Maximum entropy principle

A rule for selecting a probability distribution by maximizing entropy subject to known constraints.
Maximum entropy principle

A maximum entropy principle is the prescription: given a nonempty set C\mathcal C of candidate probability distributions, select a distribution PP^\star satisfying

Parg maxPCH(P), P^\star \in \operatorname*{arg\,max}_{P\in\mathcal C} H(P),

where HH is for discrete laws (and typically for absolutely continuous laws). In applications, C\mathcal C is usually defined by constraints such as normalization, support restrictions, or moment/expectation constraints of the form EP[gi(X)]=ci\mathbb E_P[g_i(X)]=c_i for a XX and given functions gig_i (using ).

The guiding idea is to choose the least-committal distribution consistent with the stated information. In many common settings, maximum-entropy problems can be reformulated in terms of minimizing to a reference distribution, with guaranteeing nonnegativity of the objective.

Examples:

  • On a finite set of nn outcomes, if the only constraint is that the distribution is supported on those outcomes, the maximizer of Shannon entropy is the uniform distribution.
  • Among all distributions on R\mathbb R with fixed mean and variance, the maximizer of differential entropy is the normal distribution with those parameters.