Maximum entropy with constraints (Gibbs/exponential family)
Statement (maximum entropy principle)
Let be a measurable space with a reference measure , and let be measurable “constraint functions.” Fix target values and consider probability measures that are absolutely continuous w.r.t. , with density satisfying
- normalization: ,
- moment constraints: for .
Assume there exists at least one feasible with finite Shannon entropy (see Shannon entropy ).
If the entropy maximization problem has an interior maximizer (e.g. under standard regularity/feasibility conditions), then any maximizer has the Gibbs/exponential-family form
where and multipliers are chosen so that the constraints hold. Equivalently,
Key hypotheses
- A feasible set exists: there is at least one density with and .
- Finite entropy is attainable: for some feasible .
- Regularity ensuring the optimizer is not on the boundary (so Lagrange multiplier calculus applies) and that is finite at the solution.
Conclusions
- Form of the maximizer: the maximum-entropy density is an exponential tilt of the reference .
- Uniqueness (typical): since is strictly concave on densities, the maximizer is unique when the feasible set is convex and contains an interior point.
- Thermodynamic interpretation: with a single constraint (energy), the maximizer is the canonical ensemble with partition function canonical partition function ; the multiplier is the inverse temperature (see temperature ).
Cross-links to definitions
- Probability framework: probability measure , expectation .
- Entropy/divergence: Shannon entropy , relative entropy (KL divergence) .
- Statistical mechanics: canonical ensemble , partition function .
Proof idea / significance
Use Lagrange multipliers for the constrained optimization of the strictly concave functional over an affine slice of densities. Stationarity of
forces to be an affine function of the constraints, giving the exponential form.
Equivalently, maximizing subject to constraints is the same as minimizing KL divergence to the reference measure subject to the same constraints (a projection principle), linking equilibrium ensembles to variational principles.