Gibbs' inequality (nonnegativity of KL divergence)

The Kullback–Leibler divergence is always nonnegative, and it is zero only when the two distributions are identical.
Gibbs’ inequality (nonnegativity of KL divergence)

Gibbs’ inequality: Let PP and QQ be on a Ω\Omega equipped with a F\mathcal F, and let D(PQ)D(P\|Q) denote their (allowing the value ++\infty). Then

D(PQ)0, D(P\|Q)\ge 0,

with equality if and only if P=QP=Q (as measures on (Ω,F)(\Omega,\mathcal F)).

Equivalently, in the case PQP\ll Q, writing f=dPdQf=\frac{dP}{dQ} for the Radon–Nikodym derivative (see ), one has

D(PQ)=ΩflogfdQ0, D(P\|Q)=\int_\Omega f\log f\,dQ \ge 0,

and equality holds if and only if f=1f=1 QQ-almost everywhere.

This is the basic reason relative entropy is a divergence: it is minimized uniquely at the matching law, even though it is not a metric. It is also a key input for inequalities relating KL to , such as .