Prof. Matthieu Bloch
Tuesday September 05, 2023
The only function that satisfies the axioms is \(H(X) = -\sum_xP_X(x)\log_2 P_X(x)\).
Let \(X\in\calX\) be a discrete random variable with \(\card{\calX}<\infty\). The Shannon entropy of \(X\) is defined as
\[\begin{align} \mathbb{H}(X) \eqdef \E[X]{-\log_2 P_X(X)} = - \sum_{x \in {\calX}} P_X(x) \log_2 P_X(x), \end{align}\]
with the convention that \(0\ln 0 = 0\).
The entropy \(H:\Delta(\calX)\to\bbR:P\mapsto H(P)\) is a concave function.
Let \(X\in\calX\) be a discrete random variable. Then \(\mathbb{H}(X) \geq 0\) with equality iff \(X\) is a constant, i.e., there exists \(x^*\in\calX\) such that \(\P{X=x^*}=1\).
Let \(X \in {\calX}\) be a discrete random variable. Then \(\mathbb{H}(X)\leq \ln \card{\calX}\) with equality if and only if \(X\) is uniform over \({\calX}\), i.e., \(\forall x \in {\calX}, P_X(x) = \frac{1}{\card{\calX}}\).
The joint entropy of two discrete random variables \(X\in\calX\) and \(Y\in\calY\) with joint PMF \(P_{XY}\) is
\[\begin{align*} \mathbb{H}(X,Y) \eqdef \mathbb{E}_{XY}(-\log_2 P_{XY}(X,Y)) = - \sum_{x \in {\calX}} \sum_{y \in \mathcal{Y}}P_{XY}(x, y) \log_2 P_{XY}(x, y). \end{align*}\]
Furthermore, the conditional entropy \(Y\) given \(X\) is
\[\begin{align*} \mathbb{H}(Y|X) \eqdef \mathbb{E}_{XY}(-\log_2 P_{Y|X}(Y|X)) = - \sum_{x \in {\calX}} \sum_{y \in \mathcal{Y}} P_{XY}(x,y) \log_2 P_{Y|X}(y|x). \end{align*}\]
Let \(X, Y\) be discrete random variables with joint PMF \(p_{XY}\). Then \(\mathbb{H}(Y|X) \geq 0\) with equality if and only if \(Y\) is a function of \(X\).
Let \(X\) and \(Y\) be discrete random variables with joint PMF \(p_{XY}\). Then \(\mathbb{H}(X|Y) \leq \mathbb{H}(X)\) with equality if and only if \(X\) is independent of \(Y\).
Let \(X\), \(Y\), and \(Z\) be discrete random variable with joint PMF \(p_{XYZ}\). Then \[ \mathbb{H}(XY|Z) = \mathbb{H}(X|Z) + \mathbb{H}(Y|XZ) \notag = \mathbb{H}(Y|Z) + \mathbb{H}(X|YZ). \] More generally, if \(\bfX\eqdef\set{X_i}_{i=1}^n\) and \(Z\) are jointly distributed random variables we have \[ \mathbb{H}(\bfX|Z) = \sum_{i=1}^n\mathbb{H}(X_i|X^{1:i-1}Z) \] with the convention \(X^{1:0}\eqdef \emptyset\).
Let \(X, Y\) be two random variables with joint PMF \(P_{XY}\). The mutual information between \(X\) and \(Y\) is \[ \bbI(X;Y)\eqdef \bbH(X)-\bbH(X|Y)=\bbD(P_{XY}\Vert P_XP_Y). \]
If \(\Delta(\calX\to\calY)\) denotes the set of kernels from \(\calX\) to \(\calY\) then the function \[ I:\Delta(\calX)\times \Delta(\calX\to\calY):(P,W)\mapsto I(P;W)\eqdef \mathbb{D}(W\cdot P\Vert (W\circ P)P) \] is a concave function of \(P\) and a convex function of \(W\).
Let \(X, Y, Z\) be discrete random variables with joint distribution \(p_{XYZ}\). The conditional mutual information between \(X\) and \(Y\) given \(Z\) is \(\mathbb{I}(X ; Y|Z) \eqdef \mathbb{H}(X|Z) - \mathbb{H}(X|YZ)\).
\(\mathbb{I}(X;Y|Z)\geq 0\) if and only if \(X\) and \(Y\) are conditionally independent given \(Z\).
Let \(\bfX\eqdef\set{X_i}_{i=1}^n\), \(Y\), and \(Z\) be jointly distributed random variables. Then,
\[\begin{align*} \mathbb{I}(\bfX;Y|Z) = \sum_{i=1}^n\mathbb{I}(X_i;Y|Z\bfX^{1:i-1})\textsf{ with the convention }\bfX^{1:0}=\emptyset. \end{align*}\]
Let \(X\), \(Y\), and \(Z\) be discrete random variables such that \(X \rightarrow Y \rightarrow Z\). Then \(\mathbb{I}(X;Y) \geq \mathbb{I}(X;Z)\) or, equivalently, \(\mathbb{H}(X|Z) \geq \mathbb{H}(X|Y)\).
Let \(X\) be a discrete random variable with alphabet \({\calX}\). Let \(\widehat{X}\) be an estimate of \(X\), \(\widehat{X} \in {\calX}\) and with joint distribution \(p_{X\widehat{X}}\). We define the probability of estimation error: \(P_e \eqdef \mathbb{P}[X\neq\widehat{X}]\). Then, \(\mathbb{H}(X|\widehat{X}) \leq \mathbb{H}_b(P_e) + P_e \log(\card{\calX}-1)\).
Let \(X\in\calX\) be a discrete random variable. The Renyi entropy of order \(\alpha\in\mathbb{R}\cup\{\infty\}\) of \(X\) is
\[\begin{align*} H_{\alpha}(X)\eqdef \frac{1}{1-\alpha}\log \sum_{x\in\calX} p_X(x)^\alpha, \end{align*}\]
with the convention that \(H_1(X)=\lim_{\alpha\rightarrow 1} H_\alpha(X)\) and \(H_\infty(X)=\lim_{\alpha\rightarrow \infty} H_\alpha(X)\).
Let \(X\in\calX\) be a discrete random variable. Then,
Let \(X\in\calX\) and \(Y\in\calY\) be discrete random variables. Then,
Let \(X,Y,Z\) be three discrete random variables. Then,
Let \(X,Y,Z\) be three discrete random variables. Then, \(\forall \alpha\in\mathbb{R}^+\cup\{\infty\}\) \(H^A_{\alpha}(X|YZ)\leq H^A_{\alpha}(X|Y)\). If \(\alpha\notin\{0,\infty\}\), equality holds if and only if \(X-Y-Z\) forms a Markov chain.
Let \(X,Y,Z\) be three discrete random variables such that \(X-Y-Z\) forms a Markov chain. Then, \(\forall\alpha\in\mathbb{R}^+\cup\{\infty\}\) \(H_\alpha(X|Y)\leq H_\alpha(X|Z)\).
Let \(X,Y,Z\) be three discrete random variables. Then, \(\forall \alpha\in\mathbb{R}^+\cup\{\infty\}\)
\[\begin{align*} H^A_{\alpha}(XY|Z)\geq H^A_\alpha(XY|Z)-H^A_0(Y|Z) \geq H^A_\alpha(X|Z)-H^A_0(Y|Z)\geq H^A_\alpha(X|Z)-\log\card{\calY}. \end{align*}\]