ECE 6605 - Information Theory

Prof. Matthieu Bloch

Tuesday, September 12, 2023

Announcements

Assignment 1
- Deadline passed, graded asap
Assignemnt 2
- Coming up by Thursday September 14, 2023
- Topics include entropy rate and typicality
Last time
- Data processing inequality
- Fano's inequality
- Renyi entropy
Today
- Renyi entropy
- Smoothing
- Typicality (+ a real engineering problem)

Renyi entropy

Weaker axiomatic construction of entropy H for \(P_X\in\Delta(\calX)\)
1. \(H(X)\) only depends on the set of values \(\{p_X(x)\}_{x\in\calX}\) but not on their order.
2. for \(X\sim B(p)\) with \(p\in[0;1]\), \(H(X)\) is a continuous function of \(p\).
3. if \(X\sim B(1/2)\), then \(H(X)=1\).

Let \(X\in\calX\) be a discrete random variable. The Renyi entropy of order \(\alpha\in\mathbb{R}\cup\{\infty\}\) of \(X\) is

\[\begin{align*} H_{\alpha}(X)\eqdef \frac{1}{1-\alpha}\log \sum_{x\in\calX} p_X(x)^\alpha, \end{align*}\]

with the convention that \(H_1(X)=\lim_{\alpha\rightarrow 1} H_\alpha(X)\) and \(H_\infty(X)=\lim_{\alpha\rightarrow \infty} H_\alpha(X)\).

Properties of Renyi entropy

Let \(X\in\calX\) be a discrete random variable. Then,

\(H_0(X)=\log \card{\calX}\) is called the max-entropy
\(H_1(X)\eqdef\lim_{\alpha\rightarrow 1} H_\alpha(X) = \mathbb{H}(X)\), the Shannon entropy;
\(H_2(X)=-\log\sum_{x}p_X(x)^2\) is called the collision entropy.
\(H_\infty(X)\eqdef\lim_{\alpha\rightarrow \infty} H_\alpha(X) = -\log \max_{x}p_X(x)\) is called the min-entropy.

Let \(X\in\calX\) and \(Y\in\calY\) be discrete random variables. Then,

For any \(\alpha,\beta\in\mathbb{R}^+\cup\{\infty\}\), if \(\alpha\leq\beta\) then \(H_\alpha(X)\geq H_{\beta}(X)\).
For any \(\alpha\in\mathbb{R}^+\cup\{\infty\}\) we have \(0\leq H_\alpha(X)\leq \log \card{\calX}\). The equality \(H_\alpha(X)=0\) hold if and only if \(X\) is constant. For \(\alpha\neq 0\), the equality \(H_\alpha(X)=\log\card{\calX}\) holds if and only if \(X\) is uniformly distributed.
For any \(\alpha\in\mathbb{R}^+\cup\{\infty\}\) we have \(H_\alpha(XY)\geq H_{\alpha}(X)\).

Conditional Renyi entropy

This is where it gets complicated, there are several options
- \(H_\alpha^A(X|Y)\eqdef {\frac{\alpha}{1-\alpha}}\log{\left(\sum_yp_Y(y)\left(\sum_xp_{X|Y}(x|y)^\alpha\right)^{\frac{1}{\alpha}}\right)}\);
- \(H_\alpha^H(X|Y)\eqdef \frac{1}{1-\alpha}\log \left(\sum_yp_{Y}(y)\sum_xp_{X|Y}(x|y)^\alpha\right)\).

Let \(X,Y,Z\) be three discrete random variables. Then,

\(\forall \alpha,\beta\in\mathbb{R}^+\cup\{\infty\}\) \(\alpha\leq\beta\Rightarrow H_\alpha^A(X|Y)\geq H^A_{\beta}(X|Y)\).
\(\forall \alpha\in\mathbb{R}^+\cup\{\infty\}\) \(0\leq H^A_\alpha(X|Y)\leq \log \card{\calX}\) with equality on the left-hand side if and only if \(X\) is a function of \(Y\); if \(\alpha\neq 0\), equality holds on the right hand side if and only if \(X\) is uniformly distributed and independent of \(Y\).
\(\forall \alpha\in\mathbb{R}^+\cup\{\infty\}\) \(H_{\alpha}^A(XY|Z)\geq H_{\alpha}^A(X|Z)\).

Conditional Renyi entropy

Let \(X,Y,Z\) be three discrete random variables. Then, \(\forall \alpha\in\mathbb{R}^+\cup\{\infty\}\) \(H^A_{\alpha}(X|YZ)\leq H^A_{\alpha}(X|Z)\). If \(\alpha\notin\{0,\infty\}\), equality holds if and only if \(X-Z-Y\) forms a Markov chain.

Let \(X,Y,Z\) be three discrete random variables such that \(X-Y-Z\) forms a Markov chain. Then, \(\forall\alpha\in\mathbb{R}^+\cup\{\infty\}\) \(H_\alpha^A(X|Y)\leq H_\alpha^A(X|Z)\).

Let \(X,Y,Z\) be three discrete random variables. Then, \(\forall \alpha\in\mathbb{R}^+\cup\{\infty\}\)

\[\begin{align*} H^A_{\alpha}(XY|Z)\geq H^A_\alpha(XY|Z)-H^A_0(Y|Z) \geq H^A_\alpha(X|Z)-H^A_0(Y|Z)\geq H^A_\alpha(X|Z)-\log\card{\calY}. \end{align*}\]

Smooth Renyi entropy

For a probability distribution \(P\in\Delta(\calX)\) and \(\epsilon\geq 0\), the ball of radius \(\epsilon\) around \(P\) is \[ \calB^\epsilon(P) \eqdef \set{Q:\bbV(P,Q)\leq \epsilon} \]

Let \(X\in\calX\) be a discrete random variable and let \(\epsilon\geq 0\). The \(ε\)-smooth Renyi entropy of order \(\alpha\in\mathbb{R}\cup\{\infty\}\) of \(X\) is

\[\begin{align*} H^\epsilon_{\alpha}(X)\eqdef \frac{1}{1-\alpha}\inf_{Q\in\calB^\epsilon(P_X)}\log \sum_{x\in\calX} Q(x)^\alpha, \end{align*}\]

with the convention that \(H^\epsilon_1(X)=\lim_{\alpha\rightarrow 1} H^\epsilon_\alpha(X)\) and \(H^\epsilon_\infty(X)=\lim_{\alpha\rightarrow \infty} H^\epsilon_\alpha(X)\).

This may look very complicated
We will get back to this quantity later, it is actually quite natural in some cases

Asymptotic Equipartition Property

Motivation what can we say about long sequences of i.i.d. random variables?

Let \(\set{X_i}_{i=1}^n\) be i.i.d. random variables \(\sim P_X\). Then, \(\frac{1}{n}\sum_{i=1}^nX_i\) converges in probability to \(\E{X}\). Specifically,

\[\begin{align*} \forall\epsilon>0\qquad \lim_{n\to \infty}\P{\abs{\frac{1}{n}\sum_{i=1}^nX_i-\E{X}}>\epsilon}=0. \end{align*}\]

Let \(\set{X_i}_{i=1}^n\) be i.i.d. random variables \(\sim P_X\). Then,

\[\begin{align*} \frac{-1}{n}\log P_{X}^{\otimes n}(X_1,\cdots,X_n)\mathop{\longrightarrow}_{n\to\infty}\bbH(X)\textsf{ in probability.} \end{align*}\]

Typical set

The typical set \(\calA_\epsilon^n(P_X)\) is

\[\begin{align*} \calA_\epsilon^n(P_X) \eqdef \set{x^n\in\calX^n:2^{-n(\bbH(X)+\epsilon)}\leq P_X^{\otimes n}(x^n)\leq2^{-n(\bbH(X)-\epsilon)}} \end{align*}\]

For \(x^n\in\calA_\epsilon^n(P_X)\)

\[\begin{align*} \bbH(X)-\epsilon \leq -\frac{1}{n}\log P_X^{\otimes n}(x^n) \leq \bbH(X)+\epsilon \end{align*}\]
\(\P{\calA_\epsilon^n(P_X)}\to 1\) as \(n\to\infty\)
\(\card{\calA_\epsilon^n(P_X)} \leq 2^{n(\bbH(X)+\epsilon)}\)
\(\card{\calA_\epsilon^n(P_X)} \geq (1-\epsilon)2^{n(\bbH(X)-\epsilon)}\) for \(n\) large enough