ECE 6605 - Information Theory

Prof. Matthieu Bloch

Wednesday, September 13, 2023

Announcements

Assignment 2
- To be posted on Canvas/Gradescope today
- Due on Tuesday September 26, 2023 (12 days)
Last time
- Renyi entropy
- Smoothing
- Typicality
Today
- Typicality
- Source coding

Asymptotic Equipartition Property

Motivation what can we say about long sequences of i.i.d. random variables?

Let \(\set{X_i}_{i=1}^n\) be i.i.d. random variables \(\sim P_X\). Then, \(\frac{1}{n}\sum_{i=1}^nX_i\) converges in probability to \(\E{X}\). Specifically,

\[\begin{align*} \forall\epsilon>0\qquad \lim_{n\to \infty}\P{\abs{\frac{1}{n}\sum_{i=1}^nX_i-\E{X}}>\epsilon}=0. \end{align*}\]

Let \(\set{X_i}_{i=1}^n\) be i.i.d. random variables \(\sim P_X\). Then,

\[\begin{align*} \frac{-1}{n}\log P_{X}^{\otimes n}(X_1,\cdots,X_n)\mathop{\longrightarrow}_{n\to\infty}\bbH(X)\textsf{ in probability.} \end{align*}\]

Typical set

The typical set \(\calA_\epsilon^n(P_X)\) is

\[\begin{align*} \calA_\epsilon^n(P_X) \eqdef \set{x^n\in\calX^n:2^{-n(\bbH(X)+\epsilon)}\leq P_X^{\otimes n}(x^n)\leq2^{-n(\bbH(X)-\epsilon)}} \end{align*}\]

For \(x^n\in\calA_\epsilon^n(P_X)\)

\[\begin{align*} \bbH(X)-\epsilon \leq -\frac{1}{n}\log P_X^{\otimes n}(x^n) \leq \bbH(X)+\epsilon \end{align*}\]
\(\P{\calA_\epsilon^n(P_X)}\to 1\) as \(n\to\infty\)
\(\card{\calA_\epsilon^n(P_X)} \leq 2^{n(\bbH(X)+\epsilon)}\)
\(\card{\calA_\epsilon^n(P_X)} \geq (1-\epsilon)2^{n(\bbH(X)-\epsilon)}\) for \(n\) large enough

Application: data compression

Fixed-length source coding

A fixed-lengh \((n,M_n)\) code for a discrete memoryless source \(p_X\) consists of:

A encoding function \(f_n:\calX^n\to\set{1;M_n}\)
A decoding function \(g_n:\set{1;M_n}\to \calX^n\)

The performance of an \((n,M_n)\) code is measured in terms of
1. The rate of the code \(R\eqdef \frac{\log_2 M_n}{n}\) (bits/source symbol)
2. The average probability of error
  
  \[\begin{align*} P_e^{(n)}(\calC) \eqdef \P{\widehat{X}^n\neq X^n} \eqdef \sum_{x^n\in\calX^n}P_X^{\otimes n}(x^n)\indic{g_n(f_n(x^n))\neq x^n}. \end{align*}\]

Fixed length source coding

Question 1: for a fixed \(n\in\bbN\), what is the minimum \(R\) required to achieve \(P_e^{(n)}\leq \epsilon\)?
Question 2: as \(n\to\infty\), what is the minimum \(R\) that ensures that \(\lim_{n\to \infty} P_e^{(n)}=0\) \[ C_{\textsf{source}}\eqdef \inf\left\{R: \exists {(f_n,g_n)}_{n\geq 1} with \lim_{n\to\infty}\frac{\log M_n}{n}\leq R \text{ and } \lim_{n\to\infty}P_e^{(n)}=0\right\} \]

For a discrete memoryless source \(p_X\), \(C_{\textsf{source}} = \bbH(X)\)

This is an asymptotic results as \(n\to\infty\) without any consideration of the complexity of decoding
One can get rates as close to the entropy as one wishes as long as \(n\) is large enough (achievability)
One cannot compress below the entropy (converse)

Converse

Consider a sequence of fixed length \((n,M_n)\) source codes \(\set{(f_n,g_n)}_{n\geq1}\) such that \[ \lim_{n\to\infty}\frac{\log M_n}{n}\leq R \text{ and } \lim_{n\to\infty}P_e^{(n)}=0 \] Then \(R\geq \bbH(X)\).

Achievability

The achievability follows using a technique called random binning. Consider a source \(P_U\).

First define a set \[ \calB_\gamma\eqdef\left\{u\in\calU:\log\frac{1}{P_{U}(u)}\leq\gamma\right\}. \]
Encoding function \(f:\calU\to \set{1;M}\) created by assigning an index uniformly at random in \(\set{1;M}\) to each \(u\in\calU\)
Decoding function \(g:\intseq{1}{M}:m\mapsto u^*\), where \(u^*=u\) if \(u\) is the unique sequence such that \((u,v)\in\calB_\gamma\) and \(f(u)=w\); otherwise, a random \(u\) is declared

On average (over the random binning to create \(F\)), \[ \E[C]{P_e} \leq \P[P_{U}]{U\notin \calB_\gamma} + \frac{2^{\gamma}}{M}. \]