Probabilities - Markov, Chernoff, Hoeffding
Friday, September 9, 2022
Probabilities are not that old. The axiomatic theory was carried out by Kolmogorov in 1932
Why do we we need to deal with \(\calF\) instead of \(\Omega\)?
It really matters to do things properly but we can get away without it for now
Let \((\Omega, \calF, \mathbb{P})\) be a probability space. For any events \(\set{\calA_i}_{i\geq 1}\) we have \(\P{\cup_{i\geq 1}\calA_i}\leq \sum_{i\geq 1}\P{\calA_i}\)
Let \((\Omega, \calF, \mathbb{P})\) be a probability space. If \(\P{\calB}>0\), the conditional probability of event \(\calA\) given event \(\calB\) is given by \(\P{\calA|\calB} = \P{\calA\cap\calB}/\P{\calB}\).
Let \((\Omega, \calF, \mathbb{P})\) be a probability space. Event \(\calA\) is independent of event \(\calB\) if \(\P{\calA\cap\calB} = \P{\calA}\P{\calB}\).
For \(n>2\), the events \(\set{\calA_i}_{i=1}^n\) are independent if \(\forall\calS\subset\intseq{1}{n}\) such that \(\card{\calS}\geq 2\), \[\P{\cap_{i\in\calS}\calA_i}=\prod_{i\in\calS}\P{\calA_i}\]
Let \((\Omega, \calF, \mathbb{P})\) be a probability space and \(X\) a random variable. The CDF of \(X\) is the function \[F_X:\bbR\to[0,1]:x\mapsto \P{\omega\in\Omega:X(\omega)\leq x}\eqdef \P{X\leq x}\]
If \(\card{\calX}<\infty\) or countable, the random variable is discrete. We can write \(\calX=\set{x_i}_{i=1}^\card{\calX}\) and \(P_X(x_i)\eqdef\P{X=x_i}\) is called the probability mass function (PMF) of \(X\).
If the CDF of \(X\) has a finite derivative at \(x\), the derivative is called the probability density function (PDF), denoted by \(p_X\). If \(F_X\) has a derivative for every \(x\in\bbR\), \(X\) is continuous
We often don’t need to specify \((\Omega, \calF, \mathbb{P})\). All we need is a CDF (or PMF or PDF)
Let \(X\) be a random variable with PMF \(P_X\). Then \(\E{X}\eqdef\sum_{x\in\calX}x P_X(x)\).
Let \(X\) be a random variable with PDF \(p_X\). Then \(\E{X}\eqdef\int_{x\in\calX}x p_X(x)dx\).
Expectation of a function \(f\) of a discrete \(X\) is \(\E{f(X)} = \sum_{x\in\calX} f(x)P_X(x)\) (and idem for PDFs).
Let \(X\) be a random variable. The \(m\)th moment of \(X\) is \(\E{X^m}\).
The variance is the second centered moment \(\Var{X}\eqdef \E{(X-\E{X})^2}=\E{X^2}-\E{X}^2\).
Let \(X\) be a random variable and \(\calE\subset\bbR\). Then \[\E{\indic{X\in\calE}}=\P{X\in\calE}\]
\[ \E[XY]{f(X,Y)} = \E[Y]{\E[X|Y]{f(X,Y)|Y}} \]
11th commandment: thou shall denote random variables by capital letters
12th commandment: but sometimes not
Let \(X\) be a non-negative real-valued random variable. Then for all \(t>0\) \[\P{X\geq t}\leq \frac{\E{X}}{t}.\]
Let \(X\) be a real-valued random variable. Then for all \(t>0\) \[\P{\abs{X-\E{X}}\geq t}\leq \frac{\Var{X}}{t^2}.\]
Let \(\{X_i\}_{i=1}^N\) be i.i.d. real-valued random variables with finite mean \(\mu\) and finite variance \(\sigma^2\). Then \[\P{\abs{\frac{1}{N}\sum_{i=1}^N X_i-\mu}\geq\epsilon}\leq\frac{\sigma^2}{N\epsilon^2}\qquad\lim_{N\to\infty}\P{\abs{\frac{1}{N}\sum_{i=1}^N X_i-\mu}\geq \epsilon}=0.\]
We can obtain much better bounds than with Chebyshev
Let \(\{X_i\}_{i=1}^N\) be i.i.d. real-valued zero-mean random variables such that \(X_i\in[a_i;b_i]\) with \(a_i<b_i\). Then for all \(\epsilon>0\) \[\P{\abs{\frac{1}{N}\sum_{i=1}^N X_i}\geq\epsilon}\leq 2\exp\left(-\frac{2N^2\epsilon^2}{\sum_{i=1}^N(b_i-a_i)^2}\right).\]