Deterministic Least Squares

Dr. Matthieu R Bloch

Tuesday, August 23, 2022

Today in ECE 6555

Announcements
1. Read the syllabus (for real!) and ask for clarifications as needed
2. Register on Piazza
3. Check Gradescope access
4. Check out the self assessemnt (ungraded, see canvas later today)
5. Mathematics of ECE workshops (more on this soon)
6. Buy a notebook!
Today’s plan
- Review linear algebra
- Build up to deterministic least squares
- Why? This will provide plenty of intuition when we move to stochastic least squares
- Talk about linear algebra (and review as needed)
Questions?

Many engineering problems reduce to solving a sytem of linear equations \[ \mathbf{y} = \matH \mathbf{x} \]
- \(\matH\in\bbR^{m\times n}\), vectors and matrices are real valued (complex-valued explored in homework)
- Try to make a distinction between \(\mathbf{y}\) (vector) and \(y\) (scalar)
The system may have unique of multiple solutions: what determines it?
- If \(m>n\) (\(m< n\)) the system is over (under) -determined
- If \(m=n\) the system is square
Rewrite \(\matH\) in terms of its columns \(\set{\vech_i}_{i=0}^{n-1}\) \[ \matH\eqdef \left[\begin{array}{ccc}|&&|\\\vech_0&\cdots &\vech_{n-1}\\|&&|\end{array}\right] \]
When does the system \(\mathbf{y} = \matH \mathbf{x}\) have a solution? (when is the system \(\bfy\) /consistent/?)
The system has a solution if and only if \(\mathbf{y}\) is a linear combination of the columns \(\set{\vech_i}_{i=0}^{n-1}\)

- The image (aka column space or range) of \(\matH\) is \(\text{Im}(\matH)\eqdef\set{\matH\bfx:\bfx\in\bbR^n}\)
- The kernel (aka null space) of \(\matH\) is \(\text{Ker}(\matH)\eqdef\set{\bfx\in\bbR^n:\matH\bfx=\mathbf{0}}\)
These sets are (sub) linear vector spaces… we’ll review very quickly
A vector space \(\calV\) over a field \(\bbR\) consists of a set \(\calV\) of vectors, a closed addition rule \(+\) and a closed scalar multiplication \(\cdot\) such that 8 axioms are satisfied:
1. \(\forall x,y\in\calV\) \(x+y=y+x\) (commutativity)
2. \(\forall x,y,z\in\calV\) \(x+(y+z)=(x+y)+z\) (associativity)
3. \(\exists 0\in\calV\) such that \(\forall x\in\calV\) \(x+0=x\) (identity element)
4. \(\forall x\in\calV\) \(\exists y\in\calV\) such that \(x+y=0\) (inverse element)
5. \(\forall x\in\calV\) \(1\cdot x= x\)
6. \(\forall \alpha, \beta\in\bbR\) \(\forall x\in\calV\) \(\alpha\cdot(\beta\bfx)=(\alpha\cdot\beta)\cdot\bfx\) (associativity)
7. \(\forall \alpha, \beta\in\bbR\) \(\forall x\in\calV\) \((\alpha+\beta)x = \alpha x+\beta x\) (distributivity)
8. \(\forall \alpha\in\bbR\) \(\forall x,y\in\calV\) \(\alpha(x+y) = \alpha x+\alpha y\) (distributivity)

- \(0\in\calV\) is unique
- Every \(x\in\calV\) has a unique inverse
- \(0\cdot x = 0\)
- The inverse of \(x\in\calV\) is \((-1)\cdot x\eqdef -x\)
A subset \(\calW\) of a vector space \(\calV\) is a vector subspace if \(\forall x,y\in\calW\forall \lambda,\mu\in\bbR\) \(\lambda x+\mu y \in\calW\)
If \(\calW\) is a vector subspace of a vector space \(\calV\), \(\calW\) is a vector space.
\(\text{Im}(\matH)\) and \(\text{Ker}(\matH)\) are sub vector spaces
Homework problem: prove that the set of solutions is of the form \(\vecx_0+\matH\) for any \(\vecx_0\in\text{Ker}(\matH)\)

Let \(\set{v_i}_{i=1}^n\) be a set of vectors in a vector space \(\calV\).
For \(\set{a_i}_{i=1}^n\in\bbR^n\), \(\sum_{i=1}^na_iv_i\) is called a linear combination of the vectors \(\set{v_i}_{i=1}^n\).
The span of the vectors \(\set{v_i}_{i=1}^n\) is the set \[ \text{span}(\set{v_i}_{i=1}^n)\eqdef \{\sum_{i=1}^na_iv_i:\set{a_i}_{i=1}^n\in\bbR^n\} \]
The span of the vectors \(\set{v_i}_{i=1}^n\in\calV^n\) is a vector subspace of \(\calV\).

Let \(\set{v_i}_{i=1}^n\) be a set of vectors in a vector space \(\calV\)
\(\set{v_i}_{i=1}^n\) is linearly independent (or the vectors \(\set{v_i}_{i=1}^n\) are linearly independent ) if (and only if) \[ \sum_{i=1}^na_iv_i = 0\Rightarrow \forall i\in\intseq{1}{n}\,a_i=0 \] Otherwise the set is (or the vectors are) linearly dependent.
- \(\matH\) has column rank \(r_c\) if it has at most \(r_c\) linearly independent columns
- \(\matH\) has row rank \(r_r\) if it has at most \(r_r\) linearly independent columns
- The row rank and the column rank of a matrix \(\matH\) are equal
The inverse of a matrix can only exist is \(\matH\) is square and full rank

A basis of vector subspace \(\calW\) of a vector space \(\calV\) is a countable set of vectors \(\calB\) such that:
1. \(\text{span}(\calB)=\calW\)
2. \(\calB\) is linearly independent
You should be somewhat familiar with this in \(\bbR^n\), there are lots of nice features
- every subspace has a basis
- every basis for a subspace has the same number of elements
- the number of elements in a basis is called the dimension
- the representation of a vector on a basis is unique
- having a basis reduces the operations on vectors to operations on their components.
Things sort of work in infinite dimensions, but we have to be bit more careful

An inner product space over \(\bbR\) is a vector space \(\calV\) equipped with a positive definite symmetric bilinear form \(\dotp{\cdot}{\cdot}:\calV\times\calV\to\bbR\) called an inner product
An inner product space is also called a pre-Hilbert space
An inner product satisfies \(\forall x,y\in\calV\) \(\dotp{x}{y}^2\leq\dotp{x}{x}\dotp{y}{y}\)
A norm on a vector space \(\calV\) over \(\bbR\) is a function \(\norm{\cdot}:\calV\to\bbR\) that satisfies:
- Positive definiteness: \(\forall x\in\calV\) \(\norm{x}\geq 0\) with equality iff \(x=0\)
- Homogeneity: \(\forall x\in\calV\) \(\forall\alpha\in\bbR\) \(\norm{\alpha x}=\abs{\alpha}\norm{x}\)
- Subadditivity: \(\forall x,y\in\calV\) \(\norm{x+y}\leq \norm{x}+\norm{y}\)
\[\bfx\in\bbR^d\qquad\norm[0]{\bfx}\eqdef\card{\set{i:x_i\neq 0}}\quad\norm[1]{\bfx}\eqdef\sum_{i=1}^d\abs{x_i}\quad \norm[2]{\bfx}\eqdef\sqrt{\sum_{i=1}^d x_i^2}\]

In an inner product space, an inner product induces a norm \(\norm{x} \eqdef \sqrt{\dotp{x}{x}}\)
A norm \(\norm{\cdot}\) is induced by an inner product on \(\calV\) iff \(\forall x,y\in\calV\) \(\norm{x}^2+\norm{y}^2 = \frac{1}{2}\left(\norm{x+y}^2+\norm{x-y}^2\right)\)

If this is the case, the inner product is given by the polarization identity \[\dotp{x}{y}=\frac{1}{2}\left(\norm{x}^2+\norm{y}^2-\norm{x-y}^2\right)\]

Two vectors \(x,y\in\calV\) are orthogonal if \(\dotp{x}{y}=0\). We write \(x\perp y\) for simplicity.

A vector \(x\in\calV\) is orthogonal to a set \(\calS\subset\calV\) if \(\forall s\in\calS\) \(\dotp{x}{s}=0\). We write \(x\perp \calS\) for simplicity.
If \(x\perp y\) then \(\norm{x+y}^2=\norm{x}^2+\norm{y}^2\)
The orthogonal complement of vector space \(\calW\subset\calV\subset\bbR^n\) is \[ \calW^\perp\eqdef \set{x\in\calV:\dotp{x}{y}=0 \forall y\in\calW} \]
\[ \text{Ker}(\matH) = \text{Im}(\matH^T)^\perp\quad\text{Im}(\matH) = \text{Ker}(\matH^T)^\perp \]

Consider vectors subspaces \(\calU,\calV,\calW\) or \(\bbR^n\). Then \(\calW=\calU\oplus\calV\) iff for every \(w\in\calW\) there exists a unique pair \((u,v)\in\calU\times\calV\) such that \(w=u+v\)
\[ \text{Ker}(\matH)\oplus\text{Im}(\matH^\intercal) = \bbR^n\qquad \text{Ker}(\matH^\intercal)\oplus\text{Im}(\matH) = \bbR^m \]