Dr. Matthieu R Bloch
Tuesday August 30, 2022
In most engineering problems \(\vecy = \matH\vecx\) has no solution because of (unknown) noise! \[ \vecy = \matH\vecx + \vecv \]
We need to introduce criteria to find the best approximate solutions
The least square solution \(\hat{\vecx}\in\bbR^n\) is \(\hat{\vecx}\in\bbR^n=\argmin_{\vecu}\norm[2]{\vecy-\matH\vecu}^2\)
One can consider other norms (\(\norm[1]{\cdot}\) for instance) but \(\norm[2]{\cdot}\) has appealing analytical properties
It will be convenient to introduce the cost function \(J(\vecx)\eqdef \norm[2]{\vecy-\matH\vecx}^2\)
A vector \(\vecx_0\) is a minimizer of \(J(\cdot)\) if and only if it satisfies the consistent normal equations \[ \matH^T\matH\vecx_0 = \matH^\intercal\vecy \] The resulting unique minimimum value is \(J(\vecx_0)=\norm[2]{\vecy}^2-\norm[2]{\matH\vecx_0}^2\)
Homework problem: \(\text{Im}(\matH^T\matH)=\text{Im}(\matH^T)\)
When \(\matH\) is full rank note that \(\hat{\vecy}=\matH(\matH^T\matH)^{-1}\matH^T\vecy\)
What can we say when \(\matH\) does not have full column rank? More soon!
Recall suspicious results: \(J(\vecx_0)\eqdef\norm[2]{\vecy-\matH\vecx_0}^2=\norm[2]{\vecy}^2-\norm[2]{\matH\vecx_0}^2\) and \(\matH^T(\matH\vecx_0-\vecy)=\mathbf{0}\)
Let \(\calW\) be a linear subspace of \(\calV\subset\bbR^n\). An orthogonal projection of \(y\in\calV\) onto \(\calW\) is \(\hat{y}\in\calW\) such that \(y-\hat{y}\in\calW^\perp\).
The orthogonal projection exists and is unique
Let \(\calW\) be a linear subspace of \(\calV\subset\bbR^n\) and let \(y\in\calV\) with orthogonal projection of \(y\in\calV\) onto \(\calW\).
Then \(\forall z\in\calW\), \(\norm[2]{y-\hat{y}}^2\leq\norm[2]{y-z}^2\).
This explains the normal equations very intuitively!
When \(\matH\) is full rank, the matrix \(P_\matH\eqdef \matH(\matH^T\matH)^{-1}\matH\) is the orthogonal projection matrix onto \(\text{Im}(\matH)\)
Consider a full rank matrix \(\matH\in\bbR^{m\times n}\) with \(m\gg n\)
Let \(\hat{\vecx}^{(n)}\) be the least-square solution of \(\vecy\approx \matH\vecx\)
Assume we get one more input (preserving full rank) so that we want to solve \[ \vecy\approx \left[\begin{array}{cc}\matH &\vech_{n+1}\end{array}\right] \left[\begin{array}{c}\vecx\\x_{n+1}\end{array}\right] \]
There are many variations on the least square minimization problem
Weighted least squares \[ J(\vecx)\eqdef \norm[\matW]{\vecy-\matH\vecx}^2\eqdef (\vecy-\matH\vecx)^T\matW(\vecy-\matH\vecx) \] for some symmetric positive definite matrix \(\matW\)
Regularized least squares \[ J(\vecx)\eqdef (\vecy-\matH\vecx)^T\matW(\vecy-\matH\vecx) + (\vecx-\vecx_0)^T\mathbf{\Pi}(\vecx-\vecx_0) \] for some symmetric positive definite matrices \(\matW\) and \(\mathbf{\Pi}\)