Mathematical Foundations of Machine Learning
Prof. Matthieu Bloch
Monday, December 2, 2024
Last time
- Last class: Monday November 25, 2024
- We talked about Maximum likelihood estimation
- Today: We will talk about principal component
analysis
- Great way to end the course where we started
- To be effectively prepared for today's class, you should
have:
- Gone over slides
and read associated lecture notes here
and there
and there
- Submitted Homework 8
Final exam is coming
- Review all notes and exams: exam is comprehensive
- Frequently asked questions
- How many problems on the midterm?
- How many questions on the midterm?
- What topics will the midterm cover?
- Do you have sample exams?
- Can we do additional work for extra credit?
- Do you give bonuses for participation or for CIOS completion?
- We will be very available for help and review
sessions
- Tuesday December 03, 2024 12pm: Anuvab (hybrid)
- Wednesday December 04, 2024 9am: Dr. Bloch (online)
- Wednesday December 04, 2024 11:30am: Jack (online)
Back to first lecture: Kernel PCA
Kin et al., Genome Informatics, (2002)
- tRNA (transfer RNA): plays a key role in the
creation of amino acid sequence of proteins (source)
- G G G G A A T T A G C T C A A G C G G T A G A G C G …
- Challenge: compare, classify, analyze, visualize
sequences
- Datasets of tRNA sequences
- Lots happening behind the scene
- What does it mean to represent the data in 2D?
- How do we measure distances between tRNA sequences?
- We can explain a lot now!
Principal Component Analysis
- Feature extraction methods based: unsupervised,
linear, based on sum of square
errors
- Idea is to find approximation of data as and has
orthonormal columns
Principal Component Analysis consists in solving the problem
- Given , relatively easy to
find and
Solving PCA
Assume that and are fixed. Then,
Assume is fixed and . Then,
Solving PCA
One possible choice of is
where
's are the eigenvectors
corresponding to the largest
eigenvalues of
- Proof steps
- Step 1: introduce
- Step 2: introduce linear program
- Step 3: solve linear program
- Connection to SVD where columns of are :


1/1
Mathematical Foundations of Machine Learning
Prof. Matthieu Bloch
Monday, December 2, 2024