Mathematical Foundations of Machine Learning
Prof. Matthieu Bloch
Monday August 19, 2024 (v1.1)
Don't Panic
- If you are not officially enrolled in the class:
- Ask for a permit as appropriate here
- There will be lots of movement in the waitlist - historically
everyone who wanted a seat in the course, got it
- Let me know on Thursday August 22, 2024 in the evening if you're
still unable to register
- If you are officially enrolled in the class
- As a courtesy to others, decide whether to stay or not by Wednesday
August 21, 2024 evening
Motivating example: Kernel PCA
- tRNA (transfer RNA): plays a key role in the
creation of amino acid sequence of proteins (source)
- G G G G A A T T A G C T C A A G C G G T A G A G C G …
- Challenge: compare, classify, analyze, visualize
sequences
- Example: Kernel Principal Component Analysis
- Datasets of tRNA sequences \(\set{\bfx_i}_{i=1}^n\)
- Lots happening behind the scene
Sequences \(\bfx\in\calS\)
embedded in Hilbert space \(\calH\)
(dimension \(N\gg 1\)) \[
\Phi:\calS\to\calH:\bfx\mapsto\Phi(\bfx)
\]
Approximate sequences in low \(d\)-dimensional subspace
\[\argmin_{\bfmu,\bfA,\bftheta_i}\sum_{i=1}^n\norm[2]{\Phi(\bfx_i)-\bfmu-\bfA\bftheta_i}^2
\text{ with }\matA\in\bbR^{N\times d}\]
- How do we choose \(\Phi\)? What is
the computational complexity? How do we find \(\bfA\), \(\bfmu\), \(\bftheta_i\)?
Mathematical Foundations of Machine Learning
Representations
- How do we represent signals and operators for data analysis?
- Linear models (and why they matter so much)
Models
- How do we model datasets?
Estimation
- How do we estimate parameters?
Computing
- How do we run algorithms for machine learning?
- Optimization (gradient descent)
What to expect in ECE 7750
ECE 7750 is about the mathematical foundations
of machine learning
- We will talk a lot about probability and linear
algebra
- We will prove a lot of things formally (theorems,
lemmas)
- We will not develop cool apps based on Deep Neural
Nets
- Exams and homework will have theoretical components
All that being said…
- We will also use simulations to understand concepts
- We will talk about machine learning
- Homework will have an experimental component (Python required)
- ECE 7750 is a fun course and you will learn a lot of
useful concepts
If you're unsure about taking the class, the
self-assessment is here to help!
ECE 7750 will give you solid background to self-study or take
other ML courses at GT
Logistics
- Class time and venue: Monday and Wednesday
3:30pm-4:45pm
- In-person live course
- Synchronous online lecture recorded for asynchronous viewing (DL and
on-campus)
- Instructor: Prof. Matthieu Bloch
- Email: matthieu.bloch@ece.gatech.edu
- Office: TSRB 437 (appointments only)
- Office hours: to be announced (mixed
online/on-campus)
- Teaching assistants: Jack Hill
Knack tutoring
- Students looking for additional assistance outside of the classroom
are advised to consider working with a peer tutor through Knack.
- Georgia Tech has partnered with Knack to provide students access to
verified peer tutors who have previously aced this course.
- This is pilot program, I have never used Knack myself
Electronic communication policy
- General guidelines
- Email the Dean of Students if your personal
situation requires special academic consideration
- Use Piazza for technical questions
- You can be anonymous to your peers, not to the
instructors
- You can use \(\LaTeX\) (\(\min_\beta\Vert y-X\beta\Vert_2^2\))
- Be courteous in your electronic interactions
- Avoid judgmental language, e.g., "The answer is
obvious", "This is trivial."
- Try to be constructive
- Avoid typos and use correct syntax
- If you really have to email me
- Include `[ECE 7750]` in the subject of the email
- I am usually reasonably fast
- If you email the TAs, cc me
Pop Quiz: Question 1
- You are not feeling well and you cannot turn in your
homework on time. Who do you contact?
- (a) The Dean
- (b) The Dean of Students
- (c) The School Chair
- (d) The Instructor
Pop Quiz: Question 2
- You are facing personal challenges that may affect your
semester. Who do you contact?
- (a) The Dean
- (b) The Dean of Students
- (c) The School Chair
- (d) The Instructor
Pop Quiz: Question 3
- You are not happy with your midterm grade.
Who do you contact?
- (a) The Dean
- (b) The Dean of Students
- (c) The School Chair
- (d) The Instructor
Writing
- Be extra careful with written communication
- Proper greeting and proper closing
- 3 lines
- Clear asks
Grading
- Self-assessment and assignments (50%)
- Self-assessment is here to help you decide whether
ECE 7750 is right for you
- Review of concepts from calculus, linear algebra, probability
theory, and programming.
- Open-book/internet test
- 2% of 50% for submission
- Assignments
- Due approximately every 10 days (about 8 assignments, subject to
updates).
- Both mathematical and programming problems.
- You are encouraged to typeset in \(\LaTeX\), but do not waste time.
- Allocate time to submit on gradescope
- The maximum number of homework points that you can earn is \((N-1)\times 100\), where \(N\) is the number of assignments
- Midterm exam (25%)
- Wednesday October 9, 2024 3:30pm-4:45pm in class
- Final exam (25%)
- Friday December 6, 2024 2:40pm-5:30pm
Assignments policy
- Two stage deadline policy
- Soft official deadline with 2% bonus (conditions
apply, read the fine print)
- Hard deadline 48 hours after soft-deadline; no late
homework accepted after hard deadline
- Abide by the Georgia Tech honor code
- Reference all your sources
- Do not plagiarize other sources (python code, homework solutions,
etc.)
- Do not upload course material on other websites
- Use of Generative AI without acknowledgment is plagiarism
- When in doubt regarding what constitutes plagiarism,
ask!
- Assignments are individual but light collaboration permitted and
encouraged
- Piazza is here for that purpose
- Small study groups are ok
Final thoughts
- I believe in accountability,
integrity, and fairness
- I will hold you to the same standards
- I trust you be default - trust is easily lost, not
easily regained
- Don't be shy and don't hesitate to talk to me !
- I don't negotiate grades but I’m here to help you learn
- I value your feedback - I use it to make improvements
- To be effective in ECE 7750 you should:
- Come to class and leave all distractions behind
- Be disciplined and complete reading and writing assignments on
schedule
- Come to office hours if you have questions
- Enjoy the learning process, including the necessary struggles with
the assignments