Dr. Matthieu R Bloch

Monday August 23, 2021

**tRNA (transfer RNA):**plays a key role in the creation of amino acid sequence of proteins`GGGGAATTAGCTCAAGCGGTAGAGCGCTCCCTTAGCATGCGAGAGGTAGCGGGATCGACGCCCCCATTCTCTA`

- (source: http://lowelab.ucsc.edu/GtRNAdb/Hsapi19/hg19-tRNAs.fa)

**Challenge:**compare, classify, analyze, visualize sequences- Example: Kernel Principal Component Analysis to identify clusters
- Datasets of tRNA sequences \(\set{\bfx_i}_{i=1}^n\)

**Lots happening behind the scene**Sequences \(\bfx\in\calS\) embedded in Hilbert space \(\calH\) (dimension \(N\gg 1\)) \[ \Phi:\calS\to\calH:\bfx\mapsto\Phi(\bfx) \]

Approximate sequences in low \(d\)-dimensional subspace

\[\argmin_{\bfmu,\bfA,\bftheta_i}\sum_{i=1}^n\norm[2]{\Phi(\bfx_i)-\bfmu-\bfA\bftheta_i}^2 \text{ with }\matA\in\bbR^{N\times d}\]

- How do we choose \(\Phi\)? What is the computational complexity? How do we find \(\bfA\), \(\bfmu\), \(\bftheta_i\)?

**Representations**- How do we represent signals and operators for data analysis?
- Linear models (and why they matter so much)

**Models**- How do we model datasets?

**Estimation**- How do we estimate parameters?

**Computing**- How do we run algorithms for machine learning?
- Optimization (gradient descent)

ECE 7750 is about the

*mathematical foundations*of machine learning- We
*will*talk a lot about probability and linear algebra - We
*will prove*a lot of things formally (theorems, lemmas) - We
*will not*develop cool apps based on Deep Neural Nets - Exams and homework will have theoretical components

- We
All that being said…

- We will also use simulations to understand concepts
- We will talk about machine learning
- Homework will have an experimental component (Python required)
- ECE 7750 is a fun course and you will learn a lot of
*useful*concepts

If you’re unsure about taking the class, the

*self-assessment*is here to help!ECE 7750 will give you solid background to self-study or take other ML courses at GT

**Class time and venue**: Monday and Wednesday 3:30pm-4:45pm- In-person live course
- Asynchronously recorded lectures (DL and on-campus) + synchronous BlueJeans

**Instructor:**Prof. Matthieu Bloch**Email**: matthieu.bloch@ece.gatech.edu**Office:**TSRB 441 (appointments only)**Office hours:**to be announced (mixed online/on-campus)**Teaching assistants:**being finalized

**Websites**- Canvas: for assignment posting and submission
- Piazza: for Q&A (Register, link on canvas)
- Gradescope: for assignment submission

We are officially back in-person (no social distancing, etc.)

Official Institute policy at Tech Moving Forward

*We encourage everyone in the Georgia Tech community to follow the Centers for Disease Control and Prevention’s (CDC) recommendations, vaccinate, and wear a mask in campus buildings.*- Stamps Health Services is offering free Covid-19 vaccines in August and September.

- Asymptomatic testing on campus is easy, convenient and free

**General guidelines**- Email the
*Dean of Students*if your personal situation requires special academic consideration - Use
*Piazza*for technical questions- You can be
*anonymous*to your peers, not to the instructors - You can use \(\LaTeX\) (\(\min_\beta\Vert y-X\beta\Vert_2^2\))

- You can be
- Be
*courteous*in your electronic interactions- Avoid judgmental language, e.g.,
*“The answer is obvious.”* - Try to be constructive
- Avoid typos and use correct syntax

- Avoid judgmental language, e.g.,

- Email the
**If you really have to email me**- Include
`[ECE 7750]`

in the subject of the email - I am usually reasonably fast
- If you email the TAs, cc me

- Include

**Self-assessment and assignments (50%)***Self-assessment*is here to help you decide whether ECE 7750 is right for you- Review of concepts from calculus, linear algebra, probability theory, and programming.
- Open-book/internet test

*Assignments*- Due approximately every week (~10-16 assignments overall).
- Both mathematical and programming problems.
- You are encouraged to typeset in \(\LaTeX\), but do not waste time.
- Allocate time to submit on gradescope

**Midterm exam (2x15%)**- Take home exam, 24 to 48 hours to complete

**Final exam (20%)**- Take home exam, 48 hours to complete

**Two stage deadline policy***Soft official deadline*with 2% bonus (conditions apply, read the fine print)*Hard deadline*48 hours after soft-deadline; no late homework accepted after hard deadline

**Abide by the Georgia Tech honor code**- Reference
*all*your sources - Do not plagiarize other sources (python code, homework solutions, etc.)
- Do not upload course material on other websites
*When in doubt regarding what constitutes plagiarism, ask!*

- Reference
Assignments are individual but light collaboration permitted and

*encouraged*- Piazza is here for that purpose
- Small study groups are ok

I believe in

*accountability*,*integrity*, and*fairness*- I will hold you to the same standards
- I
**trust**you be default - trust is easily lost, not easily regained

Don’t be shy and don’t hesitate to talk to me !

- I have little bandwidth for whining and complaining but I’m usually friendly

I value your feedback - I use it to make improvements

Aspects students least liked about the course

- “The lack of examples during lecture. Going from theory to homework problems was very difficult for me.”
- “It was frustrating to be able to download the raw slides (in order to watch the lecture and take notes), but then not know when the annotated slides would be uploaded.”
- “Each assignment consisted of numerous questions of high difficulty.”
- “The lack of advanced materials on learning.”
- “The class uses proofs too soon. Proofs are the highest level of understanding of a concept, and aren’t a good introduction into a concept.”
- “I also wish there’d be at least audio from our side (if not video)”