Week 3: Classification and Regression
Announcements
-
Class participation, EdSTEM, Figma, and Google Folder.
Tuesday
Recap last week
-
Highlight the patterns your recognized
-
Hands-on in-class exercises
-
Revisit Lab 2 requirements
Pattern Recognition
Narrow down to course introduction, Identify Patterns.
-
Classification
-
Regression
Textbooks
All textbooks are free available online, and are optinal, not required.
-
Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
-
A Course in Machine Learning by Hal Daume III
Algorithms
-
Linear Discriminant Analysis (LDA)
-
Support Vector Machine (SVM)
-
Nearest Neighbors
-
Decision Trees
-
Ensemble methods
Datasets
-
1. Supervised learning
-
1.1. Linear Models
-
[the diabetes dataset]
-
1.1.11. Logistic regression
-
Regularization path of L1- Logistic Regression, [Iris]
-
MNIST classification using multinomial logistic + L1, [MNIST]
-
1.4. Support Vector Machines
-
SVM-Anova: SVM with univariate feature selection, [Iris]
-
1.6. Nearest Neighbors
-
Nearest Neighbors Classification [Iris]
-
1.10. Decision Trees
-
Plot the decision surface of decision trees trained on the iris dataset [Iris]
-
Understanding the decision tree structure [Iris]
-
1.11. Ensemble methods
-
Plot the decision surfaces of ensembles of trees on the iris dataset [Iris]
-
1.11.6. Voting Classifier [Iris]
-
1.12. Multiclass and multioutput algorithms [Iris]
Code
In-class exercises:
*Explain to your classmates what is Classification *Explain to your classmates what is Regression
Lab 03
-
Classification
-
Regression
-
Programming Competitions use it as well, such as Kaggle.
-
Videos, such as Classification with Iris Dataset.
Thursday
Supervised learning
-
From Algorithms perspective
-
From Dataset perspective
Datasets
-
Popular datasets, UCI Datasets
-
Other institutions also use these datasets.
-
sklearn datasets examples
-
38 Examples using the iris dataset (classification and clustering)
The Diabetes dataset
-
regression
-
Efron, Bradley, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. "Least angle regression." The Annals of statistics 32, no. 2 (2004): 407-499.
-
Cited by 11138
-
Google Scholar has the Paper PDF file
-
lab 04
Midterm
-
research questions
-
literature review
-
paper format, two-page
-
timeline
-
team, student pairs
Math
-
textbooks
Concept
-
loss function
-
Accuracy
-
Train-test
-
Cross validation
-
Overfitting and underfitting