## COMP 4211 - Spring 2013

**Spring 2013, COMP 4211 Machine Learning [3 units]**

Lecture 1, MoWe 10:30-11:50, Rm 1511 at L27/28

**Prof. Dekai WU**, Rm 3539,
2358-6989, dekai@cs.ust.hk

Tutorial 1 TA: **DAI
Jie**, Fr 18:00-18:50, Rm 3598 at L27/28, jdaiaa@cse.ust.hk

You are welcome to knock on the door of the instructor any time. The TAs' office hours are posted at http://course.cs.ust.hk/comp4211/ta/.

### ANNOUNCEMENTS

Welcome to COMP4211! (This course was formerly called COMP328.) Tutorials will begin after Week 2.

**Always** check the Discussion Forum for up-to-the-minute
announcements.

**Discussion forum** is at http://comp151.cse.ust.hk/~dekai/content/?q=forum/3.
Always read before asking/posting/emailing your question. This forum is based
on modern, powerful software, instead of using the old clunky ITSC newsgroup.

**Course home page** is at http://www.cs.ust.hk/~dekai/4211/.

**Tutorial info** is at http://course.cs.ust.hk/comp4211/ta/.

### ORIENTATION

#### Course Description

**COMP 4211.** Fundamentals of machine learning. Concept
learning. Evaluating hypotheses. Supervised learning, unsupervised learning and
reinforcement learning. Bayesian learning. Ensemble Methods. Exclusion(s): COMP
4331, ISOM 3360 Prerequisite(s): COMP 171/171H (prior to 2009-10) or COMP
2012/2012H, and MATH 2411/2421/246.

### TEXTBOOKS

*Introduction to Text Alignment: Statistical Machine Translation Models from Bitexts to Bigrammars*(forthcoming), by**Dekai WU**. Springer, 2013.*Machine Learning: A Probabilistic Perspective*, by**Kevin Patrick MURPHY**. MIT Press, 2012.*Machine Learning*, by**Tom MITCHELL**. McGraw Hill, 1997.*Artificial Intelligence: A Modern Approach*(2nd Edition), by**Stuart RUSSELL**and**Peter NORVIG**. Prentice-Hall, 2003. ISBN-13: 978-0137903955.*Structure and Interpretation of Computer Programs*(2nd edition), by**Harold ABELSON**and**Gerald Jay SUSSMAN**, with**Julie SUSSMAN**. MIT Press, 1984. ISBN-10: 0-262-01077-1.

**Full text and code are available online at no cost**for the Scheme book (*Structure and Interpretation of Computer Programs*) at http://mitpress.mit.edu/sicp/.

### HONOR POLICY

To receive a passing grade, you are required to sign an honor statement acknowledging that you understand and will uphold all policies on plagiarism and collaboration.#### Plagiarism

All materials submitted for grading must be your own work. You are advised
against being involved in any form of copying (either copying other people's
work or **allowing others to copy yours**). If you are found to be involved
in an incident of plagiarism, **you will receive a failing grade for the
course and the incident will be reported for appropriate disciplinary
actions**.

University policy requires that students who cheat more than once be expelled. Please review the cheating topic from your UST Student Orientation.

Warning: sophisticated plagiarism detection systems are in operation!

#### Collaboration

You are encouraged to collaborate in study groups. However, you must write up solutions on your own.**You must also acknowledge your collaborators in the write-up for each problem, whether or not they are classmates.**Other cases will be dealt with as plagiarism.

### GRADING

The course will be graded on a curve, but no matter what the curve is, I guarantee you the following.

If you achieve | 85% |
you will receive at least a | A |
grade. |

75% |
B |
|||

65% |
C |
|||

55% |
D |

Your grade will be determined by a combination of factors:

Midterm exam | ~20% |

Final exam | ~25% |

Participation | ~5% |

Assignments | ~50% |

#### Examinations

No reading material is allowed during the examinations. No make-ups will be given unless prior approval is granted by the instructor, or you are in unfavorable medical condition with physician's documentation on the day of the examination. In addition, being absent at the final examination results in automatic failure of the course according to university regulations, unless prior approval is obtained from the department head.There will be one midterm (Fri 12 Apr, 18:00, Rm 3598, L27-28) worth approximately 20%, and one final exam (Tue 21 May, 12:30, Rm 2464, L25-26) worth approximately 25%.

#### Participation

Science and engineering (including software engineering!) is about communication between people. Good participation in class and/or the online forum will count for approximately 5%.

#### Assignments

All assignments must be submitted by **23:00** on the due date,
unless otherwise specified. Late assignments cannot be accepted. Sorry, in
the interest of fairness, exceptions cannot be made.

Assignments will account for a total of approximately 50%.

Assignment 3 (use Weka as described in lecture)

#### Tutorials

All information for tutorials is at http://course.cs.ust.hk/comp4211/ta/.

### SYLLABUS

- Motivations and applications: scientific/cognitive, engineering, social
- Formulating machine learning: inducing input/output relationships, objective criteria; parametric vs. nonparametric models; supervised vs unsupervised vs reinforcement learning
- Search spaces: performance vs learning components; solution spaces vs model spaces
- Overfitting vs underfitting; curse of dimensionality; model selection; misclassification rate; generalization error; training vs validation sets vs test sets; k-fold cross-validation; Bayesian priors; no free lunch theorem
- Search strategies: dynamic programming; shortest-path
- Probability distributions: the gamma function; binomial multinomial, Poisson, beta, Dirichlet, gamma distributions
- Entropy and information
- Maximum likelihood vs Bayesian inference; frequentist vs Bayesian interpretation of probability; Occam's razor; conjugate priors; MAP estimation
- Minimum description length
- Expectation maximization
- Example-based models
- K-nearest neighbors classification: weighted, ensemble variants
- Naive Bayes classification
- Decision tree learning: ID3, c4.5, MDL pruning
- Neural networks
- Training and testing of Markov models
- Hidden Markov models: Viterbi decoding
- Hidden Markov models: evaluation; forward algorithm, backward algorithm
- Hidden Markov models: parameter estimation via Baum-Welch forward-backward training
- Ensemble methods
- Use of the Weka machine learning toolkit

#### Other material and background review

- Weka 3 machine learning / data mining software (Windows, OSX, Linux)
- Neural Networks, Steve Tanimoto
- Perceptrons, Andrew Rosenberg
- Neural Networks, Andrew Rosenberg
- Decision Tree Learning (ID3 and C4.5), Tom Mitchell
- Exciting Guide to Probability Distributions, Jamie Frost
- Exciting Guide to Probability Distributions (part 2), Jamie Frost
- Scheme slides
- Scheme R5RS [html, pdf]
- Chicken Scheme 3.4 manual
- Chicken Scheme 3 eggs
- COMP221 A1
- COMP221 A2

*dekai@cs.ust.hk*

Last updated: 2013.05.13