COMP 4221/5221 - Fall 2013
Fall 2013, COMP 4221 Introduction to Natural Language Processing
Fall 2013, COMP 5221 Natural Language Processing [3-0-0:3]
Lecture 1, WeFr 16:30-17:50, Rm 4504 at L25/26
Prof. Dekai WU, Rm 3539, 2358-6989, firstname.lastname@example.org
You are welcome to knock on the door of the instructor any time. The TAs' office hours are posted at http://course.cs.ust.hk/comp4221/ta/.
Welcome to COMP4221 for UGs and COMP5221 for PGs! (The COMP4221 course was formerly called COMP300H and COMP326, and the COMP5221 course was formerly called COMP526.) Tutorials will begin after Week 2.
Always check the Discussion Forum for up-to-the-minute
Discussion forum is at http://comp151.cse.ust.hk/~dekai/content/?q=forum/3.
Always read before asking/posting/emailing your question. This forum is based
on modern software, instead of using the old clunky ITSC newsgroup. You must
register for your account at the first lecture, tutorial, or lab.
Course home page is at http://www.cs.ust.hk/~dekai/4221/.
Tutorial info is at http://course.cs.ust.hk/comp4221/ta/.
Abbreviated Course Catalog Description
COMP 4221. Human language technology for text and spoken language. Machine learning, syntactic parsing, semantic interpretation, and context-based approaches to machine translation, text mining, and web search.
COMP 5221. Techniques for parsing, interpretation, context modeling, plan recognition, generation. Emphasis on statistical approaches, neuropsychological and linguistic constraints, large text corpora. Applications include machine translation, dialogue systems, cognitive modeling, and knowledge acquisition. Background: COMP 3211 or equivalent.
Human language technology for processing text and spoken language. Fundamental machine learning, syntactic parsing, semantic interpretation, and context models, algorithms, and techniques. Applications include machine translation, web technologies, text mining, knowledge management, cognitive modeling, intelligent dialog systems, and computational linguistics.
- Introduction to Text Alignment: Statistical Machine Translation Models from Bitexts to Bigrammars (forthcoming), by Dekai WU. Springer, 2013.
- Artificial Intelligence: A Modern Approach (2nd Edition), by Stuart RUSSELL and Peter NORVIG. Prentice-Hall, 2003. ISBN-13: 978-0137903955.
- Structure and Interpretation of Computer Programs (2nd edition),
by Harold ABELSON and Gerald Jay SUSSMAN,
with Julie SUSSMAN. MIT Press, 1984. ISBN-10:
Full text and code are available online at no cost for the Scheme book (Structure and Interpretation of Computer Programs) at http://mitpress.mit.edu/sicp/.
All materials submitted for grading must be your own work. You are advised against being involved in any form of copying (either copying other people's work or allowing others to copy yours). If you are found to be involved in an incident of plagiarism, you will receive a failing grade for the course and the incident will be reported for appropriate disciplinary actions.
Warning: sophisticated plagiarism detection systems are in operation!
CollaborationYou are encouraged to collaborate in study groups. However, you must write up solutions on your own. You must also acknowledge your collaborators in the write-up for each problem, whether or not they are classmates. Other cases will be dealt with as plagiarism.
The course will be graded on a curve, but no matter what the curve is, I guarantee you the following.
|If you achieve||85%||you will receive at least a||A||grade.|
Your grade will be determined by a combination of factors:
ExaminationsNo reading material is allowed during the examinations. No make-ups will be given unless prior approval is granted by the instructor, or you are in unfavorable medical condition with physician's documentation on the day of the examination. In addition, being absent at the final examination results in automatic failure of the course according to university regulations, unless prior approval is obtained from the department head.
There will be one midterm worth approximately 20%, and one final exam worth approximately 25%.
Science and engineering (including software engineering!) is about communication between people. Good participation in class and/or the online forum will count for approximately 5%.
All assignments must be submitted by 23:00 on the due date. Scheme programming assignments must run under Chicken Scheme on Linux. Assignments will be collected electronically using the automated CASS assignment collection system. Late assignments cannot be accepted. Sorry, in the interest of fairness, exceptions cannot be made.
Programming assignments will account for a total of approximately 50%.
All information for tutorials is at http://course.cs.ust.hk/comp4221/ta/.
|2013.09.04||1||Lecture||Does God play dice? Assumptions: scientific method, hypotheses,
models, learning, probability
Admiinistrivia (honor statement, HKUST classroom conduct)
|2013.09.06||1||Lecture||Languages of the world|
|2013.09.11||2||Lecture||Linguistic relativism and the Sapir-Whorf hypothesis; inductive bias, language bias, search bias; the great cycle of intelligence|
|2013.09.12||2||Lecture||Is machine translation intelligent? Interactive simulation [20:00 Lab 3]|
|2013.09.13||2||Lecture||Learning to translate: engineering, social, and scientific motivations [at tutorial]|
|2013.09.13||2||Lecture||"It's all Chinese to me": linguistic complexity; challenges in modeling translation|
|2013.09.18||3||Lecture||[rescheduled to previous 2013.09.12 session]|
|2013.09.25||4||Lecture||Evaluating translation quality: adequacy, fluency, fidelity, speed, memory, n-grams, BLEU|
|2013.09.26||4||Lecture||Machine translation in Macau [14:30 LTH]|
|2013.09.27||4||Lecture||Evaluating translation quality: case frames, semantic frames, semantic role labeling, predicate-argument structure [at tutorial]|
|2013.09.27||4||Lecture||Evaluating translation quality: alignment; aligning semantic frames|
|2013.10.02||5||Lecture||Anagrams; bag translation|
|2013.10.04||5||Lecture||Markov models, n-gram models|
|2013.10.09||6||Lecture||Uninformed search; Dijkstra's shortest path algorithm|
|2013.10.11||6||Lecture||Anagrams with replacement; Chinese anagrams; word n-grams|
|2013.10.16||7||Lecture||HMM/SFSA/WFSA: hidden Markov models, finite-state models; parts of speech; generation vs recognition/parsing; converting state-based to transition based FSAs|
|2013.10.18||7||Lecture||[rescheduled to previous 2013.09.13 session]|
|2013.10.23||8||Lecture||HMM/SFSA/WFSA: formalization for Viterbi decoding and evaluation [slides]|
|2013.10.28||9||Lecture||Making sense in translation: Addressing lexical choice errors when translating across domains [16:00 LTF]|
|2013.10.30||9||Lecture||HMM/SFSA/WFSA: forward algorithm, backward algorithm, expectations|
|2013.11.01||9||Lecture||HMM/SFSA/WFSA: forward-backward algorithm, expectation maximization (EM) algorithm|
|2013.11.06||10||Lecture||Segmental HMM/SFA/WFSAs; WFST: finite-state translation models|
|2013.11.08||10||Lecture||AND/OR graphs; FSGs (finite-state grammars); segmental FSGs|
|2013.11.13||11||Lecture||Sentence alignment [chapter]|
|2013.11.15||11||Lecture||CFGs (context-free grammars); segmental CFGs|
|2013.11.20||12||Lecture||Syntax-directed transduction grammars|
|2013.11.22||12||Lecture||Inversion transduction grammars [article]|
|2013.11.27||13||Lecture||The magic number 4: how the generative capacity of ITGs explains the evolution of semantic frame structure|
|2013.11.29||13||Lecture||Bracketing inversion transduction grammars (BITGs): alignment, bibracketing, translation-driven segmentation, learning phrasal translation lexicons, projection/coercion, EM [article]|
|2013.11.29||13||Lecture||ITGs: translation, incorporating language models [article]; general, linguistic, and non-binary rank ITGs [article] [18:00 Rm 3401]|
|2013.12.11||14||Exam||COMP5221 Final [Rm 2464, L25-26, 12:30-15:30]|
|2013.12.12||14||Exam||COMP4221 Final [Rm 2405, L17-18, 16:30-19:30]|
- Scheme slides
- Scheme R5RS [html, pdf]
- Chicken Scheme 3.4 manual
- Chicken Scheme 3 eggs
- COMP221 A1
- COMP221 A2
Last updated: 2013.12.04