All lectures and tutorials will be held ONLINE LIVE INTERACTIVELY at the regularly scheduled times.
You can find the recurring Zoom meetings for the lectures and tutorials in Canvas. You are highly recommended to join the meetings from there. Note that these Zoom meetings only admit authenticated users with ITSC accounts (with domain connect.ust.hk or ust.hk. You can only join the meetings via either of the two paths above.You must register for the lectures and tutorials at the following links. After registering, you will receive a confirmation email containing information about joining the meeting.
After you are registered, you may use the following links to join the lectures and tutorials:
If you haven’t done so, please watch this video to get your HKUST Zoom account ready as soon as possible, not just for this course but also for all other courses at HKUST:
Lecture 1: TuTh 09:00-10:20, Rm TBA
Tutorial 1: M 16:00-16:50, Rm TBA
Office hours: TuThu 10:30-11:00. The TA's office hours are posted at http://course.cs.ust.hk/comp4221/ta/.
Course: http://www.cs.ust.hk/~dekai/4221/ is the master home page for the course.
Tutorial: http://course.cs.ust.hk/comp4221/ta/ contains all information for the tutorials.
Forum: http://comp151.cse.ust.hk/~dekai/content/?q=forum/3. is where all discussion outside class should be done. Always read before asking/posting/emailing your question. Note that you must register for your account at the first lecture, tutorial, or lab.
COMP 4221. Human language technology for text and spoken language. Machine learning, syntactic parsing, semantic interpretation, and context-based approaches to machine translation, text mining, and web search.
Human language technology for processing text and spoken language. Fundamental machine learning, syntactic parsing, semantic interpretation, and context models, algorithms, and techniques. Applications include machine translation, web technologies, text mining, knowledge management, cognitive modeling, intelligent dialog systems, and computational linguistics.
At the end of the Natural Language Processing course, you will have achieved the following outcomes.
To receive a passing grade, you are required to sign an honor statement acknowledging that you understand and will uphold all policies on plagiarism and collaboration.
All materials submitted for grading must be your own work. You are advised against being involved in any form of copying (either copying other people's work or allowing others to copy yours). If you are found to be involved in an incident of plagiarism, you will receive a failing grade for the course and the incident will be reported for appropriate disciplinary actions.
University policy requires that students who cheat more than once be expelled. Please review the cheating topic from your UST Student Guide.
Warning: sophisticated plagiarism detection systems are in operation!
You are encouraged to collaborate in study groups. However, you must write up solutions on your own. You must also acknowledge your collaborators in the write-up for each problem, whether or not they are classmates. Other cases will be dealt with as plagiarism.
Course grading will be adjusted to the difficulty of assignments and exams. Moreover, I guarantee you the following.
If you achieve | 85% | you will receive at least a | A | grade. |
75% | B | |||
65% | C | |||
55% | D |
Your grade will be determined by a combination of factors:
Exams | 0% (due to university coronovirus meaures) |
Pop quizzes | ~10% |
Class participation | ~15% |
Forum participation | ~10% |
Assignments | ~65% |
No reading material is allowed during the examinations. No make-ups will be given unless prior approval is granted by the instructor, or you are in unfavorable medical condition with physician's documentation on the day of the examination. In addition, being absent at the final examination results in automatic failure of the course according to university regulations, unless prior approval is obtained from the department head.
Science and engineering (including software engineering!) is about communication between people. Good participation in class will count for approximately 15%, and good participation in the online forum will count for approximately 10%.
All assignments must be submitted by 23:00 on the due date. Assignments will be collected electronically using the automated CASS assignment collection system. Late assignments cannot be accepted. Sorry, in the interest of fairness, exceptions cannot be made.
Scheme programming assignments must run under Chicken Scheme on Linux.
Programming assignments will account for a total of approximately 65%.
Any linked material (unless labeled "Supplementary references") is required reading that you are responsible for.
We will choose from the topics is in the list below.
date | wk | event | topic | |
20200908 | 1 | Lecture | Welcome; introduction; survey; administrivia (honor statement, HKUST classroom conduct) | |
20200910 | 2 | Lecture | Does God play dice? Assumptions: scientific method, hypotheses, models, learning, probability; linguistic relativism and the Sapir-Whorf hypothesis; inductive bias, language bias, search bias; the great cycle of intelligence | |
20200915 | 2 | Lecture | Languages of the world | |
20200917 | 3 | Lecture | Learning to translate: engineering, social, and scientific motivations | |
Lecture | Language structures thought (Sapir-Whorf hypothesis); "It's all Chinese to me": linguistic complexity; challenges in modeling translation [at tutorial] | |||
Lecture | Is machine translation intelligent? Interactive simulation; Turing test; translation based test of AI; error analysis; what's still missing in AI | |||
Lecture | Introduction to search | |||
Lecture | State spaces; the anagram problem | |||
Lecture | Conditional decomposition; Markovian approximations; memoryless property; Markov models | |||
Lecture | Character n-gram models; goal states and objective functions | |||
Lecture | Uninformed search; BFS, DFS, depth-bounded search, iterative deepening, bidirectional search | |||
Lecture | Informed search; greedy best-first search; agenda-driven search; search fringe; Dijkstra's shortest path algorithm | |||
Lecture | implementing agenda-driven search for finding best anagrams (Assignment due TBA); physical demo of agenda-driven search for Dijkstra's algorithm; anagrams with replacement | |||
Lecture | Basic probability theory; conditional probabilities; Bayes' theorem | |||
Lecture | Chinese anagrams; word vs character n-grams; first- and second-order n-grams; Shannon | |||
Lecture | hidden Markov models, finite-state models; weighted and stochastic FSAs; parts of speech; generation vs recognition/parsing | |||
Lecture | Converting state-based to transition based FSAs for both logical and stochastic FSAs; segmental HMM/SFA/WFSAs; WFST: finite-state translation models | |||
Lecture | HMM/SFSA/WFSA decoding, evaluation, learning: unrolling in time; ways to search the lattice; formalization for Viterbi decoding and evaluation [slides] | |||
Lecture | Introduction to neural networks and their language biases | |||
Lecture | Implementing feedforward neural network n-gram models; model design following scientific method for machine learning/adaptation in practice (Assignment due TBA) | |||
Lecture | Search bias; forward algorithm for HMM/SFSA/WFSAs | |||
Lecture | Training HMM/SFSA/WFSAs: backward algorithm; completing missing data; expectations; EM (expectation maximization); Baum-Welch parameter estimation | |||
Lecture | RNN: recurrent neural networks | |||
Lecture | Implementng RNNs; RNN language models; recurrent model design (Assignment due TBA); memoization/caching; HMM alignment models; memory-bounded search; heuristics; admissibility; A* search; memory-bounded heuristic search | |||
Lecture | Revisiting knoweldge representation and language bias; propositional logic; conjunctive normal form; AND/OR graphs; AND/OR hypergraphs; forward chaining | |||
Lecture | Definite clauses; definite clause grammars; Knuth's algorithm; context-free grammars (CFGs); weighted and stochastic CFGs | |||
Lecture | Probabilistic dynamic programming based chart parsing; probabilistic Cocke-Kasami-Younger (CKY) parsing; inside-outside algorithm; parsing by theorem proving; productions as inference rules; backward chaining; abductive (diagnostic or explanatory) inference vs deductive inference | |||
Lecture | From CFGs to ITGs (monolingual vs bilingual modeling); how bilingual conditions make grammar induction easier; the mystery of the magic number 4 in semantic frames; simple and full syntax-directed transduction grammars (SDTGs); tree vs matrix constituent alignment visualizations | |||
Lecture | Exploring inversion transduction grammars (ITG); ITG characteristics; stochastic ITGs; polynomial-time transduction and learning; resolving the mystery of the magic number 4 | |||
Lecture | Evaluating translation quality: alignment; aligning semantic frames: Interactive exercise | |||
Lecture | Evaluating translation quality: MEANT | |||
Lecture | Evaluating translation quality: semantic role labeling (SRL), case frames, semantic frames, predicate-argument structure | |||
Lecture | Automatic semantic role labeling (ASRL) | |||
Lecture | Implementing a feedforward neural network based part-of-speech tagger Assignment due TBA; context-independent POS tagging | |||
Lecture | I/O representations for feedforward networks; context-dependent POS tagging | |||
Tutorial | Basic probability theory; conditional probabilities; Bayes' theorem | |||
Lecture | Example-based, instance-based, memory-based, case-based, analogy-based, lazy learning for classification; translation via nearest neighbors (NN); k-NN; weighted k-NN | |||
Lecture | Learning vs performance components in machine learning; supervised learning; Word sense disambiguation; lexical choice; example-based prediction models; nearest neighbor classifiers; similarity metrics; kNN classifiers | |||
Lecture | Exploring different feedforward neural network architectures for POS tagging; model design following scientific method for machine learning in practice Assignment due TBA | |||
Lecture | Naive Bayes classifiers for WSD and lexical choice | |||
Lecture | Modern approaches to SRL | |||
Lecture | Implementing chunkers and shallow parsers via IOBES tagging plus a POS tagger Assignment due TBA | |||
Lecture | Chunking via IOBES representations; shallow bracketing | |||
Lecture | Shallow syntactic parsing; shallow semantic parsing; language bias of IOBES representations, bags of words, and one-hot representations | |||
Lecture | Introduction to word embeddings | |||
Lecture | Vector space models; classic word vector approaches | |||
Lecture | Learning word embeddings via prediction tasks; skip-grams; word2vec; | |||
Lecture | Recursive autoencoders (RAEs) and recursive auto-associative memories (RAAM); learning word embeddings by making RAEs predict | |||
Lecture | Context-free grammars (CFGs); generative vs parsing models; top-down vs bottom-up parsing; dynamic programming based chart parsing; Cocke-Kasami-Younger (CKY) parsing | |||
Lecture | Recursive autoencoders (RAE) and recursive auto-associative memory (RAAM); TRAAM (transduction RAAM); recursive neural network realizations of ITGs; a self-learning rap battle bot |