THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY
Department of Computer Science and Engineering
Department of Electronic and Computer Engineering
Human Language Technology Center
K-best Algorithms in Parsing and Machine Translation
University of Pennsylvania
Joint work with David CHIANG (USC Information Sciences Institute)
Date : 30 July 2007 (Monday)
Time : 10:30-12:00
Venue : Rm 2578 (Lifts 29-30)
The general idea of k-best search is to find the top k solutions of a given problem, since in many cases the 1-best solution is not guaranteed to be optimal given more (global) information. Recently it has become a popular technique in natural language processing, especially in parsing and machine translation. However, fast and exact k-best algorithms are largely unknown to the NLP community. In this work, we develop a series of efficient algorithms for exact k- best search in the general framework of directed hypergraphs, and demonstrate their performance on state-of-the-art Collins/Bikel parsers.
Since the publication of our paper, these algorithms have been implemented in many widely used systems and packages, including the Charniak parser, the McDonald dependency parser, ISI syntax-based MT system, CMU syntax-based MT system, the Berkeley parser, the Hopkins Dyna language for dynamic programming, and the ISI Tiburon package for tree transducers.
These algorithms have also been successfully applied to search problems in machine translation. In particular, I will talk about an adaptation called "forest rescoring," for decoding with integrated language models, which achieves more than ten fold speed-up against conventional beam search in both phrase-based and syntax-based translation systems.
- Liang Huang and David Chiang (2005). Better k-best Parsing. In Proceedings of the 9th International Workshop on Parsing Technologies (IWPT). http://www.cis.upenn.edu/~lhuang3/huang-iwpt-correct.pdf
- Hao Zhang, Liang Huang, Daniel Gildea, and Kevin Knight (2006). Synchronous Binarization for Machine Translation. In Proceedings of the HLT-NAACL 2006, Brooklyn, NY. http://www.cis.upenn.edu/~lhuang3/acl-cube.pdf
Liang Huang is finishing his fourth year in the PhD program at the University of Pennsylvania, under Aravind Joshi. He is mainly interested in the theoretical aspects of computational linguistics, in particular, efficient algorithms in parsing and machine translation, and formal properties of synchronous grammars. He also works on applying computational linguistics to structural biology.
For enquiries, please call 2358-7008 or visit our website at http://www.cs.ust.hk/~hltc/seminars.html.
The Hong Kong University of Science & Technology
HKUST, Clear Water Bay, Hong Kong
http://www.cs.ust.hk/~hltcLast updated: 2007.07.25 Dekai Wu