Computational Linguistics: 2012-2013
OverviewThe aim of this series of lectures is to provide an introduction to some of the major topics in computational linguistics. No previous knowledge of linguistics is required.
Learning outcomesBy the end of this lecture series you should understand what the concerns of computational linguists are and be familiar with some of the major topics in the area. You should also be in a position to find out more of the practical details for yourself.
A prior understanding of fundamental linguistic concepts is helpful but not entirely necessary for the course.
An understanding of basic concepts in statistics and probability, such as Bayes' rule, discrete distributions, conditional and joint probability, and expectations is assumed for this course.
The practical component of the course consists of a relatively substantial programming assignment. Students are free to choose the programming language of their choice, but should be at least somewhat familiar with the following concepts:
- File I/O
- Associative Data Structures (e.g., Hash Maps)
- String Parsing (Some experience with regular expressions is helpful)
Students with little or no programming experience may find the practical component very challenging, and we would suggest they ensure that they develop a basic proficiency before attempting the course.
- Week 1 - Parts of Speech, Hidden Markov Model POS tagging, Constituent Structure.
- Week 2 - Word alignment for machine translation, Chunking: shallow paring, Parsing Context Free Grammars.
- Week 3 - Efficient Parsing of Context Free Grammars, Probabilistic Parsing.
- Week 4 - Language Modelling, Machine Translation.
- Week 5 - Introduction to semantics and pragmatics, Semantics and Logic.
- Week 6 - Inference and question answering, Word sense disambiguation.
- Week 7 - Distributional models of semantics, Compositionality in Semantics.
- Week 8 - Advanced topics in NLP: Guest Lectures.
The following textbooks cover much of the material. More detailed references will be given with the lectures. Lecture handouts will be supplied.
- An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2nd Ed), Daniel Jurafsky and James H. Martin, 2009, Prentice-Hall.
- Statistical Machine Translation, Philipp Koehn, 2009, Cambridge University Press.
- Foundations of Statistical Natural Language Processing, Christopher Manning and Hinrich Schutze. 1999, MIT Press.
General background reading:
- The Language Instinct, Steven Pinker, 1994, Penguin Books.
- An Introduction to Language, Victoria Fromkin, Robert Rodman, and Nina Hyams, 2003, Wadsworth Publishing Company.
- Understanding Syntax, Maggie Tallerman, 2005, OUP.