Deep Learning for Natural Language Processing: 2016-2017
This is an advanced course on natural language processing. Automatically processing natural language inputs and producing language outputs is a key component of Artificial General Intelligence. The ambiguities and noise inherent in human communication render traditional symbolic AI techniques ineffective for representing and analysing language data. Recently statistical techniques based on neural networks have achieved a number of remarkable successes in natural language processing leading to a great deal of commercial and academic interest in the field
This will be an applied course focussing on recent advances in analysing and generating speech and text using recurrent neural networks. We will introduce the mathematical definitions of the relevant machine learning models and derive their associated optimisation algorithms. The course will cover a range of applications of neural networks in NLP including analysing latent dimensions in text, transcribing speech to text, translating between languages, and answering questions. These topics will be organised into three high level themes forming a progression from understanding the use of neural networks for sequential language modelling, to understanding their use as conditional language models for transduction tasks, and finally to approaches employing these techniques in combination with other mechanisms for advanced applications. Throughout the course the practical implementation of such models on CPU and GPU hardware will also be discussed.
This course will be lead by Phil Blunsom and delivered in partnership with the DeepMind Natural Language Research Group. Example lecturers include:
- Phil Blunsom (Oxford University and DeepMind)
- Chris Dyer (Carnegie Mellon University and DeepMind)
- Edward Grefenstette (DeepMind)
- Karl Moritz Hermann (DeepMind)
- Andrew Senior (DeepMind)
- Wang Ling (DeepMind)
- Jeremy Appleyard (NVIDIA)
After studying this course, students will:
- Understand the definition of a range of neural network models;
- Be able to derive and implement optimisation algorithms for these models
- Understand neural implementations of attention mechanisms and sequence embedding models and how these modular components can be combined to build state of the art NLP systems.
- Have an awareness of the hardware issues inherent in implementing scalable neural network models for language data.
- Be able to implement and evaluate common neural network models for language.
This course will make use of a range of basic concepts from Probability, Linear Algebra, and Continuous Mathematics. Students should have a good knowledge of basic Machine Learning, either from an introductory course or practical experience. No prior linguistic knowledge will be assumed. The course will contain a significant practical component and it will be assumed that participants are proficient programmers.
- Introduction/Conclusion: Why neural networks for language and how this course fits into the wider fields of Natural Language Processing, Computational Linguistics, and Machine Learning.
- Simple Recurrent Neural Networks: model definition; the backpropagation through time optimisation algorithm; small scale language modelling and text embedding.
- Advanced Recurrent Neural Networks: Long Short Term Memory and Gated Recurrent Units; large scale language modeling, open vocabulary language modelling and morphology.
- Scale: minibatching and GPU implementation issues.
- Speech Recognition: Neural Networks for acoustic modelling and end-to-end speech models.
- Sequence to Sequence Models: Generating from an embedding; attention mechanisms; Machine Translation; Image Caption generation.
- Question Answering: QA tasks and paradigms; neural attention mechanisms and Memory Networks for QA.
- Advanced Memory: Neural Turing Machine, Stacks and other structures.
- Linguistic models: syntactic and seminatic parsing with recurrent networks.
Recurrent Neural Networks, Backpropagation Through Time, Long Short Term Memory, Attention Networks, Memory Networks, Neural Turing Machines, Machine Translation, Question Answering, Speech Recognition, Syntactic and Semantic Parsing, GPU optimisation for Neural Networks
As the material covered in this course is based on recent research results there is not a relevant textbook for the area. The readings for the course will thus be based on published papers and online material.