Stream-based Algorithms for Online Machine Translation

Abby Levenberg ( University of Edinburgh )

1Mar
16:30 1st March 2011 ( week 7, Hilary Term 2011 )
Lecture Theatre B

The amount of raw text available on the web is massive and every day its rate of its growth is increasing. These unbounded text streams can be useful for Statistical Machine Translation (SMT). Incorporating more training data means decreased sparsity and greater model coverage of the target language domain. However, traditional methods for building SMT systems do not work in this setting.

In this work we investigate a new approach for SMT training using the streaming model of computation. We develop and test incrementally retrainable models which, given a incoming source of new data, have the ability to adapt to and efficiently incorporate the stream data whilst online. By continually adding new data to the system we can take advantage of recency effects in the stream. A naive approach using a stream would use an unbounded amount of space, but this is clearly infeasible. Hence we consider online adaptation operating within bounded space.

Seminar Series

Departmental Seminars

Coordinators

Ronnie Clark

Stream-based Algorithms for Online Machine Translation

Seminar Series

Coordinators

News & Events