Hierarchical Bayesian Models of Sequential Data
In this talk I will present a new approach to modelling sequence data called the sequence memoizer. As opposed to most other sequence models, ours does not make any Markovian assumptions. Instead, we use a hierarchical Bayesian approach which enforces sharing of statistical information across the different parts of the model. To better mode the power-law statistics often observed in sequence data, we use a Bayesian nonparametric prior called the Pitman-Yor process as building blocks in the hierarchical model. We show that computations in the resulting model can be performed efficiently by pruning the hierarchy, resulting in a suffix tree data structure. We show state-of-the-art results on language modelling and text compression.
This is joint work with Frank Wood, Jan Gasthaus, Cedric Archambeau and Lancelot James.