University of Oxford Logo University of OxfordDepartment of Computer Science - Home
Linked in
Linked in
Follow us on twitter
Twitter
On Facebook
Facebook
Instagram
Instagram

Estimating the unseen: a sublinear-sample estimator of distributions

Greg Valiant (UC Berkeley)

Info

Date

24th May 2011 (week 4, Trinity Term 2011)

Time

11:30

Place

147

Abstract

In joint work with Paul Valiant, we introduce a new approach to
characterizing the unobserved portion
of a distribution, which provides sublinear-sample additive estimators
for a class of properties that includes entropy, distribution
support size, and various distance metrics between pairs of
distributions (including L1 distance).
Together with our new matching lower bounds (which I
will only have time to touch upon), this settles the longstanding
question of the sample complexities of these estimation problems (up
to constant factors). Our algorithm estimates these properties up to
an arbitrarily small additive constant, using O(n log n) samples
(where n is a bound on the support size, or in the case of estimating
the support size, 1/n is a lower bound the probability of any element
of the domain). Previously, no explicit sublinear-sample algorithms
for any of these estimation tasks were known. Additionally, our algorithm
runs in time linear in the number of samples used.

If time permits, I will also briefly discuss the black magic of
Stein's method--an extremely versatile approach to proving central
limit theorems, which plays a key role in our lower-bounds.

Further info

Related series