Video Captioning
Supervisors
Suitable for
Abstract
Video captioning means automatically generating a sentence describing what’s happening in a video. Deep learning methods have
improved greatly at this task over the last 3-4 years, but the use of natural language to describe a video has several disadvantages.
Some work in our lab has proposed using a formal language, that can describe videos in terms of objects and relations between
them: for example, “throws(person,ball)” instead of “a person is throwing a ball”
(see http://www.cs.ox.ac.uk/publications/publication14259-abstract.html). This project will extend the above paper, which
could just mean scaling it to larger datasets with the latest deep learning techniques, or could involve extending the idea
to make it more efficient or performant. Advanced Machine Learning is a prerequisite, and proficiency in logic and reasoning
could also be useful, depending on the direction the project was taken in.