University of Oxford Logo University of OxfordDepartment of Computer Science - Home
Linked in
Linked in
Follow us on twitter
Twitter
On Facebook
Facebook
Instagram
Instagram

Docu-Research. Object Annotation and Visualization for PDF Documents

Supervisor

Suitable for

Abstract

(Joint supervisor with G Orsi)

PDF is the de-facto standard for unstructured documents on the web. Being able to e_ectively analyse, annotate and manipulate PDF documents is a challenging area of research and a pro_table commercial opportunity. In particular, current PDF processing tools (e.g., Adobe Acrobat) allow users to search the full content of the PDF but not to locate interesting objects in it (e.g., what is the contribution of this paper? Is there any budget in this project proposal?). Two particularly challenging aspects of this problem are the annotation and the visualization of such objects. This project is concerned with the semantic annotation of structured object in scienti_c PDF documents (e.g., conference-papers, journals, and project proposals) as well as their e_ective visualization. Good Knowledge of Java is required. Knowledge about (semantic-)web technology and natural- language processing is preferred.