Skip to main content

A Methodology for Evaluating Algorithms for Table Understanding in PDF Documents

Max Göbel‚ Tamir Hassan‚ Ermelinda Oro and Giorgio Orsi

Abstract

This paper presents a methodology for the evaluation of table understanding algorithms for PDF documents. The evaluation takes into account three major tasks: table detection, table structure recognition and functional analysis. We provide a general and flexible output model for each task along with corresponding evaluation metrics and methods. We also present a methodology for collecting and ground-truthing PDF documents based on consensus-reaching principles and provide a publicly available ground-truthed dataset.

Book Title
Proc. of the 12th ACM Symp. on Document Engineering (DocEng)
Pages
45–48
Year
2012