Query Induction with Schema-Guided Pruning Strategies
Joachim Niehren (INRIA, Lille)
Info
|
Date |
28th June 2011 (week 9, Trinity Term 2011) |
|
Time |
11:30 |
|
Place |
147 |
Abstract
Induction algorithms for tree automata, that define node selecting queries in unranked trees, rely on tree pruning strategies. These impose additional assumptions for node selection that are needed to compensate for small numbers of annotated examples. Pruning-based heuristics in query learing algorithms for Web information extraction often boost their quality essentially and speed up the inference. In this talk, we will distinguish the class of regular queries that are stable under a given pruning strategy, and show that it is learnable with polynomial time and data. Our learning algorithm is obtained by adding pruning heuristics to the traditional learning algorithm for tree automata from positive and negative examples. While our learning algorithm has solid a formal foundation, it also performs very well in practice for schema-guided pruning strategies.
This is joint work with Jérôme Champavère, Rémi Gilleron, and Aurélien Lemay
Further info
|
Related series |
|
