Skip to main content

How Big Must Complete XML Query Languages Be?

Clemens Ley and Michael Benedikt


Marx and de Rijke have shown that the navigational core of the w3c XML query language XPath is not first-order complete – that is it cannot express every query definable in first-order logic over the navigational predicates. How can one extend XPath to get a first-order complete language? Marx has shown that Conditional XPath – an extension of XPath with an ``Until'' operator – is first order complete. The completeness argument makes essential use of the presence of upward axes in Conditional XPath. We examine whether it is possible to get ``forward-only'' languages that are first-order complete for XML Boolean queries. It is easy to see that a variant of the temporal logic CTL* is first-order complete; the variant has path quantifiers for downward, leftward and rightward paths, while along a path one can check arbitrary formulas of linear temporal logic (LTL). This language has two major disadvantages: it requires path quantification in both horizontal directions (in particular, it requires looking backward at the prior siblings of a node), and it requires the consideration of formulas of LTL of arbitrary complexity on vertical paths. This last is in contrast with Marx's Conditional XPath, which requires only the checking of a single Until operator on a path. We investigate whether either of these restrictions can be eliminated. Our main results are negative ones. We show that if we restrict our CTL* language by having an until operator in only one horizontal direction, then we lose completeness. We also show that no restriction to a ``small'' subset of LTL along vertical paths is sufficient for first order completeness. Smallness here means of bounded ``Until Depth'', a measure of complexity of LTL formulas defined by Etessami and Wilke. In particular, it follows from our work that Conditional XPath with only forward axes is not expressively complete; this extends results proved by Rabinovich and Maoz in the context of infinite unordered trees.

Oxford University
International Conference on Database Theory