Skip to main content

Datalog Relaunched: Simulation Unification and Value Invention

François Bry‚ Tim Furche‚ Bruno Marnette‚ Clemens Ley‚ Benedikt Linse and Olga Poppe


For reasoning on the Web, Datalog is lacking data extraction and value invention. This article proposes to overcome these limitations with "simulation uni cation" and RDFLog". Simulation uni cation is a non-standard uni cation inspired from regular path queries. Like standard uni cation, it yields bindings for variables in both terms to unify. Unlike standard uni cation, it does not try to make the two terms identical but instead to embed the query into the data. Simulation uni cation is decidable. Without variables, it has polynomial complexity. With variables it is, like standard uni cation, np-complete. We identify a number of interesting special cases of uni cation, e.g., in presence or absence of term injectivity. In particular, we show that simulation uni cation without term injectivity on tree data is linear and in presence of injectivity it is still polynomial even on unordered trees in contrast to the np-complete unordered tree inclusion problem. RDFLog is Datalog with arbitrary quanti er alternation: Blank nodes, i.e., existentially quanti ed variables, in rule heads may be governed by universally quanti ed variables, universally quanti ed variables by blank nodes. RDFLog's declarative semantics is de ned in terms of RDF entailment; its sound and complete operational semantics, in terms of Skolemization, standard Datalog evaluation, and un-Skolemization. We show that RDFLog limited to forall* exists*-pre xes is (up to unique helper pred- icates) equivalent to RDFLog with full quanti er alternation. A light- weight implementation points to the eciency of the approach

Pro­ceed­ings of the Dat­a­log 2.0 Work­shop