RDFLog: It's like Datalog for RDF
Francois Bry‚ Tim Furche‚ Clemens Ley‚ Benedikt Linse and Bruno Marnette
RDF data is set apart from relational or XML data by its support of rich existential information in the form of blank nodes. Where SQL null values are always scoped over a single statement, blank nodes in RDF can span over any number of statements and thus can be seen as existentially quantified variables scoped over conjunctions of RDF triples. For RDF querying blank node querying is considered in most query languages, but blank node construction, i.e., the introduction of new blank nodes has been mostly ignored (e.g., in Triple) or treated in a very limited form (e.g., in SPARQL). In this paper, we classify three kinds of blank node in RDF query languages and introduce the recursive, rule-based RDF query language RDFLog. RDFLog is the first RDF query languages with full arbitrary quantifier alternation: blank nodes may occur in the scope of all, some, or none of the universal variables of a rule. In addition RDFLog is aware of important RDF features such as the distinction between blank nodes, literals and URIs or the RDFS vocabulary. The semantics of RDFLog is closed (every answer is an RDF graph), but lifts RDF's restrictions on literal and blank node occurrences for intermediary data. We show how to define a sound and complete operational semantics that can be implemented using existing logic programming techniques. Our experimental evaluation shows that our prototypical implementation of RDFLog is comparable in efficiency to SPARQL implementations, yet considerably more expressive.