PDQ

Overview

PDQ (Proof-Driven Querying) is an approach to generating query plans over semantically-interconnected datasources with diverse access interfaces. PDQ unifies a number of application scenarios involving such datasources.

For example, PDQ can provide a solution for querying data available through Web-based APIs. For a given query there may be many Web-based sources which can be used to answer it, with the sources overlapping in their vocabularies, and differing in their access restrictions (required arguments) and cost. PDQ can determine if the query can be answered using the datasources, and if so can generate the optimal plan. PDQ can also be applied to more traditional database scenarios, such as optimization of constraints in relational databasesin the presence of integrity constraints and query optimization using materialized views. PDQ works by generating query plans from proofs that a query is answerable. The PDQ planner performs optimization via exploring a space of proofs, with each proof corresponding to a different plan.

PDQ also has strong connections with many fundamental issues in computational logic. It can be seen as an application of interpolation and Beth definability to data management, and as part of the project we have examined decidability and effective interpolation for a number of logics.

Current members

Michael Benedikt – Principal investigator

Former members

Brandon Moore – Software Developer (2021)
Camilo Ortiz – Intern (2020)
Fergus Cooper – Software Developer (2020)
Gabor Gyorkei – Software Developer (2018-2020)
Stanislav Kikot – Postdoc (2018-2020)
Mark Ridler – Software Developer (2018-2019)
Stefano Germano – (2019-present)
George Konstantinidis – 2016-2017
Julien Leblay – 2014-2015
Efthymia Tsamoura – 2014-2018

References

Generating Low-cost Plans From Proofs -
Michael Benedikt, Balder ten Cate and Efthymia Tsamoura, PODS 2014
Draft of Long Version, to appear in ACM TODS and its electronic appendix
PDQ: Proof-driven Query Answering over Web-based Data -
Michael Benedikt, Julien Leblay and Efthymia Tsamoura, VLDB 2014 (demo).
Querying with Access Patterns and Integrity Constraints -
Michael Benedikt, Julien Leblay and Efthymia Tsamoura, VLDB 2015 (see also our experiments sites).
Interpolation with decidable fixpoint logics -
Michael Benedikt, Balder ten Cate, and Michael Vanden Boom, ICALP 2015
Generating Plans from Proofs: The Interpolation-based Approach to Query Reformulation -
Michael Benedikt, Julien Leblay, Efthymia Tsamoura, and Balder ten Cate, Book Published By Morgan Claypool
Biological Web Services: Integration, Optimization, and Reasoninge -
Michael Benedikt, Rodrigo Lopez-Serrano, and Efthymia Tsamoura, IJCAI BAI 2016
Reformulating Queries: Theory and Practice -
Michael Benedikt, Egor Kostylev, Fabio Mogavero, and Efthymia Tsamoura, IJCAI 2017
Characterizing Definability in Decidable Fixpoint Logics -
Michael Benedikt and Pierre Bourhis and Michael Vanden Boom, ICALP 2017
When Can We Answer Queries Using Result-bounded Data Interfaces -
Antoine Amarilli and Michael Benedikt, PODS 2018

Obtaining PDQ

The current version of the PDQ software is available on github here .

An older version of the PDQ sourcecode -- which was used for the experiments in several papers -- can be found here. Users can also directly download the pre-built .jar. A quick introduction to plan creation and plan execution using PDQ can be found in the source code WIKI . By downloading the sources and/or binaries of PDQ, you agree to the following academic licence.

User interface

A demonstration version of the PDQ Webgui can be found here . The older user interface can be downloaded from here. The following video presents a walk-through to the user interface.

Proof-Driven Query Planning