PDQ (Proof-Driven Querying) is an approach to generating query plans over semantically-interconnected datasources with diverse access interfaces. PDQ unifies a number of application scenarios involving such datasources.

For example, PDQ can provide a solution for querying data available through Web-based APIs. For a given query there may be many Web-based sources which can be used to answer it, with the sources overlapping in their vocabularies, and differing in their access restrictions (required arguments) and cost. PDQ can determine if the query can be answered using the datasources, and if so can generate the optimal plan. PDQ can also be applied to more traditional database scenarios, such as optimization of constraints in relational databasesin the presence of integrity constraints and query optimization using materialized views. PDQ works by generating query plans from proofs that a query is answerable. The PDQ planner performs optimization via exploring a space of proofs, with each proof corresponding to a different plan.

PDQ also has strong connections with many fundamental issues in computational logic. It can be seen as an application of interpolation and Beth definability to data management, and as part of the project we have examined decidability and effective interpolation for a number of logics.

Current members

Former members


Obtaining PDQ

The current version of the PDQ software is available on github here .

An older version of the PDQ sourcecode -- which was used for the experiments in several papers -- can be found here. Users can also directly download the pre-built .jar. A quick introduction to plan creation and plan execution using PDQ can be found in the source code WIKI . By downloading the sources and/or binaries of PDQ, you agree to the following academic licence.

User interface

A demonstration version of the PDQ Webgui can be found here . The older user interface can be downloaded from here. The following video presents a walk-through to the user interface.