PDQ (Proof-Driven Querying) is an approach to generating query plans over semantically-interconnected datasources with diverse access interfaces. PDQ unifies a number of application scenarios involving such datasources.
For example, PDQ can provide a solution for querying data available through Web-based APIs. For a given query there may be many Web-based sources which can be used to answer it, with the sources overlapping in their vocabularies, and differing in their access restrictions (required arguments) and cost. PDQ can determine if the query can be answered using the datasources, and if so can generate the optimal plan. PDQ can also be applied to more traditional database scenarios, such as optimization of constraints in relational databasesin the presence of integrity constraints and query optimization using materialized views.PDQ works by generating query plans from proofs that a query is answerable. The PDQ planner performs optimization via exploring a space of proofs, with each proof corresponding to a different plan.
Generating Low-cost Plans From Proofs -
, PODS 2014
PDQ: Proof-driven Query Answering over Web-based Data -
, VLDB 2014 (demo).
Querying with Access Patterns and Integrity Constraints -
, VLDB 2015 (see also our experiments sites).
Generating Plans from Proofs: The Interpolation-based Approach to Query Reformulation -
, Book Published By Morgan Claypool
Obtaining PDQThe PDQ sourcecode can be found here. Users can also directly download the pre-built .jar. A quick introduction to plan creation and plan execution using PDQ can be found here and here, while this tutorial describes the steps to build PDQ from its sources. By downloading the sources and/or binaries of PDQ, you agree to the following academic licence.
The user interface can be downloaded from here. The following video presents a walk-through to the user interface.