Skip to main content

Query Answering under Uncertainty in the Semantic Web

Supervisor

Suitable for

MSc in Computer Science

Abstract

Abstract: The use of preferences in query answering, both in traditional databases and in ontology-based data access, has recently received much attention due to its many real-world applications. We are currently tackling the problem of developing general methodologies for top-k query answering in Datalog+/ ontologies, subject both to the preferences of a user issuing a query as well as a collection of opinions recorded in (subjective) reports provided by other users. The main objective is to develop computational tools for engineering objectivity or, depending on the application, for adjusting the opinions of others to the point of view of the querying user. As a quick example, consider the problem one faces when looking for a hotel based on its location and price as the most important features -- how should one react to reviews that focus primarily on the hotel's parking facilities, staff friendliness, or gym facilities? Though ignoring them is an option, it may not be the best one since such reviews may also contain relevant information for the user. As in many real-world services (such as hotels.com or orbitz.com), each report is assumed to consist of scores for a list of features (price, location, service, etc.), its authorʼs preferences among the features, as well as other relevant information. The pieces of information in every report then need to be combined, along with the querying userʼs preferences and his/her trust into each report, to rank the query results. Our goal is to develop different methodologies to arrive at such rankings, along with algorithms for top-k query answering under these rankings. Central to the effort is finding suitable conditions under which our algorithms run in polynomial time in the data complexity.

Potential Projects: Development of a general framework for query answering based on user preferences and a set of subjective reports over given objects (hotels, holiday packages, electronics, etc.). Key questions involve how to represent the user preferences and the reports, how to adapt opinions expressed in reports to the (perhaps quite different) point of view of the querying user, how to automatically extract preferences and reports from Web sources, and how to use them for ranking answers to ontological queries over knowledge bases.

Extension of the above framework to settings in which uncertainty plays an important role. In particular, this ties in with our previous research in probabilistic ontologies.

Implementation of the framework and the algorithms for ranking and top-k querying on different fragments of probabilistic Datalog+/-.

Experimental evaluation of implementations over both synthetic and real-world data.

Prerequisites: Participation in all aspects of the project require good analytical skills, and background in knowledge representation and reasoning (in particular, first-order logic and logic programming). For the theoretical aspects, background in computational complexity and database theory would be ideal. The practical (implementation-oriented) parts require software engineering skills, knowledge of Web programming to develop Web interfaces, as well as good knowledge of Java.

Supervision: The projects will be supervised by Gerardo Simari and Maria Vanina Martinez, in the context of the research being carried out in Thomas Lukasiewicz's group.