Uncertain Database Management Systems

Supervisor

Dan Olteanu

Suitable for

MSc in Computer Science

Mathematics and Computer Science, Part C

Computer Science, Part C

Computer Science, Part B

Abstract

Not available in 2013/14

Today, uncertainty is commonplace in data management scenarios dealing with data integration, sensor readings, information extraction from unstructured sources, or whenever information is manually entered and is therefore prone to inaccuracy or partiality. In these scenarios, uncertainty arises from the existence of alternatives for mapping schemas of different sources or for possible non-identical record duplicates, different interpretations of sensor data, multiple extraction possibilities from unstructured data, or several possible readings of manually filled forms respectively.

To accommodate uncertainty, the current data management technology should pursue a paradigm shift from deterministic to possible worlds semantics and address the basic data management problems in the new context. Projects on this topic should offer support for this paradigm shift by investigating some of the following directions

compact representation systems for large sets of possible worlds,
techniques for processing and constraints on succinct representations of possible worlds,
uncertainty-aware query languages beyond relational algebra.

All aforementioned directions can lead to both theoretical and practical (implementation-oriented) projects. Anyone interested in doing a project in one of these topics is encouraged to get in touch with Dan Olteanu to explore specific ideas, such as

Query Evaluation: Tractability and Efficient Algorithms
Approximate and incremental view maintenance in probabilistic databases
Synthesising query mappings for input and output probabilistic data
View materialization for query optimization in probabilistic databases
Modelling and processing streams of uncertain sensor data
Algebraic optimizations for the MayBMS query language

Prerequisites: All projects within this framework require prior exposure to databases, though some projects may only require knowledge of either database theory or database systems. In the latter case, strong C/C++ skills are essential (Proficiency in any other general-purpose programming language is in any case an important start). Students with very good marks in the Database Systems Implementation course are preferred.

Uncertain Database Management Systems

Supervisor

Suitable for

Abstract

Suggested reading

Our Students