Schema Mappings and Automated Services for Data Integration and Exchange
1st February 2007 to 31st July 2010
This project, which is predominantly in the area of database theory, deals with schema mappings in the context of data exchange and data integration. Data Exchange is the problem of inserting data structured under one or more source schemas into a target schema of different structure (possibly with integrity constraints) while reflecting the source data as accurate as possible. Data Integration is the problem of answering queries formulated over a target schema that integrates the information provided by several data sources over one or more source schemas. In order to achieve significant progress, we will study classes of schema mappings for which these two problems become tractable and practicable. Rather than restricting ourselves to the study of single schema mappings between one source and one target schema, we will also consider more general settings, where several schemas are connected by various schema mappings and their compositions. We will start from the framework where the integrity constraints on schema mappings and databases are given by arbitrary embedded dependencies. This setting is far too general and needs to be restricted in order to obtain tractability. In the context of a comprehensive complexity analysis, we will therefore identify critical parameters, analyze restrictions (e.g. imposing bounds to some parameters) and investigate tractable settings. Based on these results, we will then design new algorithms for data exchange and integration. We will study how data exchange and integration can be best performed via Web services and develop a model. Finally, we will implement and test our new algorithm sin this context. The project is organised into four main work packages. WP1 studies the efficient computations of solutions to data exchange problems, mainly through new variants of the Chase procedure. WP2 focuses on the efficient computation of succinct solutions, so called "cores" and of the study of algebraic properties of schema mappings. WP3 consists of a comprehensive complexity study tracing the tractability frontier for data exchange problems based on various parameters. In WP3. we will also develop heuristics for dealing with computationally hard problems. In WP4, we will elaborate a new model for using Web service technology for data integration and exchange, and we will implement and test our new algorithms and methods in that model, and test the feasibility of the overall approach. The scientific project staff will consist of the applicant (part time), one post-doc, and two doctoral students. While both students are expected to intensively co-operate with each other, one will do more theory-oriented work, while the other student will spend more time on developing the Web-service model for data integration and exchange. We plan to publish the results in top database journals and at leading international conferences. The Web services will be made accessible to the public.