Skip to main content

G-Store: A Storage Manager for Graph Data

Supervisor

Suitable for

Abstract

Many applications require data modeled by very large graphs. The standard approach to such applications is to shred such graphs into the very rigid relational model, which is backed up by decades of research and comercially successful database management systems such as Oracle, IBM DB2, or PostgreSQL. One major problem with this relational encoding of graphs is that natural operations on graphs, such as checking for reachability or computing degrees of separation, require a partial or total reconstruction of the original graph through the use of very many joins. Such joins can be very expensive and need to be repeatedly computed at run-time. The aim of this project is to investigate alternative low-level storage models for graph data that better support common operations on graphs. A new storage manager should be specified and implemented assuming the standard I/O sequential model. Ideally, the storage manager should provide an API similar to BerkeleyDB or PostgreSQL storage manager, so that it can be directly used as a replacement of standard relational managers in existing applications. Further tasks include: query engine for graph queries on top of the storage manager, incorporation of indexes, graph query optimization.

The Web page of the current prototype is hosted at g-store.sourceforge.net.

Prerequisites: Strong C/C++ skils; Very good marks in the Database Systems Implementation course (or evidence of equivalent qualification).