Database Systems Implementation: 2008-2009
Lecturer | |
Degrees | Schedule C1 — Computer Science Part C — Mathematics and Computer Science |
Term | Hilary Term 2009 (20 lectures) |
Overview
This course examines the data structures and algorithms underlying database management systems such as Oracle or PostgreSQL. It covers techniques from both research literature and commercial systems.
Learning outcomes
At the end of this course, students should
- have a good insight into how DBMSs function internally
- understand how to analyse the performance of data-intensive systems
- be familiar with a variety of programming techniques for large-scale data manipulation
- apply the insights achieved to build the major components of a mini-DBMS (Minibase, see http://pages.cs.wisc.edu/~dbbook/openAccess/Minibase/minibase.html)
Prerequisites
This course assumes a reasonable understanding of the following topics:
- developing applications on relational DBMSs (SQL, relational algebra - completed an introductory course on Databases)
- sorting/searching techniques (quick/merge sorts, binary trees, hash tables - course on Design and Analysis of Algorithms)
This course will use C and C++. The practicals will have increasing difficulty, to allow students to learn the languages.
Synopsis
- Introduction: DBMS concepts; Short recap on relational algebra and SQL;
- Data Storage (Disks and Files): Page Access, Arrangement; Buffer Management; File organizations (Heap, Sorted, Clustered files); Row stores versus column stores;
- Indexes: Tree-Structured (ISAM, B+Tree); Hash-based (Static, Extendible, Linear)
- External Sorting: external merge sort; sorting based on (un)clustered B+tree;
- Query Evaluation: Binary matching operations (nested-loops, index nested, block nested, merge, hash joins); projection (with duplicate removal; based on hashing, sorting); set operations; aggregation; universal quantification; impact of buffering; evaluation techniques in existing systems;
- Query Optimization: equivalences of relational algebra; query plans; cost estimation; nested queries; optimization techniques in existing systems;
- Transaction Management: Concurrency Control (Serializability, Index/Predicate/ Locking, Multiple Granularity Locks, Kung Robinson Model); Crash Recovery (ACID properties, Logging, Checkpointing, Analysis/REDO/UNDO Phases)
Syllabus
The internals of a relational database management system; datastorage; indexes; sorting; query evaluation and optimization
techniques; transaction management; concurency control; recovery;
comparisons of some existing systems; use cases: Minibase and
PostgreSQL.
Reading list
- Ramakrishan and Gehrke. Database Management Systems, McGraw-Hill, 2002 (3rd ed).
- Garcia-Molina, Ullman, Widom. Database Systems: The Complete Book. Prentice Hall, 2002 (or 2008, 2nd ed).
Survey
- Goetz Graefe. Query evaluation techniques for large databases. ACM
Computing Surveys (CSUR). Volume 25 , Issue 2 (June 1993). Available at http://portal.acm.org/citation.cfm?id=152611.
Taking our courses
This form is not to be used by students studying for a degree in the Department of Computer Science, or for Visiting Students who are registered for Computer Science courses
Other matriculated University of Oxford students who are interested in taking this, or other, courses in the Department of Computer Science, must complete this online form by 17.00 on Friday of 0th week of term in which the course is taught. Late requests, and requests sent by email, will not be considered. All requests must be approved by the relevant Computer Science departmental committee and can only be submitted using this form.