Skip to main content

FDB: Factorised Databases

January 2010, on going

FDB logo

We investigate foundational and systems aspects of scalable data management at the confluence of compression, distribution, and approximation for mixed query and machine learning workloads on relational data.

New PhD and postdoc positions available in my group funded by a 5-year ERC consolidator grant! Please contact a project member for details if you are interested.

News

  • Sept 2017: Our own Ahmet Kara helping out at the ERC-funded Curiosity Festival taking place in Oxford.
  • June 2017: We received a Microsoft Azure grant to support our research experiments. Thank you Microsoft!
  • March 2017: The CS department and Oxford celebrated ERC on its 10th anniversary, see here and here.
  • October 2016: Pierre-Yves was awarded the Hoare Prize for best MSc in CS project 2016. His project investigates the problem of many-core and distributed regression learning over factorised joins.
  • May 2016: Maximilian received a SIGMOD travel grant to attend SIGMOD in San Francisco in June 2016.
  • March 2016: The FDB project was featured in three interviews with Dan on Romanian National Radio (Radio Romania Cultural).
  • March 2016: Maximilian received the honourable mention (2nd place out of 17 submissions) from Vienna Centre for Logic and Algorithms (VCLA) International Student Awards for Outstanding Master Thesis.
  • December 2015: Dan received a prestigious ERC Consolidator Grant worth almost 2 Million Euro to work on foundations of factorized data management systems.
  • September 2015: Maximilian was awarded the Hoare Prize for best MSc in CS project 2015. His project investigates the problem of learning linear regression models over factorised joins.
  • January 2015: We received an AWS in Education Research grant. Thank you Amazon!
  • August 2014: We received a Google Research Award. Thank you Google!
  • November 2013: Congratulations to Jakub, who successfully defended his PhD and joined Google Zurich!
  • October 2013: Laura was awarded the Hoare Prize for best MSc in CS project 2013. Her project investigates the problem of updates in factorised databases.
  • September 2012: Tomáš was awarded the Gloucester Research Project Prize for best 4th year Maths&CS project 2012. His project investigates the evaluation problem for queries with ORDER-BY and GROUP-BY clauses on factorised databases.
  • June 2012: Jakub will do an internship over the summer of 2012 at Google Zurich.

Talks

Given by various members of the project

  • Covers of Query Results
    • Workshop Aggregate Queries, Inria Lille and University of Lille, June 2017. [slides]
  • From Joins to Aggregates and Optimization Problems: One idea to rule them all and at their core factorize them!
    • Turing Data Science Course, Alan Turing Institute, Jan 2018.
    • Tutorial at LogiCS/RiSE Summer School, TU Vienna, July 2017. [slides]
  • A Unified Approach to Incremental View Maintenance of In-Database Analytics
    • Workshop Big Graph Analysis Systems, University of Copenhagen, Aug 2017.
  • In-Database Factorized Learning
    • Dagstuhl Seminar Recent Trends in Knowledge Compilation, Dagstuhl, Sept 2017.
    • Workshop Big Graph Analysis Systems, University of Copenhagen, Aug 2017.
    • Seminar, CDT Data Science, University of Edinburgh, March 2017.
    • Computer Science Colloquium, University of Warwick, March 2017.
  • Cultural Learnings of Factorized Databases for Make Benefit Glorious Family of Regression Tasks
    • University of Twente, June 2016.
    • Data Science Seminar, LogicBlox, Oct 2015.
    • Tech Seminar, Palantir, Palo Alto, Aug 2015.
    • Tech Seminar, LogicBlox, Berkeley, Aug 2015.
    • Database Seminar, Université Lille 1 and INRIA Lille - Nord Europe. July 2015, Lille.
  • Factorised Relational Databases
    (on various content of the FDB project and with slightly different titles)
    • Keynote talk at Alberto Mendelzon Workshop (AMW), Lima (Peru), May 2015.
    • LSDS (Large-Scale Distributed Systems) seminar, Imperial College London, November 2014.
    • Google MapReduce Infrastructure Seminar, February 7, 2014, Mountain View (California).
    • Pivotal Colloquium Series, January 24, 2014, Palo Alto (California).
    • LogicBlox Internal Online Seminar, December 17, 2013, Berkeley (California).
    • Google Ads Infrastructure Seminar, November 20, 2013, Mountain View (California).
    • Database Seminar, UC Santa Cruz, November 18, 2013, Santa Cruz (California).
    • Database Seminar, U Washington, November 8, 2013, Seattle (Washington).
    • AmpLab Seminar, UC Berkeley, October 30, 2013, Berkeley (California).
    • EECS Seminar, UC Merced, October 11, 2013, Merced (California).
    • Constraints Seminar, Oxford, March 2012.
    • Birkbeck Computer Science Departmental Seminar, March 2012, London.
    • Systems Seminar, EPFL, Feb 2012, Lausanne.
    • Information Systems Seminar, Oxford, November 2011.
    • Database Seminar, University of Edinburgh, March 2011, Edinburgh.
    • Dagstuhl Seminar on "Foundations of Distributed Data Management", Oct 2011, Dagstuhl.
  • Readability of Query Provenance
    Talk given by Dan in the Colloquium at UC Davis (Davis, August 2013).
  • Factorised Relational Databases: Results So Far and Open Questions
    A series of 3 invited lectures (each 45 minutes) given by Jakub at Olomouc University (September 2013, Czech Republic)

Theses

  • Nadezda Knorozova: Queries with Equality and Disequality Joins over Factorised Databases.
    MSc in CS, Oxford 2016.
  • Antonio Lombardo: Storage Layer for Factorized Data.
    MSc in CS, Oxford 2016.
  • Joe Kirk: Worst-Case Optimal Join At A Time.
    MSc in CS, Oxford 2015.
  • Szymon Wyleżoł: Cost-based Query Optimisation for Factorised Databases.
    MSc in Maths and CS, Oxford 2012.
  • Nurzhan Bakibayev: A Query Engine for Factorised Databases
    MSc in CS, Oxford 2011.

Publications

The publications are (non-disjointly) partitioned over four main topics: overview papers; in-database analytics; incremental view maintenance; succinct data representations and factorized query processing; and provenance compression.

Overviews

  • Factorized Databases. [preliminary version]
    Dan Olteanu and Maximilian Schleich.
    In SIGMOD Record (Database Principles Column), vol. 45, no. 2, June 2016.
  • Factorized Databases: A Knowledge Compilation Perspective.
    Dan Olteanu. [pdf]
    To appear in BeyondNP, AAAI workshop, Phoenix, April 2016.

On In-Database Analytics

  • In-Database Learning with Sparse Tensors. [arxiv]
    Mahmoud Abo Khamis, Hung Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich.
    To appear in ACM Principles of Database Systems (PODS), Houston, June 2018.
    arXiv report 1703.04780, March 2017.
  • Incremental Maintenance of Regression Models over Joins. [local, arxiv]
    Milos Nikolic and Dan Olteanu.
    arXiv report 1703.07484, March 2017.
  • In-Database Factorized Learning. [local]
    Hung Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich.
    In Alberto Mendelzon Workshop (AMW), Montevideo, June 2017.
    Extended version in arXiv report 1703.04780.
  • F: Regression Models over Factorized Views. [pdf, poster]
    Dan Olteanu and Maximilian Schleich.
    To appear in PVLDB 9(13), New Delhi, Sept 2016.
  • Learning Linear Regression Models over Factorized Joins. [updated version, poster]
    Maximilian Schleich and Dan Olteanu and Radu Ciucanu.
    In SIGMOD, San Francisco, June 2016.

On Incremental View Maintenance

  • Incremental Maintenance of Regression Models over Joins. [local, arxiv]
    Milos Nikolic and Dan Olteanu.
    arXiv report 1703.07484, March 2017.

On Succinct Data Representations and Factorized Query Processing

  • Covers of Query Results. [arxiv]
    Ahmet Kara and Dan Olteanu.
    To appear in Int Conf on Database Theory (ICDT), Vienna, March 2018.
    arXiv report 1709.01600, March 2017.
  • Worst-Case Optimal Join At A Time.
    Radu Ciucanu and Dan Olteanu.
    Technical report, November 2015.
  • Size Bounds for Factorised Representations of Query Results. [preliminary version]
    Dan Olteanu and Jakub Závodný.
    ACM Transactions on Database Systems (TODS) 40(1):2, 2015.
  • Aggregation and Ordering in Factorised Databases. [pdf]
    Nurzhan Bakibayev, Tomáš Kočiský , Dan Olteanu, and Jakub Závodný.
    March 2013.
    In Very Large Data Bases (PVLDB), vol 6, 2013.
    Complementary information available in Tomas' thesis
  • Demonstration of the FDB Query Engine for Factorised Databases. [pdf; poster]
    Nurzhan Bakibayev and Dan Olteanu and Jakub Závodný.
    System demonstration. In Very Large Data Bases (PVLDB), 5(12), 2012. Istanbul, 2012.
  • FDB: A Query Engine for Factorised Relational Databases. [pdf]
    Nurzhan Bakibayev and Dan Olteanu and Jakub Závodný.
    In Very Large Data Bases (PVLDB), 5(12), 2012. Istanbul, 2012.
    Prior version in arXiv technical report abs/1203.2672.
  • FDB: A Query Engine for Factorised Relational Databases. [pdf]
    Nurzhan Bakibayev and Dan Olteanu and Jakub Závodný.
    In Very Large Data Bases (PVLDB), 5(12), 2012. Istanbul, 2012.
    Prior version in arXiv technical report abs/1203.2672.
  • Factorised Representations of Query Results: Size Bounds and Readability. [pdf, slides]
    Dan Olteanu and Jakub Závodný.
    In Int Conf on Database Theory (ICDT), Berlin, 2012.
    Prior version (strict subset of results) in arXiv technical report 1104.0867, April 2011.

On Provenance Compression

  • Factorised Representations of Query Results: Size Bounds and Readability. [pdf, slides]
    Dan Olteanu and Jakub Závodný.
    In Int Conf on Database Theory (ICDT), Berlin, 2012.
    Prior version (strict subset of results) in arXiv technical report 1104.0867, April 2011.
  • On Factorisation of Provenance Polynomials. [pdf, poster]
    Dan Olteanu and Jakub Závodný.
    In 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP), June 2011, Heraklion, Crete.

Current Team

  • Ahmet Kara (postdoc)
  • Dan Olteanu (PI)
  • Maximilian Schleich (PhD)
  • Yu Tang (PhD)
  • Haozhe Zhang (PhD)

We are fortunate to work with the following collaborators: Mahmoud Abo Khamis (former LogicBlox, Inc.; now stealth mode), Milos Nikolic (University of Oxford), Hung Q. Ngo (former LogicBlox, Inc.; now stealth mode), XuanLong Nguyen (University of Michigan).

Alumni:

  • Nurzhan Bakibayev (MSc in CS, 2011)
  • Szymon Wyleżoł (4th year Maths&CS, 2012)
  • Tomáš Kočiský (4th year Maths&CS, 2011-2012)
  • Laura Draghici (MSc in CS, 2013)
  • Jakub Závodný (DPhil, 2014)
  • Joe Kirk (MSc in CS, 2015)
  • Lambros Petrou (MSc in CS, 2015)
  • Maximilian Schleich (MSc in CS, 2015)
  • Radu Ciucanu (postdoc, 2015-2016)
  • Pierre-Yves Bigourdan (MSc in CS, 2016)
  • Nadezda Knorozova (MSc in CS, 2016)
  • Antonio Lombardo (MSc in CS, 2016)
  • Lukas Kobis (4th year, 2017)
  • Amir Payberah (postdoc; April - Aug 2017)
  • Denis Rochau (MSc in CS, 2017)
  • Ruth Wells (MSc in CS, 2017)

Acknowledgments

Jakub Závodný has been partially supported by the EPSRC DTA Grant EP/P505216/1. Radu Ciucanu has been supported by the DBOnto EPSRC platform grant (2016). Yu Tang has been supported by the VADA EPSRC programme grant (2015-2017). Dan Olteanu acknowledges the support of an Astor Travel Fund grant (2013), a Google Research Award (starting 2014), an AWS in Education Research grant (2015-2017), a Microsoft Azure grant (2017-), the FADAMS ERC consolidator grant (2016-2021), and a Fondation Wiener Anspach grant (2017-2019).

Principal Investigator

People

Share this: