My
DBLP entry,
Google
Scholar profile, and
Microsoft
academic research profile.
2012
-
DAGger: Clustering Correlated Uncertain Data.
Dan Olteanu and Sebastiaan van Schaik.
System demonstration. In ACM SIGKDD Conf on Knowledge Discovery and Data Mining (KDD), Beijing 2012.
-
FDB: A Query Engine for Factorised Relational Databases.
[pdf]
Nurzhan Bakibayev and Dan Olteanu and Jakub Závodný.
In Very Large Data Bases (PVLDB), Istanbul, 2012.
CoRR technical report abs/1203.2672.
-
Aggregates in Probabilistic Databases via Knowledge
Compilation.
[pdf,
slides]
Robert Fink, Larisa Han, and Dan Olteanu.
In Very Large Data Bases (PVLDB), 5(5):490-501, 2012. Istanbul, 2012.
CoRR technical report abs/1201.6569.
-
Factorised Representations of Query Results: Size Bounds and Readability.
[pdf, slides]
Dan Olteanu and Jakub Závodný.
In Int Conf on Database Theory (ICDT), Berlin, 2012.
Prior version (strict subset of results) in
arXiv technical report 1104.0867, April 2011.
-
Ranking in Probabilistic Databases: Complexity and Efficient
Algorithms.
[pdf]
Dan Olteanu and Hongkai Wen.
In IEEE Int Conf on Data Engineering (ICDE),
Washington DC, 2012.
2011
-
Probablistic Databases
Synthesis Lectures on Data
Management, Morgan & Claypool, 2011, 180 pages.
Dan Suciu, Dan Olteanu, Christopher Re, Christoph Koch.
DOI: 10.2200/S00362ED1V01Y201105DTM016.
- The Next Big Challenge in Data Management:
Probabilistic Databases.
Robert Fink and Dan Olteanu.
Invited article in The Digit Magazine, June 2011
(10th Anniversary Special), in print.
More about the Digit Magazine in Wikipedia.
- Managing Probabilistic Data with SPROUT.
Dan Olteanu.
Article
in Inspired
Research (2nd edition), The Department of Computer Science,
University of Oxford. June 2011.
- On Factorisation of Provenance Polynomials.
[pdf, poster]
Dan Olteanu and Jakub Závodný.
In 3rd USENIX Workshop on the Theory and
Practice of Provenance (TaPP), June 2011, Heraklion, Crete.
- Repeatability and Workability Evaluation of SIGMOD
2011.
Philippe Bonnet, Stefan Manegold, Matias Bjorling, Wei Cao, Javier
Gonzales, Joel Granados, Nancy Hall, Stratos Idreos, Milena Ivanova,
Ryan Johnson, David Koop, Tim Kraska, Rene Mueller, Dan Olteanu,
Paolo Papotti, Christine Reilly,
Dimitris Tsirogiannis, Cong Yu, Juliana Freire and Dennis Shasha.
In SIGMOD Record, June 2011.
- SPROUT^2: A Squared Query Engine for Uncertain Web Data.
[pdf, poster,
web page]
Robert Fink, Andrew Hogue, Dan Olteanu, and Swaroop Rath.
System demonstration. In ACM Special Interest Group on Management of Data (SIGMOD), Athens, June 2011.
SPROUT² is a query answering system that allows users to ask
structured queries over tables embedded in Web pages,
over Google
Fusion tables, over uncertain data extracted
by NELL (Never-Ending
Language Learning), and over uncertain tables that can be
extracted from answers to Google Squared. At the core of this
service
lies SPROUT,
our query engine for probabilistic databases. A more complete
description of the demonstration is
available here.
- Providing Support for Full Relational Algebra in
Probabilistic Databases.
[pdf]
Robert Fink, Dan Olteanu, and Swaroop Rath.
In IEEE Int Conf on Data Engineering (ICDE),
Hannover, April 2011.
We show how arbitrary relational algebra queries can be
evaluated exactly or approximately in probabilistic databases
using (partial) compilation of query lineage. We also conduct
a tractability study for non-repeating queries with difference
and for quantified queries (such as division and set
inclusion/equivalence/incomparability).
- On the Optimal Approximation of Queries using
Tractable Propositional Languages.
[pdf]
Robert Fink and Dan Olteanu.
In Int. Conf. on Database Theory (ICDT),
Uppsala, March 2011.
We investigate approximations of non-repeating conjunctive
queries in probabilistic databases by lower and upper bound
queries that can be computed in polynomial time. The approach
is based on finding model-based greatest lower bounds and
least upper bounds for propositional formulas representing
query lineage, where the bounds are in tractable languages
such as (DNF) read-once formulas.
2010
- G-Store: A Storage Manager for Graph Data.
Robin Steinhaus, Dan Olteanu, and Tim Furche.
Technical report. Oct 2010.
-
Probabilistic XML via Markov Chains.
[pdf]
Michael Benedikt, Evgeny Kharlamov, Dan Olteanu, and Pierre Senellart.
In Very Large Data Bases (VLDB), Singapore, Sept 2010.
We define a space of probabilistic XML models based on
(restrictions of) Recursive Markov Chains and study their
expressiveness, succinctness, and tractability of (MSO and
XPath) query evaluation w/o unit-cost arithmetic. In contrast
to existing probabilistic XML models, these new models (i)
capture probabilistic versions of DTDs, (ii) can define
infinite probability distributions, and (iii) can be
exponentially more succinct, while still preserving MSO
tractability.
-
Approximate Confidence Computation in Probabilistic Databases.
[pdf,
talk]
Dan Olteanu, Jiewen Huang, and Christoph Koch.
In IEEE Int. Conf. on Data Engineering (ICDE), Long Beach, March 2010.
We enhance our query engine SPROUT
with a deterministic approximation algorithm with error
guarantees for confidence computation in (arbitrarily
correlated) probabilistic databases, which is based on
incremental compilation of query lineage into so-called
decomposition trees that support linear-time probability
computation. The compilation is incremental and we show
experimentally that it can achieve a given approximation within
a few steps. In case of tractable (hierarchial or inequality)
queries, it always finishes in polynomial time.
-
Bridging the Gap Between Intensional and Extensional
Query Evaluation in Probabilistic Databases.
[pdf]
Abhay Jha, Dan Olteanu, and Dan Suciu.
In Extending Data Base Technology (EDBT), Lausanne, March 2010.
We extend the notion of tractable queries on (arbitrarily
correlated) probabilistic databases to tractable data-query
instances, where general hard queries can become tractable on
restricted data instances. The query evaluation is separated
into two steps: (1) the (possibly large) tractable data-query
instance is evaluated first by efficient database techniques,
and then (2) the (usually small) intractable residue is
processed using inference techniques based on treewidth.
2009
-
On Tractability of Inequality Queries over
Probabilistic Databases and Counting Vertex Covers.
Dan Olteanu and Rasmus Wissmann.
Technical report, September 2009.
-
Secondary-Storage Confidence Computation for
Conjunctive Queries with Inequalities.
[pdf,
citations]
Dan Olteanu, Jiewen Huang.
In ACM Special Interest Group on Management of Data (SIGMOD), Providence, 2009.
This paper is the first to discuss tractability of
conjunctive queries with inequalities in probabilistic
databases. It presents a class of hard queries, and gives a
scalable algorithm for tractable inequality queries that is
based on ordered binary decision diagrams and is implemented in
our query engine SPROUT.
Our implementation passed the SIGMOD'09 repeatability and
workability evaluation (RWE). Out of 63 accepted papers, 19
participated in RWE, and 10 passed RWE.
Resources:
-
Scripts for running the experiments.
-
Code of the SPROUT query engine used in the experiments.
-
MayBMS: A Probabilistic Database Management
System (System demonstration). [pdf,
citations]
Jiewen Huang, Lyublena Antova, Christoph Koch, and Dan Olteanu.
In ACM Special Interest Group on Management of Data
(SIGMOD), Providence, June 2009.
Demo of the latest version of MayBMS with full support for
uncertainty-aware queries, succinct representation formalism for
probabilistic data, and the new query engine SPROUT.
Resources:
-
SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic
Databases.
[pdf,
citations]
Dan Olteanu, Jiewen Huang, Christoph Koch.
In IEEE Int Conf on Data Engineering (ICDE),
Shanghai, April 2009.
We show that the tractable hierarchical queries on
probabilistic databases admit relational query plans with
unrestricted join ordering, thereby overcoming the limitations
of the safe plans of earlier work. In this framework, safe
plans can be seen as one specific type of eager plans (where
confidence computation is done as soon as possible). We extend
relational plans with a new aggregation operator that is
optimized using query signature and schema information (keys)
to minimize the number of scans it needs to compute tuple
confidences. This operator basically factorizes the query
lineage into read-once functions (called 1OF formulas in the
paper) in polynomial time. An extensive study on TPC-H queries
is available here.
Resources:
-
10^10^6 Worlds and Beyond:
Efficient Representation and Processing
of Incomplete Information.
[preliminary version,
citations]
Lyublena Antova, Christoph Koch, Dan Olteanu.
In the Special Issue of the VLDB Journal
on Probabilistic Databases, 2009.
Extended version of the ICDE 2007 paper.
We discuss world-set decompositions, a complete representation
formalism for uncertain and probabilistic data, and study query
evaluation on these representations. World-set decompositions are
based on factorizations to exploit independence and, at least in
their probabilistic form, can be thought of as shallow Bayesian
Networks.
2008
-
Using OBDDs for Efficient Query Evaluation on
Probabilistic Databases.
[pdf,
citations]
Dan Olteanu, Jiewen Huang.
In 2nd
Int. Conf. on Scalable Uncertainty Management (SUM), Napoli,
Oct. 2008.
This paper is the first to consider OBDDs for efficient
evaluation of queries in probabilistic databases. It shows
that for tractable hierarchical queries, the query lineage can
be efficiently brought into one-occurrence form (that is, into
an equivalent read-once function), where each variable occurs
exactly once. Such factorizable lineage admit OBDDs of size
linear in the number of their variables only. For hard
queries, existing results that bound the OBDD size in the
pathwidth of the lineage apply immediately.
-
Conditioning Probabilistic Databases.
[pdf,
citations]
Christoph Koch, Dan Olteanu.
In Very Large Data Bases (VLDB), volume 1, 2008.
Also ACM CORR Report cs.DB/0803.2212.
This paper is the first to consider the problem of
conditioning a probabilistic database outside the context of
graphical models. The core contribution is an exact confidence
computation algorithm and its extension to achieve
conditioning by database constraints.
Resources:
-
Fast and Simple Relational Processing of Uncertain Data.
[pdf,
citations
]
Lyublena Antova, Thomas Jansen, Christoph Koch, Dan Olteanu.
In IEEE Int. Conf. on Data
Eng. (ICDE), Cancun, April 2008 (ranked 2nd in the research track).
Also ACM CORR Report cs.DB/0707.1644.
This paper presents U-relational databases, the
representation system of MayBMS that is an extension of c-tables
with probabilities and vertical partitioning. We also discuss
the efficient, purely relational evaluation of relational
algebra in such databases.
Resources:
-
CODE: Extension of TPC-H population
generator for attribute-level U-relational databases.
-
CODE: Queries and
(U-relations to ULDBs, attribute-level to
tuple-level) translators used in the experiments.
-
TALK at ICDE'08.
-
World-Set Decompositions: Expressiveness and
Efficient Algorithms.
[pdf,
citations]
Dan Olteanu, Christoph Koch, Lyublena Antova.
In
Elsevier Journal
Theoretical Computer Science (TCS)
, volume 403, Issues 2-3, Pages 133-416, 2008.
Also ACM CORR Report cs.DB/0705.4442. Long version of the
ICDT'07 paper.
This paper studies the theory of world-set decompositions
(that is, succinctness, expressiveness, and tractability of
standard decision problems). Of particular interest is the
factorization algorithm, which does a form of minimization of
representations.
-
MayBMS: A System for Managing Large Uncertain and
Probabilistic Databases.
Lyublena Antova, Christoph Koch, Dan Olteanu.
Best Poster Award at Spring'08 North East DB/IR Day, Columbia
University, April 18, 2008.
-
A semi-empirical simulation of the extragalactic
radio continuum sky.
R.J. Wilman, L. Miller, M.J. Jarvis, T. Mauch,
F. Levrier, F.B. Abdalla, S. Rawlings, H.-R. Kloeckner,
D. Obreschkow, D. Olteanu, S. Young.
In Monthly Notices of the Royal Astronomical
Society Main Journal, 2008.
Also ACM CORR Report astro-ph 0805.3413.
2007
-
World-Set Decompositions: Expressiveness and
Efficient Algorithms.
[pdf,
citations]
Lyublena Antova, Christoph Koch, Dan Olteanu.
In Int. Conf. on Database Theory (ICDT), Barcelona,
Jan. 2007.
Long version appeared in TCS 2008 and is available as
ACM CORR Report
cs.DB/0705.4442.
This paper studies the theory of world-set decompositions
(that is, succinctness, expressiveness, and tractability of
standard decision problems). Of particular interest is the
factorization algorithm, which does a form of minimization of
representations.
-
10^10^6 Worlds and Beyond:
Efficient Representation and Processing
of Incomplete Information.
[pdf,
citations]
Lyublena Antova, Christoph Koch, Dan Olteanu.
In IEEE 23rd Int. Conf. on Data Eng. (ICDE), Istanbul, April 2007.
Extended version in ACM CORR
Report cd.DB/0606075.
We discuss world-set decompositions, a complete representation
formalism for uncertain data, and study query evaluation on these
representations. World-set decompositions are based on
factorizations to exploit independence.
-
From Complete to Incomplete Information and
Back.
[pdf,
citations]
Lyublena Antova, Christoph Koch, Dan Olteanu.
In ACM Special Interest Group on Management of Data (SIGMOD), Beijing, June 2007.
This paper presents the nonprobabilistic version of the
MayBMS query language and studies some of its properties:
genericity, expressiveness, conservativity over relational
algebra, efficiency of evaluation, and algebraic equivalences.
-
Query language support for incomplete information
in the MayBMS system (Demonstration).
[pdf,
citations]
Lyublena Antova, Christoph Koch, Dan Olteanu.
In Very Large Data Bases (VLDB), Vienna, Sept 2007.
Demonstration of the first version of the MayBMS query language. It
was much improved since then (check out the SIGMOD'09 demo).
-
MayBMS: Managing Incomplete Information with Probabilistic
World-Set Decompositions (Demonstration).
[pdf,
citations]
Lyublena Antova, Christoph Koch, Dan Olteanu.
In IEEE 23rd Int. Conf. on Data
Eng. (ICDE), Istanbul, April 2007.
Demo of the first version MayBMS which used world-set
decompositions as the representation formalism for uncertain
data. In the mean time, MayBMS migrated to U-relational databases,
a more succinct representation formalism (check out the SIGMOD'09
demo).
-
Forward Node-Selecting Queries Over Trees.
[preliminary
version,
citations]
Dan Olteanu.
In ACM Trans. on Database
Systems (TODS), vol. 32 no. 1, March 2007.
This paper gives an in-depth theoretical study of the
problem of rewriting XPath-like queries into equivalent queries
without backwards steps. It introduces and studies two
rewriting systems (confluence, termination, soundness and
completeness, complexity, succinctness of rewritten over input
queries) that can rewrite any "absolute" XPath-like
navigational query into a backward-free equivalent.
-
SPEX: Streamed and Progressive Evaluation of
XPath. [preliminary version,
citations]
Dan Olteanu.
In IEEE
Trans. Know. and Data Eng. (TKDE), vol. 19 no. 7, July 2007.
This is the latest and (main) paper on the SPEX query
engine, one of the first streaming XPath engines with polynomial
combined complexity. SPEX compiles forward XPath queries (with
support for all forward axes) into networks of transducers that
process the XML stream tag after tag. For XPath with backwards
axes, it uses a rewriting technique defined in earlier work of
the author. SPEX features a distributed stack-like data
structure that is shared by the incremental computations of the
(possibly potential) answers, and a mechanism for reducing the
stream traffic across transducers. It is publicly available at
spex.sourceforge.net.
2006
-
Accelerating XPath Evaluation against XML Streams (Demonstration).
Dan Olteanu.
In Programming Language
Technologies for XML (PLAN-X), Charleston, South Carolina, January
2006.
-
Building a Native XML-DBMS as a Term Project in a Database Systems
Course. [pdf]
Christoph Koch, Dan Olteanu, Stefanie Scherzinger.
In XQuery Implementations, Experience, and
Perspectives (XIME-P), Chicago, June 2006.
2005
-
Evaluation of XPath Queries against XML streams.
[
abstract,
pdf (1514 KB),
citations
]
Dan Olteanu.
PhD Thesis, Institute for Informatics, Ludwig Maximilian University of Munich.
-
The XML Stream Query Processor SPEX (Demonstration).
[
pdf,
citations
]
Francois Bry, Fatih Coskun, Serap Durmaz, Tim Furche, Dan Olteanu,
Markus Spannagel.
In IEEE 21st
Int. Conf. on Data Eng. (ICDE), Tokyo,
April 2005.
2004
-
An Efficient Single Pass Query Evaluator for XML
Data Streams.
[
citations
]
Dan Olteanu, Tim Furche,
Francois Bry.
In ACM Symposium
on Applied Computing (SAC) (Data Streams Track), Nicosia, March 2004.
-
Evaluating Complex Queries against XML Streams with Polynomial Combined
Complexity.
[
citations
]
Dan Olteanu, Tim Furche, Francois Bry.
In British National Conf. on Databases (BNCOD21), Edinburgh, July 2004.
Full version in technical report PMS-FB-2003-15, University
of Munich, February 2003.
-
Aktuelles Schlagwort: Datenströme.
Francois Bry, Tim Furche, Dan Olteanu.
In Informatik Spektrum, 2/27, April 2004. Also in GI Informatiklexikon.
2003
-
An Evaluation of Regular Path Expressions with
Qualifiers against XML Streams.
[
citations
]
Dan Olteanu, Tobias Kiesling, Francois Bry.
In IEEE 19th Int. Conf. on
Data Eng. (ICDE), Bangalore,
March 2003.
Full version in
Technical Report PMS-FB-2002-12.
2002
-
XPath: Looking Forward.
[
pdf,
citations
]
Dan Olteanu, Holger Meuss, Tim Furche, Francois Bry.
In Workshop on XML-Based Data Management
(XMLDM) at EDBT 2002,
Prague, March 2002.
A previous version of this article is entitled Symmetry in
XPath [citations].
Full version in
Technical Report PMS-FB-2001-17.
This paper shows that "absolute" XPath navigational queries can be
rewritten into equivalent queries without backwards steps.
2001
-
Towards Grouping Constructs for Semistructured
Data.
[
pdf
]
Francois Bry, Dan Olteanu, Sebastian Schaffert.
In Workshop on
Electronic Bussiness Hubs (WEBH) at DEXA 2001, Munich 2001.
Full version in
Technical Report PMS-FB-2001-7
.
-
Aktuelles Schlagwort: Semistrukturierte Daten.
[
citations
]
Francois Bry, Michael Kraus, Dan Olteanu, Sebastian
Schaffert.
In Informatik Spektrum, 4/24, August 2001. Also in GI Informatiklexikon.
2000
-
Agora - Living with XML and Relational
(Demonstration).
[
citations
]
Ioana
Manolescu, Daniela Florescu, Donald Kossmann, Dan Olteanu, Florian
Xhumari.
In Very Large Data Bases (VLDB), Cairo, 2000.
-
Answering Queries using Views in Agora.
Dan Olteanu.
Diploma Thesis. Work completed at Polytechnic University of Bucharest and INRIA Rocquencourt. 2000.