Open−World Probabilistic Databases: An Abridged Report
İsmail İlkan Ceylan‚ Adnan Darwiche and Guy Van Den Broeck
Abstract
Large-scale probabilistic knowledge bases are be- coming increasingly important in academia and in- dustry alike. They are constantly extended with new data, powered by modern information extrac- tion tools that associate probabilities with database tuples. In this paper, we revisit the semantics under- lying such systems. In particular, the closed-world assumption of probabilistic databases, that facts not in the database have probability zero, clearly con- flicts with their everyday use. To address this dis- crepancy, we propose an open-world probabilistic database semantics, which relaxes the probabilities of open facts to default intervals. For this open- world setting, we lift the existing data complexity dichotomy of probabilistic databases, and propose an efficient evaluation algorithm for unions of con- junctive queries. We also show that query evalu- ation can become harder for non-monotone queries.