Talking with data: Natural Language Querying of RDF Knowledge Graphs
Supervisor
Suitable for
Abstract
Natural language interfaces to databases have become increasingly important as they allow users to query complex data without requiring deep technical knowledge of query languages. RDF Knowledge Graphs (KGs), which store structured semantic data, are typically queried using SPARQL—a powerful but syntactically demanding language. This creates a barrier for non-expert users who wish to retrieve information using simple, natural language queries. Bridging this gap requires robust techniques for translating human language into precise SPARQL queries while preserving semantic intent and handling ambiguity via semantic search.
The goal of this project is to develop a system that converts natural language questions into SPARQL queries for RDF Knowledge Graphs, enabling intuitive access to semantic data. The work will explore state-of-the-art approaches in natural language processing and semantic parsing, leveraging LLMs and ontology-aware generation to ensure accurate query generation. To meet performance and scalability requirements, the implementation will target the RDF engine RDFox.
Enhanced topics:
- Graph based memory – storing conversation metadata in a conversation graph to provide additional memory-based context.
- Enhanced through reasoning – explore how reasoning can be used to simplify the most frequently asked questions and increase accuracy for the answers given.
- Working with small language models that target smaller devices.
This project presents the opportunity to work with one of the Computer Science department’s spinout companies, and success story, Oxford Semantic Technologies. As well as help candidates build their CV strong candidates will have the opportunity of summer internships with Oxford Semantics.
Background reading
Aidan Hogan et al. Knowledge Graphs. Synthesis Lectures on Data, Semantics, and Knowledge, Morgan & Claypool Publishers 2021, ISBN 978-3-031-00790-3, pp. 1-257
Heiko Paulheim. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web 8(3): 489-508 (2017)
Katrin Affolter, Kurt Stockinger, Abraham
Bernstein:
A comparative survey of recent natural language interfaces for databases. VLDB J. 28(5): 793-819 (2019)
Vincent Emonet, Jerven T. Bolleman, Severine Duvaud, Tarcisio Mendes de Farias, Ana Claudia Sima: LLM-based SPARQL Query Generation from Natural Language over Federated Knowledge Graphs. HGAIS@ISWC 2024
Jacopo D’Abramo, Andrea Zugarini, and Paolo Torroni. 2025. Investigating Large Language Models for Text-to-SPARQL Generation. In Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing, pages 66–80.Association for Computational Linguistics