Graph Machine Learning with Neo4j
Supervisor
Suitable for
Abstract
Co-supervisor Neo4J (industrial partner)
We have discussed a set of projects concerning graph ML in the industry with Brian Shi (an Oxford alumni) of the Neo4j graph data science team, including (but not limited to):
- Bridging graph neural networks and graph databases. Basic GNN message passing equation can be implemented by a graph query language, but what are all the GNN architectures that can be (partly) expressed as queries and what queries can be implemented and learnt by GNNs? How to leverage a graph DB for efficient training and inference for a GNN?
- GraphRAG: A popular approach to answering graph questions nowadays is to leverage the LLMs. Given a large knowledge graph and a question, retrieve a high-recall subgraph context and prompt the LLM. This raises the question of how practically important are provable expressiveness of GNNs, how well can transformers and transformer-based LLMs approximately achieve them, how well can they reason on graphs, and how to perform efficient and reliable subgraph retrieval for reasoning.
A student would be working with Michael Benedikt and the Neo4j team. The project will be research-focused and any software produced would be in the public domain. Additional hardwares and resources will be provided if needed. The balance of experiment and theory could be tuned to the student's interests. The main prerequisite would be a very good knowledge of graph ML, at the level of Oxford's GRL course.