Skip to main content

Query rewriting for caching and security

Supervisors

Suitable for

MSc in Advanced Computer Science

Abstract

Prerequisites:  Foundational AI/ML background

Natural language interfaces for data-driven systems face a fundamental conflict between personalization and efficiency. User queries are frequently "data-dependent," meaning the user's private data and the logical structure of their request are intertwined within the query string itself. This fusion of logic and private data renders traditional caching ineffective, as identical user intentions result in textually unique queries, forcing redundant computation. Furthermore, this design exposes sensitive user data to the core query planning and optimization layers, creating significant privacy vulnerabilities and data leakage risks throughout the system.

This research proposes a new AI model to address this challenge by performing intelligent query-rewriting. The model will function as an abstraction layer, intercepting a data-dependent natural language query and transforming it into two distinct components: a canonical, **data-independent template** that represents the abstract operational intent, and a separate, structured **parameter object** that isolates all the user-specific data. This decoupling is the central hypothesis, designed to systematically separate the *what* (the logic) from the *who* (the data).

The benefits of this separation are twofold. First, the data-independent templates become highly cacheable, allowing the system to reuse computationally expensive execution plans for all users expressing the same intent, which promises a significant increase in performance and scalability. Second, it enables a more secure processing model where the generalized template is handled by a public-facing planner, while the isolated, sensitive data is managed by a secure, trusted module. This ensures private information is shielded from the main planning environment and only introduced at the final point of execution, greatly enhancing data privacy.