How to Architect a Query Compiler
In this talk, we study architecting query compilers. The state of the art in query compiler construction is lagging behind that in the compilers field. We attempt to remedy this by exploring the key causes of technical challenges in need of well-founded solutions, and by gathering the most relevant ideas and approaches from the PL and compilers communities for easy digestion by database researchers. All query compilers known to us are more or less monolithic template expanders that do the bulk of the compilation task in one large leap. Such systems are hard to build and maintain. We propose to use a stack of multiple DSLs on different levels of abstraction with lowering in multiple steps to make query compilers easier to build and extend, ultimately allowing us to create more convincing and sustainable compiler-based data management systems. We attempt to derive our advice for creating such DSL stacks from widely accepted principles. We have also re-created a well-known query compiler following these ideas and report on this effort.
Amir Shaikhha is a 5th year Ph.D. student at EPFL. His research aims to build efficient data analytics systems using high-level languages. More specifically, he is interested in using compilation techniques for generating efficient low-level code (e.g. C code) from the high-level specification (e.g. Scala code) of performance-critical systems (e.g. database systems). He received his M.Sc. from EPFL and B.S. from the Sharif University of Technology.