Practical challenges to building large applications written in a datalog-based language
Since 2009, we have been developing and operating Cloud-deployed, hybrid transactional/analytic processing (HTAP) systems for large retail customers. We implement these applications primarily in a datalog-based language called LogiQL, which permits the concise expression of rich models with powerful constraints and business rules. The ability to build such systems at scale in LogiQL owes to a number of innovations in the LogicBlox platform, including advanced multi-way join algorithms, powerful query optimizations, write-optimized data structures, and the ability to harness the parallelism available in modern servers. Unfortunately, not every LogiQL program can (yet) take full advantage of these innovations. When tuning such a program, a developer will often rewrite parts of the program to introduce new predicates and rules that materialize intermediate results. Inevitably, these rewrites have a deleterious effect on the concision, readability, and maintainability of a program. In practice, we mitigate these challenges through optimization heuristics and by generating a significant component of the LogiQL program from a higher-level form (called CubiQL) that allows us to automate their application. This talk surveys the challenges that arise when using LogiQL to build HTAP applications in this domain and the heuristics we developed to address them. We conclude with open problems that remain difficult to automate.