The SQL++ Query Language
SQL-on-Hadoop, NewSQL and NoSQL databases provide semi-structured data models (typically JSON-based). They now drive towards declarative, SQL-alike query languages. However, their idiomatic, non-SQL language constructs, the many variations and the lack of formal syntax and semantics pose problems. Notably, database vendors end up with unclear semantics and complicated implementations, as they add one feature at-a-time.
The presented SQL++ semi-structured data model bridges semistructured data and the SQL data model. The SQL++ query language aims to backwards compatibility with SQL. We show that a relatively small set of SQL restriction removals and feature additions is enough to provide a SQL-compatible extension to semistructured data. SQL++ is currently being adopted by the industry.
The extension to Configurable SQL++ includes configuration options that describe different options of language semantics and formally capture the variations of existing database languages. Configurable SQL++ is unifying: By appropriate choices of configuration options, the Configurable SQL++ semantics can morph into the semantics of any of eleven popular semistructured databases, which we surveyed, as the experimental validation shows. In this way, Configurable SQL++ allows a formal characterization of the capabilities of the emerging query languages.