# Algebraic Data Language Theory

Clemens Ley ( OUCL )

- 11:30 16th February 2011 ( week 5, Hilary Term 2011 )Room 147, Oxford University Computing Laboratory

In algebraic language theory, certain algebraic objects, primarily monoids, are used to analyze the structure of string and tree languages. A fundamental result states that a language is regular iff it is accepted by a finite monoid. Subclasses of regular languages, such as star-free languages correspond to natural classes of finite monoids. Most prominently, Schützenberger, McNaughton, and Papert showed that a regular language is first order definable iff it is star-free iff it is accepted by an aperiodic monoid.

The study of data languages -- languages over an infinite alphabet — is motivated by applications in verification and XML processing. In this talk I will report about a recent extension of algebraic techniques to the study of data languages. A class of infinite monoids, finite-orbit data monoids, has been proposed by Bojanczyk. We present a logic, called rigidly-guarded monadic second order logic, that defines exactly the languages accepted by finite-orbit data monoids. A theorem akin to Schützenberger's Theorem also carries over to data languages: We show that a language is definable in rigidly-guarded first order logic iff it is accepted by an aperiodic data monoid.