Informal minutes of IFIP Working Group 2.8
24th meeting, Nesjavellir, Iceland
July 16-20, 2007

List of attendees

Members: Observers:

Technical presentations

Ralf Hinze, Programming Challenge. (Mon 2007-07-16, 09:00)

A robot is placed in a 2-dimensional grid extending infinitely in all four directions. The robot receives as input a direction (N, E, S, W) and returns as output an indication whether he has visited the place before (Y, N). Your task is to program the robot so that for each input the output is produced in constant time (that is, we are looking for an online algorithm). As an example, the input NEESWNNWSS yields the output NNNNNYNNYY. [The solution depends very much on the underlying computational model. If your model is, say, a 2-dimensional turing machine, then the solution is entirely trivial.]
Here is a solution.

Dick Kieburtz, The Monad of Strict Computations. (Mon 2007-07-16, 09:05)

See slides.

Colin Runciman, Reach. (Mon 2007-07-16, 10:00)

This talk explains a tool for Haskell programmers, called Reach. Given a program with some top-level functions identified as sources and some expressions marked as targets, Reach attempts to construct the simplest applications of the sources that entail evaluation of the targets. Applications include property refutation, assertion breaking, program crashing, customised data generators for test coverage, and program understanding. Reach works by the application of evaluation rules similar to lazy narrowing in functional-logic programming.
See slides.

Discussion (Benjamin Pierce), Proof Assistants in CS (especially PL) Education. (Mon 2007-07-16, 11:00)


Erik Meijer, What I have done in the last ten years (1997-2007). (Mon 2007-07-16, 11:30)

See slides.

Norman Ramsey, "Blame it on Bob and Ralf" or Generic Typed Intermediate Languages for the Masses. (Mon 2007-07-16, 14:00)

In a compiler that uses typed intermediate code, it can be helpful - especially for students -to define multiple intermediate languages, each of which embodies some invariant that holds during a particular stage of compilation. For example, my pedagogical compiler from 'mini-ML' to 2D uses different intermediate languages to represent the results of type inference, defunctionalization, A-normalization, and conversion to circuits. Having different representations makes it easier for students to understand and implement translation algorithms, but it can require more grunt work. In particular, each intermediate language rqeuires implementations of capture-avoiding substitution and similar functions.

I have eliminated this grunt work by adapting Hinze's (2004) ``Generics for the Masses'' to encode languages with free names and binding constructs. In this framework I have implemented free-variable test and capture-avoiding substitution once, generically, and reused the implementation for each intermediate language in the compiler. I have also implemented a generic function for converting terms to linear form.

My hope is that similar techniques may make it possible to write a single, generic type checker for these languages and perhaps others as well.

See slides.

Satnam Singh, Multi-way Rendezvous in Haskell+STM. (Mon 2007-07-16, 15:00)

This talk presents work in progress to implement multi-way rendezvous in Haskell using STM and MVars. The technique we adopt is based on the C-style SHIM language developed at Columiba University. The key motivation for finding a way to effectively implement multi-way rendezvous in Haskell is to make it easier to produce deterministic parallel Haskell programs.
See slides.

David MacQueen, Announcement SML/NJ 4.0. (Mon 2007-07-16, 15:30)

See slides.

Martin Odersky, The Scala Experience. (Mon 2007-07-16, 16:00)

Scala is a general purpose programming language which is fully inter-operable with Java. It smoothly integrates features of object-oriented and functional languages. Scala allows a terseness of style comparable to scripting languages but has at the same time a strong and static type system. In this way, it can express common programming patterns in a concise, elegant, and type-safe way.

This talk will give an introduction to the Scala programming language, highlighting its main innovative features: closures, pattern matching, type abstraction, and mixins. I will show how these features together enable the construction of truly high-level libraries that provide in effect embedded domain-specific languages. I will also give an outline how Scala's higher-level constructs are mapped to Java.

See slides.

Simon Peyton Jones, Indexed type families or DEATH to functional dependencies. (Mon 2007-07-16, 17:00)

See slides.

Steve Zdancewic, Combining Access Control and Information Flow in DCC (work in progress). (Tue 2007-07-17, 09:00)

The Dependency Core Calculus (DCC) was introduced in 1999 by Abadi, Banerjee, Heintze, and Riecke to provide a framework for studying a variety of dependency analyses such as information flow, slicing, and binding time. The key construct in DCC is an indexed family of monads, where the indices are security levels and a type like "T_A int" can be interpreted as "an integer value with security level A". In 2006, Abadi discovered a remarkable connection between DCC and authorization logics, which are used to express access-control policies. The insight is that the type "T_A P" can be read as "A says P". Here, A is treated as a principal rather than a security level.

This talk will explain these two interpretations of DCC and describe some preliminary work on trying to combine them in a language that can express both information flow and access control policies.

See slides.

Derek Dreyer, Why Applicative Functors Matter. (Tue 2007-07-17, 10:00)

Despite much work on the semantics of ML modules, the key distinction between the SML and OCaml module systems - namely, that SML supports generative functors, while OCaml supports applicative functors - remains an issue that few people understand. The main reason, I claim, is that while generativity is critical for ensuring data abstraction in the presence of mutable state, the traditional motivations given for applicativity based on higher-order module programming are not per se very compelling. Another issue is that the question of how to define module equivalence - a key question in applicative functor semantics - has not been satisfactorily resolved.

In this talk, I offer a new, more compelling justification for the importance of applicative functors that I discovered while developing (with Bob Harper and Manuel Chakravarty) modular type classes (POPL'07). While the original MTC framework only supported generative functors, I argue that extending it with applicative functors is a good idea for several reasons. First, applicative functors significantly enhance the expressiveness of modular type classes by enabling the definition of implicitly-typed operations over generic data types implemented by functors. Second, they allow us to more properly model the functionality of Chakravarty et al.'s associated types. Lastly, the extension of MTC to support applicative functors suggests a simple, elegant generalization of term overloading to module overloading, i.e. the ability to project components from an (applicative) functor as if it were a structure.

The use of applicative functors in MTC demands that their dynamic components be purely functional. As it turns out, this purely functional restriction also makes it easy to define module equivalence in a way that conservatively approximates contextual equivalence, and thus preserves data abstraction. This leads me to the conclusion that the complexities of earlier module systems (including my own) should be dropped in favor of the simple slogan: "A functor is applicative if and only if its dynamic components are purely functional."

See slides.

Olivier Danvy, Reduction contexts without headaches (A Functional Pearl). (Tue 2007-07-17, 11:00)

Reduction contexts are notoriously difficult to phrase correctly in a Felleisen-style reduction semantics. Against this backdrop, we propose two simple stepping stones for constructing a grammar of reduction contexts: (1) write a compositional function searching a term for the first redex, if any, given a reduction strategy; (2) extend this search function to yield the redex together with the corresponding `fill' function that, given the contractum, yields the term after one reduction step. This fill function can be constructed in an inherited fashion or in a synthesized fashion, to use the vocabulary of attribute grammars. In either case, the result is directly usable to implement a one-step reduction function and from there, evaluation as iterated reduction. As for the reduction contexts, their outside-in representation is obtained by defunctionalizing the synthesized fill function and their inside-out representation is obtained by defunctionalizing the inherited fill function. A side benefit: if the search for the next redex is implemented as a compositional function (i.e., it is not a full-blown relation) in the first place, the unique-decomposition property, which is mandatory to prove for deterministic languages, follows as a corollary.

In this talk, we illustrate the method with a simple example of conditional arithmetic expressions. This method scales to more substantial languages (it passes the Flatt test) and it suggests new connections between the representations of various operational semantics: structural operational semantics, reduction semantics, abstract machines, and natural semantics.

See slides.

Paul Hudak, Plugging a Space Leak with an Arrow. (Tue 2007-07-17, 12:00)

See slides.

Larry Wall, Perl 6. (Tue 2007-07-17, 14:00)

See slides.

Peter Thiemann, Towards Interface Files for Haskell. (Tue 2007-07-17, 15:00)

Interface types are a very useful concept in object-oriented programming languages. They can enable a clean programming style which relies only on interfaces without revealing their implementation.

Haskell's type classes provide a closely related facility for stating an interface separately from its implementation. However, there is no simple mechanism to hide the identity of the implementation type of an interface as is often desired for constructing libraries. This work provides such a mechanism through the integration of lightweight existential datatypes into Haskell. A simple source-to-source transformation enables the use of interface types with subtyping in Haskell.

See slides.

Simon Marlow, Lightweight Concurrency Primitives for GHC. (Tue 2007-07-17, 16:00)

The Glasgow Haskell Compiler (GHC) has quite sophisticated support for concurrency in its runtime system, which is written in low-level C code. As GHC evolves, the runtime system becomes increasingly complex, error-prone, difficult to maintain and difficult to add new concurrency features.

This talk presents an alternative approach to implement concurrency in GHC. Rather than hard-wiring all kinds of concurrency features, the runtime system is a thin substrate providing only a small set of concurrency primitives, and the rest of concurrency features are implemented in software libraries written in Haskell. This design improves the safety of concurrency support; it also provides more customizability of concurrency features, as new concurrency features can be developed as Haskell library packages and deployed modularly.

See slides.

Benjamin C. Pierce, Fun With String Lenses. (Tue 2007-07-17, 17:00)

A LENS is a bidirectional program. When read from left to right, it denotes an ordinary function that maps inputs to outputs. When read from right to left, it denotes an "update translator" that takes an input together with an updated output and produces a new input that reflects the update. Many variants of this idea have been explored, but none deal fully with ORDERED data. If, for example, an update changes the order of a list in the output, the items in the output list and the chunks of the input that generated them from can be misaligned, leading to lost or corrupted data. We attack this problem in the context of bidirectional transformations over strings, the primordial ordered data type.

We first propose a collection of bidirectional STRING LENS COMBINATORS, based on familiar operations on regular transducers (union, concatenation, Kleene-star) and with a type system based on regular expressions. We then design a new semantic space of DICTIONARY LENSES, enriching the lenses of Foster et al. to support two additional combinators for marking "reorderable chunks" and their keys. To demonstrate the effectiveness of these primitives as a foundation for programming bidirectional string transformations, we describe the design and implementation of Boomerang, a full-blown BIDIRECTIONAL PROGRAMMING LANGUAGE with dictionary lenses at its core. We have used Boomerang to build transformers for complex real-world data formats including the SwissProt genomic database.

We formalize an essential feature of RESOURCEFULNESS - the correct use of keys to associate chunks in the input and output - by defining a refined space of QUASI-OBLIVIOUS LENSES. Several previously studied properties of lenses turn out to have compact characterizations in this space.

See slides.

Kathleen Fisher, From Dirt to Shovels:Inferring PADS descriptions from ASCII Data. (Wed 2007-07-18, 09:00)

Kathleen Fisher, David Walker, Peter White, and Kenny Zhu

An ad hoc data source is any semistructured data source for which useful data analysis and transformation tools are not readily available. Such data must be queried, transformed and displayed by systems administrators, computational biologists, financial analysts and hosts of others on a regular basis. In this paper, we demonstrate that it is possible to generate a suite of useful data processing tools, including a semi-structured query engine, several format converters, a statistical analyzer and data visualization routines directly from the ad hoc data itself, without any human intervention. The key technical contribution of the work is a multi-phase algorithm that automatically infers the structure of an ad hoc data source and produces a format specification in the PADS data description language. Programmers wishing to implement custom data analysis tools can use such descriptions to generate printing and parsing libraries for the data. Alternatively, our software infrastructure will push these descriptions through the PADS compiler and automatically generate fully functional tools. We evaluate the performance of our inference algorithm, showing it scales linearly in the size of the training data—, completing in seconds, as opposed to the hours or days it takes to write a description by hand. We also evaluate the correctness of the algorithm, demonstrating that generating accurate descriptions often requires less than 5% of the available data.

See slides.

Fritz Henglein, Generic Sorting and Multiset Discrimination. (Wed 2007-07-18, 10:00)

We show that sorting discriminators, which sort and partition their input, can be defined generically on a rich class of total preorders. This yields worst-case linear-time sorting algorithms that avoid the information-theoretic bottleneck of defining a comparison function generically and then using that as an argument in a parametric comparison-based sorting algorithm.
See slides.

John Launchbury, Haskell Program Coverage Toolkit. (Wed 2007-07-18, 11:00)

See slides.

Didier Remy, A Fully Graphical presentation of MLF. (Wed 2007-07-18, 12:00)

MLF types generalize second-order types with sharing and implicit instantiation, in order to seamlessly merge ML-style type inference with System F polymorphism. A graphic representation of MLF types has recently been introduced. We extend these (raw) graphic types to (graphic) type constraints, by introducing special nodes to represent type schemes and special edges to represent unification and instantiation constraints. We interpret type constraints as sets of graphic types that satisfy the constraints. We give a set of semantic preserving transformations on type constraints and propose a strategy to reduce type constraints to unification problems on graphic types. We describe MLF type-inference problems as typing constraints and argue that under reasonable assumptions, the complexity of type inference for MLF is almost linear, as for ML.We also introduce a more relaxed interpretation of typing constraints up to a relation called inverse abstraction, which characterizes an implicitly typed version of MLF. We show subject reduction and progress for this system, thus also proving type soundness for the explicitly typed system.
See slides, background material.

Amal Ahmed, Equivalence-Preserving Compilation. (Thu 2007-07-19, 09:00)

See slides.

Xavier Leroy, Formal verification of a compiler front-end for mini-ML. (Thu 2007-07-19, 10:00)

(With Zaynah Dargaye and Andrew Tolmach)

Formal compiler verification consists of applying program proof to compilers in order to prove that the meaning of the source program is preserved through compilation. After some initial successes with the Compcert prototype compiler from C-light to PowerPC assembly, we are working on a front-end compiler for mini-ML (call-by-value, untyped lambda-calculus with datatypes and shallow pattern-matching) down to Cminor (one of the intermediate languages of Compcert). The proof of semantic preservation for this front-end goes down relatively well. However, we also need to prove correct a memory allocator and garbage collector, and interface this proof with that of the compiler front-end. This problem is not solved yet; some approaches are described in the talk.

See slides.

David MacQueen, Abstract Machines for Type Inference Using Ranked Type Variables. (Thu Wed 2007-07-18, 11:00)

See slides.

Mark Jones, Revenge of Functional Dependencies. (Thu 2007-07-19, 12:00)

See slides.

Bob Harper, Extensible Indexed Types. (Thu 2007-07-19, 14:00)

Daniel R. Licata and Robert Harper

Indexed families of types are a way of associating run-time data with compile-time abstractions that can be used to reason about them. We propose an extensible theory of indexed types, in which programmers can define the index data appropriate to their programs and use them to track properties of run-time code. The essential ingredients in our proposal are (1) a logical framework, which is used to define index data, constraints, and proofs, and (2) computation with indices, both at the static and dynamic levels of the programming language. Computation with indices supports a variety of mechanisms necessary for programming with extensible indexed types, including the definition and implementation of indexed types, meta-theoretic reasoning about indices, proof-producing run-time checks, computed index expressions, and run-time actions of index constraints.

Programmer-defined index propositions and their proofs can be represented naturally in a logical framework such as LF, where variable binding is used to model the consequence relation of the logic. Consequently, the central technical challenge in our design is that computing with indices requires computing with the higher-order terms of a logical framework. The technical contributions of this paper are a novel method for computing with the higher-order data of a simple logical framework and the integration of such computation into the static and dynamic levels of a programming language. Further, we present examples showing how this computation supports indexed programming.

See slides.

Ralf Hinze, Closed and Open Recursion. (Thu 2007-07-19, 15:00)

See slides.

Discussion (John Launchbury), What has FP to offer for high-robustness systems? (Thu 2007-07-19, 16:00)


Olin Shivers, Higher-order Flow Analysis with DDP. (Fri 2007-07-20, 09:00)

See slides.

Lennart Augustsson, Perverted Haskell. (Fri 2007-07-20, 10:00)

See slides.

Colin Runciman, Polycell: a rehabilitation of lazy lists (Fri 2007-07-20, 11:00)

See slides.

Photos


Ralf Hinze (email: ralf@informatik.uni-bonn.de)