Informal minutes of IFIP WG2.8 26th meeting

Informal minutes of IFIP Working Group 2.8
26th meeting, Frauenchiemsee, Germany
June 7-12, 2009

List of attendees

Members:

Arvind
Lennart Augustsson
Manuel M.T. Chakravarty
Koen L. Claessen
Olivier Danvy
Jack P. Dennis
Kathleen Fisher
Fritz Henglein
Ralf Hinze (secretary)
Richard B. Kieburtz
John Launchbury
Xavier Leroy
David B. MacQueen
Martin Odersky
Simon Peyton Jones
Norman Ramsey
Didier Remy
Colin Runciman
Joe Stoy
Peter Thiemann
David A. Turner
Stephanie Weirich
Steve Zdancewic
Observers:

Conor McBride
Gabriele Keller
Julia Lawall
Marc Pouzet
Don Stewart
Gilad Bracha
Michael Sperber
Martin Hofmann

Technical presentations

Kathleen Fisher, Typing Directories. (Mon 2009-06-08, 09:20-10:00)

Joint work with David Walker and Kenny Zhu.
PADS describes the contents of individual ad hoc data files, but has no provisions for describing collections of files, i.e., directories. In this talk, I explore examples where having a declarative description of directories as well as files would be useful, including websites, source code trees, source code control systems, operating systems, and scientific data sets. As part of this exploration, I identify essential features of a directory description language and useful tools that might be produced from such a description. I end with a series of questions about how such a language might most easily be implemented in the context of Haskell.

See slides.

Mike Sperber, Data is from Venus - Database is from Mars. (Mon 2009-06-08, 10:00-10:45)

In application development, when a relational database enters the scene, bad things tend to happen: Databases are good at providing persistence, transactional integrity, and fast searches, but their data-modelling capabilities are limited. The lack of a direct model for sum types is particularly significant. Taken together with typical development processes as well as the limited capabilities of object-relational-mapping frameworks to express relationships between database models and data models, this means that the database model often influences the data model in pernicious ways. Moreover, the resulting lack of modularity often makes the software hard to modify or extend. The talk describes the underlying problems, as well as a query framework that addresses them. The query framework lets the programmer express object-relational mappings using arbitrary functions, and thus serves to decouple the database model from the data model. The framework was initially implemented in Scheme, and subsequently transliterated to Java for use in a large enterprise application.

The slides can be obtained from Mike.

Julia Lawall, Isomorphisms for the Coccinelle Program Matching and Transformation Engine. (Mon 2009-06-08, 11:15-12:15)

Joint work with Jesper Andersen, Julien Brunel, Damien Doligez, Rene Rydhof Hansen, Bjoern Haagensen, Gilles Muller, Yoann Padioleau, and Nicolas Palix
Program matching is the process of finding the places in a piece of software that are matched by a given pattern. We have developed the Coccinelle tool for performing program matching in C code, with the goal of automating code evolution and finding bugs. Coccinelle pattern specifications are in the form of C code, abstracting with metavariables and operators representing arbitrary code sequences.
While the use of C code for the pattern language makes patterns easy to write, it can be too specific, as there is a need to take into account every possible syntactic form in which a concept can be expressed. For example, in C, a NULL test can be written as x == NULL or as !x. To avoid requiring the user to consider all of these variations, we have developed the notion of isomorphisms, which are collections of code fragments that can be considered to be the same in the context of a pattern specification. In this talk, we consider the conditions under which such isomorphisms are safe to apply.

See slides.

Special interest groups, short introductory round. (Mon 2009-06-08, 14:00-14:15)

Ralf Hinze, The Chocolate Game. (Mon 2009-06-08, 14:15-14:25)

The chocolate game is a two-person game. The players eat up a chocolate bar according the following rule: the player whose turn it is breaks the bar in two pieces, either horizontally or vertically, and eats the smaller one. The players make alternating moves; the game ends when the bar consists of a single square. The player whose turn it is loses and consequently has to pay the chocolate bar.
Devise a winning strategy (the objective is to avoid paying, not to eat as much as possible).

See puzzle and solution.

Colin Runciman, Partitions. (Mon 2009-06-08, 14:25-14:35)

The partitions of a natural number n are the bags of positive integers that sum to n. Eg. the partitions of 3 are {1,1,1}, {1,2} and {3}.

Define a function parts :: Int -> [[Int]] so that parts n lists the partitions of n, representing bags by lists. You should find, for example, that length (parts n) is 627 for n=20, and 5604 for n=30.

Define reduction count suitably for the programming language you have used - eg. formulate your definition of parts to use an applicative style and count n-ary applications as reductions. Hence determine the number of reductions per second when computing, say, parts 50.

Determine the clock rate for the computer you are using. Hence find the number of reductions per clock tick. Answers above 0.1 are encouraging, above 0.5 competitive and above 1.0 exciting.

See slides.

Koen Classen, Edit distance. (Mon 2009-06-08, 14:35-14:45)

We present a programming challenge where the objective is to write a complete QuickCheck specification for the edit distance problem. The technique we end up using is useful for most search- and optimization problems.

See slides (odp).

Koen Classen, Shrinking functions. (Mon 2009-06-08, 14:45-15:20)

In QuickCheck, you can quantify over functions, but when a counter example is found, the functions can neither be shown nor shrunk. I will show a technique that we can use to do both of these. It turns out you cannot show a function unless you shrink it first!

See slides (odp).

Peter Thiemann, Recency Types for JavaScript. (Mon 2009-06-08, 15:20-16:00)

(joint work with Phillip Heidegger)
Scripting languages are popular because their dynamic type systems accelerate the development of short programs. As larger programs are built with those languages, static analysis becomes an important tool for detecting programming errors.
We specify such an analysis as a type system for a call-by-value lambda calculus with objects and prove its type soundness. The calculus models essential features of JavaScript, a widely used dynamically-typed language: objects as property maps, type change during object initialization, and prototypes.
As a more general technical contribution, our work demonstrates that the idea of recency abstraction can be transferred from abstract interpretation to a type system. We model recency information with a notion of precise object pointers that enables strong, type changing updates of object types during a generalized initialization phase. The same precise object pointers allow for an accurate treatment of the prototype mechanism. Unlike linear types, precise object pointers can refer to ordinary, imprecise object pointers and vice versa.

See slides.

Richard B. Kieburtz, Building a Haskell verifier from component theories. (Mon 2009-06-08, 16:30-17:10)

This talk describes the architecture of an experimental verifier for properties of Haskell 98 programs. In this approach, a semantic theory for Haskell is factored into component subtheories that correspond to its type structures. A verifier is then constructed as a collection of modules that individually realize decision procedures for decidable fragments of the theory components. These approximate the complete theory, and while incomplete, are sufficient to verify, automatically, many simple but useful program property assertions.
A prototype verifier is based upon the Decision Procedure Toolkit (DPT), an open-source collection of cooperating decision procedures implemented in OCAML by a research team at Intel. At the center of the toolkit is a fast, SAT solver that implements the Davis-Putnam-Logeman-Loveland procedure for propositional logic.

See slides.

Xavier Leroy, Programming with dependent types: passing fad or useful tool? (Mon 2009-06-08, 17:15-18:00)

I've been programming medium-sized applications (compilers, etc) within the Coq specification language, which supports both ML/Haskell-style programming (with plain functions and data) as well as dependently-typed programming (where logical properties are embedded in the types of data and functions). The talk presents the experience gained by doing so. A couple of nice uses of dependent types are described: attaching invariants to data structures (abstract data types) and to monadic computations. With these exceptions, I still find it more effective to write plain ML/Haskell-style programs in Coq and separately state and prove properties about them.

See slides.

Steve Zdancewic, Concurrency and Classical Linear F. (Tue 2009-06-09, 09:00-09:50)

In this talk, I explore using classical linear logic as a type system for a concurrent variant of System F. First, I'll describe a variant of System F that provides linear types distinguished by kinds. In cotrast with traditional presentations of linear logic, which use the ! constructor to express unrestricted types, this language, called F^o, is closer to ordinary System F. I give examples of how linearity can be used to enforce protocols over existential packages. Next, I'll show how we can move to a classical variant of linear logic by adding just two primitives for concurrency. The resulting language allows for encoding session types and hopefully yields a "pure" style of concurrent programming.
This is work in progress with my collaborator Karl Mazurak.

See slides.

Conor McBride, Winging It. (Tue 2009-06-09, 09:50-10:40)

I propose a small, readily simulated extension to Haskell's kind and type system, allowing types to become kinds and constructor-form expressions to stand in types. I exploit this extension to explore improvements in precision for both data and control structures. I develop standard examples of data structures indexed by type-level values in the style of dependently typed languages. Where the Functor class captures untyped superstructure relationships, a new IFunctor class expresses a similar connection between indexed value sets and indexed superstructure sets, lifting index-preseving maps
type s :-> t = forall i. s i :-> t i
from one to the other. Through the Curry-Howard lens, an IFunctor looks rather like a monotone predicate transformer.
class IFunctor phi where imap :: (s :-> t) -> phi s :-> phi t
In the rest of the talk, I examine the notion of IMonad which falls naturally from this IFunctor abstraction. IMonads model indexed notions of computation with indexed notions of value, where the index represents the state of the system with which computations interact. Through the Curry-Howard lens, an indexed notion of value is a predicate on states, and an IMonad, phi say, is a predicate transformer -- phi t i holds if a state satisfying t is reachable by a phi-computation starting in state i. The Kleisli arrow types, s :-> phi t, are thus Hoare triples, requiring postcondition t to be reachable if precondition s is satisified. Crucially, to produce such an arrow, one must be ready for any state which satisfies the precondition. Correspondingly, IMonads model computation in changing circumstances. For example, it's easy to give types to file-open and file-read which ensure that we can only read from a file if we know it has opened successfully.

See slides and website.

Stephanie Weirich, Doing dependent types wrong without going wrong. (Tue 2009-06-09, 11:00-12:00)

(joint work with Limin Jia, Jianzhou Zhou and Vilhelm Sjöberg)
Looking at a type soundness proof for a full-spectrum CBV dependently-typed language that allows non-terminating expressions in types.

See slides.

Special interest groups, Working. (Tue 2009-06-09, 14:00-16:30)

Special interest groups, Reporting back. (Tue 2009-06-09, 16:30-18:00)

Dependent types. Joe Stoy, Stephanie Weirich.
Adding dependent types to GHC. Simon Peyton Jones. Didier Remy.
Revision control and patches. Julia Lawall.
Composable GPU programming. Fritz Henglein. See slides.
Coq in the classroom. Olivier Danvy.

Don Stewart, Improving Data Structures. (Wed 2009-06-10, 09:00-09:50)

On viewing the heap live, including sharing and unboxing information, and then turning to the question of improving the uniform polymorphic box representation to yield more efficient data structures.

See slides.

Fritz Henglein, Optimizing relational algebra operations using generic partitioning discriminators and lazy products. (Wed 2009-06-10, 10:00-10:30)

We show how to implement in-memory execution of the core relational algebra operations of projection, selection and cross-product efficiently, using discrimination-based joins and lazy products. We introduce the notion of (partitioning) discriminator, which partitions a list of values according to a specified equivalence relation on keys the values are associated with. We show how discriminators can be defined generically, purely functionally, and efficiently (worst-case linear time) on top of the array-based basic multiset discrimination algorithm of Cai and Paige (1995). Discriminators provide the basis for eliminating duplicates, set union, set difference and discrimination- based joins, a new technique for computing joins that requires neither hashing nor sorting.
We represent a cross-product lazily as a formal pair of the argument sets (relations). This allows the selection operation to recognize on the fly whenever it is applied to a cross-product, in which case it can choose an efficient discrimination-based equijoin implementation. The techniques subsume most of the optimization techniques based on relational algebra equalities, without need for a query preprocessing phase. They require no indexes and behave purely functionally. Full source code in Haskell extended with Generalized Algebraic Data Types (GADTS) is included. GADTs are used to represent sets (relations), projections, predicates and equivalence denotations in a type safe manner. It should be emphasized that the code is only intended for and applicable to operations on in-memory data; that is, in a random-access memory cost model. Most emphatically, for performance reasons it is, as given, inapplicable to data stored on disk or on the network.

See slides.

Gabriele Keller, Adding Support for Multi-Dimensional Arrays to Data Parallel Haskell. (Wed 2009-06-10, 11:10-12:00)

Data Parallel Haskell (DPH) was designed with irregular data parallel applications in mind. Therefore, it provides nested data parallel arrays - that is, arrays with subarrays of arbitrary size. So far, DPH does not support regular structures, such as n-dimensional arrays. Therefore, they have to be expressed in terms of nested arrays as well. This is both inconvenient and inefficient. We are therefore currently adding support for n-dimensional arrays to DPH. We chose the representation of the arrays in a way to enable the definition of shape polymorphic operations, while still trying to do as much static shape checking as possible. In particular, we track the dimensionality of the arguments and result of operations, but not their exact size. In this talk I discuss the design of the parallel n-dimensional array library that is part of DPH, and present benchmarks for a set of common array operations and applications.

See slides.

Gilad Bracha, Newsspeak and its Children: Avarice and Sloth. (Thu 2009-06-11, 09:00-10:10)

Newspeak is a dynamically typed, class based language, in the Smalltalk tradition. Like Self, Newspeak is a message based language: all computation - even an object's own access to its internal structure - is performed by sending messages to objects. Classes can be nested arbitrarily, as in Beta. Since all names denote message sends, all classes are virtual; in particular, superclasses are virtual, so all classes act as mixins and class hierarchy inheritance is supported.
Newspeak has no global scope. Instead, top level classes act as module definitions, which are independent, immutable, self-contained parametric namespaces. They can be instantiated into modules which may be stateful and mutually recursive. Side by side deployment and simultaneous use of multiple alternate library implementations fall out naturally in this model. Newspeak modules are naturally sandboxed in line with the object capability model of security.
Newspeak is an imperative language, but its main design principles apply to purely functional languages as well. We discuss two such possible languages: Sloth, which is lazy, and its eager sibling Avarice.

See slides.

Marc Pouzet, Clocks as Types in Synchronous Dataflow Languages. (Thu 2009-06-11, 10:30-11:30)

Synchronous data-flow languages define a particular class of Kahn process networks which can be executed without any buffer. This is ensured by a dedicated type system which express properties about the clocks of signals, i.e., when data are ready to be consumed. I will show how to derive these typing constraints from the data-flow semantics of a minimal set of primitives. Then, I will show how to extend the synchronous model so as to deal with periodic and quasi-periodic systems, bounded jitter and delays.

See slides.

Martin Oderski, Down with Higher-Kinded Types. (Thu 2009-06-11, 11:30-12:00)

abstract missing

See slides.

Martin Hofmann, Relational semantics for effect-based program transformations: higher-order store. (Thu 2009-06-11, 14:00-14:45)

We give a denotational semantics to an effect system tracking reading and writing to global variables holding values of arbitrary type including effect-triggering functions. Effects are interpreted in terms of the preservation of certain binary relations on the store.
The semantics validates a number of effect-dependent program equivalences and can thus serve as a foundation for effect-based compiler transformations.
The definition of the semantics requires the solution of a mixed-variance equation which is not accessible to the hitherto known methods. We illustrate the difficulties with a number of small example equations one of which is still not known to have a solution.

See slides.

Manuel Chakravarty, WHAT. (Thu 2009-06-11, 14:45-15:20)

Software needs to expose increasing amounts of explicit parallelism to fully utilise modern processors. The current extreme are Graphics Processing Units (GPUs) that require thousands of data-parallel threads to produce maximum performance. Purely functional programming and especially collective array operations are a good match for data-parallel programming. In this talk, I briefly review two approaches to data-parallel programming in Haskell and present some first benchmarks. The two approaches differ in that the first implements the well understood model of flat data-parallelism, whereas the second implements the much more ambitious model of nested data- parallelism. We target GPUs with the flat parallel model and multicore CPUs with the nested parallel model.

See slides.

Ralf Hinze, The Lifting Lemma. (Thu 2009-06-11, 15:20-15:55)

Mathematicians routinely lift operators to structures, so that, for example, + not only denotes plain addition, but also addition lifted point-wise to sets A + B = { a + b | a <- A, b <- B }, or addition lifted point-wise to functions (f + g)(x) = f(x) + g(x).
Haskell programmers routinely lift operators to container types so that + either adds two numbers, or adds two elements of a pointed type (Maybe t), or sequences two parsers adding their semantic values.
Given a base-level property, does this property also hold when lifted to a structure or a container type? This talk answers this question making use of the notion of an applicative functor.

See slides.

Lennart Augustsson, Back to Basic. (Thu 2009-06-11, 15:55-16:05)

abstract missing

See slides.

Simon Peyton Jones, Let should not be generalised. (Thu 2009-06-11, 16:05-16:30)

I argue that local 'let' definitions should not be generalised, completely contrary to the status quo for ML-like languages. I explain why losing generalisation does little harm, and why it leads to a substantial simplification, at least once one adds GADTs and a more sophisticated notion of type equality to the language.

See slides.

Business meeting. (Thu 2009-06-11, 16:45-19:00)

Arvind, Bounded Dataflow Networks (BDNs) and Latency-Insensitive (LI) Circuits. (Fri 2009-06-12, 09:00-09:30)

We present a theory for modular refinement of Synchronous Sequential Circuits (SSMs) using bounded dataflow networks (BDNs). Our theory would let one replace, say a 3-ported register file in an SSM by a one-ported register file without affecting the correctness of the SSM. We provide a procedure for implementing any SSM into an LI-BDN, a special class of BDNs with some good compositional properties. We show that the Latency-Insensitive property of LI-BDNs is preserved under parallel and iterative composition of LI-BDNs. Our theory permits one to make arbitrary cuts in an SSM and turn each of the parts into LI-BDNs without affecting the overall functionality. We can further refine each constituent LI-BDN into another LI-BDN which may take different number of cycles to compute. If the constituent LI-BDN is refined correctly we guarantee that the overall behavior would be cycle-accurate with respect to the original SSM. We give several examples to show how our theory supports a generalization of previous techniques for Latency-Insensitive refinements of SSMs.

See slides.

Koen Classen, Solution to the programming problem. (Fri 2009-06-12, 09:30-09:40) See slides.

Ralf Hinze, Solution to the programming problem. (Fri 2009-06-12, 09:40-09:50) See slides.

Colin Runciman, Solution to the programming problem. (Fri 2009-06-12, 09:50-10:00) See slides.

David MacQueen, Higher-order modules. (Fri 2009-06-12, 10:20-10:50)

abstract missing

See slides.

Olivier Danvy, Defunctionalized Interpreters for Call-by-need Programming Languages. (Fri 2009-06-12, 10:50-11:30)

Starting from a hygienic representation of the standard call-by-need reduction of the lambda-calculus due to Ariola and Felleisen, and to Maraist, Odersky and Wadler, we refocus it into the storeless representation of a lazy abstract machine. We then refunctionalize this representation of an abstract machine into the representation of a continuation-passing interpreter, which we then map back to direct style. The result is a continuation-free representation of a natural semantics for lazy evaluation that embodies Barendregt's variable convention. The resulting spectrum of straightforwardly inter-derivable semantic artifacts touches upon many other such artifacts that have been independently worked out, most recently by Garcia, Lumsdaine and Sabry at POPL'09, for example, and that require independent soundness proofs on a case-by-case basis.

See slides.

Didier Remy, System F with explicit type-level retyping functions. (Fri 2009-06-12, 11:30-12:00)

MLF is a type system that seamlessly merges ML-style implicit but second-class polymorphism with System-F explicit first-class polymorphism. We present xMLF, a Church-style version of MLF with full type information that can easily be maintained during reduction. All parameters of functions are explicitly typed and both type abstraction and type instantiation are explicit. However, type instantiation in xMLF is more general than type application in System-F. We give xMLF a small-step reduction semantics where reduction is allowed in any context and show that reduction is confluent and type-preserving. We also show that both subject reduction and progress hold for weak-reduction strategies, including the call-by-value with the value-restriction. There is a type preserving encoding of MLF into F (which we will not describe), which ensures type soundness for the most general version of MLF.

See slides.

Ralf Hinze (email: ralf.hinze@comlab.ox.ac.uk)

Informal minutes of IFIP Working Group 2.8 26th meeting, Frauenchiemsee, Germany June 7-12, 2009

Informal minutes of IFIP Working Group 2.8
26th meeting, Frauenchiemsee, Germany
June 7-12, 2009