Informal minutes of IFIP WG2.8 22nd meeting

Informal minutes of IFIP Working Group 2.8
22nd meeting, Kalvi Manor, Estonia
October 1-4, 2005

List of attendees

Members:

Arvind
Lennart Augustsson
Olivier Danvy
Kathleen Fisher
Fritz Henglein
Ralf Hinze (secretary)
John Hughes
John Launchbury (chair)
Xavier Leroy
Greg Morrisett
Simon Peyton Jones
Benjamin Pierce
Olin Shivers
Satnam Singh
Peter Thiemann
Stephanie Weirich
Observers:

Juhan Ernits (local)
Andrzej Filinski
Jacques Garrigue
Jim Grundy
Conor McBride
John Reppy
Ando Saabas (local)
Olha Shkaravska (local)
Don Syme
Andrew Tolmach
Tarmo Uustalu (local)
Varmo Vene (local)
Vesal Vojdani (local)

Technical presentations

Xavier Leroy, Fun with coinduction in big-step operational semantics (Sat 2005-10-01, 16:30)

Big-step operational semantics reflect the structure of the program in the evaluation derivation. This makes them very convenient for proving the correctness of program transformations, such as compilation passes. However, they do not account for non-terminating programs, and make it hard to observe a trace of effects during execution. Small-step semantics, on the other hand, are convenient to reason about non-terminating programs and observable effects, but lead to much more difficult correctness proofs for compilation passes.
In this work in progress, we try to extend big-step semantics to describe non-terminating programs as well as terminating programs, using coinductive definitions. We use the simply-typed CBV lambda-calculus as a concrete example, and the Coq proof assistant as our investigation tool.
Cousot and Cousot (POPL 1992) introduced co-inductive big-step characterizations of non-terminating programs. We also consider a co-inductive interpretation of the normal evaluation rules, capturing both finite and infinite evaluation derivations. While the former is well-behaved and matches exactly infinite reduction sequences in small-step semantics, the latter exhibits strange behaviors. However, both coinductive semantics are preserved by compilation down to a simple abstract machine. The use of these coinductive big-step semantics for proving type soundness remains to be investigated.

Jim Grundy, Contexts in reFLect: A Theorem Proving Meta-Language (Sat 2005-10-01, 17:30)

The talk describe the context mechanism of the functional programming language reFLect; a reflective functional programming language developed and used at Intel for a variety of tasks in the design and verification of digital hardware.
The reFLect language supports a reflection mechanism allowing nested quotation and anti-quotation in a style similar to LISP. The context mechanism provides a syntactically convenient means for both constructing quoted terms, for and inspecting them by pattern matching. A key benefit of the context mechanism is that it makes it convenient for programs to access all the subexpressions of a quoted expression that are surrounded by a single logical level of quotation regardless of the actual quotations and anti-quotations surounding them. This is the common requirement in theorem proving applications, and in our own theorem prover we find it significantly simplifies our code.
The talk presented a subset of the material described in the paper A Reflective Functional Language for Hardware Design and Theorem Proving by Jim Grundy, Tom Melham and John O'Leary, which is awaiting publication by the Journal of Functional Programming.

See slides.

Greg Morrisett, Simplifying Regions (Sat 2005-10-01, 18:30)

Cyclone is a type-safe dialect of C that incorporates a region-based, type-and-effects system (a la Tofte & Talpin) intended to support manual memory management. Cyclone extends the Tofte-Talping system with region sub-typing, 1st-class regions and unique pointers. Constructing a model and proving type soundness has proven intractible until now.
In the talk, I sketch a translation from core-Cyclone to the polymorphic, linear lambda calculus extended with a very simple notion of regions. Proving soundness for this target language is very easy, and the translation from the source is also relatively straightforward. I believe this leads to new insights into regions and runST and suggests that linearity might be a better foundation for tracking Kripke-like effects.

See slides. For more information on Cyclone, see homepage. This is joint work with Amal Ahmed and Matthew Fluet.

John Reppy, Towards the optimzation of Concurrent ML (Sun 2005-10-02, 09:00)

CML is a statically-typed higher-order concurrent language that is embedded in Standard ML. Its most notable feature are first-class synchronous operations, which allow programmers to encapsulate complicated communication and synchronization protocols as first-class abstractions. This feature encourages a modular style of programming, where the actual underlying channels used to communicate with a given thread are hidden behind data and type abstraction.
While CML has been in active use for well over a decade, little attention has been paid to optimizing CML programs. In this talk, we present a new program analysis for statically-typed higher-order concurrent languages that is a significant step toward optimization of CML. Our technique is modular (i.e., it analyses and optimizes a single unit of abstraction at a time). The analysis has two major components: the first is a type-sensitive CFA that uses the program's type-abstractions to compute more precise results. We then construct an extended control-flow graph from the results of the CFA and analyze the flow of known channel values using the graph. The analysis is designed to detect special patterns of use, such as one-shot channels, fan-in channels, and fan-out channels. We expect that these special patterns can be exploited by using more efficient implementations of channel primitives.

See slides.

Peter Thiemann, A Type-Safe DOM API (Sun 2005-10-02, 09:45)

DOM (Document Object Model) is the W3C recommendation for an API for manipulating XML documents. The API is stated in terms of a language neutral IDL with normative bindings given for Java and ECMAScript. The type system underlying the DOM API is a simple object-based one which relies entirely on interfaces and interface extension. However, this simplicity is deceiving because the DOM architects implicitly impose a large number of constraints which are only stated informally. The long list of exceptions that some DOM methods may raise witnesses these constraints.
We define a refinement of Java's type system which makes most of these constraints accessible and thus checkable to the compiler. Technically, we graft a polymorphic annotation system on top of the host language's types and propagate the annotations using ideas borrowed from affine type systems. There is a type soundness proof with respect to an operational semantics of a Java core language.

Don Syme, F#: Experiences and Ramifications (Sun 2005-10-02, 11:00)

The aim of this talk was to convincingly demonstrate how much functional programming has to gain from (and contribute to) modern programming platforms such as .NET and Java. In particular, I presented F# by demonstration and by short summary, emphasizing how an efficient, interoperable functional language implementation has been achieved through relatively little investment. I emphasized the importance of bi-directional interoperability - and a "publish, publish, publish" mentality that leaves no barriers in the way of people from related programming disciplines accessing components written in your language. I discussed the ramifications for the design principles that might under-pin a future language such as a "StrictHaskell", in particular the observation that the language design should explicitly make room for extensions to and manifestations of the language that would enable it to be used in the context of pragmatic settings such as .NET.

Conor McBride, Erasing the static parts of dependently typed programs (Sun 2005-10-02, 11:45)

I sketched some work in progress (joint with Edwin Brady), aimed at recovering some sort of erasure semantics for a dependent type theory. More details can be found here.
The basic idea is to echo System F's 'big' and 'small' lambda and application by marking each pi, lambda and application with a numeric level; the type system polices levels appropriately and that terms at lower (more dynamic) levels are always typable at higher (more static) levels. For each i, the operator which erases all terms marked as above level i respects the core reduction rules of the calculus. Exactly how one separates and assigns levels is motivated by the observations one wishes to make on erased terms: at run time we care only about weak head-normalisation in the empty context; at compile time we need partial evaluation and type checking in any context.
If we introduce new datatypes by adding constructors and an induction operator, the types of these things must also be marked in such a way that the erasure still respects the computational behaviour of induction. Edwin Brady's doctoral research shows one way to do this and suggests that instantiating this framework with at least three levels serves some purpose: level 0 terms survive to run time; level 1 terms support partial evalution and type checking; level 2 terms can always be recovered from known information. We can go erase all arguments which serve only to indicate types or to act as proofs, just as we should hope to do.
It is perfectly possible to allow ordinary values right of the colon for typechecking, and still erase types to the left of it for execution.

See slides.

Discussion: Shall a future version of Haskell be strict? (Sun 2005-10-02, 14:00)

Benjamin C. Pierce, Adventures in Bi-Directional Programming (Sun 2005-10-02, 14:45)

Harmony: The System
Harmony, a framework for building high-quality, heterogeneous synchronizers for XML and other forms of tree-structured data, has been under development for several years at Penn. It includes two major components: (1) a generic tree synchronization algorithm whose behavior is driven by the schema of the trees being synchronized, and (2) a domain-specific programming language for describing bi-directional transformations on trees, in which the same program can be interpreted both as a transformation from concrete trees in some specific application format to abstract trees suitable for synchronization and as a transformation mapping updated abstract trees (post synchronization) to updated versions of the original concrete trees. After a big push on the implementation this summer, we are now in the process of making our first public release.
The talk includes a live demo of some of Harmony's core features and an overview of some of the many interesting questions that are still open.

For more information on Harmony, see homepage.

Jacques Garrigue, Structural Types, Recursive Modules, and the Expression Problem (Sun 2005-10-02, 16:00)

We present a series of solutions to the expression problem using polymorphic variants in Objective Caml. It starts with the original solution, where types are mostly left implicit, and then uses new features like recursive modules and private row types to finally obtain a solution which is both compact and scalable. Some of these solutions can be translated to SML or Haskell.

See slides and paper.

Kathleen Fisher, The Next 700 Data Description Languages (Sun 2005-10-02, 16:45)

In the spirit of Landin, we present a calculus of dependent types to serve as the semantic foundation for a family of languages called data description languages. Such languages, which include PADS, DataScript, and PacketTypes, are designed to facilitate programming with ad hoc data, ie, data not in well-behaved relational or XML formats. In the calculus, each type describes the physical layout and semantic properties of a data source. In the semantics, we interpret types simultaneously as the in-memory representation of the data described and as parsers for the data source. The parsing functions are robust, automatically detecting and recording errors in the data stream without halting parsing. We show the parsers are type-correct, returning data whose type matches the simple-type interpretation of the specification. We also prove the parsers are ``error-correct,'' accurately reporting the number of physical and semantic errors that occur in the returned data. We use the calculus to describe the features of various data description languages, and we discuss how we have used the calculus to improve PADS.

See slides.

Andrew Tolmach, Modelling the H Interface inside Haskell (Mon 2005-10-03, 09:00)

The H monad is specialized version of the Haskell IO monad that provides type-safe access to the protected features of the IA32 architecture. It is intended as a basis for building experimental O/S kernels in Haskell. The H monad has been implemented on real hardware, but it is also useful to give a model implementation within Haskell. This talk outlines some of the issues involved in writing such a model, including the treatment of non-deterministic interrupts and support for multiple abstractions for user-space execution. It also suggests that accurate models may need to reflect over some aspects of Haskell execution.

Fritz Henglein, Multiset discrimination for acyclic data (Mon 2005-10-03, 09:45)

Multiset discrimination is a set of fundamental algorithmic techniques pioneered by Paige et al. in the 80s and 90s for partitioning an input list of acyclic data under a given equivalence relation in worst-case linear time, without the use of hashing or sorting.
In this talk we define a notion of parametric polymorphic discriminators that encapsulate multiset discrimination and show how a rich class of discriminators can be constructed by induction on a language of definable equivalence relations that includes structural equivalence, bag and set equivalences and equivalences definable as least fixed points. A definable equivalence relation gives thus rise to both proof of linear-time discriminability and automatic generation of a worst-case optimal time implementation, which is illustrated by a number of applications that are essentially partitioning problems (anagram classes, finite state machine minimization, type isomorphism with commutative type operators). (Extensions to a richer language (preimages, greatest fixed points, etc.) on circular data with application to more complex examples will be hinted at, time permitting. So will be open problems and consequences for programming language design and implementation.)

See slides.

Andrzej Filinski, Data Monads (with an introduction to monadic reflection) (Mon 2005-10-03, 11:00)

The principle of monadic reflection links two distinct monadic interpretations of a computational effect by an observational monad isomorphism. If one of the monads is seen as a reasoning-oriented specification of the effect, and the other as an efficiency-oriented implementation, the isomorphism means that the observable behavior of a program using the effect can be understood entirely in terms of the specification monad, even though the actual execution exploits the implementation monad.
One can show that any specification monad can be uniformly implemented in this way by a continuation-state monad, which in turn can be viewed as the native monad underlying the semantics of ML-like languages with mutable cells (refs) and first-class continuations (callcc). This construction also scales naturally to layered monads, built up incrementally from monad transformers. However, it is a bit inefficient when the effect being modeled involves no control behavior, but only expresses a particular pattern of data flow.
The talk presents a general characterization of such control-less effects in terms of an algebraic structure tentatively named indexed monoids; the monads induced by such monoids are called data monads. Effects expressible as data monads include the usual state, environment, and accumulation monads, as well as more complicated variants, such as write-once or log-structured state. Moreover, much like any monad can be implemented by a continuation-state monad, any data monad can be implemented by a state monad alone. This optimized construction is still robust in the presence of other effects with control-affecting behavior, meaning that it can be used to efficiently implement control-less effect layers even in a general multi-effect setting.

Satnam Singh, Concurrent Programming with Join Patterns via STMs (Mon 2005-10-03, 11:45)

This talk showed how join patterns of the style found in languages like Comega and be implemented as a library in Haskell using STMs. The use of the orElse combinator allows join patterns to be composed to nicely implement choice. The implementation is very straightforward and it is possible to code up more exotic kinds of joins e.g. conditional joins.

See slides.

Stephanie Weirich, The POPLmark Challenge (Mon 2005-10-03, 12:30)

See homepage.

Discussion: relationship between modules & type classes (Mon 2005-10-03, 14:00)

Tarmo Uustalu, Partiality is an effect (and not a defect) (Mon 2005-10-03, 14:45)

This talk is about an approach to partiality from non-termination as a monadic effect using coinductive types. The intuitive idea is very simple: termination/non-termination is about waiting, this waiting can be made official. Looping is supported directly by the monad. In order to obtain general recursion, we use that the homsets of the Kleisli category are domains. This gives us a possibility to translate a language with general recursion into a total language following the general paradigm of monadic translation. We also discuss a combination of partiality with other effects based on a monad transformer.
(Joint work with Thorsten Altenkirch, Venanzio Capretta.)

See slides.

Varmo Vene, The Essence of Dataflow Programming (Mon 2005-10-03, 16:00)

We propose a novel, comonadic approach to dataflow (stream-based) computation. This is based on the observation that both general and causal stream functions can be characterized as coKleisli arrows of comonads and on the intuition that comonads in general must be a good means to structure context-dependent computation. In particular, we develop a generic comonadic interpreter of languages for context-dependent computation and instantiate it for stream-based computation. We also discuss distributive laws of a comonad over a monad as a means to structure combinations of effectful and context-dependent computation. We apply the latter to analyse clocked dataflow (partial streams based) computation.

See slides.

Arvind, Designing Million Gates in a Semester using Bluespec (Mon 2005-10-03, 16:45)

No abstract submitted.

Stephanie Weirich, Boxy type inference for higher-rank and impredicative polymorphism (Tue 2005-10-04, 09:00)

Languages with rich type systems are beginning to employ a blend of type inference and type checking, so that the type inference engine is guided by programmer-supplied type annotations. This work shows, for the first time, how to combine the virtues of two well- established ideas: unification-based inference, and bidirectional propagation of type annotations. The result is a type system that conservatively extends Hindley-Milner, and yet supports both higher-rank types, and impredicativity.

See slides.

Olha Shkaravska, Amortization for Heap Consumption (Tue 2005-10-04, 09:30)

In this talk we discuss static analysis of heap usage by strict functional programs.
We design a heap-aware type system for a functional language over polymorphic lists. Type annotations in the system determine bounds on heap consumption/gain as functions of the size of data. The system extends the amortization-based procedure with constant annotations developed by Hofmann and Jost for linear bounds.
Running the type analysis for a program one collects constraints for numerical annotations. Checking their consistency corresponds to the type-checking procedure, and solving corresponds to the type inference. Full type inference seems to be very complex. In particular, given a program, one must be able to derive the exact dependencies of the output sizes on input sizes.
The approach looks promising for checking asymptotic bounds, that is, given a function T(n) and a program, we answer the question if the resource consumption for the program is of O(T(n)), where n is the size of an input, and find corresponding coefficients.
We plan to reduce the complexity of the checking procedure and extend the approach for other than lists data types.

See slides.

Conor McBride, Differentiating data structures (Tue 2005-10-04, 10:00)

Telling an old story which could be better known, I sketched the connection between Huet's notion of 'zipper', representing one-hole contexts in tree-like data, and the differential calculus. The derivative of a functor represents the one-hole contexts for an element. Correspondingly, a sequence of such elements gives the zipper for the datatype generated by a fixpoint of the functor. The original observation (paper) is now realised as a generic program by Ralf Hinze et al in Generic Haskell (paper), and also by Peter Morris et al in Epigram (paper). Category theorists can find a deeper analysis of the phenomenon (with pictures) here.

See slides.

Olivier Danvy, Delimited-Continuations Blues (Tue 2005-10-04, 11:00)

The goal of this talk is (1) to compare and contrast static and dynamic delimited continuations, to illustrate them (with depth-first and breadth-first traversals), and then explain them in terms of calculi, abstract machines (small-step semantics), and evaluators (big-step semantics); and (2) to reflect on the method, which is based on a syntactic correspondence between calculi and abstract machines, and a functional correspondence between evaluators and abstract machines.

See slides and papers:

Satnam Singh, Haxell: adding regular types to Haskell (Tue 2005-10-04, 11:30)

This talk presented very preliminary work on adding features to Haskell for typing and matching against regular expressions.

See slides.

Ralf Hinze, Generic grouping and sorting (Tue 2005-10-04, 12:00)

This talk shows how to implement Fritz Henglein's multiset discrimination in Haskell using a generalized algebraic data type (GADT).

See slides.

Simon Peyton Jones, ESC/Haskell: baby steps towards extended static checking for Haskell (Tue 2005-10-04, 12:30)

Types are good, but we would like to check more static properties than type systems traditionally support. Dependent types are one promising direction, but so is "extended static checking", pioneered in Modula, and more recently in Java. In this talk I sketched some preliminary ideas for how the general approach might work out in Haskell.

Ralf Hinze (email: ralf@informatik.uni-bonn.de)

Informal minutes of IFIP Working Group 2.8 22nd meeting, Kalvi Manor, Estonia October 1-4, 2005

Informal minutes of IFIP Working Group 2.8
22nd meeting, Kalvi Manor, Estonia
October 1-4, 2005