Informal minutes of IFIP Working Group 2.8
30th meeting, Annapolis, Maryland, USA
November 5-9, 2012

List of attendees

Members: Observers:
Note: many of the links are dead, which simply means that I haven't yet received the material.

Technical presentations

Ralf Hinze, The computational essence of sorting algorithms. (Mon 2012-11-05, 9:05-9:50)

Sorting algorithms are an intrinsic part of functional programming folklore as they exemplify algorithm design using folds and unfolds. This has given rise to an informal notion of duality among sorting algorithms: insertion sorts are dual to selection sorts. Using bialgebras and distributive laws, we formalise this notion within a categorical setting. We use types as a guiding force in exposing the recursive structure of bubble, insertion, selection, quick, tree, and heap sorts. Moreover, we show how to distill the computational essence of these algorithms down to one-step operations that are expressed as natural transformations.

(This is joint work with Daniel W.H. James, Thomas Harper, Nicolas Wu, José Pedro Magalhães.)

See slides and paper.

Xavier Leroy, Validating LR(1) parsers. (Mon 2012-11-05, 9:50-10:30)

For some applications (verified compilers, secure handling of structured documents), it is important to have parsers that are correct with respect to their specifications as a context-free grammar. To this end, we apply translation validation to a LR(1) parser generator such as Menhir: after Menhir generates a pushdown automaton A from a grammar G, a validator, written and proved correct in Coq, checks that A is sound, complete, and nonambiguous with respect to G. We applied this validator to the C grammar from the ISO C 99 standard: it worked, but the known ambiguities and context dependencies in this C grammar require ugly workarounds.

(Joint work with Jacques-Henri Jourdan and François Pottier.)

See slides.

Paul Hudak, The World's Fastest Network Controller -- in Haskell. (Mon 2012-11-05, 10:55-11:35)

Software defined networks (SDN) introduce centralized controllers to drastically increase network programmability. The simplicity of a logical centralized controller, however, can come at the cost of controller programming complexity and scalability. In this paper, we present McNettle, an extensible SDN controller system whose control event processing throughput scales with the number of system CPU cores and which supports control algorithms requiring globally visible state changes occurring at flow arrival rates. Programmers extend McNettle by writing event handlers and background programs in a high-level functional programming language extended with shared state and memory transactions. We implement our framework in Haskell and leverage the multicore facilities of the Glasgow Haskell Compiler (GHC) and runtime system. Our implementation schedules event handlers, allocates memory, optimizes message parsing and serialization, and buffers system calls in order to optimize cache usage, OS processing, and runtime system overhead. We identify and fix bottlenecks in the GHC runtime system and IO manager. Our experiments show that McNettle can serve up to 5000 switches using a single controller with 46 cores, achieving throughput of over 14 million flows per second, near-linear scaling up to 46 cores, and latency under 10 ms with loads consisting of up to 1500 switches.
See slides and paper.

Michael Hicks, Expositor: Scriptable Time-Travel Debugging with First Class Traces. (Mon 2012-11-05, 11:35-12:05)

We present Expositor, a new debugging environment that combines scripting and time-travel debugging to allow developers to automate complex debugging tasks. The fundamental abstraction provided by Expositor is the execution trace, which is a time-indexed sequence of program state snapshots. Developers can manipulate traces as if they were simple lists with operations such as map and filter. Under the hood, Expositor efficiently implements traces as lazy, sparse interval trees, whose contents are materialized on demand. Expositor also provides a novel data structure, the edit hash array mapped trie, which is a lazy implementation of sets, maps, multisets, and multimaps that enables developers to maximize the efficiency of their debugging scripts. We have used Expositor to debug a stack overflow and to unravel a subtle data race in Firefox. We believe that Expositor represents an important step forward in improving the technology for diagnosing complex, hard-to-understand bugs.
See slides and paper.

Daniel Licata, title. (Mon 2012-11-05, 13:30-14:30)

Homotopy type theory is an exciting new generalization of Martin-Lof's intensional type theory, based on connections with homotopy theory and higher-dimensional category theory. It admits certain new logical principles inspired by these semantics, including Voevodsky's univalence axiom, which states that isomorphic types are interchangable in all contexts. Consequently, homotopy type theory has several promising applications to formalized mathematics, such as allowing mathematicians to work up to isomorphism, and enabling logical formalizations of results in homotopy theory. However, making computational sense of these new logical principles requires changing the operational semantics of type theory, reinterpreting certain rules of type theory as generic programs, directed by the structure of types and terms. While this computational interpretation is currently an open problem, we understand some special cases of it. In this talk, I describe the intuition behind the computational interpretation of homotopy type theory, and sketch some applications of the idea that isomorphic types are interchangable to code reuse and specifications for abstract types.
See slides, and paper, and examples examples.

Aaron Turon, Reagents: Functional programming meets scalable concurrency. (Mon 2012-11-05, 14:30-15:20)

Efficient communication and synchronization is crucial for fine-grained parallelism. Libraries providing such features, while indispensable, are difficult to write, and often cannot be tailored or composed to meet the needs of specific users. We introduce reagents, a set of combinators for concisely expressing concurrency algorithms. Reagents scale as well as their hand-coded counterparts, while providing the composability existing libraries lack.
See slides and paper.

Cătălin Hriţcu, All Your IFCException Are Belong To Us. (Mon 2012-11-05, 15:35-16:25)

I talked about how we do exception handling in Breeze, a new programming language that combines fine-grained, dynamic information flow control with label-based discretionary access control. Existing designs for fine-grained, dynamic information-flow control assume that it's acceptable to terminate the entire system when an incorrect flow is detected -- i.e, they give up availability for the sake of confidentiality and integrity. This is an unrealistic limitation for systems such as long-running servers. We identified public labels and delayed exceptions as crucial ingredients for making information-flow errors recoverable while retaining non-interference, and we proposed two new error-handling mechanisms that make all errors recoverable. The first mechanism builds directly on these basic ingredients, using not-a-values (NaVs) and data flow to propagate errors. The second mechanism adapts the standard exception model to satisfy the extra constraints arising from information flow control, converting thrown exceptions to delayed ones at certain points. We have proved formally that both mechanisms are sound, and we believe that they have equivalent expressive power.
See slides and paper.

Jean-Baptiste Tristan, title. (Mon 2012-11-05, 16:25-16:50)

See slides and paper.

John Hughes, Testing atomicity: Finding race conditions by random testing. (Mon 2012-11-05, 16:50-17:20)

What is a race condition in Erlang? Since there is no shared data between processes, and no mutable data, then data races simply cannot occur... and yet programs do sometimes fail because of concurrency errors. We characterize (harmful) race conditions via serializability: given a sequential specification of an API, a parallel test fails if no serialization of it conforms to the sequential spec. The idea dates from ICFP 2009: I talked about a recent success story in which this technique tracked down notorious race conditions in Mnesia, the database packaged with Erlang. Even more recently, the technique has been extended to handle atomic operations that may block, such as claiming a lock. The extended method has been applied to a worker-pool component with an existing QuickCheck specification, and immediately revealed new bugs.
See slides.

Dimitrios Vytiniotis, Towards A Finite model theory for higher-order program verification. (Tue 2012-11-06, 9:00-9:50)

In recent work we have proposed the use of off-the-shelf first order logic theorem provers to verify properties of higher-order functional programs, based on a simple axiomatization of the denotational semantics of programs. In this talk I give a short presentation of that approach and argue that the proposed translation of programs to FOL , although sufficient for proving several interesting properties, is not satisfactory for finding counterexamples. The root of the problem is the loss of the finite model property of the proposed FOL translation: I explain the problem, and give a partial solution that can, under certain conditions, guarantee the existence of finite models when counterexamples exist. Overall, this talk describes our efforts towards an FOL theory that is simultaneously good for verification and model checking of higher-order programs.
See slides.

Arthur Charguéraud, title. (Tue 2012-11-06, 9:00-9:50)

Some programming languages have as primary goal the production of efficient code. Other languages instead have as primary goal the production of correct code, i.e., code that can be formally verified at reasonable cost. In this talk, we focus on the following question: could we design a programming language which can be used to produce code that is both fairly efficient and easy to reason about? We address one particular aspect of this question by proposing to change slightly the semantics of pure values in a language such as Caml. With this change, we are able to simplify formal proofs by removing the need to use representation predicates for reasoning about purely-functional data structures. Representation predicates relate a concrete implementation of a data structure (e.g., a binary tree) with the abstract object that it implements (e.g., a set). By programming directly using abstract objects, we are able to directly reason at the abstract level, thus avoiding the need for representation predicates altogether. By annotating the source code with hints describing which concrete data structures to use for implementing each of the abstract objects, we can then recover the ability to produce efficient code.
See slides.

Daniel "Spoons" Spoonhower, title. (Tue 2012-11-06, 10:45-11:30)

See slides and paper.

Brent Yorgey, Eliminators for Data Types with Symmetries. (Tue 2012-11-06, 11:30-12:10)

The theory of combinatorial species may provide some interesting tools for working with generalizations of algebraic data types, including data types quotiented by symmetries---for example, cycles, bags, or binary trees where sibling nodes are unordered. As one small piece of this larger project, we present two different views on eliminators for such quotiented types, which must not allow the programmer to distinguish between different concrete representations of equivalent structures. The first follows directly from a canonical decomposition theorem for species along with the definition of quotienting. The second is built in a more "modular" way on top of Peter Hancock's "cursor down" operator, and we show that it has other nice applications to quotiented types beyond just elimination.
For those interested in reading more about species, I've been writing a series of blog posts; the first couple are here (with more to come): 1 and 2.
See slides and paper.

Nate Foster, title. (Tue 2012-11-06, 13:30-14:25)

See slides.

Geoffrey Mainland, Regular Array Computation in Haskell ... on GPUs. (Tue 2012-11-06, 14:25-15:00)

Repa provides a excellent Haskell interface for expressing and automatically parallelizing regular array computation on CPUs. Moreover, its use of type indices to specify how arrays are represented allows programmers to use types to reason about time and space costs. We show how to re-use these reasoning techniques in a deeply embedded, monadic DSL for regular array computation on GPUs.
See slides.

Benjamin Pierce, Test-Driven Development of an Information-Flow ISA: A QuickCheck Adventure. (Tue 2012-11-06, 10:45-11:30)

Could a random testing tool like QuickCheck be used to assist in the development of a low-level machine architecture with built-in dynamic information-flow control? I'll describe some recent experiments in this direction, focusing on (1) how to generate random machine-code programs and (2) how to shrink counter-examples to make them easier to understand.
(joint work with Arthur Azevedo de Amorim, Catalin Hritcu, John Hughes, Leonidas Lampropoulos, Ulf Norell, Dimitrios Vytiniotis, and Antal Spector-Zabusky)
See slides.

Robert Harper, Cache Efficient Functional Algorithms. (Tue 2012-11-06, 16:00-16:40)

The widely studied I/O and ideal-cache models were developed to account for the large difference in costs to access memory at different levels of the memory hierarchy. Both models are based on a two level memory hierarchy with a fixed size primary memory (cache) of size M, an unbounded secondary memory organized in blocks of size B. The cost measure is based purely on the number of block transfers between the primary and secondary memory. All other operations are free. Many algorithms have been analyzed in these models and indeed these models predict the relative performance of algorithms much more accurately than the standard RAM model. The models, however, require specifying algorithms at a very low level requiring the user to carefully lay out their data in arrays in memory and manage their own memory allocation.

In this paper we present a cost model for analyzing the memory efficiency of algorithms expressed in a simple functional language. We show how some algorithms written in standard forms using just lists and trees (no arrays) and requiring no explicit memory layout or memory management are efficient in the model. We then describe an implementation of the language and show provable bounds for mapping the cost in our model to the cost in the ideal-cache model. These bound imply that purely functional programs based on lists and trees with no special attention to any details of memory layout can be as asymptotically as efficient as the carefully designed imperative I/O efficient algorithms. For example we describe an O(n/B log_(M/B) n/B) cost sorting algorithm, which is optimal in the ideal cache and I/O models.

See slides and paper.

Amal Ahmed, Verifying an Open Compiler from System F to Assembly. (Wed 2012-11-07, 9:00-10:05)

The field of compiler verification has witnessed remarkable progress of late, with verified-correct compilers for increasingly realistic languages. Most of these compilers, however, are proved correct under a closed-world assumption, that is, assuming that the verified compiler will always compile whole programs. This is an unrealistic assumption since the ``whole'' programs that we run are invariably comprised of code linked together from various sources (e.g. low-level libraries, code compiled using other compilers, and even code written in different source languages). Recent work by Benton and Hur aims to address this problem, but their approach has serious limitations: it does not feasibly support linking with target code produced from a different source language and it seems hard to scale to multi-pass compilers.

In this talk, I'll present a new methodology for verifying an open compiler---one that yields theorems guaranteeing meaning-preserving compilation of *components*, not just whole programs. Specifically, we show how to verify an open, multi-pass, type-preserving compiler from a polymorphic language (with recursive types) to assembly, establishing a compiler correctness theorem that permits reasoning about linking with arbitrary target code (no matter its source). Our compiler pipeline deals with four languages: F (System-F-like source), C (target of closure conversion), A (target of allocation), and T (a stack-based typed assembly language). The central novelty of our approach is that we define a combined language FCAT that embeds these four languages and formalizes the semantics of interoperability between each pair of adjacent languages using boundaries in the style of Matthews and Findler. We can stack these boundaries to allow interoperability between the source and target of the compiler, e.g., FCAT(e_T) allows a target component e_T to be used from within an F expression. Compiler correctness is stated as contextual equivalence in the combined language: if e_F compiles to e_T then e_F is equivalent to FCAT(e_T).

This is joint work with James T. Perconti.

See slides.

Stephanie Weirich, Paradoxical Typecase. (Wed 2012-11-07, 10:05-10:40)

Injective Type Constructors lead to paradoxes in type theories such as Coq and Agda. But why? In this talk I show how the presence of a typecase operator can be used to define looping terms. I also discuss how type constructor injectivity in the presence of type equalities or classical logic can emulate typecase.
See slides.

Philippa Gardner, title. (Wed 2012-11-07, 11:05-12:00)

See slides and paper.

Olivier Danvy, A Saucerful of Proofs in Coq. (Thu 2012-11-08, 9:00-9:45)

This talk is an overview of my new experience in using Coq in the classroom.
See slides.

John Reppy, A different kind of functional language. (Thu 2012-11-08, 9:45-10:30)

This talk is about Diderot, a parallel domain-specific language for image analysis and visualization. Diderot supports programming over continuous tensor fields, which are functions from R^n to tensors, and includes higher-order operators, such as differentiation. This talk gives an overview of Diderot and describes the techniques used to implement fields and the higher-order operations on fields. More information about Diderot is available here.
See slides.

Jack Dennis, Commuting Statements and Functional Programming. (Thu 2012-11-08, 11:00-11:25)

The Habanero Java parallel programming language permits concurrent threads to execute assignment statements and tests of shared variables. If all pairs of concurrent statements commute, that is either order of execution has the same effect, then it is known that the program is repeatable – for any given input data, every run of the program produces the same result. In Habanero Java concurrency of threads is expressed using finish and async statements. We show how to construct the environment transformation performed by execution of a finish statement, which is a function if all pairs of concurrent statements commute. Such a transformation can be used to construct programs in a functional programming language equivalent to Habanero Java programs satisfyng the condition.
See slides.

Peter Thiemann, You Can't Touch This. (Thu 2012-11-08, 11:25-12:00)

Scripting languages provide access to all resources via object properties. An access control mechanism that provides protection of confidential information for such a language has to gauge traversal of object graphs. We propose a domain specific language to specify sets of objects, assign read and write permissions to them, and enforce these permissions in limited scopes of a program. To obtain complete interposition, we build the enforcement into the scripting engine. We are working on an implementation of the enforcement in Firefox.
See slides.

John Launchbury, title. (Thu 2012-11-08, 13:30-14:15)

See slides and paper.

Dave MacQueen, Teaching Induction. (Thu 2012-11-08, 14:15-14:45)

Students have difficulty mastering the technique of inductive proof as used in the theory of programming languages. One source of this difficulty is the degree of "cheating" typically employed when such proofs are presented in textbooks. This talk illustrates the problem by considering a simple version of the Substitution Lemma, which involves an induction over (typing) derivations. I show how the induction can be fully explained when the derivations are made exlicit as an inductive structure so that the inductive principle and induction hypotheses can be made manifest.
See slides.

Simon Peyton Jones, Putting the thrill back into computing at School. (Thu 2012-11-08, 14:45-15:30)

I told the story of the rapid developments around Computer Science as a school subject in UK schools.
See slides.

Steve Zdancewic, Linear Logic and Linear Algebra. (Fri 2012-11-09, 9:00-9:40)

This talk will consider the connection between linear logic and linear algebra. In particular, I will describe work in progress that aims to give an interpretation of linear logic: propositions are interpreted as finite-dimensional vector spaces over a finite field and judgments are interpreted as linear maps between vector spaces. Of particular interest is the treatment of the "exponential" types !A.
See slides.

Norman Ramsey, title. (Fri 2012-11-09, 9:40-10:20)

I am writing a book intended to support a programming-language survey course for people who will spend their careers involved with software. The theme of the book is to give people something to *do* with ideas of lasting value. In a course based on such a book, students can spend perhaps a week learning about laziness. If students already have some background with recursion, lists, and higher-order functions, what are good examples for them to see? If they want to learn more, what exercises should they do? A "zero draft" of this material includes examples using infinite sequences and using memo tables. More complex examples which exploit laziness include Boolean satisfiability (at least two different solutions) and the line-breaking algorithm invented by Don Knuth and Michael Plass.
See slides.

Conal Elliott, Circuit timing analysis, linear maps, and semantic morphisms. (Fri 2012-11-09, 10:50-11:35)

This talk shows some connections between circuit timing analysis and linear algebra, then gives a formulation of linear maps (as a generalized algebraic data type) with a very simple denotational semantics and a category interface, and finally derives an implementation from the semantics.
See slides.

Philippa Gardner and Arthur Charguéraud, title. (Fri 2012-11-09, 11:35-11:50)

See slides and paper.

Ralf Hinze (email: