Informal minutes of IFIP WG2.8 29th meeting

Informal minutes of IFIP Working Group 2.8
29th meeting, Storulvan, Sweden
February 19-24, 2012

Technical presentations

John Hughes, The Properties of Riak. (Mon 2012-02-20, 09:15-)

Riak is one of the new breed of no-SQL databases that is becoming popular for "big data" applications. Riak itself is developed by Basho Technologies, and has been adopted by, among others, the Danish Health Service and by the company processing SMS votes on UK TV shows. Riak is a key-value store which sacrifices consistency in favour of availability and network-partition tolerance; values retrieved from the database may be stale in the presence of failures, but the database will "eventually" become consistent. I presented a QuickCheck formal model that I and a team from Basho developed together, for testing the eventual-consistency property. Our tests drove a deeper understanding of how the Riak API actually behaves--even in the absence of failures--and enabled us to find and fix failures of eventual consistency in extreme failure scenarios.

See slides (not yet available).

Rishiyur Nikhil, Programming tightly-coupled HW accelerators. (Mon 2012-02-20)

There is a new generation of HW platforms with support for customizable direct hardware implementations of complex computations that are tightly coupled with conventional CPU cores. "Tight coupling" means low latency handoffs between the CPU and custom HW, and high bandwidth access to shared memory shared memory (cache-coherent, same addressing). These are motivated both by execution speed and power--custom HW can often be an order of magnitude faster, and consume an order of magnitude less power, than a general-purpose CPU performing the same computation. In this talk I will briefly describe such platforms, and the various ways we've been able to use Bluespec to significantly simplify their programming and testing.

See slides (not yet available).

Dimitrios Vytiniotis, Static contract checking for Haskell through FOL semantics. (Mon 2012-02-20)

Any serious program contains assertions or documented function pre- and post-conditions. An alternative way to express such executable specifications is through contracts. In the context of Haskell, contracts have a well-defined operational semantics, but previous work does not settle the subject of statically checking contracts, expressed as ordinary programs, in an effective and practical way for a lazy functional language. In this work (in progress) we aim to address the problem taking a denotational semantics approach: We express the semantics of programs and of contracts directly in terms of first-order logic formulae and automatically discharge the proof obligations that arise using a first-order logic theorem prover. We show how we leverage the purity of Haskell for equational reasoning, how we use information from counter-models, and how to avoid the search space explosion in the theorem prover. We demonstrate our ideas with a demo of our prototype implementation and sketch an extension to other features of Haskell, such as type classes and separate contract checking of modules.
(joint work with Koen Claessen, Simon Peyton Jones, Nathan Colins, Charles-Pierre Astolfi and Dan Rosen)

See slides (not yet available).

Adam Chlipala, The Next 700 Low-Level Programming Languages. (Mon 2012-02-20)

The design of programming languages must juggle many conflicting considerations, including expressivity, modularity features, run-time performance, and support for effective static analysis or even formal verification. The Bedrock framework combines some of the best ways of achieving these goals, drawn from high-level and low-level languages, providing a language for tasks traditionally done in low-level languages (like C). Bedrock programs are essentially functional programs that compute assembly programs. Our favorite high-level modularity features (e.g., first-class functions) are thus available at compile time, without compromising run-time performance. Bedrock is implemented as an embedded DSL in Coq, with support for both full functional verification and lighter-weight "type checking"-style analysis. Programmers may extend the language soundly with new features, so long as each new feature is proved correct in an appropriate sense.

See slides (not yet available).

John Scholes, Dyalog APL. (Mon 2012-02-20, 17:00-)

See slides.

Andy Gill, Kansas Lava: Our playground for DSLs. (Mon 2012-02-20)

Kansas Lava is a system, written in Haskell, for constructing hardware circuits. The system makes use a dual shallow/deep embedding, IO-based observable sharing, and phantom types to provide an expression language that can be directly compiled into gate logic. We will overview the core system, and give examples of how we use phantom and haunted types. For control and stateful code, we have a second-level DSL, called Wakarusa. We will describe this extension, build using a GADT-based deep DSL, and give examples of code, including a complete working example in our ASCII simulator.

See slides (not yet available).

Nate Foster, Kinetic: Abstractions for Network Update. (Tue 2012-02-21, 08.15--)

Configuration changes are a common source of instability in networks, leading to outages, performance disruptions, and security vulnerabilities. Even when the initial and final configurations are correct, the update process itself often steps through intermediate configurations that exhibit incorrect behaviors. This talk introduces the notion of consistent network updates -- updates which are guaranteed to preserve well-defined behaviors when transitioning between configurations. I will identify two distinct consistency levels, per-packet and per-flow, and we present general mechanisms for implementing them in Software-Defined Networks using switch APIs like OpenFlow. I will present a formal model of OpenFlow networks, and show that consistent updates preserve a large class of properties. I will also describe our prototype implementation, including several optimizations that reduce the overhead required to perform consistent updates, as well as a verification tool that leverages consistent updates to significantly reduce the complexity of checking the correctness of network control software.

See slides.

Eijiro Sumii, Higher Order Distribution and Name Creation. (Tue 2012-02-21)

We introduce HOpiPn, the higher-order pi-calculus with passivation and name creation, and develops an equivalence theory for this calculus. Passivation [Schmitt and Stefani] is a language construct that elegantly models higher-order distributed behaviors like failure, migration, or duplication (e.g. when a running process or virtual machine is copied), and name creation consists in generating a fresh name instead of hiding one. Combined with higher-order distribution, name creation leads to different semantics from name hiding, and is closer to implementations of distributed systems. We define for this new calculus a theory of sound and complete environmental bisimulation to prove reduction-closed barbed equivalence.
(joint work with Adrien Pierard)

See slides (not yet available).

Simon Marlow, Solving an old problem: How do we get a stack trace in a lazy funcitonal language. (Tue 2012-02-21)

I consider the problem of obtaining a stack trace in a lazy language like Haskell, and come to two slightly surprising conclusions: (1) that lazy evaluation is not the primary difficulty, and (2) that the stack trace you get for free in a strict setting is seriously limited in usefulness particularly when using functional abstractions such as composition and monads (and the problem is not just due to tail call optimisation). I give an operational semantics for a lazy functional language that maintains a stack trace, and explore two variants of it. The first simulates the stack trace provided by a strict language without tail calls, which I show yields results that are less than useful, and perhaps surprising, when used with programs containing some common functional abstractions. Secondly I explore a different semantics that appears to give more useful results, but has less clear properties when considering which compiler transformations are valid. This is work in progress.

See slides (not yet available).

Simon Peyton Jones, Deferring Type Errors. (Tue 2012-02-21)

In my talk I described an extremely simple implementation technique that allows a program with type errors to be compiled to executable code that, when run, will trigger the type error only when the erroneous bit of code is executed.

See slides (not yet available).

Derek Dreyer, Toward a New Package System for GHC. (Tue 2012-02-21, 17:00-)

GHC's package system, Cabal, is limited in its support for reusable, statically type-checkable components because of its reliance on version ranges as a substitute for proper package interfaces. In this talk, I present some preliminary work that my student Scott Kilpatrick and I are doing together with Simon Marlow and Simon Peyton Jones on a next-generation package system for GHC, intended to remedy Cabal's deficiencies. The main idea is to employ "mixins" as a way of supporting packages that are a mix of defined and undefined components, with the defined components implemented by ordinary Haskell modules, and the undefined components (or "holes") specified by a new notion of Haskell "signatures". While the high-level approach is modeled after Dreyer and Rossberg's MixML type system (ICFP'08), adapting this approach to Haskell leads to a variety of interesting problems in the design and semantics of the language.
(joint work with Scott Kilpatrick, Simon Marlow, and Simon Peyton Jones)

See slides (not yet available).

Ulf Norell, An Agda Talk. (Tue 2012-02-21)

In this talk I show off the new JavaScript backend for Agda and implement an embedded language for simple user interfaces. As an example the slides for the talk is implemented in the language. I use dependent types to track which events (button clicks) can be emitted by a user interface component to ensure statically that all events are handled.

See slides (not yet available).

Didier Remy, Namespaces for OCaml (Moving the module boundary). (Tue 2012-02-21)

Separate compilation in OCaml uses modules as compilation units. A compiled object is a pair of an interface file and a code file with the same basename. When the compiler starts, it builds a mapping from module names to previously compiled objects by searching compiled objects into all directories of the library path and using the capitablized basename as the canonical module name.
This makes two compilation units with the same name conflicting: their code cannot linked together in the same project and their modules cannot be accessed simultaneously in the same compilation unit. We first introduce an internal identifier that makes each compilation unit linkable with anyother one regardless of the name of its source file.
We then replace the flat canonical mapping of module names to module objects by an arbitrary tree mapping whose leaves are module objects. We use a small language to populate the initial namespace by scanning the file system for compiled objets and to reorganize namespaces in almost arbitrary ways by renaming, restriction, merging of namespaces, let-bindings, etc. Namespaces define the compiler's view of the world which need not match the filesystem's view anymore.
Besides, we also discuss functorization---a need to split bodies of functors into separately compilation units. Namespaces can be extended to describe functorized compilation units as well.

See slides (not yet available).

Fritz Henglein, Dynamic Symbolic Computation for Domain-Specific Language implementation. (Wed 2012-02-22, 18:15-)

We describe a simple, but powerful method for gradually enhancing a base DSL implementation with computational performance improvements. It consists of adding constructors for expensive operations if this facilitates asymptotic performance improvements in some contexts, with at most constant-factor overhead. The resulting implementation can be thought of as executing “standard” computation steps from the base implementation interleaved with symbolic computation steps on the newly introduced constructors. Starting with lists as base implementation for multisets we arrive at a novel representation with constructors for unions, scalar multiplication, and mapping binary functions over Cartesian products and show how to implement finite probability distributions with exact (rational) probabilities and efficient computation of expectation and variance on product distributions.

See slides (not yet available).

Michal Palka, Testing an optimising compiler using a random simply-typed lambda-term generator. (Thu 2012-02-23, 08:15-)

We present a method of testing the middle-end of the GHC optimising compiler using a random simply-typed lambda-term generator for generating random Haskell programs. The generator works by generating terms top-down and performing local choice based on the typing rules of the simply-typed lambda calculus. The compiler is tested using differential testing where the observable behaviour of programs compiled with different optimisation levels is compared. The testing was able to uncover interesting bugs in GHC that changed strictness of compiled expressions, despite that only a very small subset of Haskell programs is covered. Unsolved problems in the generation of expressions that involve polymorphic constants were discussed.

See slides (not yet available).

Benjamin Pierce, Verification Challenges of Pervasive Information Flow. (Thu 2012-02-23)

The CRASH/SAFE project aims to design a new computer system that is highly resistant to cyber-attack. It offers a rare opportunity to rethink the hardware / OS / software stack from a completely clean slate, unhampered by legacy constraints. We are building novel hardware, a new high-level programming language, and a suite of modern operating system services, all embodying fundamental security principles -- separation of privilege, least privilege, mutual suspicion, etc. -- down to their very bones. Achieving these goals demands a co-design methodology in which all system layers are designed together, with a ruthless insistence on simplicity, security, and verifiability at every level.
I will describe the current state of the CRASH/SAFE design and discuss some of the most interesting verification challenges that it raises.
More information on CRASH/SAFE can be found at http://crash-safe.org.

See slides (not yet available).

Discussion groups. (Thu 2012-02-23, 10:45-12:30)

Verifying machine code (from Benjamin's CRASH/SAFE project).
Mapping APL's array stuff into Haskell libraries.

Business meeting. (Thu 2012-02-23, 17:00-)

Adam Chlipala, Statically Typed Metaprogramming in Ur. (Fri 2012-02-24, 09:00-)

Many real-world Web applications depend on metaprograms, or code generators. Traditionally, these programs are implemented with little help from the compiler in static checking, so that it is quite easy to build a metaprogram that performs appropriately on some inputs but outputs faulty programs on other inputs. The Ur programming language is a practical extension of System F omega that supports programming with syntax trees that have expressive types, based on extensible records and type-level programming. I presented the core of Ur and some of the type inference heuristics that make metaprograms more pleasant to write than in, e.g., Haskell or Agda. In particular, authors of metaprograms do not need to write proof terms to justify typing claims, and authors of metaprogram client code need not even write types fancier than in the mainstream.

See slides (not yet available).

Dimitrios Vytiniotis, Project Rhea: ... in data analytics ... Static Analysis for optimizing MapReduce. (Fri 2012-02-24)

Hadoop is a popular framework for processing large datasets. Many Hadoop jobs are very selective and operate on just a fraction of their input data. That data is often unstructured (for instance text files), which makes it difficult to apply out-of-the-box classical database optimization techniques. In this cross-group project at MSRC we have used static analysis techniques to examine the (executable bytecode of the) map phase of a job and automatically extract a filter that identifies the interesting `rows' and `columns' of the input data. Instead of sending all data from the storage to the compute cluster, we automatically identify and send only the subset of interest. Our automatically-generated filters are purely an optimization: they soundly approximate the set of interesting data, they are side-effect free (whereas mappers need not be), and can be killed or restarted on demand. Using our filters on example jobs, we have reduced network overheads by a factor of 5, and job completion times by a factor of 3 to 4 for certain jobs. In this talk I will emphasize on the static analysis part, and show how the domain of Hadoop map jobs makes a great fit for a very simple to implement, cheap to run, and effective in terms of improving job-completion times static analysis.

See slides (not yet available).

Discussion groups continued. (Fri 2012-02-24, 10:45-11:30)

Verifying machine code (from Benjamin's CRASH/SAFE project).
Mapping APL's array stuff into Haskell libraries.

Ralf Hinze (email: ralf.hinze@cs.ox.ac.uk)

Informal minutes of IFIP Working Group 2.8 29th meeting, Storulvan, Sweden February 19-24, 2012

Informal minutes of IFIP Working Group 2.8
29th meeting, Storulvan, Sweden
February 19-24, 2012