Publications of the Algebra of Programming Research Group

The following publications are in (approximately) reverse chronological order. Links are given to electronic copies and to BibTeX references, where available. (If you would like a copy of any publication for which no links are available, please write to the author or to the group's web administrator.)

All of these publications are copyrighted, some by the authors and some by publishing companies. It is a publisher's requirement to display the following notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Go to publications for: 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | 1998 | 1997 | 1996 | 1995 | 1994 | 1993.

Barney Stratford (2006). A Faster Arithmetic Coder. Unpublished.

Abstract: Arithmetic coding is a highly effective method for data compression. One of the criticisms levelled at it is that it runs fairly slowly. We propose a novel method for arithmetic coding that runs at a significantly higher speed and can be implemented using simple hardware if required.
pdf (17 pages).

Jeremy Gibbons (2006). Design Patterns as Higher-Order Datatype-Generic Programs. Submitted for publication.

Abstract: Design patterns are reusable abstractions in object-oriented software. However, using current programming languages, these elements can only be expressed extra-linguistically: as prose, pictures, and prototypes. We believe that this is not inherent in the patterns themselves, but evidence of a lack of expressivity in the languages of today. We expect that, in the languages of the future, design patterns will be expressible as reusable library code. Indeed, we claim that the languages of tomorrow will suffice; the future is not far away. The necessary features are higher-order and datatype-generic constructs; these features are already or nearly available now. We argue the case by presenting higher-order datatype-generic programs capturing Origami, a small pattern language of recursive data structures.
pdf (18 pages).

Jeremy Gibbons (2006). Fission for Program Comprehension. Submitted for publication.

Abstract: Fusion is a program transformation that combines adjacent computations, flattening structure and improving efficiency at the cost of clarity. Fission is the same transformation, in reverse: creating structure, ex nihilo. We explore the use of fission for program comprehension, that is, for reconstructing the design of a program from its implementation. We illustrate through rational reconstructions of the designs for three different C programs that count the words in a text file.
pdf (18 pages).

Nils Anders Danielsson, Jeremy Gibbons, John Hughes and Patrik Jansson (2006). Fast and Loose Reasoning is Morally Correct. To appear at POPL2006.

Abstract: We justify reasoning about non-total (partial) functional languages using methods seemingly only valid for total ones; this permits "fast and loose" reasoning without actually being loose.

Two languages are defined, one total and one partial, with identical syntax. The semantics of the partial language includes partial and infinite values and lifted types, including lifted function spaces. A partial equivalence relation is then defined, the domain of which is the total subset of the partial language. It is proved that if two closed terms have the same semantics in the total language, then they have related semantics in the partial language. It is also shown that the partial equivalence relation gives rise to a bicartesian closed category, which can be used to reason about values in the domain of the relation.

pdf (12 pages). Accompanying proofs as a rough collection of text files.

Ralf Hinze, Andres Löh, and Bruno C. d. S. Oliveira (2005). "Scrap Your Boilerplate" Explained. Submitted to FLOPS 2006

Abstract: The paper "Scrap your Boilerplate" (SYB) introduces a combinator library for generic programming that offers generic traversals and queries. Classically, support for generic programming consists of two essential ingredients: a way to write type-indexed functions, and independently, a way to access the structure of data types. SYB seems to lacks the second. As a consequence, it is difficult to compare with other approaches such as PolyP or Generic Haskell. In this paper we reveal the structural view that SYB builds upon. This allows us to define the combinators as generic functions in the classical sense. We explain the SYB approach in this changed setting from ground up, and use the understanding gained to relate it to other approaches to generic programming. Furthermore, we show that the SYB view is applicable to a very large class of data types, including generalized algebraic data types.
Fetch the PDF file (188K). Extended version, source code and useful other information can be found here.

Bruno C. d. S. Oliveira and Jeremy Gibbons (2005). TypeCase: A Design Pattern for Type-Indexed Functions. In Haskell Workshop, September 2005.

Abstract: A type-indexed function is a function that is defined for each member of some family of types. Haskell's type class mechanism provides open type-indexed functions, in which the indexing family can be extended at any time by defining a new type class instance. The purpose of this paper is to present TypeCase: a design pattern that allows the definition of closed type-indexed functions. It is inspired by Cheney and Hinze's work on lightweight approaches to generic programming. We generalise their techniques as a design pattern. Furthermore, we show that type-indexed functions with type-indexed types, and consequently generic functions with generic types, can also be encoded in a lightweight manner, thereby overcoming one of the main limitations of the lightweight approaches.
pdf (12 pages).

Jeremy Gibbons (2005). Metamorphisms: Streaming Representation-Changers. Submitted for publication, January 2005.

Abstract: Unfolds generate data structures, and folds consume them. A hylomorphism is a fold after an unfold, generating then consuming a virtual data structure. A metamorphism is the opposite composition, an unfold after a fold; typically, it will convert from one data representation to another. In general, metamorphisms are less interesting than hylomorphisms: there is no automatic fusion to deforest the intermediate virtual data structure. However, under certain conditions fusion is possible: some of the work of the unfold can be done before all of the work of the fold is complete. This permits streaming metamorphisms, and among other things allows conversion of infinite data representations. We present the theory of metamorphisms and outline some examples.
pdf (51 pages). (This paper is a revised and extended version of Streaming Representation-Changers, from MPC2004.)

Jeremy Gibbons, Unbounded Spigot Algorithms for the Digits of π. To appear in American Mathematical Monthly. One of the algorithms described in this paper won a prize for "most useful submission" at the Succ Zeroth International Obfuscated Haskell Code Contest!

Abstract: Rabinowitz and Wagon (in American Mathematical Monthly 102(3):195--203, 1995) present a spigot algorithm for computing the digits of π. A spigot algorithm yields its outputs incrementally, and does not reuse them after producing them. Their algorithm is inherently bounded; it requires a commitment in advance to the number of digits to be computed, and in fact might still produce an incorrect last few digits. We propose some streaming algorithms based on the same and some similar characterizations of π, with the same incremental characteristics but without requiring the prior bound.
pdf (10 pages).

Jeremy Gibbons, David Lester and Richard Bird, Enumerating the Rationals. To appear in Journal of Functional Programming, 2005.
Abstract: We present a series of functional programs for enumerating the rational numbers without duplication, drawing on some elegant results of Neil Calkin, Herbert Wilf and Moshe Newman.
pdf (10 pages).

Jeremy Gibbons and Graham Hutton, Proof Methods for Corecursive Programs. Fundamenta Informatica 66(4):353-366, April/May 2005.
Abstract: Recursion is a well-known and powerful programming technique, with a wide variety of applications. The dual technique of corecursion is less well-known, but is increasingly proving to be just as useful. This article is a tutorial on the four main methods for proving properties of corecursive programs: fixpoint induction, the approximation (or take) lemma, coinduction, and fusion.
pdf (13 pages). (This paper is a revised version of [Gibbons&Hutton99:Proof].)

Jeremy Gibbons, Streaming Representation-Changers. In LNCS 3125: Mathematics of Program Construction, pages 142-168, July 2004.

Abstract: Unfolds generate data structures, and folds consume them. A hylomorphism is a fold after an unfold, generating then consuming a virtual data structure. A metamorphism is the opposite composition, an unfold after a fold; typically, it will convert from one data representation to another. In general, metamorphisms are less interesting than hylomorphisms: there is no automatic fusion to deforest the intermediate virtual data structure. However, under certain conditions fusion is possible: some of the work of the unfold can be done before all of the work of the fold is complete. This permits streaming metamorphisms, and among other things allows conversion of infinite data representations. We present the theory of metamorphisms and outline some examples.
pdf (27 pages).

Clare Martin, Sharon Curtis, and Ingrid Rewitzky, Modelling Nondeterminism. To appear in Mathematics of Program Construction, July 2004.

ps (26 pages).

Sharon Curtis. The Classification of Greedy Algorithms. Science of Computer Programming 49:125-157, 2003.

Abstract: This paper presents principles for the classification of greedy algorithms for optimization problems. These principles are made precise by their expression in the relational calculus, and illustrated by various examples. A discussion compares this work to other greedy algorithms theory.
ps.gz (39 pages).

Roland Backhouse and Jeremy Gibbons, editors. Summer School on Generic Programming. Lecture Notes in Computer Science 2793, November 2003.

Fetch the table of contents as a PDF file.

Jeremy Gibbons, Patterns in Datatype-Generic Programming. To appear in Declarative Programming in the Context of Object-Oriented Languages, Uppsala, 25th August 2003.

Abstract: Generic programming consists of increasing the expressiveness of programs by allowing a wider variety of kinds of parameter than is usual. The most popular instance of this scheme is the C++ Standard Template Library. Datatype-generic programming is another instance, in which the parameters take the form of datatypes. We argue that datatype-generic programming is sufficient to express essentially all the genericity found in the Standard Template Library, and to capture the abstractions motivating many design patterns. Moreover, datatype-generic programming is a precisely-defined notion with a rigorous mathematical foundation, in contrast to generic programming in general and the C++ template mechanism in particular, and thereby offers the prospect of better static checking and a greater ability to reason about generic programs. This paper describes work in progress.
pdf (13 pages).

Shin-Cheng Mu, A Calculational Approach to Program Inversion. DPhil thesis, January 2003.
Abstract: Many problems in computation can be specified in terms of computing the inverse of an easily constructed function. However, studies on how to derive an algorithm from a problem specification involving inverse functions are relatively rare. The aim of this thesis is to demonstrate, in an example-driven style, a number of techniques to do the job. The techniques are based on the framework of relational, algebraic program derivation.

Simple program inversion can be performed by just taking the converse of the program, sometimes known as to `run a program backwards'. The approach, however, does not match the pattern of some more advanced algorithms. Previous results, due to Bird and de Moor, gave conditions under which the inverse of a total function can be written as a fold. In this thesis, a generalised theorem stating the conditions for the inverse of a partial function to be a hylomorphism is presented and proved. The theorem is applied to many examples, including the classical problem of rebuilding a binary tree from its preorder and inorder traversals.

This thesis also investigates into the interplay between the above theorem and previous results on optimisation problems. A greedy linear-time algorithm is derived for one of its instances — to build a tree of minimum height. The necessary monotonicity condition, though looking intuitive, is difficult to establish. For general optimal bracketing problems, however, the thinning strategy gives an exponential-time algorithm. The reason and possible improvements are discussed in a comparison with the traditional dynamic programming approach. The greedy theorem is also generalised to a generic form allowing mutually defined algebras. The generalised theorem is applied to the optimal marking problem defined on non-polynomial based datatypes. This approach delivers polynomial-time algorithms without the need to convert the inputs to polynomial based datatypes, which is sometimes not convenient to do.

The many techniques are applied to solve the Countdown problem, a problem derived from the popular television program of the same name. Different strategies toward deriving an efficient algorithm are experimented and compared.

Finally, it is shown how to derive from its specification the inverse of the Burrows-Wheeler transform, a string-to-string transform useful in compression. Not only do we identify the key property why the inverse algorithm works, but, as a bonus, we also outline how two generalisations of the transform may be derived.

ps.gz (167 pages).

Clare Martin, Jeremy Gibbons and Ian Bayley, Disciplined, efficient, generalised folds for nested datatypes. Formal Aspects of Computing 16(1):19-35, 2004.
Abstract: Nested (or non-uniform, or non-regular) datatypes have recursive definitions in which the type parameter changes. Their folds are restricted in power due to type constraints. Bird and Paterson introduced generalised folds for extra power, but at the cost of a loss of efficiency: folds may take more than linear time to evaluate. Hinze introduced efficient generalised folds to counter this inefficiency, but did so in a pragmatic way, at the cost of a loss of reasoning power: without categorical or equivalent underpinnings, there are no universal properties for manipulating folds. We combine the efficiency of Hinze's construction with the powerful reasoning tools of Bird and Paterson's.
pdf (17 pages).

Richard Bird and Jeremy Gibbons, Arithmetic Coding with Folds and Unfolds. In Johan Jeuring and Simon Peyton Jones, editors, Advanced Functional Programming 4. Lecture Notes in Computer Science 2638, Springer-Verlag, 2003. pdf; code.

Jeremy Gibbons and Johan Jeuring, editors, Generic Programming. Proceedings of the IFIP TC2 Working Conference on Generic Programming, Schloss Dagstuhl, July 2002. ISBN 1-4020-7374-7. Kluwer Academic Publishers, 2003. Buy it from Amazon. pdf of preface and table of contents.

Jeremy Gibbons and Oege de Moor, editors, The Fun of Programming. ISBN 0333992857 in paperback, 1403907722 in hardback. Palgrave, March 2003. Buy it from Amazon in paperback or hardback. Pdf of preface and table of contents.

Jeremy Gibbons, Origami Programming. In Jeremy Gibbons and Oege de Moor, editors, The Fun of Programming.

Shin-Cheng Mu and Richard Bird. Inverting functions as folds. To appear in Mathematics of Program Construction, Schloss Dagstuhl, Germany, July 2002.
Abstract: This paper is devoted to the proof and applications of a theorem giving conditions under which the inverse of a partial function can be expressed as a relational hylomorphism. The theorem is a generalisation of a previous result, due to Bird and de Moor, that gave conditions under which a total function can be expressed a relational fold. The theorem is illustrated with three problems, all dealing with constructing trees with various properties.
.ps.gz (24 pages).

Ian M. Bayley. Generic Operations on Nested Datatypes. DPhil thesis, Oxford University Computing Laboratory, Michaelmas Term 2001 (revised May 2002).
Abstract: Nested datatypes are a generalisation of the class of regular datatypes, which includes familiar datatypes like trees and lists. They typically represent constraints on the values of regular datatypes and are therefore used to minimise the scope for programmer error.

An operation is said to be generic if it is parameterised by a datatype. This thesis explains how to define and reason with generic operations for nested datatypes. The operations we choose to illustrate the method include the zip and membership operations of Hoogendijk's thesis. These operations are thereby generalised from regular datatypes to nested datatypes.

We use fold operators to define and reason with these generic operations. It is therefore sufficient for us to define these fold operators generically and to express neatly generic theorems for these folds, such as universal properties and fusion laws. This we do for the three types of folds on nested datatypes. We demonstrate the theorems by proving the fold-equality law, which connects two varieties of fold representing two different modes of evaluation.

Much of our reasoning is with relations rather than functions so we have to adapt our semantics for nested datatypes to incorporate the operators of relational calculus. For this reason, we extend our fold operators and associated theorems to apply to relations.

Okasaki has argued informally that every nested datatype represents a constraint on some regular datatype. We prove this formally by defining an injective embedding function for each nested datatype. Since this operation connects programs that use nested datatypes with programs that use only regular datatypes, we can use it to remove nested datatypes from programs. There is some hope that we can also use it as a means of constructing programs for nested datatypes.

pdf (189 pages).

Roland Backhouse, Roy Crole and Jeremy Gibbons, editors. Algebraic and Coalgebraic Methods in the Mathematics of Program Construction. Lecture Notes in Computer Science 2297, January 2002.

Fetch the table of contents as a PDF file.

Jeremy Gibbons. Calculating Functional Programs. In Algebraic and Coalgebraic Methods in the Mathematics of Program Construction, Lecture Notes in Computer Science 2297, p148-203, January 2002.
Abstract: Functional programs are merely equations; they may be manipulated by straightforward equational reasoning. In particular, one can use this style of reasoning to calculate programs, in the same way that one calculates numeric values in arithmetic. Many useful theorems for such reasoning derive from an algebraic view of programs, built around datatypes and their operations. Traditional algebraic methods concentrate on initial algebras, constructors, and values; dual co-algebraic methods concentrate on final co-algebras, destructors, and processes. Both methods are elegant and powerful; they deserve to be combined.
Fetch the PDF file. (This chapter supersedes Lecture Notes on Algebraic and Coalgebraic Methods for Calculating Functional Programs, from the Estonian Winter School on Computer Science, March 1999.)

Richard Bird, Jeremy Gibbons and Shin-Cheng Mu. Algebraic Methods for Optimization Problems. In Algebraic and Coalgebraic Methods in the Mathematics of Program Construction, Lecture Notes in Computer Science 2297, p281-307, January 2002.
Abstract: We argue for the benefits of relations over functions for modelling programs, and even more so for modelling specifications. To support this argument, we present an extended case study for a class of optimization problems, deriving efficient functional programs from concise relational specifications.
Fetch the PDF file.

Jeremy Gibbons. Towards a colimit-based semantics for visual programming. In LNCS 2315: Coordination Models and Languages, ed Farhad Arbab and Carolyn Talcott, p166-173, April 2002.
Abstract: Software architects such as Garlan and Katz promote the separation of computation from coordination. They encourage the study of connectors as first-class entities, and superposition of connectors onto components as a paradigm for component-oriented programming. We demonstrate that this is a good model for what visual programming tools like IBM's VisualAge actually do. Moreover, Fiadeiro and Maibaum's categorical semantics of parallel programs is applicable to this model, so we can make progress towards a formal semantics of visual programming.
.pdf of conference version (8 pages) or .pdf of extended version (16 pages).

Shin-Cheng Mu and Richard Bird. Functional Quantum Computation. In 2nd Asian Workshop on Programming Languages and Systems, KAIST, Dajeaon, Korea, December 17-18, 2001. .ps.gz (14 pages).

Richard Bird and Shin-Cheng Mu. Inverting the Burrows-Wheeler Transform. In Haskell Workshop, Firenze, Italy, September 2001. This revised version submitted for journal publication.
Abstract: Our aim in this pearl is to exploit simple equational reasoning to derive the inverse of the Burrows-Wheeler transform from its specification. As a bonus, we will also sketch the outlines of deriving the inverse of two more general versions of the transform, one proposed by Schindler and another by Chapin and Tate.
.ps.gz (9 pages).

Jeremy Gibbons, Graham Hutton and Thorsten Altenkirch. When is a function a fold or an unfold?. In Coalgebraic Methods in Computer Science, April 2001.
Abstract: We give a necessary and sufficient condition for when a set-theoretic function can be written using the recursion operator fold, and a dual condition for the recursion operator unfold. The conditions are simple, practically useful, and generic in the underlying datatype.
.ps.gz (13 pages).

Silvija Seres. The Algebra of Logic Programming. D.Phil. thesis, April 2001.
Abstract: At present, the field of declarative programming is split into two main areas based on different formalisms; namely, functional programming, which is based on lambda calculus, and logic programming, which is based on first-order logic. There are currently several language proposals for integrating the expressiveness of these two models of computation. In this thesis we work towards an integration of the methodology from the two research areas. To this end, we propose an algebraic approach to reasoning about logic programs, corresponding to the approach taken in functional programming.

In the first half of the thesis we develop and discuss a framework which forms the basis for our algebraic analysis and transformation methods. The framework is based on an embedding of definite logic programs into lazy functional programs in Haskell, such that both the declarative and the operational semantics of the logic programs are preserved.

In spite of its conciseness and apparent simplicity, the embedding proves to have many interesting properties and it gives rise to an algebraic semantics of logic programming. It also allows us to reason about logic programs in a simple calculational style, using rewriting and the algebraic laws of combinators. In the embedding, the meaning of a logic program arises compositionally from the meaning of its constituent subprograms and the combinators that connect them.

In the second half of the thesis we explore applications of the embedding to the algebraic transformation of logic programs. A series of examples covers simple program derivations, where our techniques simplify some of the current techniques. Another set of examples explores applications of the more advanced program development techniques from the Algebra of Programming by Bird and de Moor, where we expand the techniques currently available for logic program derivation and optimisation.

.ps.gz (203 pages).

Clare Martin and Jeremy Gibbons, On the Semantics of Nested Datatypes. In Information Processing Letters, 80(5) p233-238, December 2001.
Abstract: Nested (or non-regular or non-uniform) datatypes are recursively defined parameterised datatypes in which the parameter of the datatype changes in the recursive call. The standard semantic definition of recursively defined datatypes is as initial algebras in the category Set of sets and total functions. Bird and Meertens have shown that this theory is inadequate to describe nested datatypes. Instead, one solution proposed there was to define them as initial algebras in the functor category Nat(Set), with objects all endofunctors on Set and arrows all natural transformations between them. It was observed, however, that this approach had not been validated, in the sense of ensuring that such initial algebras exist. We show here that they are not guaranteed to exist in the functor category itself, but that they do exist in one of its subcategories: the category of all cocontinuous endofunctors and natural transformations. This category is then a suitable semantic domain for nested datatypes, both first order and higher-order.
ps.gz (7 pages). (Revised, February 2001.)

Richard McPhee. Compositional Logic Programming. DPhil thesis, Oxford University Computing Laboratory, Trinity Term 2000.
Abstract: Relational program derivation has gathered momentum over the last decade with the development of many specification logics. However, before such relational specifications can be executed in existing programming languages, they must be carefully phrased to respect the evaluation order of the language. In turn, this requirement inhibits the rapid prototyping of specifications in a relational notation. The aim of this thesis is to bridge the gap between the methodology and practice of relational program derivation by realising a compositional style of logic programming that permits specifications to be phrased naturally and executed declaratively.

The first contribution of this thesis is the identification of a collection of desiderata that sets out the particular language requirements necessary to support our notion of compositionality. Thus, from the outset, we differentiate between the execution of specifications and programs, the latter an enterprise best left to the likes of Prolog and Mercury.

Compositionality is obtained in this thesis by translating higher-order elements of functional programming style into the logic paradigm. By doing so, these elements become enriched with the extra expressiveness fostered by nondeterminism, logic variables, and program converses. Another contribution of this thesis is the demonstration that a curried representation of programming terms in a compositional logic language is sufficient to provide the desired higher-order facilities without the need for either extra-logical predicates or higher-order unification.

A further contribution of this thesis is the rediscovery of fair SLD-resolution as a fundamental way to guarantee termination of compositional programs within the confines of resolution. Unfortunately, though, fair SLD-resolution using the `breadth-first' computation rule exhibits efficiency behaviour that is difficult to predict. Consequently, this thesis proposes and implements two novel versions of fair SLD-resolution that overcome the deficiencies of the breadth-first computation rule.

The first strategy, based on a new formulation of tabled evaluation, restores efficiency by eliminating redundant computation and also inherits the extra termination benefits intrinsic to tabled evaluation. The second strategy, called prioritised fair SLD-resolution, selects literals in a goal from those whose resolution is known to terminate at the expense of the others. Prioritised resolution centres around an original adaptation of an existing termination analysis for logic programs. Although termination analysis has recently been used to verify the correctness of logic programs, its application in this thesis to improve the efficiency of compositional programs is new.

pdf (196 pages).

Paul Hoogendijk and Oege de Moor, Container Types Categorically. In Journal of Functional Programming 10(2) p.191-225, 2000. (This paper is a revised version of [Hoogendijk&deMoor96:What]).

Silvija Seres and Shin-Cheng Mu. Optimization Problems in Logic Programming: An Algebraic Approach, in LPSE, July 2000.
Abstract: Declarative programming, with its mathematical underpinning, was aimed to simplify rigorous reasoning about programs. For functional programs, an algebraic calculus of relations has previously been applied to optimisation problems to derive efficient greedy or dynamic programs from the corresponding inefficient but obviously correct specifications. Here we argue that this approach is natural also in the logic programming setting.
ps.gz (19 pages).

Silvija Seres and Mike Spivey. Higher-Order Transformation of Logic Programs, in LOPSTR, July 2000.
Abstract: It has earlier been assumed that a compositional approach to algorithm design and program transformation is somehow unique to functional programming. Elegant theoretical results codify the basic laws of algorithmics within the functional paradigm and with this paper we hope to demonstrate that some of the same techniques and results are applicable to logic programming as well.
ps.gz (9 pages).

Oege de Moor and Jeremy Gibbons, Pointwise Relational Programming, to appear at AMAST, May 2000.
Abstract: The point-free relational calculus has been very successful as a language for discussing general programming principles. However, when it comes to specific applications, the calculus can be rather awkward to use: some things are more clearly and simply expressed using variables. The combination of variables and relational combinators such as converse and choice yields a kind of nondeterministic functional programming language. We give a semantics for such a language, and illustrate with an example application.
ps.gz (20 pages).

Graham Hutton and Jeremy Gibbons The Generic Approximation Lemma, Information Processing Letters 79(4) p197-201, August 2001.
Abstract: The approximation lemma was recently introduced as a simplification of the well-known take lemma, and is used to prove properties of programs that produce lists of values. We show how the approximation lemma, unlike the take lemma, can naturally be generalised from lists to a large class of datatypes, and present a generic approximation lemma that is parametric in the datatype to which it applies. As a useful by-product, we find that generalising the approximation lemma in this way also simplifies its proof.
ps.gz (8 pages).

Tyng-Ruey Chuang and Shin-Cheng Mu. Out-of-core functional programming with type-based primitives. In 2nd International Workshop on Practical Aspects of Declarative Languages, January 2000. .ps (15 pages).

Silvija Seres, Mike Spivey and Tony Hoare, Algebra of Logic Programming, to appear at ICLP'99, November 1999.
Abstract: A declarative programming language has two kinds of semantics. The more abstract helps in reasoning about specifications and correctness, while an operational semantics determines the manner of program execution. A correct program should reconcile its abstract meaning with its concrete interpretation.

To help in this, we present a kind of algebraic semantics for logic programming. It lists only those laws that are equally valid for predicate calculus and for the standard depth-first strategy of Prolog. An alternative strategy is breadth-first search, which shares many of the same laws. Both strategies are shown to be special cases of the most general strategy, that for free searching. The three strategies are defined in the lazy functional language Haskell, so that each law can be proved by standard algebraic reasoning. The laws are an enrichment of the familiar categorical concept of a monad, and the links between such monads are explored.

ps.gz (16 pages).

Jeremy Gibbons and Graham Hutton, Proof Methods for Structured Corecursive Programs. In First Scottish Functional Programming Workshop, August/September 1999.
Abstract: Corecursive programs produce values of greatest fixpoint types, in contrast to recursive programs, which consume values of least fixpoint types. There are a number of widely used methods for proving properties of corecursive programs, including fixpoint induction, the take lemma, and coinduction. However, these methods are all rather low-level, in the sense that they do not exploit the common structure that is often present in corecursive definitions. We argue for a more structured approach to proving properties of corecursive programs. In particular, we show that by writing corecursive programs using an operator called unfold that encapsulates a common pattern of corecursive definition, we can then use high-level algebraic properties of this operator to conduct proofs in a purely calculational style that avoids the use of either induction or coinduction.
ps.gz (14 pages). (This paper has been superseded by [Gibbons&Hutton2000:Proof].)

Silvija Seres and Mike Spivey, Embedding Prolog into Haskell, presented at HASKELL'99, September 1999.
Abstract: The distinctive merit of the declarative reading of logic programs is the validity of all the laws of reasoning supplied by the predicate calculus with equality. Surprisingly many of these laws are still valid for the procedural reading; they can therefore be used safely for algebraic manipulation, program transformation and optimisation of executable logic programs.

This paper lists a number of common laws, and proves their validity for the standard (depth-first search) procedural reading of Prolog. They also hold for alternative search strategies, e.g. breadth-first search. Our proofs of the laws are based on the standard algebra of functional programming, after the strategies have been given a rather simple implementation in Haskell.

ps.gz (14 pages).

Richard Bird, Jeremy Gibbons and Geraint Jones, Program Optimisation, Naturally. In Millenial Perspectives in Computer Science, Palgrave, 2000.
Abstract: It is well-known that each polymorphic function satisfies a certain equational law, called a naturality condition. Such laws are part and parcel of the basic toolkit for improving the efficiency of functional programs. More rarely, some polymorphic functions also possess a higher-order naturality property. One example is the operation unzip that takes lists of pairs to pairs of lists. Surprisingly, this property can be invoked to improve the asymptotic efficiency of some divide-and-conquer algorithms from O(n log n) to O(n). The improved algorithms make use of polymorphic recursion, and can be expressed neatly using nested datatypes, so they also serve as evidence of the practical utility of these two concepts.
ps.gz (9 pages).

Jeremy Gibbons, Generic Downwards Accumulations. Science of Computer Programming 37(1-3) p37-65, 2000.
Abstract: A downwards accumulation is a higher-order operation that distributes information downwards through a data structure, from the root towards the leaves. The concept was originally introduced in an ad hoc way for just a couple of kinds of tree. We generalize the concept to an arbitrary regular datatype; the resulting definition is co-inductive.
ps.gz (29 pages).

Oege de Moor and Jeremy Gibbons, Bridging the Algorithm Gap: A Linear-Time Functional Program for Paragraph Formatting. Science of Computer Programming, 35(1), September 1999. Revised version of Technical Report CMS-TR-97-03, School of Computing and Mathematical Sciences, Oxford Brookes University.
Abstract: In the constructive programming community it is commonplace to see formal developments of textbook algorithms. In the algorithm design community, on the other hand, it may be well known that the textbook solution to a problem is not the most efficient possible. However, in presenting the more efficient solution, the algorithm designer will usually omit some of the implementation details, thus creating an algorithm gap between the abstract algorithm and its concrete implementation. This is in contrast to the formal development, which usually proceeds all the way to the complete concrete implementation of the less efficient solution.

We claim that the algorithm designer is forced to omit some of the details by the relative expressive poverty of the Pascal-like languages typically used to present the solution. The greater expressiveness provided by a functional language would allow the whole story to be told in a reasonable amount of space. In this paper we use a functional language to present the development of a sophisticated algorithm all the way to the final code. We hope to bridge the algorithm gap between abstract and concrete implementations, and thereby facilitate communication between the constructive programming and algorithm design communities.

ps.gz (27 pages).

Richard Bird and Ross Paterson, Generalised Folds for Nested Datatypes. Formal Aspects of Computing, 11 p200-222, 1999.
Abstract: Nested datatypes generalise regular datatypes in much the same way that context-free languages generalise regular ones. Although the categorical semantics of nested types turns out to be similar to the regular case, the fold functions are more limited because they can only describe natural transformations. Practical considerations therefore dictate the introduction of a generalised fold function in which this limitation can be overcome. In the paper we show how to construct generalised folds systematically for each nested datatype, and show that they possess a uniqueness property analogous to that of ordinary folds. As a consequence, generalised folds satisfy fusion properties similar to those developed for regular datatypes. Such properties form the core of an effective calculational theory of inductive datatypes.
ps.gz (23 pages).

Richard Bird and Ross Paterson, De Bruijn Notation as a Nested Datatype. Journal of Functional Programming, 11 p200-222, 1999.
Abstract: de Bruijn notation is a coding of lambda terms in which each occurrence of a bound variable x is replaced by a natural number, indicating the `distance' from the occurrence to the abstraction that introduced x. One might suppose that in any datatype for representing de Bruijn terms, the distance restriction on numbers would have to maintained as an explicit datatype invariant. However, by using a nested (or non-regular) datatype, we can define a representation in which all terms are well-formed, so that the invariant is enforced automatically by the type system.

Programming with nested types is only a little more difficult than programming with regular types, provided we stick to well-established structuring techniques. These involve expressing inductively defined functions in terms of an appropriate fold function for the type, and using fusion laws to establish their properties. In particular, the definition of lambda abstraction and beta reduction is particularly simple, and the proof of their associated properties is entirely mechanical.

ps.gz (15 pages).

Jeremy Gibbons, A Pointless Derivation of Radixsort. Journal of Functional Programming, 9(3) p339-346, 1999.
Abstract: This paper is about point-free (or `pointless') calculations - calculations performed at the level of function composition instead of that of function application. We address this topic with the help of an example, namely calculating the radix-sort algorithm from a more obvious specification of sorting. The message that we hope to send is that point-free calculations are sometimes surprisingly simpler than the corresponding point-wise calculations.
ps.gz (10 pages).

Richard Bird, Meertens' number. Journal of Functional Programming, 8(1):83-88, 1998.

Silvija Seres, Unifying Functional and Logic Programming, transfer thesis, September 1998. ps.gz

Jeremy Gibbons and Geraint Jones, The Under-Appreciated Unfold. In International Conference on Functional Programming, Baltimore, MD, September 1998.
Abstract: Folds are appreciated by functional programmers; the benefits of encapsulating common patterns of computation as higher-order operators are well-known and well understood. Their dual, unfolds, are nearly as well-known, but not nearly as well appreciated. We believe they deserve better. To illustrate, we present (indeed, we calculate) a number of algorithms for computing the breadth-first traversal of a tree. We specify breadth-first traversal in terms of level-order traversal, which we characterize first as a fold. The presentation as a fold is simple, but it is inefficient, and removing the inefficiency makes it no longer a fold. We calculate a characterization as an unfold from the characterization as a fold; this unfold is equally clear, but more efficient. We also calculate a characterization of breadth-first traversal directly as an unfold; this turns out to be the `standard' queue-based algorithm.
ps.gz (7 pages).

Geraint Jones, Tabulation for Type Hackers. Unpublished notes, July 1998. html

Richard Bird and Lambert Meertens, Nested Datatypes, in Johan Jeuring (ed), LNCS 1422: Mathematics of Program Construction, Springer-Verlag, 1998.

Richard Bird, Introduction to Functional Programming using Haskell, Prentice-Hall, 1998. Resources.

Richard Bird and Jesus Ravelo, On computing representatives. Information Processing Letters, 63:1-7, 1997.

Richard Bird, Geraint Jones, and Oege de Moor, More haste, less speed: lazy versus eager evaluation, Journal of Functional Programming, 7(5):541-547, 1997. dvi.gz, ps.gz (5 pages).

Richard Bird, On building trees with minimum height. Journal of Functional Programming, 7(4):441-445, 1997.

Jeremy Gibbons, More on Merging and Selection. Technical Report CMS-TR-97-08, School of Computing and Mathematical Sciences, Oxford Brookes University.
Abstract: In his paper On Merging and Selection (Journal of Functional Programming 7(3), 1997), Bird considers the problem of computing the nth element of the list resulting from merging the two sorted lists x and y. Representing x and y by trees, Bird derives an algorithm for the problem taking time proportional to the sum of their depths. Bird's derivation is more complicated than necessary. By the simple tactic of delaying a design decision (in this case, the decision to represent the lists as trees) as long as possible, we obtain a much simpler solution.
ps.gz (5 pages).

Richard Bird, On merging and selection. Journal of Functional Programming, 7(3):349-354, 1997.

Pedro Borges, Sequence Implementations in Haskell, 1997.
Abstract: An abstract data type Sequence is defined with the operations empty, isEmpty, cons, snoc, popFront, popRear, lenghtSeq, toList, and toSeq. A sequence with the operations lookupSeq and updateSeq is an Indexable Sequence. A sequence with catenation is called a Catenable Sequence. Some functional implementations of these abstract data types taken from the literature are described. These implementations are classified as stacks, deques, flexible arrays, and catenable lists, if they can be used as efficient implementations for each of these traditional data types. Some of them are extended to provide the operations defined for sequences. Some comments and directions for further research are also included.

The implementations are made in the functional programming language Haskell as instances of one or more of the classes Sequence, IndSeq, and CatSeq, with the operations defined for each type. These instances are classified by the subset of these operations that each instance supports efficiently, i.e. with at most logarithmic complexity.

dvi.gz, ps.gz (61 pages).

Jesus Ravelo, Calculating with Relations for Graph Algorithmics, Presented at RelMiCS'97: 3rd International Seminar on the Use of Relational Methods in Computer Science, 6-10 January 1997, Hammamet, Tunisia. dvi.gz, ps.gz (10 pages).

Richard Bird, Allegories as a basis for algorithmics. In E. Moggi and G Rossolini, editors, Proceedings of Category Theory and Computer Science, volume 1290 of Lecture Notes in Computer Science, pages 34-46, 1997.

Richard Bird and Oege de Moor, Algebra of Programming, Prentice-Hall, 1996. Resources.

Jesus Ravelo, A Class of Graph Algorithms, qualifying dissertation submitted in application for transfer to DPhil status, 1996. dvi.gz, ps.gz (45 pages).

Richard McPhee and Oege de Moor, Compositional Logic Programming, M. Chakravarty and Y. Guo and T. Ida (eds), Proceedings of the JICSLP'96 post-conference workshop: Multi-paradigm logic programming, Report 96-28, Technische Universitaet Berlin, 1996, pp. 115--127. dvi.gz, ps.gz (12 pages).

Paul Hoogendijk and Oege de Moor, What is a data type?, 1996. dvi.gz, ps.gz (30 pages). (This paper has been superseded by [Hoogendijk&deMoor2000:Container].)

Richard Bird, Oege de Moor and Paul Hoogendijk, Generic functional programming with types and relations. Journal of Functional Programming, 6(1), 1996. ps.gz (25 pages).

Richard Bird, Functional algorithm design. Science of Computer Programming, 26:15-31, 1996.

Oege de Moor, An Exercise in Polytypic Program Derivation: repmin, 1995. dvi.gz, ps.gz (15 pages).

Oege de Moor, A Generic Program for Sequential Decision Processes, PLILP 1995: 1-23, 1995. dvi.gz, ps.gz (23 pages).

Richard McPhee, Towards a Relational Programming Language, qualifying dissertation submitted in application for transfer to DPhil status, 1995. dvi.gz, ps.gz (65 pages).

Richard McPhee, Implementing Ruby in a Higher-Order Logic Programming Language, 1995. dvi.gz, ps.gz (15 pages).

Sharon Curtis and Gavin Lowe, A Graphical Calculus, 1995. ps.gz (18 pages).

Sharon Curtis, A Relational Approach to Optimization Problems, D.Phil. thesis, 1995. ps.gz. (137 pages).

Richard Bird and Oege de Moor, Relational program derivation and context-free language recognition, In A. W. Roscoe, editor, A Classical Mind: Essays Dedicated to C. A. R. Hoare, Prentice-Hall, 1994.

Richard Bird and Oege de Moor, The Algebra of Programming. In Manfred Broy, editor, Proceedings of the 1994 International Summer School, Marktoberdorf, NATO ASI series F. Springer-Verlag, 1994.

Oege de Moor, Categories, Relations and Dynamic Programming. Mathematical Structures in Computer Science 4(1): 33-69 (1994).

Sharon Curtis, Partitions Revisited, qualifying dissertation submitted in application for transfer to DPhil status, 1993. ps.gz (55 pages).

Doaitse Swierstra and Oege de Moor, Virtual Data Structures, in Bernhard Moeller et al (eds), Proceedings of the State-of-the-Art Seminar on Formal Program Development, Rio de Janeiro, Springer LNCS 755, 1993. dvi.gz, ps.gz (17 pages).

Richard Bird and Oege de Moor, From Dynamic Programming to Greedy Algorithms, in Bernhard Moeller et al (eds), Proceedings of the State-of-the-Art Seminar on Formal Program Development, Rio de Janeiro, Springer LNCS 755, 1993. dvi.gz, ps.gz (19 pages).

Mike Spivey, Category theory for functional programming. Technical Report TR-7-93, OUCL, June 1993.

Paul Gardiner, Clare Martin, and Oege de Moor, An Algebraic Construction of Predicate Transformers, Science of Computer Programming 22(1-2): 21-44 (1994). (Earlier version appears in MPC92.) dvi.gz, ps.gz (20 pages).

Richard Bird and Oege de Moor, Solving Optimisation Problems with Catamorphisms, In MPC92, pages 45-66. dvi.gz, ps.gz (22 pages).

Geraint Jones and Mary Sheeran, Designing arithmetic circuits by refinement in Ruby In MPC92, pages 208-232. ps.Z (26 pages).

Richard Bird, Carroll Morgan and Jim Woodcock (eds), Proceedings of the Second International Conference on Mathematics of Program Construction, volume 669 of Lecture Notes in Computer Science, Springer-Verlag, 1993.

Richard Bird and Oege de Moor, List Partitions, Formal Aspects of Computing, 5(1):61-78, 1993.

Richard Bird and Oege de Moor, Between Dynamic Programming and Greedy: Data Compression, 1992. dvi.gz, ps.gz (33 pages).

Oege de Moor, Inductive Data Types for Predicate Transformers, Information Processing Letters 43(3): 113-117, 1992. dvi.gz, ps.gz (9 pages).

Jeremy Gibbons (Jeremy.Gibbons@comlab.ox.ac.uk)

[Oxford Spires]

Oxford University Computing Laboratory Courses Research People About us News