# Functional Pearl: A fresh look at binary search trees

*Ralf Hinze*

### Abstract

Binary search trees are old hat, aren't they? Search trees are routinely covered in introductory computer science classes and they are widely used in functional programming courses to illustrate the benefits of algebraic data types and pattern matching. And indeed, the operation of insertion enjoys a succinct and elegant functional formulation. Alas, both succinctness and elegance are lost when it comes to implementing the dual operation of deletion. Why this discrepancy? The algorithmic explanation is that insertion always takes place at an external node, that is, at a leaf whereas deletion always takes place at an internal node and that manipulating internal nodes is notoriously more difficult than manipulating external nodes. Our own stab at explaining this phenomenon is algebraic or, if you like, linguistic. Arguably, the data type Tree with its two constructors, Leaf and Node, does not constitute a particularly elegant algebra. If we use binary search trees for representing sets, then Leaf denotes the empty set ∅ and Node l a r denotes the set s ∪ a ∪ t where s and t are the denotations of l and r, respectively. One might reasonably advance that Node mingles two abstract operations, namely, forming a singleton set `-' and taking the disjoint union `∪' of two sets, and that it is preferable to consider these two operations separately. Of course, there is a good reason for using a ternary constructor: the second argument of Node, the split key, is vital for steering the binary search. Thus, as a replacement for the tree constructors the algebra ∅, -, `∪' is inadequate; we additionally need a substitute for the split key. Now, a search tree satisfies the invariant that for each node the split key is greater than the elements in the left subtree (and smaller than the ones in the right subtree). This suggests to augment the algebra with an observer function max (or min, equivalently) that determines the maximum (or the minimum) element of a set. We will see that all standard operations on search trees can be conveniently expressed using this extended algebra. This does not mean, however, that we abandon binary search trees altogether. Rather, we shall use the algebra as an interface to the concrete representation of this data structure. This is the point of the pearl: even concrete data types may benefit from data structural abstraction.