Difference lists are a (very) useful data structure in Prolog, particularly when it comes to parsing with definite clause grammars (DCGs) and when it comes to list construction (again, using DCGs). Why? Well, to answer that question, let's examine the structure of difference lists. The Prolog (or relational) representation of a difference list is of the form:

`X = [1,2,3|Y] - Y`

What that relation describes is that `X`

is the *difference*of the pair of lists of (first)

`[1,2,3]`

appended with the list `Y`

*minus*(second) the list

`Y`

.Well, what is the value of the list

`Y`

? To a Prologian, this question is not important (for `Y`

, being a logic variable, can be *any*list or, even, not a list at all because the variable is still uninstantiated (a "free" variable)). Its importance is that it gives us an "instant" index into the list after the elements we've already processed. The significance of that index for parsing is that we now have a stream over the parsed (list) data, which gives us the ability to translate BNF grammars directly into working parsers. The significance for list construction is that we "simply" walk "backward" through the DCG to construct a list from a (parsed) internal representation, without paying the linear cost of either (a) traversing the list each time to add a new element (which results in exponential time list construction) or (b) building the list in reverse (adding new elements to the head of the list) and then applying

`reverse`to the end result.

"Big deal!" you scoff, "zippers [also called 'finger trees'] already give us indices into data structures. Who needs a funny-looking specialized zipper just for lists?"

"Good point!" is my response. Zippers are powerful and flexible data structures that not only provide that index over any data structure, but, even better than difference list's index that only goes forward (with any dignity) in constant time, the zipper's index can go both forward and backward through that (generic) structure in constant time. So, if you are not familiar with zippers, I request that you run, don't walk, to a good overview to learn more about their utility.

With all their flexibility and power, in certain situations, a zipper is not the correct choice for some list processing tasks; the difference list's simplicity shines in these situations. First, for the unidirectional index of the difference list (it moves forward through the list well, but backwards ... not so much), this, for list construction purposes, is hardly ever an issue, and if it becomes one, then backtracking is easily installed with some

`MonadPlus`magic. Next, you, the coder, are responsible for constructing a zipper list data type each time you need one, but Don Stewart has provided Data.DList; no construction necessary! Last, an example structure of a zipper list, for the list

`a++b`

is `(``reverse` a,b)

, so if we wish to reconstruct the original list, for an `a`

of large magnitude, we still pay the linear (`reverse`) cost in reconstruction

*as well as*the linear cost of appending

`b`

to that result.You heard it here first, folks — difference lists: [zipper lists] can't touch this!

Okay, I've finished throwing it down for now. Let's look at difference lists from the Haskell perspective and then use this data structure in a list construction example.

`> ``import` `Data`.`DList`

For my examples, I'm going to be looking at the subset of difference lists that I can compare and see:With the above>instanceEqa =>Eq(DLista)where

> x == y = toList x == toList y

>instance(Eqa,Showa) =>Show(DLista)where

>showx | x == empty = "\\x -> [x]"

> |otherwise= "\\x -> [" ++ show (head x)

> ++ doShow (toList (tail x)) ++ ":x]"

>wheredoShow [] = ""

> doShow (a:b) = ", " ++ show a ++ doShow b

`instance`declarations, we see that

`(singleton 1) `snoc` 2 `snoc` 3`

gives us the representation`λ x . [1, 2, 3:x]`

... `snoc`

appends an element to the end of a list (so it's the reverse of `cons`

).Prologians are always smug with their ability to create an infinite stream of 1s with

`X = [1|X]`

. And, no doubt about it, Prolog is a programming language that makes list construction easy ... **ahem* almost*as easy as list construction in Haskell ("Your list comprehensions outshine even Prolog for sure..."):

But no self-respecting Prologian would freeze their interpreter with a such a rash proof; Haskellians have no such concern, thanks to weak head normal form:repeat1

[1,1..]... and now with difference lists ...letx = unDL (singleton 1) xinx

`take` 5 (`repeat` 1)

So there!Let's examine an application of difference lists. I have a little scanner that I use to help me in my oration. The meat of the algorithm (before difference lists) was as follows:

Ugh, yes,> lettersIn word = lettersIn' word []

> lettersIn' [] ans =reverseans

> lettersIn' (h:t) ans | isLetter h = lettersIn' t (h:ans)

> |otherwise=reverseans

`reverse`s galore! This is an oft-recurring pattern in Haskell code that is very easily fixed with some difference list magic:

A very simply example, I grant you, for if we take a list whose elements are all in the set of the alphabet, the above code is a very complicated way of reinventing the> lettersIn word = lettersIn' word empty

> lettersIn' [] ans = toList ans

> lettersIn' (h:t) ans | isLetter h = lettersIn' t (ans `snoc` h)

> |otherwise= toList ans

`id`function. Or, put another way, who's going to teach me about

`scanl`? Hm, would I be required to use the continuation

`Monad`to escape the scan, I wonder? But you get the general idea for list construction, right? Instead of building the list by adding to the head ("cons"ing) and then applying

`reverse`to the result, the different list allows one to "

`snoc`

" each element and return the correctly ordered result. For a single list construction, the benefit may not be all that obvious, but when one builds, e.g., an XML generator for a complex internal representation, reversing the reverse of the reversed reversed child element becomes an odious labour — difference lists eliminate all the internal juggling, replacing that complication with a straightforward constant-time list and element in-place append.So, next time you need to do some list construction, instead of falling back on the

`reverse`standby, flex your new difference list muscles; you'll be pleasantly surprised.

## 8 comments:

"zippers [also called 'finger trees']"

Um. No they're not. A finger tree is a totally different data structure to a zipper. A zipper is a view into another data structure with a "special point" in that structure. A finger tree is an tree designed for sequences with fast concat, append and prepend.

Zipper: http://en.wikipedia.org/wiki/Zipper_(data_structure)

Finger tree: http://en.wikipedia.org/wiki/Finger_tree

I wonder why you didn't write

lettersIn (x:xs) | isLetter x = x : lettersIn xs

lettersIn _ = []

Or even better:

lettersIn = takeWhile isLetter

David R. MacIver:

Yes, finger trees aren't the same as zippers, but zippers are really just fingers. Finger trees could have been called zipper trees if Kaplan & Tarjan hadn't gotten there first. Their fingers are actually special points in trees, along with a path from those points to the root, just like a zipper!

I like difference lists. However, I think anything you can do with a difference list you can do with constant-time-append deques, which provide other operations as well:

Kaplan & Tarjan's simpler real-time version

Okasaki's amortized-time version

Might be worth noting that the book Concepts, Techniques, and Models of Computer Programming has a nice explanation of Difference Lists using Oz in section 3.4.4. I translated the code to Alice ML, where futures can used to get the same effect.

Prologians are always smug with their ability to create an infinite stream of 1s with X = [1|X].Yes, because breaking the logical interpretation of unification is a virtue in logic programming.

(Actually, I think that Prologians are secretly happy every time they come up with a way to do anything at all in Prolog. I digress.)

What I actually wanted to say was that I found it amusing that you used "show" rather than "shows" in an article about difference lists.

Post a Comment