Wednesday, May 28, 2008

Composing functions with Arrows

The Arrow class in Haskell is a generalization of the Monad class. Whereas a Monad type contains a value, and therefore is often used to represent state (in that the value captures the output of a function), an Arrow type represents the entire function, capturing both its inputs and outputs.

This is very Prolog-y. In Prolog, there are no functions, per se: the procedure's parameters (both input and output, if the procedure has any) are explicit in the declaration.1 It's this total control of the procedure's environment that makes writing, e.g., aspects for Prolog a trivial exercise.

With Arrows, Haskell, which has always been friendly to function composition, receives another way of controlling data flow and composing functions. This topic is broad, diverse and general, and I am just starting down the path of exploring the Arrow class, so this entry will concentrate on only one aspect of the Arrow class, that is data flow through a composed arrow.

The problem comes up often enough: one wishes to dispatch some input through different functions and then marry the results for disposition by another function (if this sounds like neural nets or circuits, that's because those systems do just that). The usual way of going about this work is for the programmer to shepherd the data as it flows from function to function. However, the Arrow protocol provides tools to automate this shepherding. To provide a specific example of the above described problem, we will use problem 54 from Project Euler to illustrate: which hand of poker wins (with the problem statement listing the rules of what makes a hand of poker and the ranking of hands)?

To see which hand wins, first the hands must be classified to their type: high card, pair, two pairs, three of a kind, straight, flush, full house, four of a kind, straight flush and royal flush. We will concentrate on hands that are straights, flushes, straight flushes and royal flushes.2

Given a hand is a list of cards of the following type ...

data Face =  Jack | Queen | King | Ace deriving (Eq, Ord, Enum)
data N = Two | Three | Four | Five | Six | Seven | Eight
| Nine | Ten deriving (Eq, Ord, Enum)
data Rank = N N | Face Face deriving (Eq, Ord)

data Suit = Diamonds | Clubs | Spades | Hearts deriving Eq

data Card = Card Rank Suit deriving Eq3

... and cards are typed as follows ...

data HighCard = High Rank deriving (Eq, Ord)

data Type = [...]
| Straight HighCard
| Flush HighCard Rank Rank Rank Rank
| StraightFlush HighCard
| RoyalFlush deriving (Eq, Ord)

... and the cards are ranked in descending order, seeing if a hand is a straight is simply a check to see if all cards are a run:

run :: MonadPlus m ⇒ [Card] → m HighCard
run (Card rank _:cards)
= High rank |- descending (value rank)
(map value cards)
where descending _ [] = True
descending x (card:t) = x - 1 ≡ card
∧ descending card t

... where the value returns the value (of the rank) of the card (Two ≡ 2, Three ≡ 3, ..., King ≡ 12, Ace ≡ 13), and the |- function [pron: "implied by"] converts a value to its MonadPlus equivalent based on the result of a predicate:

(|-) :: MonadPlus m ⇒ a → Bool → m a
x |- p | p = return x
| otherwise = mzero

A flush is a hand that has all cards of the same suit ...

sameSuit :: MonadPlus m ⇒ [Card] → m Suit
sameSuit (Card _ suit:cards) = suit |- all inSuit cards
where inSuit (Card _ s) = s ≡ suit

... or in the mode of sameSuit a flush is a hand where every other card has the same suit as the first card.

So now we have functions that determine if a hand is a flush or if its a straight. Now, during the discussion, you may have been wondering at the return type of these functions. Why not simply return a truth value, as these are predicates on straights and flushes? If these functions simply tested only for straights and flushes (mutually-) exclusively, the return type of Bool is most appropriate, but we intent to use these tests for more than just that: we will be using these simple functions to build our tests for straight flushes and royal flushes, and their return types help in that eventuality.

The strength of the Arrow class comes into its own in the following development, as well.

First off, what are straight flushes and royal flushes, conceptually? A straight flush is a straight in the same suit with a discriminating high card (for, after all, 109876 beats 76543). A royal flush is a straight flush with an ace as the high card. Now it is clear that the return type of the test functions above need more information than what Bool conveys.

So, we need to process the hand through both run and sameSuit, obtaining positive results from both functions, and we need to retrieve the result from run either to ensure the hand is a royal flush (because the high card must be an ace) or to fix the value of the straight flush. The Arrow protocol has function, &&& [pron. "fan-out"] that splits an input and sends it off to two Arrow types for separate processing, so to determine a straight flush we start the test with ...

run &&& straight

... which gives the following results (with the Maybe type selected to represent the particular MonadPlus):

AKQJ10 (Just (Face Ace), Just (Suit Spades))
AKQJ10 (Just (Face Ace), Nothing)
AK762 (Nothing, Just (Suit Spades))
QQQ44 (Nothing, Nothing)

Now that we have those results (in the form of a tuple pair), we need simply to perform a logical-and over the different values to obtain a conjoined truth result and to retrieve the high card's value to determine the "blue-blooded"ness of the straight flush.

That solution gives us two problems to deal with:

  1. I'm not aware of a logical-and operation for the MonadPlus types; but that doesn't matter because:

  2. We're working with two different types of MonadPlus type values anyway.

So, I rolled my own conjunctive function, favoring the result of the run, given both tests resolve positively:

mand :: Monad m ⇒ m a → m b → m a
x `mand` y = y >> x >>= return

Given the above definition of mand and introducing the composition function of the Arrow protocol, >>>, we can now write the computation that determines a straight flush and returns its high card:

run &&& sameSuit >>> uncurry mand

In fact, this pattern: f &&& g >>> uncurry mand4 neatly captures the logic programming paradigm of goal-seeking via conjunction, so it deserves it's own apt abstraction:

conjoin :: Monad m ⇒ (a → m b) → (a → m c) → a → m b
conjoin f g = f &&& g >>> uncurry mand

So, now we have the tools available to discriminate a hand's type along the straight or royal flush -- the straight flush simply returns the type with the high card value ...

straightFlush :: [Card] → Maybe Type
straightFlush hand = (run `conjoin` sameSuit) hand >>=
return . StraightFlush

... and a royal flush is a straight flush with a guaranteed high ace:

royalFlush :: [Card] → Maybe Type
royalFlush hand = straightFlush hand >>= highAce
where highAce kind
= RoyalFlush |- kind ≡ StraightFlush (High (Face Ace))

Note the complementary rôles of conjoin and >>= [pron. bind]: conjoin marries two truth values (emphasizing the first), and >>= forwards a truth value for disposition. Due to the monadic nature of these functions, a false result (or failure) at any point in the computation aborts it, so these nondeterministic functions are very well-behaved: what is returned is either Just a straight flush or a royal flush if and only if every test is true along the way, or Nothing if even a single test fails.

In summary, this particular aspect of the Arrow protocol that captures the totality of functions and simplifies the control of data flow gives a particularly powerful and flexible set of tools to the user, combining this flexibility with (Arrow) composition and the existing Monad and MonadPlus facilities allows a programming style as simple, declarative and powerful as the logic-programming paradigm.

1 Further distancing Prolog procedures from Haskell functions, the outcome of a procedure is not an "output" but its validity, or truth value -- how very un-Haskell-y!
2 But seriously! A royal flush? How many of you have every drawn that hand? There should be some special rule for drawing a royal flush, like: you get a life-time supply of marzipan loaded into the trunk of your brand new Bentley parked at your vacation home in St. Croix.
3 Note that in poker there is no precedence of suits (unlike in other card games, like Contract Bridge), so Suit is not an Ord type. This means, that since the Card type contains a Suit value, it, too, is not ordered ... if it were, then the situation would arise where a high card hand of 109... would lose to a high card hand of 106... which is patently not within the rules of poker.
4 In fact, the pattern foo &&& bar >>> uncurry quux deserves and has its own article which very prettily describes and illustrates graphically what this pattern does.

Monoid use

I've been reading a bit about Monoids, which are a generalization of the MonadPlus specialization of the Monad class. The generalization of Monoids is that whereas a particular Monad type carries some (hidden) value, a Monoid type has no such requirement.

Firstly, what is a Monoid type? It is something that has an empty representation (mempty :: a) and has the property that allows two values to be joined (appended) to form a new value (mappend :: a → a → a). This protocol sounds very much like the one for MonadPlus types, doesn't it? It does, indeed, and with good reason, for every MonadPlus type is also a Monoid type:

instance MonadPlus m ⇒ Monoid (m a) where
mempty = mzero
mappend = mplus

What are some examples of Monoid? Well, of course, the types that are MonadPlus types, like Maybe and specified list types (but with an important caveat: the generalized list container must be specialized to contain some particular type a). But besides these, Monoid gives us more types that do not fit within the MonadPlus protocol, for example, and importantly, integrals under addition or multiplication.

Ah! So if one were, say, doing an operation with some kind of list and then doing the exact same kind of thing, but simply marking results (counting, as [brilliantly!] demonstrated by the Peano series, is operating on integers under addition) without the Monoid abstraction, we would need to write two separate procedures (in the infamous style of copy-and-waste, er, -paste) to work with the two separate data types, but with Monoid we can have simply one procedure working with the Monoid type, giving us some measure of polytypic expressiveness. This brings us to where we were at the end of the previous entry on factorization, where we had developed a unifying function for factorization (where its rôles changed depending on what seed value it was given: it returned the factors when given a list or returned a count of the factors when given an integer). Recall that our helper function, mkadder, needed to be supplied with a translator function for the individual values processed and a concatenator function to tally those translated values:

showfactors x = factors x (mkAdder return mplus) []
countfactors x = factors x (mkadder (const 1) (+)) 0

The first using function, showfactors seized the advantage that lists are MonadPlus types, but the second, countfactors, could not do the same because integers are not monadic (they are not (polymorphic) containers), so this function had to provide its own versions of translation and concatenation.

This problem goes away, however, since both these particular types are Monoid types, right? Yes, well, there's the issue of how to enforce this relation — a unspecialized list is not a Monoid type, nor are integers, in general. For this particular case, we must demonstrate that our particular data are of the Monoid class (a list of integer for the former and integers under addition for the latter).

So, what needs to be done is that these types need to be declared instances of the Monoidic domain by injected their values into that domain (thanks to Dirk Thierbach's explanation):

class Injector m where
inject :: Integral a ⇒ a → m a

instance Injector [] where
inject x = [x]

instance Injector Sum where
inject x = Sum 1

The above instance declarations do just that, making an injector on generalized lists one that translates some integral value into a specified (i.e. monoidic) list and making an injector on (generalized) sums one that translates some integral value (again) into a specified (again, monoidic) sum -- in this case the value is always translated into 1, because we are summing the number of factors (each factor adding 1 to the total number of factors), not the value of factors.

With these instances, we can now employ the protocol of Monoid to generalize the mkadder function. Recall its definition ...

mkadder :: Integral a ⇒ 
(a → b) → (b → b → b) → a → a → (b → b)
mkAdder f g y z = g (if y ≡ z
then f y
else g (f y) (f z))

... where f is the translation function and g is the concatenation function. So, what mkadder does is to provide a concatenator for either just one of the values (if both values are the same) or for both values.

Now, we add the properties of the Injector as well as those of the Monoid to get a new mkadder function that can stand alone needing neither a translator nor a concatenator to be provided from using functions ...

mkadder :: (Injector m, Monoid (m a), Integral a) ⇒
a → a → (m a → m a)
mkAdder y z = mappend (if y ≡ z
then inject y
else inject y `mappend` inject z))

... where the generic functions f and g are replaced, respectively, by the inject function from the Injector class and the mappend function from the Monoid class. Note, also that the type-relation that was unspecific in the previous version, (a → b), now has an explicit relation, (a → m a), thanks to the relation of the types between the Injector and Monoid classes. This relationship gives us a weak equivalent of a functional dependency. With this change in place, the using functions now no longer need to specify these functions, so they, in turn, simplify to:

showfactors x = factors x mkAdder []
countfactors x = getSum (factors x mkadder (Sum 0))

So, if you find yourself doing the same thing to very different types, and they all are not monadic, then perhaps the monoidic form may give the generality to encode those disparate types under one solution.

Saturday, May 24, 2008

Optimizing Factorization

I've been working my way through the problems hosted by Project Euler. Usually, using a naïve approach coupled with efficient nondeterminism suffices (oh! don't other problem solvers wish they had the MonadPlus in their semantics when tackling the now-trivial pandigital problems!), but I ran into a snag when I was factoring large triangular numbers.

The problem was my naïveté, of course.

My original version of factors proceeded thusly ...

factors :: Integral a ⇒ a → [a]
factors x = filter (λy . x `rem` y ≡ 0) [1..x]

... which is straightforward enough. One can't get any simpler than: "Keep ('filter') all the numbers from 1 to x that divide x with no remainder"!

Simple? Yes. Naïve? Yes. For small numbers, this factorization algorithm did just fine, even for the 384th triangle, 73920, this algorithm found the 112 factors in a blink. The problem appeared when doing a (triangular) search along the Natural number line for an unknown, large, triangle. The factorization algorithm is linear, and combining it with a (nearly) linear search makes the entire activity quadratic. Twenty-six hours after I began my search ("find the first triangular number with more than 500 factors") the algorithm was still going with no solution.

There must be a better way to go about this. And, of course, there are more than a few ways, some of which you can pay for, but none, on the top of the search tree that appeared simple enough to comprehend and to implement in a reasonable time.

Fortunately, there also exists an obvious sub-linear solution: if we are given in the linear progression [1..x] at the current index, y, that y is a factor of x, then obviously ...

z where z = x / y

... is also a factor. Not only that, but, since we are progressing linearly, z now becomes the upper bound in the factorization algorithm. The fix-point of the algorithm described gives us a new, much more efficient, factorization ...

factors x = factors' 2 (x, [1,x])
where factors' y res@(top, ans) | y ≥ top = ans
| otherwise = factors'' (x `divMod` y) y res
factors'' (newtop, 0) y (oldtop, ans)
= factors' (succ y) (newtop, y:oldtop:ans)
factors'' _ y res
= factors' (succ y) res

... Much more efficient? Yes. Self-describing? Not so much. Also, there's now a subtle error introduced by assuming all factors are unique, for ...

> factors 25 → [5,5,1,25]

... duplicates are introduced when factoring squares. Testing uniqueness adds its own complexity:

factors x = factors' 2 (x, [1,x])
where factors' y res@(top, ans) | y ≥ top = ans
| otherwise = factors'' (x `divMod` y) y res
factors'' (newtop, 0) y (oldtop, ans)
= factors' (succ y) (newtop, addin y oldtop ans)
factors'' _ y res
= factors' (succ y) res
addin x y list | x &equiv y = x:list
| otherwise = x:y:list

Complexity on top of complexity! This goes against my grain (as well as Richard O'Keefe's as you see in the side-bar quote), but is it worth it?

> sort (factors (triangle 12375)) → [1,2,3,...572 others...,76576500]

Yes, as the above interaction took no noticeable time, whereas before no solution was available after a day's worth of computation. So this optimized factorization algorithm is "good enough" for the task at hand, for I did solve the problem using that algorithm.

But, returning to the fix-point, which more than generalizes recursive algorithms -- it can be used to think of algorithms, qua algorithms, more generally. In this particular case, I have no desire to view every one of the 576 factors of this triangular number. No, I simply wish to ensure that there are more than 500 factors, no matter what those factors are. So, in the large, something like a countfactors (that simply returns the number of factors, instead the list of all the factors) is more desirable. How are we to go about writing such an algorithm? Most coders, I regret to observe, would punt to the copy-and-paste style of coding, as the desired changed is buried deeply within the algorithm. Not only that, but the type signature of the function varies with our alternatives: in what we've developed so far, the algorithm returns a list, but what we wish to have instead is (only one) number.

Fortunately, Dirk Thierbach introduced me to this particular use of the fix-point -- we simply extract the engine of computation into an external function. So now we have two functions that collapse into one line each ...

showfactors :: Integral a ⇒ a → [a]
showfactors x = factors x (λ y z ans . if y ≡ z then y:ans else y:z:ans) []

countfactors :: Integral a ⇒ a → a
countfactors x = factors x (λ y z ans . if y ≡ z then 1 + ans else 2 + ans) 0

... with factors being (slightly) modified to accept the new functional argument and base case:

factors :: Integral a ⇒ a → (a → a → b → b) → b → b
factors x adder ans = factors' 2 (x, adder 1 x ans)
where factors' y (tot, ans) | y ≥ tot = ans
| otherwise = factors'' (x `divMod` test) y (tot, ans)
factors'' (tot, 0) y (_, ans)
= factors' (y+1) (tot, adder y tot ans)
factors'' _ _ y ans
= factors' y ans

With this approach, the return type depends on the calling function's use and is threaded throughout the utility function, factors, by the adder computation engine.

One final note: the λ-terms in both calling functions showfactors and countfactors are also of the same structure, so we can again perform surgery, extracting the engine from the structure:

mkAdder :: Integral a ⇒ (a → b) → (b → b → b) → a → a → (b → b) 
mkAdder f g y z = g (if y ≡ z then f y else g (f y) (f z))

Now, anyone who's been working with monads for more than an brief period will see the b type as monadic and the functional type (a → b) as a lifting function (return) and the composition function, g, as monadic addition. And, as it turns out, this concept fits very nicely with the new implementation ...

showfactors x = factors x (mkAdder return mplus) []
countfactors x = factors x (mkadder (const 1) (+)) 0

With these new implementations, how much does factors need to change? Not one bit. Functional purity: why can't all programming languages have this? And with this generalization, I was able to achieve the desired result:

> countfactors (triangle 12375) → 576

Now it is known that both lists of some type (in this case integers) and also integers under addition are monoids. Given that, mkAdder can be further simplified. Also, the if-then-else is easily replaced by the Either type and proper use of arrows, but these are discussions for another day.

Friday, May 16, 2008

Guarded Choice with MonadPlus

In the previous article, I introduced the MonadPlus class and three examples of monads that allow for non-determinism in programming (Maybe and the list data type, both of which are MonadPlus types and Either, which can be coerced into a MonadPlus type). These types were introduced, but besides showing (unexplained) examples and minimal explanation of the Maybe lookup example, there is not much there to show how to program in a declarative nondeterministic manner. Let's rectify that. First, we'll show how to program nondeterministically and narrow the options down with guard. We will be using the standard nondeterministic "Hello, world!" problem, that is: solving the cryptarithmetic problem ...

... by iteratively improving the efficiency of the solution.

First up, list compression is a powerfully expressive programming technique that so naturally embodies the nondeterministic programming style that users often don't know they are programming nondeterministically. List compression is of the form:

[ x | qualifiers on x]
where x represent each element of the generated list, and the qualifiers either generate or constraint values for x

Given the above definition of list compression, writing the solution for our cryptarithmetic problem becomes almost as simple as writing the problem itself:

[(s,e,n,d,m,o,r,e,m,o,n,e,y) | s ← digit, e ← digit, n ← digit,
d ← digit, m ← digit, o ← digit,
r ← digit, y ← digit,
s * 1000 + e * 100 + n * 10 + d
+ m * 1000 + o * 100 + r * 10 + e
≡ m * 10000 + o * 1000 + n * 100
+ e * 10 + y]
where digit = [0..9]

Easy, but when run, we see that it's not really what we needed for the answer is ...


... and 1153 others. No, we wish to have SEND + MORE = MONEY such that S and M aren't zero and that all the letters represented different digits, not, as was in the case of the first solution, all the same digit (0). Well, whereas we humans can take some obvious constraints by implication, software must be explicit, so we need to code that S and M are strictly positive (meaning, "greater than zero") and that all the letters are different from each other. Doing that, we arrive at the more complicated, but correct, following solution ...

[(s,e,n,d,m,o,r,e,m,o,n,e,y) | s ← digit, s > 0,
e ← digit, n ← digit, d ← digit,
m ← digit, m > 0,
o ← digit, r ← digit, y ← digit,
different [s,e,n,d,m,o,r,y],
num [s,e,n,d] + num [m,o,r,e]
≡ num [m,o,n,e,y]]
where digit = [0..9]
num = foldl ((+).(*10)) 0
different (h:t) = diff' h t
diff' x [] = True
diff' x lst@(h:t) = all (/= x) lst && diff' h t

A bit of explanation -- the function num folds the list of digits into a number. Put another way ...

num [s,e,n,d] ≡ ((s * 10 + e) * 10 + n) * 10 + d

... and the function different, via the helper function diff', ensures that every element of the argument list are (not surprisingly) different -- a translation of diff' is ...

diff' x [] = True "A list is 'different' if there is only one number"
diff' x lst@(h:t) = all (≠ x) lst && diff' h t "A list is 'different' if one of the numbers is different than every other number in the list and if this is true for all the numbers in the list"

... and after a prolonged period [434 seconds], it delivers the answer:


Okay! We now have the solution, so we're done, right? Well, yes, if one has all that time to wait for a solution and is willing to do tha waiting. However, I'm of a more impatient nature: the program can be faster; the program must be faster. There are few ways to go about doing this, and they involve providing hints (sometimes answers) to help the program make better choices. We've already done a bit of this with the constraints for both S and M to be positive and adding the requirement that all the letters be different digits. So, presumably, the more hints the computer has, the better and faster it will be in solving this problem.

Knowing the problem better often helps in arriving at a better solution, so let's study the problem again:


The first (highlighted) thing that strikes me is that in MONEY, the M is free-standing -- its value is the carry from the addition of the S from SEND and the M from MORE. Well, what is the greatest value for the carry? If we maximize everything, then the values assigned are 8 and 9, then we find the carry can at most be 1, even if there's carry over (again, of at most 1) from adding the other digits. That means M, since it is not 0, must be 1.

What about for S, can we narrow its value? Yes, of course. Since M is fixed to 1, S must be of a value that carries 1 over to M. That means it is either 9 if there's no carry from addition of the other digits or 8 if there is. Why? Simple: O cannot be 1 (as M has taken that value for itself), so it turns out that there's only one value for O to be: 0! We've fixed two values and limited one letter to one of two values, 8 or 9. Let's provide those constraints ("hints") to the system.

But before we do that, our list compression is growing larger with these additional constraints, so let's unwind into an alternate representation that allows us to view the smaller pieces individually instead of having to swallow the whole pie of the problem in one bite. This alternative representation uses the do-notation, with constraints defined by guards.

A guard is of the following form:

guard :: MonadPlus m ⇒ Bool → m ()

What does that do for us? Recall that MonadPlus kinds have a base value (mzero) representing failure and other values, so guard translates the input Boolean constraint into either mzero (failure) or into a success value. Since the entire monadic computation is chained by mplus, a failure of one test voids that entire branch (because the failure propagates through the entire branch of computation).

So, now we are armed with guard, we rewrite the solution with added constraints in the new do-notation.

do let m = 1
o = 0
s ← digit
guard $ s > 7
e ← digit
n ← digit
d ← digit
r ← digit
y ← digit
guard $ different [s,e,n,d,m,o,r,y]
guard $ num [s,e,n,d] + num [m,o,r,e] ≡ num [m,o,n,e,y]
return (s,e,n,d,m,o,r,e,m,o,n,e,y)
where digit = [2..9]

Besides the obvious structural difference from the initial simple solution, we've introduced some other new things --

  • When fixing a value, we use the let-construct.

  • As we've grounded M and O to 1 and 0 respectively, we've eliminated those options from the digit list.

  • Since the do-notation works with monads in general (it's not restricted to lists only), we need to make explicit our result. We do that with the return function at the end of the block.

What do these changes buy us?

[(9,5,6,7,1,0,8,5,1,0,6,5,2)] returned in 0.4 seconds

One thing one learns quickly when doing logic, nondeterministic, programming is that the sooner a choice is settled correctly, the better. By fixing the values of M and O we entirely eliminate two lines of inquiry but also eliminate two options from all the other following choices, and by refining the guard for S we eliminate all but two options when generating its value.

In nondeterministic programming, elimination is good!

So, we're done, right? Yes, for enhancing performance, once we're in the sub-second territory, it becomes unnecessary for further optimizations. So, in that regard, we are done. But there is some unnecessary redundancy in the above code from a logical perspective -- once we generate a value, we know that we are not going to be generating it again. We know this, but digit, being the amb operator doesn't, regenerating that value, then correcting that discrepancy only later in the computation when it encounters the different guard.

We need the computation to work a bit more like we do, it needs to remember what it already chose and not choose that value again. We've already use memoization when we implemented the Fibonacci sequence and the Ackermann function with the State monad; so let's incorporate that into our generator here.

What we need is for our amb operator to select from the pool of digits, but when it does so, it removes that selected value from the pool. In a logic programming language, such as Prolog, this is accomplished easily enough as nondeterminism and memoization (via difference lists) are part of the language semantics. A clear way of dissecting this particular problem was presented to me by Dirk Thierbach in a forum post on comp.lang.haskell, so I present his approach in full:

  • I need both state and nondeterminism, so I have to combine the state monad and the list monad. This means I need a monad transformer and a monad (you need to have seen this before, but if you have once, it's easy to remember).

  • The state itself also has to be a list (of candidates).

  • So the final monad has type StateT [a] [] b.

  • I need some function to nondeterministically pick a candidate. This function should also update the state.

  • Played around a short time with available functions, didn't get anywhere.

  • Decided I need to go to the "bare metal".

  • Expanded StateT [a] [] a into [a] → [(a,[a])], then it was obvious what choose should do.

  • Decided the required functionality "split a list into one element and rest, in all possible ways" was general enough to deserve its own function.

  • Wrote it down, in the first attempt without accumulator.

  • Wrote it down again, this time using an accumulator.

With this approach presented, writing the implementation simply follows the type declaration:

splits :: Eq a ⇒ [a] → [(a, [a])]
splits list = list >>= λx . return (x, delete x list)

Although, please do note, this implementation differs significantly from Dirk's, they both accomplish the same result. Now we lift this computation into the State monad transformer (transformers are a topic covered much better elsewhere) ...

choose :: StateT [a] [] a
choose = StateT $ λs . splits s

... and then replace the (forgetful) digit generator with the (memoizing) choose (which then eliminates the need for the different guard) to obtain the same result with a slight savings of time [the result returned in 0.04 seconds]. By adding these two new functions and lifting the nondeterminism into the StateT we not only saved an imperceptibly few sub-seconds (my view is optimizing performance on sub-second computations is silly), but, importantly, we eliminated more unnecessary branches at the nondeterministic choice-points.

In summary, this entry has demonstrated how to program with choice using the MonadPlus class. We started with a simple example that demonstrated (naïve) nondeterminism, then improved on that example by pruning branches and options with the guard helper function. Finally, we incorporated the technique of memoization here that we exploited to good effect in other computational efforts to prune away redundant selections. The end result was a program that demonstrated declarative nondeterministic programming not only fits in the (monadic) idiom of functional program but also provides solutions efficiently and within acceptable performance measures.

Wednesday, May 14, 2008

Choice with Monads: List, Maybe, Either

We've seen up to now what monads are and how they can be useful in simple ways, such as for heavy lifting on recursive addition. So, monads are useful for housekeeping when you have more work than a computer could handle in a straightforward manner. This work is called deterministic, meaning that the computation occurs in only one way. As we have seen, monads can be helpful with this kind of computation. But monads can be helpful with nondeterministic computations, or computations that proceed along multiple possible paths, as we shall see.

Haskell comes with three kinds of monads that have been used specifically for nondeterministic computation: the Maybe monad, the list data type and, a new one, the Either monad.

We saw the first one in the previous post: the Maybe monad. This monad type has two instances: Nothing and Just x (where x is the specific value of the computation). The Maybe monad is illustrated by the two dialogues below:

Scenario 1
Waiter: How is the pork chop, can I get you anything to go with that?
Custamah: Oh, Nothing for me, thanks.
Waiter:Wonderful, enjoy your meal.
Scenario 2
Waiter:How is the pork chop, can I get you anything to go with it?
Custamah:Oh, Just a small bowl of applesauce, please?
Waiter:Sure, I'll bring that right out.

The waiter in the above two scenarios doesn't know exactly what the customer will want, but that waiter is pretty sure the customer will ask for Nothing or for Just something, and these options describe the Maybe monad type.

Another example of this kind of monad is the list data type. But whereas the Maybe monad allows two options (the answer or failure), the list data type (a monad) allows multiple answers (including no answers, which is represented by the empty list). These kinds of monads form a protocol called the MonadPlus class, just as the more general monad data types form the more general protocol of the Monad class, and just like regular monads, conform to a set of laws.

First, let us specify and explain what the MonadPlus protocol is. All MonadPlus types must have the following two properties defined:

mzero :: m a — the base, or usually interpreted as fail, value; and,
mplus :: m a → m a → m a — a function that chooses a success value when offered two values

For the Maybe MonadPlus type the above properties are defined as follows:

mzero = Nothing
`mplus` b = b
a `mplus` b = a

In other words, Nothing is the failure case, and mplus tries to choose a non-Nothing value (roughly: "If a is Nothing, pick b; otherwise pick a." Here's a question for you: what happens when both a and b are Nothing, and for what reason?) Note the interesting semantics of mplus — it is not at all addition, as we expect, for:

Just 3 `mplus` Just 4 = Just 3

Recall that if we wish to do monadic addition, we need to define such an operator.

madd :: (Monad m, Num a) ⇒ m a → m a → m a
madd = liftM2 (+)
Just 3 `madd` Just 4 = Just 7

So, now madd has the triple meaning here: it is not mplus (which is not addition), it is addition for monads containing numbers, and it either heightens awareness or annoys the cause of "MADD". Got all that?

The Maybe type has a special handler, called maybe. Its type signature is:

maybe :: b → (a → b) → Maybe a → b

What does this function do? Well, we've already seen it in action with the monadic Ackermann and Fibonacci solutions. One can read the arguments from right to left, to get the feel of an if-then-else: if the last argument is Just a, then pass a to the second argument (which is a function that converts an a to the proper return type); else execute the first argument. A very compact and useful function when working with Maybe types.

The second most commonly used data type used for non-deterministic computation is the list MonadPlus data type. It has an interesting variation from the Maybe definition:

mzero = []
mplus = (++)

In other words, the empty list ([]) is the base (failure) case, and mplus here actually is addition ('concatenation', to be technically correct); addition, that is, in the list-sense. But it all works out, particularly when it comes to the base cases, for:

[3] `mplus` [] = [3]
Just 3 `mplus` Nothing = Just 3

But, on the other hand, mplus is different when handling non-base cases for the Maybe and list monad types, for:

[3] `mplus` [4] = [3, 4]
Just 3 `mplus` Just 4 = Just 3

But this difference is consistent with the different types: the list monad allows for multiple solutions, whereas the Maybe monad allows only one.

The list data type has too many special functions associated with it to review in this post. I recommend a review of the Haskell online report to get a taste of list's rich functionality, and then read Eric Kidd's post on backtracking with monads for some insights into using list monads in nondeterministic programming.

The third data type that is used, albeit less frequently, for non-deterministic computation is the Either data type. It's structure is as follows:

data Either a b = Left a | Right b

The way Either operates is that it offers a mutually-exclusive choice. For example, little Isabel sits to my Left and her até Elena Marie sits to my Right, so at 4 p.m. I must choose Either one to serve tea first: Left Isabel or Right ElenaMarie.

The interesting distinction of the Either monad to MonadPlus types such as the list data type and the Maybe monad is that both options are weighed equally, or, more to the point, neither is considered to be the base case. This means that Either, qua Either, is not in the MonadPlus class. With this caveat, can the Either type be used for non-deterministic computation? Yes, absolutely!

Not only can the Either type be used in its basic monadic form, but it also can be coerced into the MonadPlus class. How? It's simple, really. By simply choosing one of the branches to be the base (the Haskell library designers chose Left), the Either type now conforms to that protocol. The convention assigns the error message (a String) to the Left and the value sought is assigned to the Right one. This rather reduces Either to a glorified, error-handling, Maybe, and that is how it is used in every-day Haskell code for the most part.

The Either monad also has a special handler, either, with the type signature of:

either :: (a → c) → (b → c) → Either a b → c

This function is in the same vein as the Maybe handler, but complicated by the fact that maybe has only one (success) type to handle, whereas this function has two possible types it deals with — either's type translates as: if the answer from the third argument (Either a b) is Left a, then feed a to the first argument (a function that converts the input value of type a to the output of type c), but if the answer from the third argument is of type Right b, then feed b to the second argument (a function that converts the input value of type b to the output of type c).

What we've seen in this entry is an introduction to the MonadPlus class and three examples of monads that allow for choice, Maybe, the list data type and Either, and saw an example for each which demonstrated their ability to code with choice.

The next entry will further explore the MonadPlus class and some of its powerful functions, such as msum and guard, and how the MonadPlus class allows us to code in a declarative nondeterministic style.

No "fib"bing; getting into the monad groove

The title is groan-inducing, as all my "jokes" are. I guarantee it: once you've finished reading this entry, reread the title, and you won't be able to stop from groaning.

The Fibonacci series goes as follows: 0,1,1,2,3,5,8,13..., or put another way, the current number is obtained by adding the previous two numbers. It is useful for many things, as its limit is the golden ratio, found in nature (the spiral of some crustaceans and the population growth of rabbits [they must be stopped!]) and in artifice (windows, painting, doors, buildings follow the width and height of this ratio)).

Any fibonacci number can be easily computed from the following formula ...

fib :: IntegerInteger
fib n | n ≡ 0 = 0
| n ≡ 1 = 1
| otherwise = fib (n-1) + fib (n-2)

... that is, easily computed if this world were purely mathematical, including the caveat that any computable function could be instantly computed. Run on a computer, fib 25 slows noticeably and fib 50 may as well describe the halting problem, because I wasn't going to wait around for it to terminate.

Note the similarity between the computation of the Fibonacci series to the computation of the Ackermann table. They are not the same kind of problem, mind you, as the Ackermann is not primitively recursive; the Fibonacci is "only" doubly (branching) recursive. But they are similar enough in that they can be solved in similar ways. Given that the current Fibonacci number is the sum of the previous two Fibonacci numbers, we need only a (reversed) list to memoize the previous results, so the above sequence becomes:

[..., 13, 8, 5, 3, 2, 1, 1, 0]

and the "next" Fibonacci number would be simply the first two elements of this list (21, in this case). But how do we know where we are in the sequence? Easy: the length of this list tells us where we are, and in this case, the list has 8 elements, meaning the "next" Fib is 9th in the sequence.

So, turning to monads with this list structure for memoization, the code falls out as follows:

fib :: IntFibS
fib n = get >>= λmem . maybe (fib' n >>= update)
(gimme mem n)

So, fib is now simply a decision: "Do I have the fib requested in the list?" "Yes: gimme it and return it" or "No: compute it and then update the list"

The list is a very slight variation on a regular list type, as we choose to carry around its length (as opposed to recomputing it at each iteration), and we lift this new data type into the State monad (as we did with the Map data type for our Ackermann monad optimization):

data Mem = Mem [Integer] Int
type FibS = State Mem Integer

The actual computation function is lifted into the monadic form with very little variation ...

fib' :: IntFibS
fib' n | n ≡ 0 = return 0
| n ≡ 1 = return 1
| otherwise = liftM2 (+) (fib (n - 1)) (fib (n - 2))

... where monadic addition, liftM2 (+), replaces addition in the usual sense (recall the spot of bother we had "adding" Thing One to Thing Two), and where the plain numbers are lifted into the monad with return. In brief, the substance is now monadic but the structure is the same as our original, plain, fib.

The other housekeeping functions are new for the monadic solution, but what one would expect. The update follows in the same vein as the one for the ackermann monad:

update :: IntegerFibS
update n = do (Mem lst len) ← get
put (Mem (n:lst) (len + 1))
return n

An interpretation of the update function is that it is the identity function with the side-effect that it remembers the result that it returns.

The only other function is the self-describing gimme which retrieves a previously-computed fibonacci number from memory:

gimme :: MemIntMaybe Integer
gimme (Mem (h:t) len) n | len ≡ n = Just h
| len > n = let x = (len - 1) - n
in Just (t !! x)
| otherwise = Nothing

This gimme function uses the Maybe monad and its functionality, saying "If I have the value already computed [If the list length is equal to or greater than the requested index], then return Just that value; otherwise I've got Nothing, so you need to compute that value."

In summary, we've decorated the naïve fibonacci algorithm with some memory and three new functions (one manager and two support functions). What is the payoff?

Ready> fib 1500
[no delay]

fib 50000
[4 second delay]
[10615 digits]

Ready> fib 60000
[2.5 second delay]
*** Exception: stack overflow

These results are a great step forward over the naïve fib implementation (which essentially froze at fib 50) and even memoized implementation reported elsewhere (which ran out of memory after fib 1450).

Huzzah! then for efficient program representation in Haskell and monadic code.

Oh! my acking monad!

So, we just saw, after reams of paper on proving in inscrutable detail the three monadic laws, that after all that, we see that using them is pretty easy. Fancy that, Hedda!

Haskell does provide several different kinds of monads for various uses, but sometimes one must roll one's own to fit the current need. I was exploring the ackermann function, curious to know what A4,2 looked like [oh, yes, 19730 digits in all its (gory) glory -- I went there!] (other than the cop-out of 265536-3 that is listed in Wikipedia), and whether my computer could calculate and display that value [obviously it could, but it took some work (see below) to arrive at that point]. The ackermann function written in Haskell looks very much like its representation in standard mathematical notation:

a :: IntegerIntegerInteger
a m n | m ≡ 0 = n + 1
| m > 0 ∧ n ≡ 0 = a (m - 1) 1
| m > 0 ∧ n > 0 = a (m - 1) (a m (n - 1))

The "problem" with the Ackermann function is that even though it is easy to write, the recursion required to arrive at a solution from even small (single digit) inputs is staggering. My computer staggered and then killed the computation in mid-process (too bad Redjak didn't have that option). The Ackermann function, being not primitively recursive, is also highly redundantly recursive, so it would be nice to provide "hints" to the program, something like: "Oh, you've already computed A3,1 to be 13, so you don't need to recompute that entire branch any more."

This "hinting" has a name in computer science; it's called "memoization", and Haskell provides the State monad that can be used to cover that functionality quite nicely. As we're computing the values that build up the solution, we put the smaller-indexed solutions into a dictionary, something like:

... and so forth ...

Such a dictionary in Haskell is called a Map, because we are mapping from the "word" (the indices of the Ackermann function) to the "definition" (the solution)...

type Index = (Integer, Integer)
type AckMap = Map Index Integer

... and we wrap that Map with the State monad ...

type AckMapS = State AckMap Integer

... where AckMapS uses the AckMap dictionary to provide the preexisting (partial) solution, or, given there's none yet, populates the solution calculated (the Integer) into the dictionary at the current Index. Simple, yes? So, all we need is an utility function that does the lookup or updates the state ...

ackify :: IntegerIntegerAckMapS
ackify m n = get >>= λma . maybe (a' m n >>= update m n)
(Map.lookup (m, n) ma)

... the update function that actually does the housekeeping ...

update :: IntegerIntegerIntegerAckMapS
update m n ans = do mappo ← get
put (Map.insert (m, n) ans mappo)
return ans

... so that our monadic version of the ackermann function is as follows:

a' :: IntegerIntegerAckMapS
a' m n | m ≡ 0 = return (n + 1)
| m > 0 ∧ n ≡ 0 = ackify (m - 1) 1
| m > 0 ∧ n > 0 = ackify m (n - 1) >>= ackify (m - 1)

which looks very much the same as the original function, with calls to a being replaced by calls to the lookup-or-update ackify function and function composition (e.g. a (m - 1) (a m (n - 1))) being replaced by the monadic composer, >>=. So, from the above demonstration we see that monads are not only easy to use, but also easy to "roll your own" and integrate into preexisting non-monadic code without much fuss.


Although the incorporation of the State monad dramatically speeds processing solutions and makes some solutions computable that were unreachable under the naïve approach, there are at least two better generalizations:

  1. Use the fixpoint (or the Y-combinator) of the ackermann function so that one can now decorate the engine of computation without altering its structure at all! Incredible! So instead of using a monad to abstract stateful decoration of computations, the fixpont can be used to abstract any decoration of computation!

  2. Use the compiler to change the code, instead of having it programmed in. Many functional language compilers have the option to memoize computations automagically. So, instead of writing code that memorizes partial results, the compiler intercepts the computation, replacing computation with previously calculated values (or running the computation if no such value exists and then storing that result into the runtime). No extra coding; no "troublesome" monads.

Critique on the critique

On the other hand ("there are five fingers", as my favorite Mother-in-law enjoys rejoining),

  1. The fixpoint solution is more general and more elegant, but being more general suffers in a slight decrease in efficiency: it more than likely will run more slowly than the specialized monadic state solution

  2. With an explicit encoding of state, the programmer has direct say into what is memoized and what is not, but with a compiler directive, everything is memoized. Since the Ackermann function can be represented by simple functions for m < 4, the map can be replaced by a faux-map for those values, reducing the memory footprint of memoization to only a couple of cells for m == 4 (as opposed to over 150,000 cells for m == 4 and n == 1 using the naïve memoization scheme).

... so the above described solution may be "good enough" for the task at hand, after all. Put another way, when I said I was wrong, I might have been wrong. Caveat programmer.

Orators' exercise

I practice this exercise daily. Don't sight read it; read it aloud. The Haskell program that produced this uses a monad; the heart of which is as follows:

-- parses a poem, refering words to CMU for pronounciation

parsePoem = do p ← getContents
return (checkEachLine (lines p))

main = do p ← parsePoem
putStr p

Do you see the monad? Well, that's a rather mean question, because the monad is entirely implied in this program fragment.

One of the motivators for incorporating monads into Haskell was dealing with things in the "real world" that fall outside easy descriptions in purely functional (mathematical) domains. One of those things is input and output. Whereas an equation...


... was true yesterday, is true now, and will be true tomorrow (albeit, it took over 100 pages of introduction to set theory in the Principia Mathematica to be able to back up that truth), things of the "real world", particularly input and output change, and change irrevocably. For example, if I buy and eat some ice cream (mint chocolate chip, please!), I'm not getting a refund. The state of the world has changed: I'm parted from my money, and that scoop of ice cream is no longer in the shop's inventory (and they have a Form 1120S, Schedule A to prove it). Input/output, as part of the state of the world, is one of the members of the awkward squad. So, to help it to be a used and useful language, the Haskell designers included the IO monad as one of the libraries.

So, if one were to query main and parsePoem, one would find that they are the types IO () and IO [Char], respectively, meaning that main interacts with IO, returning "nothing", and parsePoem interacts with IO, returning a list of characters (the poem that you just finished reading aloud).

In short, the IO monad does all the handshaking (or, more correctly, handwaving) with the state of the world, so the programmer can do something as "simple" as printing "Hello, world!" easily. All this monad asks, as every monad asks, is "Pay no attention to that man behind the curtain."

Trivial Monad solutions (cont.)

This is a continuation of the previous entry on the subject, with the solutions for exercises 3 and 4. But, we are now finally and formally introduced to the Haskell-style bind operator: >>=. What >>= (which is pronounced: "bind") does is to give the value the monad contains to the function that then uses it. More formally defined (specific to sigfpe's trivial monad example):

W a >>= f ≡ f a

An example of the syntax (again, following sigfpe's post) is as follows:

W 5 >>= f

(where f was defined in sigfpe's post as an increment function for his trivial monad: f x = W (x + 1)). The above example would result in the (monadic) value W 6. So, now you see >>= (trivially) in action, and sigfpe asks us to put it into use for the following exercises.

The first exercise that we'll continue on in this entry is exercise 3, which is to prove the three monad laws:

Left Identity: return a >>= f ≡ f a

or, a monadized value bound to a function is the same as applying a function to the plain value.

Right Identity: m >>= return ≡ m

or, a monad bound to return is just that monad, again (or, the identity function for monads is bind return), and finally:

Associativity: (m >>= f) >>= g == m >>= ((λ x . f x) >>= g)

which is similar in spirit to associativity for addition, i.e.: (a + b) + c ≡ a + (b + c).

Just a side note here, some people become discouraged when they are told "to prove" something, whereas if they are asked "to solve" a problem, they turn right to it. Fogel and Michalewicz rightly point out that proving and solving are basically the same process!

My cara spoza asked me, as I was attempting to (re-)prove liftM2 (+) for exercise 2, "Is proving difficult?" The question gave me pause: "Well, sometimes the simplest proof can elude me for days, and sometimes something that appears much more difficult comes in a flash."

So, let's prove each of these laws in turn, using the definitions of >>= and return as axioms.

First up, Left Identity:

Prove:return a >>= f f a 
1.return a >>= f W a >>= f return definition
2.  f a >>= definition

Next, Right Identity:

Prove:m >>= return m 
1.m >>= return W a >>= return monad identity
2.  return a >>= definition
3.  W a return definition
4.  m monad identity

And, finally, let's prove associativity. We'll start out by reducing the left hand side ("lhs") to something simpler:

Simplify:(m >>= f) >>= g  
1.(m >>= f) >>= g (W a >>= f) >>= g monad identity
2.  f a >>= g >>= definition

... but it was here that I became stymied, for I didn't see how to transform f a >>= g into something resembling the right hand side (rhs), which is m >>= (λ x . f x >>= g). Do you? Give it a try!


The thing to do, when one becomes stuck solving or proving something, is to acknowledge that fact, and then to try something entirely different. One way to try something different is to rework the simplification of the lhs so it assumes a form much closer to what the rhs is. I didn't see this approach to be fruitful, however: I reduced the lhs to be something pretty simple, so a rework would make the lhs more complicated -- clutter often obscures correlation. Another approach is to leave the lhs be and work on simplifying the rhs -- maybe a simplification there would yield an useful correspondence. Let's do that:

Simplify:m >>= (λ x . f x >>= g)  
1.m >>= (λ x . f x >>= g) (W a >>= (λ x . f x >>= g) monad identity
2.  (λ x . f x >>= g) a >>= definition
3.  f a >>= g β-reduction

Ha! Now we see that the simplification of the lhs is identical to the simplification of the rhs. That was easier than expected.


To summarize this exercise, proving something or solving something is really a rather simple task: using the tools you have to match the solution to the problem. The fun part comes in when you discover new ways to use the given tools, or, finding out the given tools aren't sufficient, which means finding new tools to solve the problem. Once you have your tools, along with the problem statement and desired outcome, it is often the case that there is a simple walk from the problem to the solution, the solution to the problem, or both meet somewhere in the middle. Odds are, you are not the first one solving a problem, if you need help, there's a whole big world out there -- someone probably has already left the solution out there for you to find.

Exercise 4: Monads cannot be unwrapped completed back to the plain value they carry, but if a monad is layered, that is a monad contains a monad, it is possible to remove one or more layers. join is such a function that does this "unlayering". Given a layered monad mm = W (W x), join mm will return the value W x. The declaration for join is:

join :: W (W a) → W a

Define it.

Solution 4: Well, the declaration for join looks suspiciously like an application of >>= (you are pronouncing that operator "bind", right?), as it hands the value carried by the monad for function application.

join mm = mm >>= ?f?

The question then becomes "what function to apply the carried value to?" The problem is that >>= has already given us the answer, W x, so we just wish to have join return that value unaltered. Fortunately for us, there is such a function from combinatory logic called the identity combinator, I, which has the form λ x . x, and Haskell already has that definition (it is named id), so we simply use that for the hole in the above definition:

join mm = mm >>= id


In conclusion, I hope these entries have helped you to see that monads are actually a rather simply concept that are easy to use. This is just the tip of the iceberg, however: monads are used pervasively in Haskell, tackling a great many tasks. May I suggest the excellent and practical tutorial on Monad Transformers? Please do not be put off by the imposing front page to the paper, as this tutorial shows how monads and their transformers shine, giving a step-by-step introduction of new functionality of monads into a plain-vanilla system.

Trivial Monad solutions

Okay, a monad is a mathematical form, which sigfpe describes in his blog. He then proposes a series of exercises [reminiscent of my own exercises in combinatory logic], and even provides solutions. I came up with quite a different solution set, as I used his entry, and his entry only, as the source material.

A recap of monad: a monad is an opaque decorator. It takes a thing (including taking nothing) and hides that thing behind the monadic form. In sigfpe's example, the monad, W, takes a number (or anything else, for that matter), consuming it.

A monad, then, is like a katamari or a black hole, ... or Kuzko: no touchy!

So, then, the question naturally arises: how does one interact with the captured data? Well, the monad interface provides two basic operations in its protocol: return and bind (sometimes written >>=).

  • return "lifts" an object into the monadic form. In propositional logic, return is a → m a — 'return 5' would yield 'W 5' in sigfpe's example.

  • bind takes a function that transforms an object to a monad and converts that function that transforms the object within that monad. In propositional logic, bind is (a → m b) → (m a → m b)

So, bind is used frequently when working with data and their monadic forms, and from it other functions may be derived, such as fmap, which maps (converts) a function that works on "ordinary" data to one that works on monads ... fmap is (a → b) → (m a → m b). Note the closeness to bind: the derivation of fmap is λ x . bind (return • x) (where is composition (the B combinator, or Bluebird, from combinatorial logic), and λ introduces a variable in the expression).

The above is enough introductory material to solve sigfpe's first two exercises, which I reproduce here:

Exercise 1: Write a function, g, that adds a number to a monad containing a number, returning a new monad with the sum, or g :: a → m a → m a.

Solution 1: well, given that addition is of the form (+) :: a → a → a, and that fmap lifts an ordinary function to a monadic one, the solution nearly writes itself:

g x my = fmap (+ x) my

What is simply done here is to construct a "plus x" function by applying x to the addition operator — note that (+x) :: a → a. We then lift that function to its monadic equivalent with fmap and apply the result to the number (y) contained in the monad (my). Q.E.D.

Exercise 2: Write a function, h, that adds monadic numbers, returning a new monad containing the sum, or h :: m a → m a → m a

Solution 2: This provided a spot of bother for me. Whereas in the last solution, one could take the first number and apply it to addition to come up with a function that fit the form of fmap, in this case, we don't have any numbers with which to simplify addition to a form we can use, as they are all rolled up into monadic forms.

I described the problem thusly to my daughter, EM.

Me: You know addition, right?
EM: What's that, Papa?
Me: What's one plus two?
EM, immediately:Three! ... and ten plus ten is twenty! ... and twenty plus twenty is thirty! ... and thirty plus thirty is forty!
Me, clearing my throat, letting the errors pass in light of the delighted exuberance with which she delivered her answers: Um, yes, well. Here's my problem, let's say Thing One has a number hiding behind its back, that it won't show you, and Thing Two has a number hiding behind its back, too, that it won't show you, and you're asked to add those two numbers and give the answer to Thing Three. How do I do that?
EM, rolling her eyes: Papa, there isn't a Thing Three in the Cat in the Hat!
Me: Yes, well, suppose there was; how would I do this?
EM: Can you peek behind their backs to see what the numbers are?
Me: No, you're not allowed; besides, those Things are much too fast to run behind them, remember how they ran through the house with kites? ... So, we must use something like the fmap-box-maker, because addition is a box that takes numbers and yields numbers, but we need a box that takes Things and yields Things.

Using fmap does work its way into the solution I eventually derived, but another important player for this solution is also bind used in conjunction with a technique of deferred computation.

You see, just like the solution for exercise 1, we'd like to apply a number to addition to obtain a function for use by fmap. The problem is that we have no such number (just monadic values), but this is where bind is of use. Recall that bind :: (a → m b) → (m a → m b). The standard given explanation of bind is that it transforms a function that lifts ordinary data into monads to a function that works entirely in the monadic domain. But an entirely different way of looking at bind is that it treats a given monad like ordinary data as input to the lifting function. εὕρηκα! [No, I'm not about to hop out of the tub and run around town with my towel flapping in the wind]

The latter usage is the only one for those using >>= (which has the English language pronunciation of "bind", as well, confusingly), but if we take sigfpe's entry as the sole and base source, this "discovery" is quite the epiphany. And with that understanding, along with some deferred computational hand-waving, the solution then, again, presents itself:

h w1 w2 = bind (λ x . fmap (+ x) w2) w1

Working from the parenthetical expression outward, we again see the fmap (+ x) form, unaltered from the solution to exercise 1, giving us a number-to-monad function. The catch here is that we have no such number, so we give a promise that there will be such a number eventually with the λ-introduction. We then immediately fulfill that promise with bind: we apply the number-to-monad function to w1, treating that monadic value as if it were a plain number (that's what bind gives us permission to do), rendering the solution. Q.E.D.


I obtained solution 1 almost immediately by matching the logic of fmap to that of the problem description. It seemed to me to be rather trivial, so I was taken aback at the lengths and "arcane-ness" that sigfpe put into his solution.

On the other hand, sigfpe merely needed to augment his first solution with variable introduction to obtain his second solution. His second solution was trivially-obtained and obvious, given his approach to the first solution, whereas I had to struggle quite some time — introducing both λ-abstraction (as sigfpe does) and bind (which sigfpe already had in the alternate syntactic form of >>=) — to obtain mine.

In my defense, my solutions use only the monadic tools introduced in sigfpe's blog entry. On the other hand, Haskell programmers accustomed with monads rarely use fmap, regarding it as esoteric or superfluous, and instead rely on >>=, or inline binding entirely with the more "natural" do-notation (although there has been some grumblings against it), so in the light of that experience, sigfpe's solutions are natural and the first way most Haskell programmers would solve these exercises. This entry here, then, is from the perspective of someone not familiar with Haskell's approach to monads deriving the solutions from the principles of the provided function bind and its derived sibling fmap as discussed in sigfpe's blog entry.

Please see my next post for the solutions to sigfpe's exercises 3 and 4.