SEND + MORE = MONEY
... by iteratively improving the efficiency of the solution.
First up, list compression is a powerfully expressive programming technique that so naturally embodies the nondeterministic programming style that users often don't know they are programming nondeterministically. List compression is of the form:
[ x | qualifiers on x]
wherex
represent each element of the generated list, and the qualifiers either generate or constraint values forx
Given the above definition of list compression, writing the solution for our cryptarithmetic problem becomes almost as simple as writing the problem itself:
[(s,e,n,d,m,o,r,e,m,o,n,e,y) | s ← digit, e ← digit, n ← digit,
d ← digit, m ← digit, o ← digit,
r ← digit, y ← digit,
s * 1000 + e * 100 + n * 10 + d
+ m * 1000 + o * 100 + r * 10 + e
≡ m * 10000 + o * 1000 + n * 100
+ e * 10 + y]
where digit = [0..9]
Easy, but when run, we see that it's not really what we needed for the answer is ...
[(0,0,0,0,0,0,0,0,0,0,0,0,0),(0,0,0,1,0,0,0,0,0,0,0,0,1),...]
... and 1153 others. No, we wish to have
SEND + MORE = MONEY
such that S
and M
aren't zero and that all the letters represented different digits, not, as was in the case of the first solution, all the same digit (0
). Well, whereas we humans can take some obvious constraints by implication, software must be explicit, so we need to code that S
and M
are strictly positive (meaning, "greater than zero") and that all the letters are different from each other. Doing that, we arrive at the more complicated, but correct, following solution ...[(s,e,n,d,m,o,r,e,m,o,n,e,y) | s ← digit, s > 0,
e ← digit, n ← digit, d ← digit,
m ← digit, m > 0,
o ← digit, r ← digit, y ← digit,
different [s,e,n,d,m,o,r,y],
num [s,e,n,d] + num [m,o,r,e]
≡ num [m,o,n,e,y]]
where digit = [0..9]
num = foldl ((+).(*10)) 0
different (h:t) = diff' h t
diff' x [] = True
diff' x lst@(h:t) = all (/= x) lst && diff' h t
A bit of explanation -- the function
num fold
s the list of digits into a number. Put another way ...num [s,e,n,d] ≡ ((s * 10 + e) * 10 + n) * 10 + d
... and the function
different
, via the helper function diff'
, ensures that every element of the argument list are (not surprisingly) different
-- a translation of diff'
is ...diff' x [] = True
"A list is 'different' if there is only one number"diff' x lst@(h:t) = all (≠ x) lst && diff' h t
"A list is 'different' if one of the numbers is different than every other number in the list and if this is true for all the numbers in the list"
... and after a prolonged period [434 seconds], it delivers the answer:
[(9,5,6,7,1,0,8,5,1,0,6,5,2)]
Okay! We now have the solution, so we're done, right? Well, yes, if one has all that time to wait for a solution and is willing to do tha waiting. However, I'm of a more impatient nature: the program can be faster; the program must be faster. There are few ways to go about doing this, and they involve providing hints (sometimes answers) to help the program make better choices. We've already done a bit of this with the constraints for both
S
and M
to be positive and adding the requirement that all the letters be different digits. So, presumably, the more hints the computer has, the better and faster it will be in solving this problem.Knowing the problem better often helps in arriving at a better solution, so let's study the problem again:
SEND
+MORE
MONEY
The first (highlighted) thing that strikes me is that in
MONEY
, the M
is free-standing -- its value is the carry from the addition of the S
from SEND
and the M
from MORE
. Well, what is the greatest value for the carry? If we maximize everything, then the values assigned are 8 and 9, then we find the carry can at most be 1, even if there's carry over (again, of at most 1) from adding the other digits. That means M
, since it is not 0, must be 1.What about for
S
, can we narrow its value? Yes, of course. Since M
is fixed to 1, S
must be of a value that carries 1 over to M
. That means it is either 9 if there's no carry from addition of the other digits or 8 if there is. Why? Simple: O
cannot be 1 (as M
has taken that value for itself), so it turns out that there's only one value for O
to be: 0! We've fixed two values and limited one letter to one of two values, 8 or 9. Let's provide those constraints ("hints") to the system.But before we do that, our list compression is growing larger with these additional constraints, so let's unwind into an alternate representation that allows us to view the smaller pieces individually instead of having to swallow the whole pie of the problem in one bite. This alternative representation uses the do-notation, with constraints defined by guards.
A guard is of the following form:
guard :: MonadPlus m ⇒ Bool → m ()
What does that do for us? Recall that MonadPlus kinds have a base value (
mzero
) representing failure and other values, so guard translates the input Boolean constraint into either mzero
(failure) or into a success value. Since the entire monadic computation is chained by mplus
, a failure of one test voids that entire branch (because the failure propagates through the entire branch of computation).So, now we are armed with guard, we rewrite the solution with added constraints in the new do-notation.
do let m = 1
o = 0
s ← digit
guard $ s > 7
e ← digit
n ← digit
d ← digit
r ← digit
y ← digit
guard $ different [s,e,n,d,m,o,r,y]
guard $ num [s,e,n,d] + num [m,o,r,e] ≡ num [m,o,n,e,y]
return (s,e,n,d,m,o,r,e,m,o,n,e,y)
where digit = [2..9]
Besides the obvious structural difference from the initial simple solution, we've introduced some other new things --
- When fixing a value, we use the let-construct.
- As we've grounded
M
andO
to 1 and 0 respectively, we've eliminated those options from thedigit
list. - Since the do-notation works with monads in general (it's not restricted to lists only), we need to make explicit our result. We do that with the return function at the end of the block.
What do these changes buy us?
[(9,5,6,7,1,0,8,5,1,0,6,5,2)]
returned in 0.4 seconds
One thing one learns quickly when doing logic, nondeterministic, programming is that the sooner a choice is settled correctly, the better. By fixing the values of
M
and O
we entirely eliminate two lines of inquiry but also eliminate two options from all the other following choices, and by refining the guard for S
we eliminate all but two options when generating its value.In nondeterministic programming, elimination is good!
So, we're done, right? Yes, for enhancing performance, once we're in the sub-second territory, it becomes unnecessary for further optimizations. So, in that regard, we are done. But there is some unnecessary redundancy in the above code from a logical perspective -- once we generate a value, we know that we are not going to be generating it again. We know this, but
digit
, being the amb
operator doesn't, regenerating that value, then correcting that discrepancy only later in the computation when it encounters the different guard
.We need the computation to work a bit more like we do, it needs to remember what it already chose and not choose that value again. We've already use memoization when we implemented the Fibonacci sequence and the Ackermann function with the State monad; so let's incorporate that into our generator here.
What we need is for our
amb
operator to select from the pool of digits, but when it does so, it removes that selected value from the pool. In a logic programming language, such as Prolog, this is accomplished easily enough as nondeterminism and memoization (via difference lists) are part of the language semantics. A clear way of dissecting this particular problem was presented to me by Dirk Thierbach in a forum post on comp.lang.haskell, so I present his approach in full:- I need both state and nondeterminism, so I have to combine the state monad and the list monad. This means I need a monad transformer and a monad (you need to have seen this before, but if you have once, it's easy to remember).
- The state itself also has to be a list (of candidates).
- So the final monad has type
StateT [a] [] b
. - I need some function to nondeterministically pick a candidate. This function should also update the state.
- Played around a short time with available functions, didn't get anywhere.
- Decided I need to go to the "bare metal".
- Expanded
StateT [a] [] a
into[a] → [(a,[a])]
, then it was obvious whatchoose
should do. - Decided the required functionality "split a list into one element and rest, in all possible ways" was general enough to deserve its own function.
- Wrote it down, in the first attempt without accumulator.
- Wrote it down again, this time using an accumulator.
With this approach presented, writing the implementation simply follows the type declaration:
splits :: Eq a ⇒ [a] → [(a, [a])]
splits list = list >>= λx . return (x, delete x list)
Although, please do note, this implementation differs significantly from Dirk's, they both accomplish the same result. Now we lift this computation into the State monad transformer (transformers are a topic covered much better elsewhere) ...
choose :: StateT [a] [] a
choose = StateT $ λs . splits s
... and then replace the (forgetful)
digit
generator with the (memoizing) choose
(which then eliminates the need for the different guard
) to obtain the same result with a slight savings of time [the result returned in 0.04 seconds]. By adding these two new functions and lifting the nondeterminism into the StateT we not only saved an imperceptibly few sub-seconds (my view is optimizing performance on sub-second computations is silly), but, importantly, we eliminated more unnecessary branches at the nondeterministic choice-points.In summary, this entry has demonstrated how to program with choice using the MonadPlus class. We started with a simple example that demonstrated (naïve) nondeterminism, then improved on that example by pruning branches and options with the guard helper function. Finally, we incorporated the technique of memoization here that we exploited to good effect in other computational efforts to prune away redundant selections. The end result was a program that demonstrated declarative nondeterministic programming not only fits in the (monadic) idiom of functional program but also provides solutions efficiently and within acceptable performance measures.
1 comment:
This post is very informative, though I will need some time to digest it. I am currently using nondeterminism in instruction selection for a compiler - it relates to branch instruction size being dependent on the offset jumped. This will involve some dependency analysis (via Data.Graph I imagine) - but currently just tries all options - which makes it scale pretty horribly when multiple branches instructions exist.
Post a Comment