Wednesday, June 4, 2008

A Haskell Rule Engine

Those of you who have read other entries on this blog know that the title does not contain a (semantic) typo. Those of you new to this discourse may wonder, "Rule Engine? Not Haskell; Prolog! End of discussion, right?"

Well, yes and no.1

Prolog is nearly perfectly situated, semantically, for building rule engines. In fact, the Prolog interpreter is a rule engine, so making one's own rule engine amounts simply to adding rules (as rules are native-language constructs in Prolog) to the interpreter (which are then compiled into the Prolog system). What could be simpler? Well, nearly nothing, and that's why Prolog is the language of choice for these kinds of systems (and for knowledge bases, and for work-flow analysis, and for resource scheduling systems, and for ... etc).

The problem, however, is the Prolog offers a little too little and a little too much -- a little too little in typing and a little too much side-effects (à la assert, retract, and the ! (pron: "red cut")) -- and this abundance (of side effects) and dearth (of typing) both seriously hamper confidence in the correctness, or provability, of the rule system.

Let's take a moment to consider how a rule-based system works. The system is fed data; it then forms a supposition from those data, and the rules contribute, to a greater or lesser extent, to proving or disproving that supposition. Rule-based systems often, but not always, have a dynamic element to them -- rules may be added or removed (or activated or passivated) while the system is running to deal with emergent situations. Furthermore, rule-based systems often have a collaborative element to them: it is often desirable that rule-findings (or their antithesis) affect other rules in the same or similar category. Contribution, dynamism, and collaboration: these are the ingredients for successful rule-based systems. For these ingredients to work with any success, however, the supposition must have a well-understood structure as it is tested from rule to rule. Furthermore, findings that contribute to the strength of the supposition must be adjoined in well-defined ways.

Caveat: if you're looking for objectivity, read Ayn Rand. However, even though the next section is anything but even-tempered, it is indeed tempered by years of experience on real systems really in production. YMMV

The Prolog programmer writing the rule-based system may render the structure of the supposition either implicitly (by testing the dynamic structure of the supposition in the goals of each rule) or explicitly (by pattern matching in the head of each rule with tagged terms of the supposition). Either approach requires uniformity, which is just a euphemism for typing. In short, the Prolog programmer must type the supposition, but the Prolog system has no built-in check guaranteeing correctness of that type from rule to rule. This places the burden of typing on the Prolog programmer, and this often results in the programmer accidentally building an ad hoc type system extension, and often poorly enforcing that discipline from rule to rule. The result: an ill-typed supposition is allowed to pass through the rule engine generating false positives or failing to recognize real positives from the base data set.

So much for the need-for-typing screed.

Now, as for the side-effecting (a.k.a. "extra-logical") constructs in Prolog (assert, retract, and !), the literature is resplendent with caveats, so I will leave that discussion for your discovery (which would simply be to read the latter part of chapter 1 of any introductory Prolog text). I will add one bit of personal professional experience: a rule-based system built on the foundation of these extra-logical constructs (each rule issued more than a few asserts which then their duals later retracted [fingers crossed]). What happened when an unusual finding surfaced? Well, what didn't happen was a felicity in the debugger, even with a complete trace it was well nigh impossible to sort out which lemmas were asserted and at what points. Were there detritus lemmas from the previous transaction that were never retracted due to a ! or an unexpected failure resulting in backtracking? These questions were hard enough to answer in that system, but the difficulty was often compounded by the inconsistency between different findings on the exact same input data. In short, if you are a Prolog programmer who uses the extra-logical features of the language to build your rule-based system, you may have something that works from time to time, but you've also sacrificed consistency and confidence in the results. Please, for my sake,2 find a way to code the system relying on the logical features of the language as the mainstay.3

We now return to our regularly-scheduled programming...

Haskell, being a typed and pure functional programming language, has its own strengths and weaknesses.4 The type system requires the program to be well-typed and consistent ("proofs-as-types"), otherwise the program is rejected by the Haskell system. Conversely, foundational operations of logical programming are not to be found as part of Haskell's base syntax nor semantics. Unification? Backtracking? Neither are available in "plain" Haskell. In Prolog, what are the operators for unification and backtracking? There are none, as these concepts are fundamental parts of Prolog's semantics -- they are as invisible and as necessary as the air, permeating the entire system down to every goal in each rule clause.

If not addressed, these deficiencies of plain Haskell would stop the show. However, previous articles have demonstrated typed, concise, and natural representations of choice (backtracking). So, the rest of this article will develop a rule-based system using those tools, and along the way demonstrate how "good-enough" unification may be achieved with monads in Haskell.

Before diving in to building a rule-based system in Haskell, let's pause again to consider how rule findings should be captured. Some systems are implemented where rules return a score, and the summation of all the rule findings determine if the supposition holds. Many such systems exist using this approach. The problem with this approach is that it makes defining the special cases (which always crop up in these kinds of systems) difficult. For example, a special rule finding would be: "whenever these preconditions are met, then the supposition is true, no matter what other findings occur." How does one encode this in a points-based system? "Obviously, that rule finding is alloted enough points to tip the scales in support of the supposition," you answer. However, this answer, although it works today, will no longer work as, inevitably, new rules are added (and old rules removed). When this change occurs, new thresholds are set, and the value assigned to this "ultimate" rule finding no longer triggers the expected result. Of course, one can use indirection, where the rule finding's point value is a reference to the threshold's value, but then this introduces additional housekeeping that the "simplification" of "rule findings are points" was supposed to avoid. These factors -- the paradigm broken (a rule finding is not a point result but a reference to one) and the additional housekeeping -- are an open invitation to introduce logic errors into the system during the maintenance phase of the system's life cycle.

And, what about the rule findings that monitor the users of the rule-based system? An integral part of every rule-based system (or expert assistance system) is the monitoring of the experts using the system. And, in every one of these kinds of systems, there are one or more experts who consistently subvert the system and there are certain experts who consistently and markedly outperform the system. For example, in the banking industry, the system may make recommendations that the current customer move funds over $10,000.00 out of the savings account into longer term investments, but it may also observe that the current teller is consistently successful in upselling certain profitable instruments successfully (and therefore recommend awards, promotions, or raises for that teller or may recommend pulling that teller in for interviews to enhance the system with their experience and know-how). On the other side, there may be tellers who consistently report totals that are different than what the daily transaction report shows, and so the system may recommend to managers to bring additional supervision, training, or legal action against that teller.

This subversion or excellence may be unconscious, but whatever the motive, it is imperative that this expert not be notified that their activities are being monitored. Such a rule finding ("expert in dereliction" or "expert outperforming") would, in a point-based system, result in adding 0 points, as it doesn't affect the supposition, but it does have importance elsewhere.

In short, a point-based system may work for the "normal" rules, but is unsatisfactory for any real-world system that must include special cases, such as ultimatums and rules that deal with issues not concerning proof of the supposition.

Well, if points are not the way to go, what about having the rule findings immediately trigger the appropriate action? I don't know what to call that kind of application, but it certainly isn't a rule-based system. A system such as this immediately disintegrates into incoherency. Let's examine an example to demonstrate, e.g.: a rule-based system for surveying banking customers has a rule that runs as follows: "A male, usually wearing a red shirt, and waving a pistol is a robber; call the police." So, under a ruling precipitates action system, the next person (male or female) who walks in wearing a red shirt (one of the triggers) would have the SWAT team come barrelling in, weapons free. This may impact customer satisfaction and disrupt business operations.

No, part of the raison d'être of rule-based system is that first a case must be build for, or against, a supposition with findings that are either sufficient or overwhelming in number, then, only after a compelling case is built does the system render a decision or recommendation with associated actions. The product of such systems is not only a recommendation, but also a coherent model, a "why", in support of that recommendation.

Okay, rules-as-points are out, and the imperative model ("ready-fire-aim!") is out, so what is left? The coherent model; and a very simple model is just the set of rule findings, themselves. With those findings, the arbiter subsystem may use any method for reaching a recommendation (points or winner-take-all or dispatch or monte carlo, etc), or even change its methodology, but the rule finding process does not need to change with each change of the arbiter.

Sometimes, loose coupling is good.

What better way to represent rule findings than as terms of a type? We use this representation in our example rule-based system that follows (which is the poker hand classifier developed earlier). We already have the rules for straights, flushes, straight flushes and royal flushes:

type Query = [Card] → [(Rank,Int)] → Maybe Type

royalFlush :: Query
royalFlush hand x = straightFlush hand x >>= highAce
where highAce hand = RoyalFlush |-
hand ≡ StraightFlush (High (Face Ace))

straightFlush :: Query
straightFlush hand _ = (run `conjoin` sameSuit) hand >>=
return . StraightFlush

flush :: Query
flush hand x = let (HighCard hc r1 r2 r3 r4) = fromJust $ highCard hand x
in sameSuit hand >> return (Flush hc r1 r2 r3 r4)

straight :: Query
straight hand _ = run hand >>= return . Straight

The second argument of the Query type, which is a bag representation of the hand -- the rank of the cards and the count for each rank, was not used in the above rules, but becomes indispensable for the other rule findings, starting with the full house (3 of a kind and 2 of a kind in one hand):

fullHouse _ bag = let (a,3) = head bag
(b,2) = head (tail bag)
in return (FullHouse (Threes a) (Twos b))

No, that's not right. The let-expression is not an assignment, it's an unification! For, after all, if the hand is not a full house, then the "assignment" fails in its "pattern match", so, not only are we attempting to fix the values of a and b but we are also propagating a success or truth value through that process. Hm, the "success of an assignment", what type would that be? Ah! Monadic, of course, and list compression compactly captures the concept of essayed assignment which we then distill (my much less cumbersome name for the standard library's listToMaybe function):

fullHouse :: Query
fullHouse _ bag = distill [FullHouse (Threes a) (Twos b)
| (a,3) ← bag, (b,2) ← bag]

It is almost criminally embarrassing on how easy this rule is to translate from the specification: "A full house is the distillation of a hand into three a's and two b's."5 From this understanding, the other hands fall out naturally:

fourofakind :: Query
fourofakind _ bag = distill [FourofaKind (Fours a) (High b)
| (a,4) ← bag, (b,1) ← bag]

threeofakind :: Query
threeofakind _ bag = distill [ThreeofaKind (Threes a) (High b) c
| (a,3) ← bag, (b,1) ← bag,
(c,1) ← bag, b > c]

twopairs :: Query
twopairs _ bag = distill [TwoPair (Twos a) (Twos b) (High c)
| (a,2) ← bag, (b,2) ← bag,
a > b, (c,1) ← bag]

onepair :: Query
onepair _ bag = distill [OnePair (Twos a) (High b) c d
| (a,2) ← bag, (b,1) ← bag,
(c,1) ← bag, b > c,
(d,1) ← bag, c > d]

... and finally there's the catch-all for the "I'd better be good at bluffing" hand:

highCard :: Query
highCard hand _ = let [hc,r1,r2,r3,r4] = map settleRank hand
in return (HighCard (High hc) r1 r2 r3 r4)

settleRank :: CardRank
settleRank (Card rank _) = rank

Now that we have the rules, we need simply write the marshalling code, the rule-finding loop, and the arbiter to complete our rule engine.

The marshalling code pre-processes the hand into a bag in order to facilitate n-of-a-kind rule findings:

let groups = group $ map settleRank cards
cards = reverse (sort hand)
counts = map length groups
bag = zipWith (curry (first head)) groups counts

The rule-finding loop applies the rules to the (processed) hand in order:

rules = [royalFlush, straightFlush, fourofakind,
fullHouse, flush, straight,
threeofakind, twopairs, onepair,
findings = map (λf . f cards bag) rules6

And, finally, the arbiter in this case is very simple: it takes the first (highest ordered) rule-finding as the result.

in fromJust (msum findings)

With that, we have built the kernel of a complete rule-based system.

In this article, we demonstrated that Haskell, with standard extensions, can follow a process that constructs rules. This process not only matches the ease and simplicity of rule-building in Prolog, but it also has the benefit that these rules in Haskell have the type-correctness that provides a stronger degree of confidence in the rule-finding results.

End notes
1 I must be getting older. Not too long ago, my refutation would have been more emphatic ... to the tune of: "Wrong!"
2 I say "for my sake" because I'm the guy you're going to be complaining about: "All I wanted him to do is just add this one rule set to the system, and he goes off and rewrites the entire system to do it!"
3 While you're at it, put all your debug statements into aspects (because nl should not even appear in core production code much less be the most frequently called "goal" in the system) and then have a complete(ly automated) unit test suite so that additions or deletions are entirely covered by this safety net, but those are whole ('nother) discussions onto themselves.
4 But why choose a functional programming language like Haskell? If Prolog nearly fits the bill, wouldn't a well-typed logic programming language, say, Mercury, or others, fair better than Haskell? The problems with most of these languages is that they fall into the research/academic ghetto. The user base is comprised of the language developers and maybe one or two others, and so the languages stagnate in their insularity. Haskell, and some other languages, including Prolog itself, have garnered enough outside attention and use to grow organically beyond their inventors' control, boundaries and strictures -- these languages retain their academic foundations but also gain a set of practical, useful, tools and extensions.
5 ... cannot resist ... "or not two b's" *sigh*
6 The Applicative style (see this paper) would have the list of functions applied to the list of arguments thusly ...

map uncurry rules <*> [(cards, bag)]

... but I suppose that, in itself, opens up a new topic of discussion, which I will currently shunt to another day.