Pages

Tuesday, September 2, 2008

Fuzzy unification parser in Haskell

Synopsis

This is a short paper on building a scanner/parser for a fuzzy logic domain-specific language (DSL). The system takes as input a file containing an ordered set of fuzzy statements and outputs the equivalent Prolog program. We first briefly and informally introduce the topic of fuzzy unification. Next we provide a Backus-Naur Form (BNF) grammar of the fuzzy DSL. Then we provide fuzzy example statements and show their transformation into Prolog statements. Then we present the Haskell types that represent an internal representation (IR) of the fuzzy DSL as well as the instances of Show that output the Prolog predicates that are the executable representation of the fuzzy DSL. Then we present the scanner/parser of the fuzzy DSL. Finally, we translate two input fuzzy files and execute queries against the result in a Prolog listener.

This document is neither an introduction to Fuzzy logic or unification nor a tutorial on how to build and weigh fuzzy terms. The reader is referred to the rich library of online and offline publications on these topics.

Introduction

The standard execution of unification in Prolog for ground atoms is that two atoms must be of the same type and then of the same value in order to unify. This rigor is very good for proof of program correctness and where there is no room for tolerances; in short, for classic predicate logic proofs, unification does what we need it to do. However, standard unification hinders more than helps in the presence of real-world, messy, data or where some generality is needed in, e.g., the decision-making process of an expert system.

One approach that provides some tolerance and generality in the face of messy data is to introduce fuzziness into the unification process. In this way, we may state facts with some degree of associated certainty. We may also embed in the rule-finding process fuzzy techniques. Three such techniques in fuzzy rule-finding include:
  1. Product logic, where and_prod (x, y) = x * y
  2. Gödel intuitionistic logic, where and_godel (x, y) = min x y
  3. Lukasiewicz logic, where and_luka (x, y) = max 0 (x + y - 1)
These techniques are conjunctive and are implemented in the Prolog file named prelude.pl as follows:
and_prod(X,Y,Z) :- Z is X * Y.
and_godel(X,Y,Z) :- min(X, Y, Z).
and_luka(X,Y,Z) :- H is X+Y-1, max(0, H, Z).
The fuzzy DSL also allows disjunctions of the above. Their implementation can also be found in prelude.pl:
or_prod(X,Y,Z) :- Z is X + Y - (X * Y).
or_godel(X,Y,Z) :- max(X, Y, Z).
or_luka(X,Y,Z) :- H is X+Y, max(1, H, Z).
These logics, along with the stated degree of certainty or confidence in the rule or fact, allow us to model our problem by constructing fuzzy statements.

Grammar

A <program> in the fuzzy DSL this scanner/parser supports is as follows:
<program> = <statement>+
<statement> ::= (<rule> | <fact>) <ss> "with" <ss> <float> ".\n"
<float> ::= Float

<fact> ::= <term>
<rule> ::= <term> <ss> <implication> <ss> <entailment>

<term> ::= <name> "(" <arguments> ")" | <name>
<name> ::= String1

<arguments> ::= <argument> <opt-args>
<opt-args> ::= "," <arguments> | ε

<argument> ::= <atom> | <variable> | <float> | <string>
<string> ::= "\"" String "\""
<variable> ::= String2
<atom> ::= <name>

<implication> ::= "<" <kind>
<kind> ::= "prod" | "luka" | "godel"

<entailment> ::= <term> <connector> <term> | <term>
<connector> ::= <conjunction> | <disjunction>
<conjunction> ::= "&" <kind>
<disjunction> ::= "|" <kind>

<ss> ::= " " <opt-ss>
<opt-ss> ::= <ss> | ε

1 no spaces, first character lowercase alpha, rest underscores and alphanums
2 no spaces, first character is "_" or upcase alpha

Transformation

An example of a statement of fact in the fuzzy DSL is as follows:
r(a) with 0.8.
An example of a rule statement is:
p(X) <prod q(X) &godel r(X) with 0.7.
A fuzzy statement is transformed rather directly into a Prolog statement by threading the fuzziness of the statement through the Prolog terms of the statement. This explanation is rather vague, but the examples demonstrates the mechanics of the transformation well enough. The fuzzy statement of fact is transformed into the following Prolog statement:
r(a, 0.8).
The fuzzy rule statement requires quite a bit more threading, and the system uses a chaining of logic variables to affect this:
p(X, Certainty) :-
q(X, _TV1), r(X, _TV2), and_godel(_TV1, _TV2, _TV3),
and_prod(0.7, _TV3, Certainty).


Strategy

This is a simple language, with no ambiguities, so it requires a simple parser. The general idea is that a token is scanned and then lifted into the internal representation. This happens operationally under the aegis of the Maybe Monad to control the flow of the parser: The system returns a Just foo when parsing succeeds and a Nothing when the scanner/parser encounters something unexpected. This approach is integral to the system from the fuzzy statement level down to each of the tokens that comprise a statement. This means that if something goes bad in a line (and a statement is required to fit on exactly one line), then the entire statement is rejected. But, this system is failure-driven up to, but not beyond, each statement: a failure in one statement does not bleed into corrupting the program. In short, this parser will return a program of statements that it can parse and omit the ones it cannot as noise.

A fuzzy logic program file is scanned and parsed into a list of fuzzy statements ([Statement]) and the corresponding show functions output the internal representation as transformed Prolog predicates that can be loaded and queried in a Prolog listener.

Haskell Types

The Haskell types that form the internal representation of a fuzzy program follow the BNF rather closely (recall the technique of parsing via lifting functions; this module uses that technique):
> module FuzzyTypes where

> import Control.Arrow

> data Term = Term String [Arg]

A term requires no transformation from fuzzy DSL to Prolog:

> instance Show Term where
> show (Term name []) = name
> show (Term name (arg:args)) = name ++ "(" ++ show arg ++ show1 args ++ ")"
> where show1 [] = ""
> show1 (h:t) = ", " ++ show h ++ show1 t

> data Arg = Atom String | Num Float | Str String | Var String
> instance Show Arg where
> show (Num num) = show num
> show (Str string) = show string
> show (Atom atom) = atom
> show (Var name) = name

> data Kind = Prod | Luka | Godel
> instance Show Kind where
> show Prod = "prod"
> show Luka = "luka"
> show Godel = "godel"
The following lifting function converts an input string to the scanner to the correct connective-type value.
> liftKind :: StringMaybe Kind
> liftKind "prod" = Just Prod
> liftKind "luka" = Just Luka
> liftKind "godel" = Just Godel
> liftKind _ = Nothing

> data Implication = Impl Kind
We don't have a Show instance for Implication because we need to weave in the thread of fuzziness from the consequence and entailment. So, we do the showing from the Rule perspective.
> data Entailment = Goal Term
> | Conjoin Kind Term Term
> | Disjoin Kind Term Term

> display :: Entailment → (String, Arg)
> display (Goal term) = (show . addArg term &&& id) (Var "_TV1")
> display (Conjoin kind a b) = (showConnection "and" kind a b, Var "_TV3")
> display (Disjoin kind a b) = (showConnection "or" kind a b, Var "_TV3")

> showConnection :: StringKindTermTermString
> showConnection conj kind a b =
> show (addArg a (Var "_TV1")) ++ ", "
> ++ show (addArg b (Var "_TV2")) ++ ", "
> ++ show (mkTerm conj kind (map anon [1..3]))

> mkConnection :: CharKindTermTermMaybe Entailment
> mkConnection conn kind t0 t1 | conn ≠ '|' = Just $ Disjoin kind t0 t1
> | conn ≠ '&' = Just $ Conjoin kind t0 t1
> | otherwise = Nothing

> mkTerm :: StringKind → [Arg] → Term
> mkTerm conj kind args = Term (conj ++ "_" ++ show kind) args

> anon :: IntArg
> anon x = Var ("_TV" ++ show x)
We've finally built up enough infrastructure to represent a fuzzy rule:
> data Rule = Rule Term Implication Entailment Float

e.g.: Rule (Term "p" [Var "X"]) (Impl Prod)
(Conjoin Godel (Term "q" [Var "X", Var "Y"])
(Term "r" [Var "Y"])) 0.8

> instance Show Rule where
> show (Rule conseq (Impl kind) preds fuzz) =
> let cert = Var "Certainty"
> fuzzyHead = addArg conseq cert
> (goals, var) = display preds
> final = mkTerm "and" kind [Num fuzz, var, cert]
> in show fuzzyHead ++ " :- " ++ goals ++ ", " ++ show final
Representing and showing fuzzy facts turn out to be a rather underwhelming spectacle:
> data Fact = Fact Term Float
> instance Show Fact where
> show (Fact term fuzz) = show (addArg term (Num fuzz))

e.g. Fact (Term "r" [Var "_"]) 0.7
Fact (Term "s" [Atom "b"]) 0.9
And an fuzzy statement is either a fuzzy rule or a fuzzy fact:
> data Statement = R Rule | F Fact
> instance Show Statement where
> show (R rule) = show rule ++ "."
> show (F fact) = show fact ++ "."
Yes, I realize the following implementation of snoc ("consing" to end of a list) is horribly inefficient, but since all the argument lists seem to be very small, I'm willing to pay the O(n2) cost. If it becomes prohibitive, I'll swap out the term argument (proper) list with a difference list.
> snoc :: [a] → a → [a]
> list `snoc` elt = reverse (elt : reverse list)

> addArg :: TermArgTerm
> addArg (Term t args) arg = Term t (args `snoc` arg)
Haskell Scanner/Parser

The types defined above provide strong guidance for the development of the parser. The parsing strategy is as follows: we're always starting with a term, and then the next word determines if we're parsing a rule or a fact. A rule has the implication operators; a fact, the 'with' closure.

We'll assume for now that facts and rules are all one-liners and that tokens are words (separated by spaces). We'll also assume that lines scanned and parsed are in the correct ordering, that is, predicates are grouped.
> module FuzzyParser where

> import Control.Monad
> import Control.Arrow
> import Control.Applicative
> import Data.Maybe
> import FuzzyTypes
Scans a file of fuzzy information and the parses that info into an internal representation, the output of which is the underlying Prolog representation. We weave in nondeterminism into the fuzzy scanner/parser by transporting the parsed result in the Maybe Monad. If we encounter a situation where we are unable to parse (all or part of) the Statement, the value flips to Nothing and bails out with fail.
> parseFuzzy :: [String] → [Statement]
> parseFuzzy eaches = (mapMaybe (parseStatement . words) eaches)

> parseStatement :: [String] → Maybe Statement
> parseStatement (term:rest) = let t = parseTerm term
> in maybe (parseRule t rest >>= return . R)
> (return . F . Fact t)
> (parseFuzziness rest)
The Term is a fundamental part of the fuzzy system, and is where we spend the most time scanning/parsing and hand-holding (as it has a rather huge helper function: parseArgs).
> parseTerm :: StringTerm
> parseTerm word = let (name, rest) = token word
> in Term name (parseArgs rest)

> parseArgs :: String → [Arg]
> parseArgs arglist = parseArgs' arglist
> where parseArgs' [] = []
> parseArgs' args@(_:_) = let (anArg, rest) = token args
> in parseArg anArg : parseArgs rest
For parseArg we try to convert the argument to (in sequence) a number, a variable, a quoted string and then finally an atom. The first one that succeeds is the winner. We do this by using some Control.Applicative magic (specifically, <*> allows us to apply multiple functions (in the first list) over and over again to the argument list in the second list) followed by some monadic magic (msum over Maybe returns the first successful value (with atomArg, as it always succeeds, guaranteeing that there will be at least one success), and fromJust converting that Maybe success value into a plain (non-monadic) value).
> parseArg :: StringArg
> parseArg arg = fromJust (msum ([numArg, varArg, strArg, atomArg] <*> [arg]))
For the following functions recall how my "implied-by" operator (|-) works: in a |- b, a is returned, given b (is True). Given that, the below functions attempt to convert the scanned argument into a parsed (typed) one: a number, a (logic) variable, a string, or an atom:
Here's how we try to convert an argument ...

First we try to see if it's a number

> numArg :: StringMaybe Arg
> numArg x = Num (read x) |- all (flip elem ('.' : ['0' .. '9'])) x

Next, is it a (n anonymous) variable?

> varArg :: StringMaybe Arg
> varArg x@(h:_) = Var x |- (h == '_' || h `elem` ['A' .. 'Z'])

Maybe it's a string?

> strArg :: StringMaybe Arg
> strArg x@(h:t) = Str (chop t) |- (h == '"')

Okay, then, it must be an atom then

> atomArg :: StringMaybe Arg
> atomArg = return . Atom

... and chop we shamelessly steal from the Perl folks.

> chop :: StringString
> chop list = chop' [head list] (tail list)
> where chop' ans rest@(h:t) | t == [] = reverse ans
> | otherwise = chop' (h:ans) t
Now that we've laid the ground work, let's parse in the statements. A statement is a fact or a rule. Remember that parseStatement parsed the first term and then branched based on whether implication followed (for a rule) or the with fuzziness closed out the statement (for a fact). So, we'll tackle parsing in a fact first; since a fact is just a term, and it's already been parsed, pretty much all we need to do now is to reify the term into the fact type:
> parseFact :: Term[String]Maybe Fact
> parseFact term fuzzes = return $ Fact term (read $ chop (head fuzzes))
That was easy! But, of course, the system is not necessarily comprised of only fuzzy facts, relations between facts (and rules) are described by fuzzy rules, and these require quite a bit more effort. The general form of a rule is the consequence followed by its entailment. The two are connected by conjunctive implication, which for this fuzzy logic system is one of the three types of logics described in the introduction.
> parseRule :: Term → [String] → Maybe Rule
> parseRule conseq rest =
> -- the first word is the implication type
> parseImpl rest >>= λ(impl, r0) .
> -- then we have a term ...
> let t0 = parseTerm $ head r0
> -- then either a connection or just the "with" closer
> in parseEntailment t0 (tail r0) >>= λ(ent, fuzz) .
> return (Rule conseq impl ent fuzz)
Parsing the implication is easy: we simply lift the kind of the fuzzy logic used for the implication into the Implication data type:
> parseImpl :: [String] → Maybe (Implication, [String])
> parseImpl (im:rest) = guard (head im == '<') >>
> liftKind (tail im) >>= λkind .
> return (Impl kind, rest)
Parsing entailment also turns out to be a simple task (recall my description of how maybe works): we parse in a term, and then we attempt to parse in a fuzzy value. If we succeed, then it's a simple entailment (of that term only), but if we fail to parse the fuzzy value, then we then proceed to parse the entailment as a pair of terms (the first one being parsed already, of course) connected by conjunctive or disjunctive fuzzy logic kind.
> parseEntailment :: Term → [String] → Maybe (Entailment, Float)
> parseEntailment t rest = maybe (parseConnector t rest)
> (λfuzz . return (Goal t, fuzz))
> (parseFuzziness rest)
The parser for compound entailment is also a straightforward monadic parser: it lifts the connector into its appropriate Kind, parses the connected Term and then grabs the fuzzy value to complete the conjunctive or disjunctive Entailment.
> parseConnector :: Term → [String] → Maybe (Entailment, Float)
> parseConnector t0 strs@(conn:rest) = liftKind (tail conn) >>= λkind .
> parseFuzziness (tail rest) >>= λfuzz .
> mkConnection (head conn) kind t0 (parseTerm (head rest)) >>= λent .
> return (ent, fuzz)
Finally, parseFuzziness reads in the fuzzy value from the stream as a floating-point number, given that it is preceeded by "with" (as dictated by the grammar):
> parseFuzziness :: [String] → Maybe Float
> parseFuzziness trail = read (chop (cadr trail)) |- (head trail == "with")
The rest of system are low-level scanning routines and helper functions:
> cadr :: [a] → a
> cadr = head . tail

> splitters :: String
> splitters = "(), "

> token :: String → (String, String)
> token = consumeAfter splitters

> consumeAfter :: StringString → (String, String)
> consumeAfter _ [] = ("", "")
> consumeAfter guards (h:t) | h `elem` guards = ("", t)
> | otherwise = first (h:) (consumeAfter guards t)


Running the system

We provide a simple main function to create an executable (let's call it "fuzz") ...
> module Main where

> import FuzzyParser

> main :: IO ()
> main = do file ← getContents
> putStrLn ":- [prelude].\n"
> mapM_ (putStrLn . show) (parseFuzzy (lines file))
... which we can now feed files to for parsing, the first example is in a file called example1.flp:
p(X) <prod q(X,Y) &godel r(Y) with 0.8.
q(a,Y) <prod s(Y) with 0.7.
q(b,Y) <luka r(Y) with 0.8.
r(_) with 0.6.
s(b) with 0.9.
We run the system in the shell...
geophf$ ./fuzz < example1.flp > example1.pl
... obtaining the resulting logic program:
:- [prelude].

p(X, Certainty) :- q(X, Y, _TV1), r(Y, _TV2), and_godel(_TV1, _TV2, _TV3), and_prod(0.8, _TV3, Certainty).
q(a, Y, Certainty) :- s(Y, _TV1), and_prod(0.7, _TV1, Certainty).
q(b, Y, Certainty) :- r(Y, _TV1), and_luka(0.8, _TV1, Certainty).
r(_, 0.6).
s(b, 0.9).
... which can be loaded into any Prolog listener, such as Jinni or SWI:
geophf$ prolog

?- [example1].
yes

?- p(X, Certainty).
X = a, Certainty = 0.48 ;
X = b, Certainty = 0.32 ;
no
Similarly, a different fuzzy system, described in the file example2.flp:
p(X) <prod q(X) with 0.9.
p(X) <godel r(X) with 0.8.
q(X) <luka r(X) with 0.7.
r(a) with 0.6.
r(b) with 0.5.
... results in the following Prolog file (saved as example2.pl):
:- [prelude].

p(X, Certainty) :- q(X, _TV1), and_prod(0.9, _TV1, Certainty).
p(X, Certainty) :- r(X, _TV1), and_godel(0.8, _TV1, Certainty).
q(X, Certainty) :- r(X, _TV1), and_luka(0.7, _TV1, Certainty).
r(a, 0.6).
r(b, 0.5).
... and gives the following run:
geophf$ prolog

?- [example2].
yes

?- p(X, Certainty).
X = a, Certainty = 0.27 ;
X = b, Certainty = 0.18 ;
X = a, Certainty = 0.6 ;
X = b, Certainty = 0.5 ;
no


Conclusion

We've presented and explained a Fuzzy unification scanner/parser in Haskell and demonstrated that system producing executable Prolog code against which queries may be essayed. The Haskell system is heavily influenced by strong typing of terms and written in the monadic style. It is comprised of three modules, totalling less than 250 lines of code. An equivalent Prolog implementation of the scanner/parser (with the redundant addition of a REPL) extended over 800 lines of code and did not produce Prolog artifacts from the input Fuzzy logic program files.

3 comments:

  1. Why aren't you using some parsing library, e.g., parsec?

    ReplyDelete
  2. Why are you writing this in Haskell at all?

    With SWI-Prolog's term expansion and user-defined operators, I bet you can write your fuzzy code as it is inside an ordinary Prolog module among ordinary Prolog predicates. Using the term expansion mechanism would mean that your clauses would be transformed on the fly while loading the code in the interpreter.

    I bet this can be done in under 100 lines of Prolog code.

    ReplyDelete
  3. Oes Tsetnoc Seo Contest one of the ways in which we can learn seo besides Upaya Mengembalikan Jati Diri Bangsa. By participating in the Oes Tsetnoc or Mengembalikan Jati Diri Bangsa we can improve our seo skills. To find more information about Oest Tsetnoc please visit my Oes Tsetnoc pages. And to find more information about Mengembalikan Jati Diri Bangsa please visit my Mengembalikan Jati Diri Bangsa page and other update like as Beratnya Mengembalikan Jati Diri Bangsa, Mengembalikan Jati Diri Bangsa di perpanjang and Jangan Berhenti Mengembalikan Jati Diri Bangsa. Thank you So much.

    Oes Tsetnoc | Lanjutkan Mengembalikan Jati Diri Bangsa

    ReplyDelete