Thursday, August 28, 2008

Scanner-parsers I: lifting functions

I'll start right off with an apology: I would've loved this entry to be on comonadic parsing, but I'm still trying to get my head around the Uustalu and Vene papers — I keep looking in the papers for something that says the equivalent of "... and this comonadic parsing technique resolves to something very much like Prolog's definite clause grammar", but I didn't see this conclusion in their papers. Would somebody please point out the gap in my understanding?

Instead, we'll concentrate on converting a very simple command language into an internal representation using lifting functions. Daniel Lyon suggested this "Mars rover" problem.
> module MarsRover where

> import Char
> import Data.List
Problem statement: MARS ROVERS
A squad of robotic rovers are to be landed by NASA on a plateau on Mars. This plateau, which is curiously rectangular, must be navigated by the rovers so that their on-board cameras can get a complete view of the surrounding terrain to send back to Earth.

A rover's position and location is represented by a combination of x and y co-ordinates and a letter representing one of the four cardinal compass points. The plateau is divided up into a grid to simplify navigation. An example position might be 0, 0, N, which means the rover is in the bottom left corner and facing North.

In order to control a rover, NASA sends a simple string of letters. The possible letters are 'L', 'R' and 'M'. 'L' and 'R' makes the rover spin 90 degrees left or right respectively, without moving from its current spot. 'M' means move forward one grid point, and maintain the same heading.

Assume that the square directly North from (x, y) is (x, y+1).

INPUT:

The first line of input is the upper-right coordinates of the plateau, the lower-left coordinates are assumed to be 0,0.

The rest of the input is information pertaining to the rovers that have been deployed. Each rover has two lines of input. The first line gives the rover's position, and the second line is a series of instructions telling the rover how to explore the plateau.

The position is made up of two integers and a letter separated by spaces, corresponding to the x and y co-ordinates and the rover's orientation.

Each rover will be finished sequentially, which means that the second rover won't start to move until the first one has finished moving.

OUTPUT

The output for each rover should be its final co-ordinates and heading.

INPUT AND OUTPUT

Test Input:

5 5
1 2 N
LMLMLMLMM
3 3 E
MMRMMRMRRM


Expected Output:

1 3 N
5 1 E
Problem solution

Types

This problem simply decodes into a state machine and a command interpreter, where the interpreter first lifts the input characters into their correspondingly-typed values.
The state machine, a.k.a Mars Rover, is:

> data Location = Loc Integer Integer
> data Orientation = N | E | S | W
> deriving (Show, Enum, Read)
> data Position = Pos Location Orientation

The typed commands are:

> data Direction = L | R deriving Show
> data Command = Turn Direction | Move

... with some Show instances for sanity checks and output

> instance Show Command where
> show (Turn dir) = show dir
> show Move = "M"

> instance Show Location where
> show (Loc x y) = show x ++ " " ++ show y

> instance Show Position where
> show (Pos loc dir) = show loc ++ " " ++ show dir
Our scanner-parser for our very simple language grabs the next thing to be scanned from the input and returns updated input.
> type Scan a = String → (a, String)
Lifting functions

The lifting functions are rather simple affairs, as they convert what's in the input stream to their typed internal equivalents:
> liftOrient :: Scan Orientation
> liftOrient command = head $ reads command

> liftLoc :: Scan Location
> liftLoc command = let (x, locstr):_ = reads command
> (y, out):_ = reads locstr
> in (Loc x y, out)
... I'm not quite getting the intent of reads: it seems to return either empty (denoting failure) or a singleton list containing the value and the continuation (that is, to be precise, the rest of the list after the first value is read). I'm not doubting its utility, but I do wonder that the list type was the return type if it behaves as a semideterimistic function (fail or singleton success) and not as a nondeterministic one (fail or multiple successes). The Maybe data type is more closely aligned with semideterministic functions; lists, nondet. So I believe that either the Either type or Maybe are a more appropriate return for reads.

At any rate, with the above two lifting functions, lifting the position of the rover (the first line in our command language) simplifies to chaining the above two lifting functions:
> liftPos :: Scan Position
> liftPos command = let (loc, rest) = liftLoc command
> (dir, out) = liftOrient rest
> in (Pos loc dir, out)
So, we have the scan and parse of the starting position of the rover done. We're half-way there! Now we simply need to scan then to parse the commands. But, the neat thing for us is the following: the commands are atomic (and just one character each). There's no need for the continuation scheme that we used for scanning positions above, we just need simply map the command lifting function over the line of commands (the "command line") to obtain our internal representation:
> liftCmd :: CharCommand
> liftCmd 'L' = Turn L
> liftCmd 'R' = Turn R
> liftCmd 'M' = Move
Command interpreter

What does a command do? It moves the rover or reorients the rover. What is movement? A change in Position (via a change in Location). What is reorientation? A change in Orientation. So, a Command's action (which is either to move or to turn the rover) results in a Δ (pron: "Delta") of the Position:
> type Δ a = a → a

> command :: CommandΔ Position
> command (Turn dir) (Pos loc orient)
> = Pos loc (turn dir orient)
> command Move pos = move pos

> move :: Δ Position
> move (Pos loc dir) = Pos (dLoc dir loc) dir
> where dLoc :: OrientationΔ Location
> dLoc N (Loc x y) = Loc x (y+1)
> dLoc E (Loc x y) = Loc (x+1) y
> dLoc W (Loc x y) = Loc (x-1) y
> dLoc S (Loc x y) = Loc x (y-1)

> turn :: DirectionΔ Orientation
> turn dir orient
> = toEnum $ (intVal dir + fromEnum orient) `mod` 4
> where intVal :: DirectionInt
> intVal L = -1
> intVal R = 1
I suppose turn could've been modeled more realistically on radians, and would work just the same, but since the command language is so simple (turns are exactly 90o), I can get away with using the four points of the compass as (circularly) enumerated values.

The Glue

Given all the above, the rover simply runs as follows.
Note that I ignore the grid-size command, as it appears to be superfluous to the problem definition (collision detection between rovers and out-of-the-grid bounding errors are NOT covered in the problem statement ... since they choose to ignore these issues, I will, too).
> runRovers :: StringIO ()
> runRovers commnd = let (_:commands) = lines commnd
> in putStrLn $ run' commands
> where run' [] = ""
> run' (pos:cmds:rest)
> = let (start, _) = liftPos pos
> mycmds = map liftCmd cmds
> stop = foldl (flip command)
> start mycmds
> in show stop ++ "\n" ++ run' rest

> runTestEx :: IO ()
> runTestEx = runRovers "5 5\n1 2 N\nLMLMLMLMM\n"
> ++ "3 3 E\nMMRMMRMRRM"
Summary

So, there you have it, folks: scanning and parsing as mapped lifting functions. That all DSLs would be that simple, but that topic is for when I cover how I reinvent the Parsec wheel ... but in the meantime, you can read the next extry the covers scanning and parsing for this problem with the State monad transformer.

1 comment:

Anonymous said...

Actually, reads is defined as returning a list of possible parses, so it need not always be a singleton when successful.

This seems rarely explained; for a long time I wondered why the parser chapter in Hutton's book used a list as the return value.