Thursday, July 24, 2008

My Top-"10" Movie List

The following article is presented as a literate (literal) Haskell program. You can read it, and you can just as easily feed it to a Haskell interpreter, such as ghci or hugs.

I often get into conversations with acquaintances about movies, and when a (really) good movie is mentioned, I exclaim, "That's on my Top '10' list!", because, well, it is. After about 17 of these exclamations in a (series of) conversations, I'm often called on my assertions: "Well, what are your top 10 favorite movies then?" So, I've been forced to show my hand. Which language, though, to display this (ever-changing) list? HTML, no? XML? No, dreck, and no-no-no (been there, done that; barely survived to tell the tale). Haskell, of course. What follows is my top "10" movies as a program as activated data. If you need to see the movies in something other than code, I've posted the run result: voilà.

> module Movies where

The philosophy of this system, other than the fact that I count to 10 accurately, but I appear to have difficulty stopping there (just as the protagonist (?) in the movie "High Fidelity" has in counting to 5), is that I use the State monad to control how the output of the database is formatted. The resulting movie database is a tabled-HTML output.

Quite simply, the State monad contains a (modular) counter, and when my logic determines, then things happen at each cycle.

The Reader transformer monad carries the constant of when I wish to do these things, so the net result is a State monad counter wrapped in a Reader (constant) transformer. This article is not about the (very useful) topic of monad transformers, so I recommend the excellent article that covers that material.
> import Control.Monad.State
> import Control.Monad.Reader

> data Cat a = Cat String [a]

... unfortunately, the type-class system balks at using bare or aliased types, such as string, so here I wrap String in the "S" data type and provide my HTML representation against that.
> type Caddy = Cat (S Int)   -- quite the cad, indeed!

> data S s = S s String

> instance Show (S s) where
> show (S s str) = cdata str []

Not particularly happy with the cdata transformation, either, but it's a quick and dirty implementation to handle a problem of character data in the database that would muck up an HTML representation.
> cdata [] ans = reverse ans 
> cdata (a:b) ans = cdata b ((if a == '&' then 'n' else a) : ans)

The typeclass Html is a pretty printer that generates HTML from the data set.
> class Html a where
> asHtml :: a → ReaderT Int (State Int) String

Here, when the typeclass is given an instance of "String" (wrapped in the S data type), it pretty prints columns, wrapping every n columns were n is the constant held by the ReaderT transformer monad.
> instance Html (S s) where
> asHtml x = do pre ← prefix
> post ← postfix
> return (pre ++ "<td>" ++ show x
> ++ "</td>" ++ post)
> where prefix = do idx ← get
> return (if idx == 0 then "<tr>" else "")
> postfix = do sz ← ask
> idx ← get
> put ((idx + 1) `mod` sz)
> return (if idx + 1 == sz
> then "</tr>\n" else "")

This is the engine control of the pretty printer. When given a category, it prints the title, resets the counter to zero, and then hands off the printing of columns to either itself (if the category (e.g. "Top 10 movies") contains categories (such as the genres)) or to the "String" printer (if the category just contains the movie list).
> instance Html a => Html (Cat a) where
> asHtml (Cat title list)
> = do rows ← ask
> put 0
> cells ← mapM asHtml list
> rest ← get
> return ("\n<tr><th colspan='"
> ++ show rows ++ "'>" ++ title
> ++ "</th></tr>\n" ++ foldl (++) "" cells
> ++ closeCells ((rows - rest) `mod` rows))
> where closeCells 0 = ""
> closeCells x = foldl
> (λx y → x ++ "<td>&nbsp;</td>")
> "" [1..x]
> ++ "</tr>\n"

The database of movies follows:
> scifi, documentary, ferrn, romcom :: Caddy
> western, musicals, comedy :: Caddy
> horror, brit, oldies, sitcom, indie, drama :: Caddy

> movies :: Cat Caddy
> movies = Cat "My Top 10 Movies"
> [scifi, ferrn, romcom, musicals,
> comedy, indie, documentary,
> horror, brit, sitcom, drama, oldies,
> western, anime]

> anime = Cat "Animation"
> (map (S 1) ["Emporer's New Groove", "Wallace & Gromit",
> "Shrek"])
> scifi = Cat "Science Fiction"
> (map (S 2) ["Solaris", "Blade Runner", "Galaxy Quest",
> "Donnie Darko", "Fight Club"])
> ferrn = Cat "Foreign"
> (map (S 3) ["Le Placard", "Yojimbo", "Bleu, Blanc, Rouge",
> "Siti no Samurai", "Sanjuro", "Eiron Shimbum",
> "Schultze gets the Blues", "Eat Drink Man Woman",
> "Legend of Drunken Master", "Shaolin Soccer",
> "Kung Fu (Hustle)", "Cronos"])
> romcom = Cat "Romantic Comedy"
> (map (S 4) ["George of the Jungle", "Charade", "Bull Durham",
> "Bride and Prejudice", "L.A. Story",
> "Moonstruck", "My Big, Fat Greek Wedding",
> "Bullets Over Broadway", "Clueless", "Ocean's 11",
> "Whole Nine Yards", "Princess Bride"])
> musicals = Cat "Musical"
> (map (S 5) ["Guys and Dolls", "Singin' in the Rain",
> "Mary Poppins"])
> brit = Cat "Brit"
> (map (S 6) ["Cold Comfort Farm", "Importance of Being Earnest",
> "Hot Fuzz"])
> oldies = Cat "Ole Timey"
> (map (S 7) ["O Brother, Where art Thou?",
> "To Have and Have Not", "Hopscotch"])
> western = Cat "Western"
> (map (S 8) ["Shanghai Noon", "Pale Rider",
> "Dances with Wolves"])
> comedy = Cat "Comedy"
> (map (S 9) ["Blazing Saddles", "Arsenic and Old Lace",
> "Dodgeball", "Young Frankenstein",
> "Harold & Kumar go to White Castle",
> "Revenge of the Pink Panther"])
> sitcom = Cat "Sitcom"
> (map (S 10) ["Addams Family", "Addams Family Values",
> "State and Main", "Rushmore"])
> horror = Cat "Horror/Suspense"
> (map (S 11) ["Jacob's Ladder", "Silence of the Lambs",
> "Shawn of the Dead"])
> drama = Cat "Drama"
> (map (S 12) ["Grave of the Fireflies", "Village", "Payback",
> "Shadowlands", "Station Agent", "Pleasantville",
> "Shoot 'em up", "Searching for Bobby Fischer"])
> indie = Cat "Indies"
> (map (S 13) ["Bagdad Cafe", "Ghost World",
> "Living in Oblivion", "My New Gun"])
> documentary = Cat "Documentaries"
> (map (S 14) ["Good Night and Good Luck",
> "thirty two short films about Glenn Gould",
> "Koyaanisqatsi", "Unzipped", "Ed Wood"])

This function calls the monadic system to print the movie database.
> showMovies :: Int → String
> showMovies cols = "<table>\n"
> ++ evalState (runReaderT (asHtml movies) cols) 0
> ++ "\n</table>\n"

> test = putStrLn (showMovies 3)

You can download it and see the embedded HTML in a code browser, such as eclipse. At any rate, executing test will get the above movie database in a tabled-HTML form.

What's the payoff?

Ah, yes. Well, if I had encoded my movie list in one of the currently popular metadata formats, such as HTML or XML/XSTL, then changing the number of columns per row would be prohibitively difficult with the former (HTML), and getting the metadata output as a properly formatted set of table rows would have been prohibitively difficult with the latter (XSTL). The Haskell program demonstrates both: you've seen how (relatively) easy it is to output the structured data as table rows of three columns per row, to change the number of columns (e.g.: to two columns per row), it's as simple as rewriting the test function to ...
test = putStrLn (showMovies 2)

... paid off; or, if you prefer, Q.E.D.

Alternative Implementations

Using the Reader monad is one of the ways to pass around "global" values, defining a ("global") constant function is another, sigfpe discusses those, as well as using the Reader Comonad, in an article on comonadic plumbing. I chose the Reader transformer monad instead of the Reader comonad for this article, because combining a comonad with a monad introduces arrows as the glue. Another article will explore comonads first before we dive into the thick of monad/comonad/arrow plumbing.

No comments: