So, today's question on @1HaskellADay was this:
write a function
countOccurences :: [Stirng] -> Map Char Int
(typos faithfully reproduced)
lookup 'l' $ countOccurences "Hello"~>
lookup 'q' $ countOccurences "Hello"~> Nothing
Okay, that can be done easily enough, I suppose, by torquing Map into something that it isn't, so one gets wrapped around the axel of creating a mapping from characters to occurrences.
First of all,
countOccurencesmaps a String (not a List of Strings) to a Map, and that map is a very specialized kind of map that has existed in the literature for quite a while, and that map is known as the Bag data type, and is also, nowadays, called the MultiSet by people too embarrassed to say the word 'bag' in a sentence, because of their prior drug convictions.
("I got two months for selling a dime bag.")
So they now twist the word 'Set' (a 'collection of unique objects') to mean something that's not a set at all, the 'Multi'Set, which is a 'collection of unique objects, but you can have multiples of these unique objects, so they're not unique at all, so it isn't a set at all, but we need to say the word 'set' because we can't say the word 'bag' because saying the word 'bag' would make us sound plebeian for some reason.'
Yeah, that. 'MultiSet.'
What. Ev. Er.
But I digress.
So I COULD write the
String -> Map Char Intfunction, but then: why bother? You can either write tons of algorithmic code that obscures the intent or just simply use the appropriate data type.
I went for the latter.
Now, I wuz gonna do a dependently-typed pair to represent an occurrence...
... notice how
countOccurencesis so badly misspelled, by the way?
SOMEbody didn't QA-check their problem for the day today, I'm thinking.
... but then I said: 'eh!'
I mean: WHY is
lookup 'q' $ countOccurences "Hello"~> Nothing?
WHY can't it be that
count 'q'for a Bag Char representation of "Hello" be 0? 0 is a valid answer and it keeps everything nice and monoidal without having to lift everything unnecessarily into the monadic domain.
So, yeah. Let's do that, instead.
So, here we go, and in Idris, because that's how I'm rolling these days. The advantages of dependent types have been enumerated elsewhere, so we'll just go with that they're better as an assumption and move on, using them, instead of extolling them, in this post.
So, my first attempt at Bag crashed and burned, because I did this:
data Bag : (x : Type) -> Type where
add : Bag x -> x -> Bag x
emptyBag : Bag x
and the compiler was fine with that. Hey, I can declare any type I'd like, so long as the types just stay as types, but as soon as I tried to define these things:
emptyList : List x
emptyList = 
emptyBag = Bag emptyList
add (Bag ) x = Bag [(x, 1)]
add (Bag ((x, y) :: rest)) x = Bag ((x, y + 1) :: rest)
add (Bag ((z, y) :: rest)) x = Bag ((z, y) :: (add rest x))
The compiler looked at me and asked: 'geophf, what in tarnation are you-ah tryin' to do?'
And about the only intelligent answer I could muster was: 'Ummmm... idk.'
I had gotten too clever for myself by half, trying to reshape a data type you learn in Comp.Sci. 101 as a purely functional type.
Back to Basics ... (but not BASIC)
So, let's just declare Bag to be what it is and KISS: 'keep it simple, stupid!'
data Bag x = Air | Stuffed (x, Nat) (Bag x)
Now, I so totally could've gone with the balanced binary-tree representation instead of the simple and standard linked list, but, you know: 'live and learn!'
With this declaration the emptyBag becomes so trivial as to be unnecessary, and then add is simplicity, itself, too, but add is, either way, so that's not saying much.
add : Eq x => Bag x -> x -> Bag x
add Air x = Stuffed (x, 1) Air
add (Stuffed (z, y) rest) x =
case x == z of
True => Stuffed (x, y + 1) rest
False => Stuffed (z, y) (add rest x)
Now, you see me relying on the
case-statement, here. Unhappily.
I'd like my dependent types to say, 'unify x with x (reflexive) for the isomorphic case, and don't unify x with z for the other case.' But we're not there yet, or my coding isn't on par with being there yet, so I forced total coverage bifurcating the result-set into isomorphic and not with a hard-case statement.
Ick. I hate explicit case-statements! Where is really, really, really smart pattern-matching when I need it?
add,constructing a Bag becomes easy, and then counting elements of that bag is easy, too (again, with another
count : Eq x => x -> Bag x -> Nat
count _ Air = 0
count x (Stuffed (z, y) rest) =
case x == z of
True => y
False => count x rest
countOccurences(with one-too-few 'r's in the function name) becomes easy, given the Bag data type:
countOccurences : String -> Bag Char
countOccurences str = co' (unpack str) where
co'  = Air
co' (char :: rest) = add (co' rest) char
But look at this:
depth : Bag x -> Nat
depth Air = 0
depth (Stuffed _ rest) = 1 + depth rest
sample : ?bag
sample = countOccurences "The quick, brown fox jumped over the lazy dog."
bag = proof search
When we do a
depth sample, we get the not-surprising answer of
29 : Nat
Perhaps this could be made a tad bit more efficient?
Well, then, let's do that!
data Bag x = Air | Stuffed (x, Nat) (Bag x) (Bag x)
We make Bag balanced, with the
add-function, doing the work of (very simply) branching off new nodes:
add : Ord x => Bag x -> x -> Bag x
add Air x = Stuffed (x, 1) Air Air
add (Stuffed (z, y) less more) x =
case (compare x z) of
LT => Stuffed (z, y) (add less x) more
GT => Stuffed (z, y) less (add more x)
EQ => Stuffed (z, y + 1) less more
Then all the other functions change ('morph') to work with a tree, not a list and work with Ord elements, not with (simply) Eq ones.
And so, the redefined
depth-function gives a very different result:
9 : Nat
Not bad! Not bad! The improved data-structure improves efficiency across the board from O(N) to O(log N).
Hm, perhaps I'll have
countreturn a dependently-typed pair, just as the library function
filterdoes on List types, but not tonight.
Good night, Moon!