Tuesday, June 2, 2009

Realized Constants are Comonadic

An interesting problem that often arises is "to make" constants. Put another way, it often happens that a system acquires information over time. The system may wish to formalize what it has acquired by creating a constant value.

Here's the problem, however: "Variables don't; Constants aren't"citation?

Or, put another way:

1. One man's constant is another man's variable.


To make a constant sometime down the road, as it were, in languages that have logic variables is simplicity itself: once a variable is unified with a value, it keeps that value throughout that proof.

In "pure" functional languages, that is, languages that do not have side effects, the same can be said.

What about the rest of the world? Take Java, for example. One can make a variable keep its value by declaring that variable final, but that is not helpful if we do not know what our constant value is to be at the time of object creation.

In what scenario could this possibly occur? We need a constant value, but we do not know what it is?

Actually, this situation arises quite frequently. Let's take a concrete example. You have many databases in your database management system, and at compile time you do not know on which port your DBMS is running nor do you know which database the user wishes to query. That's fine, you can lazily create the database connection and access the value through a method:

2. Functions delay binding; data structures induce binding. Moral: Structure data late in the programming process.


But what's not fine is this: the user is data-mining, examining chunks at a time, occasionally calling for the next chunk. How does one know that one has fallen off the end of the table? A simple SELECT on the maximum indexed key will tell you that, but to do that with every query? This seems wasteful. So, once we do the query once, let's just store that value in a variable with a flag to say we've already looked it up.

That sounds suspiciously like a lazily initialized constant, right?

But here's the rub: code that sets a flag allows for other code to unset it, just as code that sets a constant value allows other code to unset that. Using just plain-old regular variables gives no guarantee that the variable, once set, stays set at that constant value.

What to do?

Well, what is the work-flow of this process? We perform an action, checking that we are still within the bounds the valid domain, but the domain only becomes valid after program start, so we cannot make the bounds constant using the final keyword on the variable. This is a very common programming action ... sort of like ... dare I say ... a design pattern. Gee, I wish there was one invented that did all this.

In fact, there does exist such a pattern, and it comes from us via Category Theory. The programming language Haskell has incorporated elements of Category theory in its use of monads and arrows, but the downside of these ways of thinking about computation is that one must "lift" the entire computation into that domain, transforming the original computation to work in that regime.

No, what is needed is something less intrusive, and I found that less intrusive thing when I read an article by sigfpe on comonadic plumbing. In this article he describes three different ways of looking at constants: 1. as a constant, 2. as the Reader Monad, 3. or as the Reader Comonad.

The first one is sometimes untenable, given the programming need, and doesn't work for our case.

The second one is useful if one is already programming in a monad. Umm ... how many OO programmers use monads? Hands up.

[sound of crickets] ... thought so.

The third one allows one to use the constant on demand. And here's the thing, the comonad is very easily comprehended: it has a value that can be extracted "down" from the comonad, it allows a value to be extended over the comonad.

This sounds, in OO parlance, very much like an object. Let's ignore the extentionability of the comonad for now and simply look at extraction (this narrowed functionality is a thing in and of itself. Objects of this type are called Copointed, but let us not be too dogmatic here). Creating such a type is simplicity itself:

> public interface Comonad<T> { public T extract(); }

Um, yawn?

But that is the power of pattern language: not the ability to create these incredibly complex things in a controlled way (they do do that), but the ability to recognize that such and so is a pattern and then encapsulate behavior into the pattern.

Using the comonad pattern, we need simply to make our "maximum row number of Table x" an implementation of the Comonad interface, and then, when we do have enough information to create the database connection (that is, we now have our domain in which our constant resides), we instantiate the comonadic object (which is that domain). Whenever extract is called, it simply returns the constant value required, with the implementation that it does a database look up the first time, then it does the internal constant value look up all other times. Since the Comonad interface does not have methods to change its internals (here), so long as one has the copointed object, the value extract has a guarantee that it remains constant.

Summary

This article examined a commonly recurring problem: one needs to constify a value at some point during a program run and guarantee that it remain constant after being created. A simplification of the comonad was offered as a pattern that is simple to define and to implement.