Incorporates strong typing over predicate logic programming, and, conversely, incorporates predicate logic programming into strongly typed functional languages. The style of predicate logic is from Prolog; the strongly typed functional language is Haskell.
Tuesday, May 7, 2013
Small survey of Category Theory introductions
Some time ago a friend asked for a good introductory work on Category theory. I never did answer his question to my satisfaction, as the stuff I picked up on the subject was here and there as I needed it, and I thought there was never any succinct introductory work.
Well, I thought wrongly.
http://en.wikipedia.org/wiki/Categories_for_the_Working_Mathematician
Above link ... links to the seminal summa, available for you if you wish to pursue this delightful area of research into expressivity in mathematics.
Also, of course, there's the working-quantum-physicist's introduction at:
http://math.ucr.edu/home/baez/categories.html
Having a working knowledge of quantum computation is not necessary, probably not even helpful, but a very nice introduction is Quantum Computation and Quantum Information by Nielsen and Chuang, if you wish to see the source from where I got to categories and Category Theory.
Dr. Baez's work starts off lightly and playfully, but then gets pretty deep pretty quickly, as he goes into the Groupoid/Topoid theoretical application of Category Theory, but that's to be understood, as quantists are always concerned about (super-)symmetries, and I, not so much, as I look for the more practical application of Categories in Monoids and the Relational Calculus, but there it is.
I do, of course, have more advanced works on this topic if you wish to research further, and there's always this blog, where I look at the logical implications of cat theory (heh: 'logical' 'implications' ... Math humor). There is, e.g., an introductory article on monads and their computational application at:
http://logicaltypes.blogspot.com/2011/09/monads-in-java.html
Wednesday, December 19, 2012
That's a pretty good guess... Naïve Bayesian Classifiers
... so, I don't need to write any more, correct? Classification problem? Apply a naïve bayesian system to it, and you are way ahead of the curve, and you've expended a fraction of the effort to stand the system up.
... shortest posting ever written.
So, anyway.
So, a Bayesian classifier is a predictor. It says, that is, if it were to speak, it says, "Hm, I've seen this before, so I think this is a that." And, well, given a reasonable tolerance for error or noise, it's pretty good at doing its job. In fact, it works supremely well in noisy environments, shines in them, even, where other models that are more rigidly defined tend rapidly to become unstable at the slightest irregularity (the inductive-logic-programming model comes to mind).
Yes, I've worked on classification systems. We've had a hybrid system composed of a deductive rule-based system (that we hashed out with Subject Matter Experts over months of establishing, parameterizing, then fine tuning each rule) running in parallel with a Bayesian classifier. Guess where 99% of our usable results came from?
Sir Bayes walked away with the win, nearly every time.
How does the naïve Bayesian classifier work?
The algorithm is simplicity itself ... embarassingly so. It does an observation, and that observation is that there is a probability of something happening or being so, and that case depends on all of its antecedents.
Obviously. All these effects occured, lining up to cause this thing to be.
Let's give a real-world example.
The probably of Tennessee winning last Monday's game (they did) was predicated on:
That works, but only when everything lines up exactly so.
So the brilliancy of Bayes was that the above probability can be rewritten with the following observation:
p(win(Tennessee)) * Πi p(Fi)
And there you have your classifier: you run your classes through the probabilities of their antecedents and for the class where the probability is the greatest is the winner.
Simple, and as the system gets more input data, matched against labeled classes, the better it gets at classification.
Problems
Well, I never promised you a rose garden...
One of the problems is a real-world concern, not a mathematical one, and that is many of these probabilities are minute, and multiplying them to each other ramps the scale down to the miniscule, very quickly. These days computers are ... well: 'eh' at handling very small fractional numbers, and they are architected in such a way that imprecision gets rolled into these numbers, compounding the problem with each multiplication.
How to handle these very small numbers?
Well, obviously, scale them. And this is very simply handled, for, after all, if it's a sense of scale that concerns you, you just rescale your domain. The logarithm does that. 'Automagically.'
SO, the above classifier can be rewritten:
log(p(win(Tennessee))) + Σi log(p(Fi))
And the scaling issue for computers just goes away.
There's another problem, however, and that is how to deal with the messiness of real-world input data.
The naïve Bayesian classifier assumes that all the evidence is present for each class. If a single piece of evidence is missing, that probability will be zero, instantly voiding the entire formula.
So, what to do?
I've observed that voiding the formula for one datum often leads to classifiers that oscillate too wildly and are therefore unusable.
So, I neutralize the nulled probability, instead of nulling it and making the formula void, I set it to unit and it therefore acts as a 'pass-through.'
This is an artifice, I know. I'm intentionally saying a missing datum is actually fully present, but I've found that this works 'eh' ... that is 'passably well.'
I've done this on two classifiers so far, and the classifiers are pretty darn good, but not good enough to my mind.
This is a fundamental problem with the naïve Bayesian classifier, and I'm sure everyone who undertakes this approach has to deal with this, one way or the other. What are good ways to make the naïve Bayesian classifier much more effective, but without excessive effort or thought around null data points that come up?
A problem, but not a crippling one.
I won last week's picks, after all, and that's not the first time
... shortest posting ever written.
So, anyway.
So, a Bayesian classifier is a predictor. It says, that is, if it were to speak, it says, "Hm, I've seen this before, so I think this is a that." And, well, given a reasonable tolerance for error or noise, it's pretty good at doing its job. In fact, it works supremely well in noisy environments, shines in them, even, where other models that are more rigidly defined tend rapidly to become unstable at the slightest irregularity (the inductive-logic-programming model comes to mind).
Yes, I've worked on classification systems. We've had a hybrid system composed of a deductive rule-based system (that we hashed out with Subject Matter Experts over months of establishing, parameterizing, then fine tuning each rule) running in parallel with a Bayesian classifier. Guess where 99% of our usable results came from?
Sir Bayes walked away with the win, nearly every time.
How does the naïve Bayesian classifier work?
The algorithm is simplicity itself ... embarassingly so. It does an observation, and that observation is that there is a probability of something happening or being so, and that case depends on all of its antecedents.
Obviously. All these effects occured, lining up to cause this thing to be.
Let's give a real-world example.
The probably of Tennessee winning last Monday's game (they did) was predicated on:
- It was a Monday night game -- Monday or F1
- They were playing on their home field -- Home or F2
- They were favored to win (with an even 2-point spread advantage) -- 2.0 spread or F3
- They were playing against the New York Jets -- vs(NYJ) or F4
That works, but only when everything lines up exactly so.
So the brilliancy of Bayes was that the above probability can be rewritten with the following observation:
p(win(Tennessee)) * Πi p(Fi)
And there you have your classifier: you run your classes through the probabilities of their antecedents and for the class where the probability is the greatest is the winner.
Simple, and as the system gets more input data, matched against labeled classes, the better it gets at classification.
Problems
Well, I never promised you a rose garden...
One of the problems is a real-world concern, not a mathematical one, and that is many of these probabilities are minute, and multiplying them to each other ramps the scale down to the miniscule, very quickly. These days computers are ... well: 'eh' at handling very small fractional numbers, and they are architected in such a way that imprecision gets rolled into these numbers, compounding the problem with each multiplication.
How to handle these very small numbers?
Well, obviously, scale them. And this is very simply handled, for, after all, if it's a sense of scale that concerns you, you just rescale your domain. The logarithm does that. 'Automagically.'
SO, the above classifier can be rewritten:
log(p(win(Tennessee))) + Σi log(p(Fi))
And the scaling issue for computers just goes away.
There's another problem, however, and that is how to deal with the messiness of real-world input data.
The naïve Bayesian classifier assumes that all the evidence is present for each class. If a single piece of evidence is missing, that probability will be zero, instantly voiding the entire formula.
So, what to do?
I've observed that voiding the formula for one datum often leads to classifiers that oscillate too wildly and are therefore unusable.
So, I neutralize the nulled probability, instead of nulling it and making the formula void, I set it to unit and it therefore acts as a 'pass-through.'
This is an artifice, I know. I'm intentionally saying a missing datum is actually fully present, but I've found that this works 'eh' ... that is 'passably well.'
I've done this on two classifiers so far, and the classifiers are pretty darn good, but not good enough to my mind.
This is a fundamental problem with the naïve Bayesian classifier, and I'm sure everyone who undertakes this approach has to deal with this, one way or the other. What are good ways to make the naïve Bayesian classifier much more effective, but without excessive effort or thought around null data points that come up?
A problem, but not a crippling one.
I won last week's picks, after all, and that's not the first time
Tuesday, December 18, 2012
Logical Types, LLC, Archived News
| Archived News
| ||||||||||||||||||||
2007-03 | ||||||||||||||||||||
29: |
I have eliminated my "corrections" to the Mercury
extras xml library (I had an outdated version of the library as outlined in the 2007-03-18 news post, so my corrections were redundant the the Mercury team's). At the same time, I have moved my XML enhancements into the utils library and promoted thedoug_graph module from alpha; it is now also included in utils. | |||||||||||||||||||
18: |
A new library and several improvements can be found in the shared repository:
| |||||||||||||||||||
2007-02 | ||||||||||||||||||||
14: |
| |||||||||||||||||||
02: |
We present a very tiny foreign interface to libtiff. The module
tiffany allows reading and writing simple RGBA TIFFs with two-dimensional arrays. As thematrix module matures, we will use that protocol for more comprehensive image filtering. This interface is presented with some samples and (currently) no documentation (other than code comments) as is available from our shared repository. Both libtiff and the matrix protocol must be available to run the samples.Update 2007-02-06: | |||||||||||||||||||
2007-01 | ||||||||||||||||||||
31: |
As the Mercury standard library does not have a matrix protocol, I am building a module that handles some basic matrix operations; this module and its test suites will be added to the
utils library when in serviceable condition. | |||||||||||||||||||
30: |
Thanks to the efforts of the Mercury research group at Melbourne, their Mercury compiler (
mmc) is now available for Macintosh computers with the Intel chipset. This is the alpha ROTD (2007-01-21) as opposed to the currently sanctioned 0.13.1 release, so some things are experimental in this version of the compiler. Email logicaltypes.com if you wish to have a binary distribution.
Our compiler,
ltq (that allows syntactic extensions with the op/3 directive, see the news archives at 2006-03-02), may need some work as the Mercury research team has changed the ops module. This, of course, also means that systems that depend on these improvements, such as the test-building framework qcpt (see news archive post 2006-05-12), must also be retested. We will post an update when qld is again tested and working. | |||||||||||||||||||
2006-07 | ||||||||||||||||||||
17: |
The various data structures provided in the ROTD extras/ distribution that handles
solver types (particularly the various any modes) do not compile because of purity issues; also, since their release, the coding style has changed for module qualification. The fixes to get these types compiled are here. For the Mercury team's review are the diffs to fold back into the distribution. | |||||||||||||||||||
2006-05 | ||||||||||||||||||||
13: |
We present a complete rewrite of Mercury's QuickCheck implementation: qcheck2 (the sample unit test module (
peano_unit_tests) requires the peanomodule in order to run). The essence of testing à la qcheck -- type discernment to obtain (random) test values with a test specification language -- remains unchanged. The new features of this new version enhance the system with:
The documentation for
qcheck (upon which qcheck2 is built) is a model for any system to follow; so, documentation, to include a system description and transition guide, for qcheck2 is under development.Update 2006-05-20: Update 2006-05-21: | |||||||||||||||||||
12: | ltq (Logical Types Quicksilver compiler) now comes with qcpt (QuickCheck Predicate Types), a system that discerns the interface predicates and functions of a system (to facilitate comprehensive unit testing). Perhaps even more important is the inclusion of a README that doubles as a HOWTO and INSTALL document. | |||||||||||||||||||
11: |
Completed the document describing an implementation of mutable syntax for Mercury. This article in available in the literature section (see above).
| |||||||||||||||||||
2006-04 | ||||||||||||||||||||
30: |
The
graph module takes a rather non-directed approach to locating a path from Node1 to NodeN. This is fine if the path has no associated cost, but if one is looking for the best path (where a path has an associated cost), this laissez-faire approach becomes problematic.
As I often need a efficient path in a graph, I have added
best_path/5 and other supporting predicates that cannot be implemented well operationally given the protocol of module graph and submitted these changes to the Mercury team for review. While they consider these changes, I provide the implementation here as module doug_graph, with an associated example. Locating the best path in the given example using the naive pathimplementation took over 20 seconds; the reimplementation here completes the computations in less than a second. | |||||||||||||||||||
27: |
The following Mercury compiler distributions are available from here on request:
These distributions include the
extras/, the samples/ (in extras/) and the HTML documentation (in doc/). | |||||||||||||||||||
2006-03 | ||||||||||||||||||||
| ||||||||||||||||||||
2006-02 | ||||||||||||||||||||
| ||||||||||||||||||||
2006-01 | ||||||||||||||||||||
| ||||||||||||||||||||
2005-12 | ||||||||||||||||||||
| ||||||||||||||||||||
Logical Types Libraries
Libraries, etc. | ||||||||||||||||||||||||||||
Logical types has the following libraries, systems, and sample code. All provided with the usual caveats (they are not guaranteed to work, and Logical Types is not liable for you downloading and using this code):
| ||||||||||||||||||||||||||||
| Libraries | ||||||||||||||||||||||||||||
utils |
a set of utilities useful for building production systems
contains modules: | |||||||||||||||||||||||||||
qcheck2 |
A testing/verification framework for Mercury programs
see
| |||||||||||||||||||||||||||
| Systems | ||||||||||||||||||||||||||||
ltq |
Extends Mercury with
op/3 declarations, allowing syntax modificationsee installation instructions | |||||||||||||||||||||||||||
| Fixes/Patches | ||||||||||||||||||||||||||||
anys |
A set of fixes to the utilities provided in the Mercury
extras distribution supporting operations with the any mode
see
| |||||||||||||||||||||||||||
| Alpha: You Have Been Warned! | ||||||||||||||||||||||||||||
matrix |
Work in progress to add a viable matrix protocol; will be rolled into the
utils library. | |||||||||||||||||||||||||||
tiffany |
A very small, thread-un-safe, foreign interface to libtiff; also requires
matrix (which it already bundles). | |||||||||||||||||||||||||||
Installation Instructions for QuickSilver
| Installation Instructions for Quicksilver | ||||
| PowerPC/Apple architecture | ||||
| ||||
Creating syntax with op/3 can become complicated when several operators interact to create a term. I've provided a module that prints the canonical representation of a parsed term (write_canonical.m) and a testing module (test_op.m) that allows prototyping of operator declarations and allows submitting terms under that syntax. The whole test system may be built in the usual way:$ mmake test_op.depend $ mmake test_op |
Logical Types Company Mission Statement
| Rule-based Systems |
Rule-based systems process data according to sets of constraints established by the user. The results of these systems is a collection of rule findings that can be used to construct the final product or to assist the user in rendering an informed decision.
The above description may seem all-too-general, as that describes the what most programs do. This is indeed correct: rule-based programming is sufficiently powerful enough to describe any computable system. In fact, the rule-based approach is now being viewed by the mainstream as the preferred method for workflow analysis, process, resource scheduling, service-based systems, etc. Nearly every software system has a set of rules, explicit or implied, to which it adheres. The rule-based programming style model these rule constructs directly and facilitate their manipulation as the system grows and changes.
| ||
| Deductive Logic |
The traditional approach to building logic systems is to construct a set of clauses where the head of the clause matches a condition and the rest of the clause verifies the match through a set of goals for that match. These clauses are known as rules and a set of rules is a predicate, and this predicate can be used, in turn, as a goal in a new rule. Deductive logic matches problem specifications very closely and is a very effective way to convert a set of requirements into a production system. Some examples of types of problems that are easily modelled in deductive logic are expert systems, planning systems and scheduling systems.
Deductive systems are used when the rules are clear, when the user requires certain outcomes, and are very good at "explaining" what the rule findings are and how they were arrived at.
| ||
| Inductive Logic |
The opposite approach to a deductive system is an inductive one. Whereas in an deductive system, the user has very exacting control over the process and outcome, in an inductive system, the rules are obtained by deriving the relations between input data and their outcomes, with very little guidance, if any, from the builder of the system. Traditional inductive systems required very clean data and had little tolerance for deviation -- a slight perturbation in the data set could cause the system to fall into an undefined state. Modern inductive systems have taken a different approach: reaping the benefits of recent advances in probability and statistics, these systems (such as Bayesian systems and neural networks) are highly redundant and adaptive. These new systems consistently perform well: they have excellent success narrowing to a classification from apparently unrelated attributes, and they have a high rate of stability, being very fault-tolerant, even in the presence of very noisy data. It is also trivial to convert a statically trained inductive system to one that learns continuously from new inputs and outcomes.
Inductive systems are used when users cannot explain how they arrive at decisions (attributing the outcomes to a "feel" for the situation), and where gradual trends result in eventual changes to outcomes. These systems excel at making the correct decision with a very high degree of confidence, but are poor at explaining what prompted the decision.
| ||
| Our Approach |
Logical Types, LLC uses both deductive and inductive logic to build systems as the needs of the customer demand:
| ||
PADL Symposium 2006 Post-Mortem
After Action Report: PADL 2006
Practical Aspects of Declarative Languages
Introduction
This is an highly-opinionated report on the presentations at the PADL 2006. The sessions are presented chronologically, but are tabled in the contents by topic, and cross-referenced by their practicality (i.e. in what I found useful in the presentation).
Presentations
| Cross References | |
|---|---|
| Topic | Practicality |
| Invited Speakers | Practical Results |
| Phil Walder | Dominators |
| Erik Meijer | Probabilistic Prolog |
| No show: David Roundy | Distribution type |
| Theorem Provers (no, really!) | |
| "Practical" "Applications"1 | Declarative debugging |
| CHRs for testing | Improving the search from failed branches |
| Probabilistic music | Cliques as power sets |
| Genomes with distribution type | Complexity measures |
| Constraints | "Theoretical" 2 (including compiler |
| Optimal paths with dominators | implementations and optimizations)3 |
| Constraints in Mercury | Tabling and tabling 4 |
| Informing constraints with failure | Constraints 5 |
| Model checking | |
| Analysis and Verification | Reimplementing cut |
| Typing Prolog with cliques | Description logics 6 |
| Model Checking model checkers | |
| Logic Programming | |
| Generic (foreign) Cuts | |
| Tabling in Mercury | |
| Correct Tabling in XSB | |
| Declarative debugging | |
| Browsing and Querying | |
| Logical Eclipse plugin | |
| description logic queries | |
| Queries with sets with regex | |
Retrospective
As you can see below, my critisms are very harsh. This begs two questions: harsh against whom? and would I go again to a PADL? The answer to the latter is resoundingly in the affirmative (as William Byrd would pipe up: 't'): the density of useful concepts I took away from this conference was much greater than most other conferences I've attended, including, not surprisingly, the POPL.
So, why so harsh in my reviews? Was I critizing the speakers? In some cases, I was indeed critizing the speakers, particularly when they sacrifice rigor for herd thinking (particularly when they do not acknowledge such lapses), but in the main, my criticisms, such as 'this topic does not interest me', was not a criticism of the speaker, nor even the choice of topic (with the caveat that of the 17 of the 33 papers accepted, the papers accepted should be more practical and less esoteric, if the reviewers chose academic curiosities over less polished papers on the topic of applications then I do fault the selection process), but a severe criticism of me. What I mean here is that the PADL's mission is to gather reports from the front-lines, as it were. That's where companies, like mine, operate. So, companies, like mine, including mine, should be submitting these papers. We do not, and the PADL has suffered for it.
In short, the PADL was what it set out to be -- a free exchange of ideas of the application of research and an open channel between researchers and industry. AND the PADL, as good as it was, could be even better with more practical, real, applications developed in the declarative style showcased for all to see. It motivates the researchers ('my work has revelance! an audience! a fan club!'), and it improves the tools industry uses to deliver these applications ('you can do that?'). Win/win. QED.
The Challenge
So you companies that use declarative programming ... you know who you are (*cough* galois *cough* ... um, Doug, what about you? Yeah, me, too), belly up to the bar, 'cause drinks are on the house! (trans: submit papers; you are not divulging company secrets, and you're gonna get lots of neato ideas that will help you on your current projects).
Day 1: Monday, January 9, 2006
| 9:00 am | Links: Linking Theory to Practice for the Web |
| Phil Walder | |
| |
| 10:30 am | Using CHRs to generate functional test cases for the Java Card Virtual Machine |
| Sandrine-Dominique Gourand (presenter), Arnaud Gotlieb | |
| |
| 11:00 am | Probabilistic-logical Modeling of Music |
| Jon Sneyers, et al. | |
| |
| 11:30 am | Modeling Genome Evolution with a DSEL for Probabilistic Programming |
| Martin Erwig and Steve Kollmansberger (presenter) | |
| |
| 2:00 pm | Using Dominators for Solving Constrained Path Problems |
| Luis Quesada, et al. | |
| |
| 2:30 pm | Adding constraint solving to Mercury |
| Ralph Becket, et al. presented by Zoltan Somogyi (co-author) | |
| |
| 3:00 pm | A Hybrid BDD and SAT Finite Domain Constraint Solver |
| Peter Hawkins (presenter) and Peter Stuckey | |
| |
| 4:00 pm | Efficient top-down set-sharing analysis using cliques |
| Jorge Navas, Francisco Bueno and Manuel Hermenegildo (presenter) | |
| |
| 4:30 pm | Automatic Verification of a Model Checker by Reflection |
| Bow-Yaw Wang | |
|
Day 2: Tuesday, January 10, 2006
| 9:10 am8 | Generic Cut Actions for External Prolog Predicates |
| Tiago Soares, Ricardo Rocha and Michel Ferreira (presenter) | |
| |
| 9:40 am | Tabling in Mercury: Design and Implementation |
| Zoltan Somogyi (presenter) and Konstantinos Sagonas | |
| |
| 10:10 am | Incremental Evaluation of Tabled Prolog: Beyond Pure Logic Programs |
| Diptikalyan Saha and C. R. Ramakrishnan | |
| |
| 10:40 am | Controlling search space materialization in a practical declarative debugger |
| Ian MacLarty and Zoltan Somogyi (presenter) | |
| |
| 2:00 pm | LINQ: Reconciling objects, relations and XML in the .NET framework: a Personal Perspective |
| Erik Meijer | |
| |
| 3:30 pm | A Generic Code Browser with a Declarative Configuration Language |
| Kris De Volder | |
| |
| 4:00 pm | Translating Description Logic Queries to Prolog |
| Zsolt Nagy, Gergely Lukácsy (presenter) and Péter Szeredi | |
| |
| 4:30 pm | Querying Complex Graphs |
| Yanhong A. Liu (presenter) and Scott D. Stoller | |
|
Endnotes
| 1 | As all of the talks in the "Practical" "Applications" were neither practical (they were all used to further research or as dissertations) nor applications (several speakers thoughout the conference, when I queried them about aspects of their presentation stated that the works they discussed where under development still), I query the accuracy of the topic heading for these talks. A saving grace is that all of the talks on this topic did have results that were practical and immediately usable for real-world applications, unlike what some of the other presentations had to offer. |
| 2 | |
| 3 | |
| 4 | Yes, there were two presentations on tabling. Both of them I found to be very "useful" in a "theoretical" sense of the word. |
| 5 | Okay, okay, so I am looking into constraints. This does point out an issue with the symposium: it may point out things that I don't know I don't know, but it wasn't very helpful in educating me about the practical applications in these areas. |
| 6 | I think a more approapriate term for 'description logics' or 'ontologies' is 'hooey'. Bluntly, descriptions logics are as innovative and as useful as Java is ('the most distressing thing to hit computing since MS-DOS' Alan Kay). With the set of claims proponents are making grandiosely about knowledge representation, which I must remind everyone is a similar set of claims the AI community was making for symbolic representation of knowledge (what, again, is description logic? Oh, the 'symbolic representation of knowledge'? Ah, yes, we have a winner here!), in the same self-satisfied tone, just before AI went dark for twenty years, AND with the bedfellows it has (whispered: 'eggs'-'em'-'ill'), I am shocked that more gullible consumers of this hogwash haven't made the obvious connections and avoided this dead end... |
| 7 | The 'random' in "'random' testing" means 'comprehensive with random generation of parameters', not 'a random selection [e.g. not comprehensive] of possible tests'. |
| 8 | This is a departure from the published schedule; the invited speaker for the morning was a no-show. |
(originally published January 16, 2006)
Subscribe to:
Posts (Atom)