In August 2008, I posted my answer to the question "Why should we be interested in categories" on the A Categorical Manifesto thread in the n-Category Café blog. The posting was long, and I had to feed it in in sections, which became interrupted by readers' replies. So to make it easier to read, I've put it up here, as one document. The original, with replies, is at the link just mentioned.

- Category theory and the Interesting Truths
- Unifying unification with holographic reduced representations
- Unifying neural nets with logic
- Unifying dynamical systems with logic
- Understanding analogical reasoning — the core of cognition and a chemistry of concepts?
- "A necessary tool in the construction of an adequately explicit science of knowing"
- By the way — spreadsheets!

Here’s my answer. In general, of course, categories will be useful
because
they’ll help us understand the “Interesting
Truths” of computer science: the “subtle unifying
insights between broad classes of mathematical structures”,
to quote the paragraph from
Greg Egan’s
novel *Incandescence* that
I posted:

‘Interesting Truths’ referred to a kind of theorem which captured subtle unifying insights between broad classes of mathematical structures. In between strict isomorphism — where the same structure recurred exactly in different guises — and the loosest of poetic analogies, Interesting Truths gathered together a panoply of apparently disparate systems by showing them all to be reflections of each other, albeit in a suitably warped mirror. Knowing, for example, that multiplying two positive integers was really the same as adding their logarithms revealed an exact correspondence between two algebraic systems that was useful, but not very deep. Seeing how a more sophisticated version of the same could be established for a vast array of more complex systems — from rotations in space to the symmetries of subatomic particles — unified great tracts of physics and mathematics, without collapsing them all into mere copies of a single example.

But more specifically, I’m interested in them because I think they have huge — and under-used — potential in understanding human cognition and in building “intelligent” computers. I’m going to give several examples, which will make quite a long post. I hope that’s OK.

My first example concerns understanding how “logical unification” and “holographic reduced representations” are related. Logical unification is a kind of pattern matching. (It’s often just called “unification”, but I’ll preface with “logical” to avoid confusion with “unify” as in “insight”. Logicians reading this, please forgive such a cursory description.)

For instance, in the programming language Prolog, unifying the term 2*x+3*y with the term U+V, will bind the variable U to 2*x and the variable V to 3*y. Prolog programmers love this, because it’s such an easy way to get at pieces of structures.

But — computer science also knows a different kind of pattern matching, implemented by so-called “holographic reduced representations” or HRRs. These encode structures as bitwise or numeric vectors in very high-dimensional spaces, using circular convolution and circular correlation (or analogous operations) to build and decompose vectors. I’ll show an example in the next paragraph; for those who want to know more, these references give some background:

- Pentti Kanerva,
*Dual Role of Analogy in the Design of a Cognitive Computer*. In*Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences*. Workshop. Sofia, Bulgaria, July 17–20, 1998. - Jane Neumann,
*Learning the Systematic Transformation of Holographic Reduced Representations*.*Cognitive Systems Research*,**3**, (2002), 227-235. - Chris
Eliasmith,
*Structure Without Symbols: Providing A Distributed Account of Low-level and High-level Cognition*. Southern Society for Philosophy and Psychology Conference March, 1997.

Here’s an example from Kanerva’s paper. The following equation computes a bitwise vector encoding data about France:

France = [Capital*Paris + Location*WesternEurope + Currency*Euro]

Here, + is vector addition, * is exclusive-or. The brackets [] denote a “normalised mean”, whose exact definition is given in Kanerva’s paper. Capital, Location and Currency are three randomly-generated vectors. The vector Capital represents a slot or field whose value will represent the capital of France; similarly for the vectors Location and Currency.

Paris, WesternEurope and Euro are three more randomly-generated vectors, representing the values of these slots. It’s important to note that all six vectors should be interpreted as atomic symbols: the actual values of their components have no significance.

Having computed the vector France, we can “probe” it by exclusive-oring with the vector Capital:

WhatCity = Capital *France

= Capital * [Capital*Paris + Location*WesternEurope + Currency*Euro]

= Capital*Capital*Paris + Capital*Location*WesternEurope + Capital*Currency*Euro

= Paris + (a bit of other stuff).

This works because * (that is, exclusive-or) is its own inverse, and because it distributes over the normalised mean [].

The final stage is to do what Kanerva calls a “clean-up”. You assume there is an associative memory that contains the six vectors originally generated. Fed any vector, it will quickly return whichever of the six closest to it. Which in this case,we hope will be Paris.

For this to work, various conditions on the distribution of vectors must be satisfied. But if they are, you can do some really nifty analogical reasoning. For example, you can ask how France is related to Paris by exclusive-oring the vector France with the vector Paris. This works the same way as above, and after the clean-up, will return Capital. Kanerva goes on to describe more ambitious reasoning tasks, such as forming a vector that represents Sweden, then asking “What is the Paris of Sweden”?

On the surface, holographic reduced representations are utterly different from logical unification. But I can’t help feeling that, at a deeper level, they are closely related. And there is a categorical formulation of logical unification, which regards it as finding coequalisers in a category whose objects are sets of terms and whose arrows are substitutions. This is described in the first reference below, by Rydeheard and Burstall. They say their formulation is derived from an observation by Goguen. So it may be based (I’m not an expert) on the ideas in the second reference:

- David Rydeheard and Rod Burstall,
*Computational Category Theory*. Prentice Hall, 1988. See Chapter 8. - Joseph Goguen,
*What is unification? A categorical view of substitution, equation and solution*. In*Resolution of Equations in Algebraic Structures*,**1**:*Algebraic Techniques*, Academic Press, 1989.

So, can we extend that categorical formulation to holographic reduced representations? I don’t know. But if we could, we would better understand how they are related to logic programming, and we might gain new tools for analogical reasoning. It’s worth trying.

My theme is that category theory has heaps to offer Artificial Intelligence and cognitive science. We ought to be doing all we can to promote it, and to make it easy to learn. And we should be showing people the kinds of task it could be useful for. Which includes unifying deeply disparate areas of computing. Which AI and cognitive science are full of!

So here’s my second example: neural networks, and logic-based representations. I suppose most readers have come across neural networks, but for those who haven’t, I’ll describe them briefly. Neural networks consist of “nodes” connected in a network. The connections between nodes correspond to synapses between biological neurons. Typically, a node sums the signals from its input nodes, multiplying each input signal by the strength of the corresponding connection. It may then process the sum in some way — e.g. by thresholding — to calculate an output and an “activation”. The entire network can be regarded as a vector of these activation values, representing the data in the network.

Most kinds of neural network can learn by adjusting the
strengths of their connections, thus mimicking —
in a *very* simplified way — how the brain learns by
adjusting synaptic strengths.

In Artificial Intelligence, there is a vast cultural chasm between neural nets and symbolic ways to represent knowledge. It’s rare for the same researcher to work on both, and students usually get taught about them in different courses. This reflects the differences between the two. The symbols in symbolic representations denote things. But in some kinds of neural net, the individual nodes don’t, although a whole set of nodes might. Therefore, such nets are often called “subsymbolic”, a word introduced by Paul Smolensky in the following paper:

- Paul Smolensky,
*On the Proper Treatment of Connectionism*. In*Readings in Philosophy and Cognitive Science*, MIT Press, 1993.

(Holographic reduced representation vectors are also subsymbolic.)

As well as the “symbolic/subsymbolic divide”,
there is a “compositional/non-compositional divide”. Symbolic
ways to
represent knowledge
are “compositional”, in that
the meaning of a sequence of symbols is
determined by the meanings of its parts.
Compositionality is good: if you can
represent “coffee cup”
and “cake”, it’s nice to be able to represent
“coffee cup and cake” too. But if
you have (say) an image-processing
net which has learnt to represent
images of coffee cups by one pattern of
activations, and images of cakes by another,
you can’t assume that
superimposing or adding the patterns will
represent a coffee cup and a piece of cake. So
there has been a lot of argument about questions like:
is the brain compositional; is composition by
superimposing activation patterns the only kind of
compositionality; and (as engineers) how can we design
nets that *are* compositional?

Yet another difference is that symbolic ways to represent knowledge are often said to be “contextual”: they are leaky, in that what happens at one node can be very sensitive to data elsewhere in the network. Which is different from my Prolog program that will happily unify 2*x+3*y with U+V, giving the same results no matter what else is in my laptop’s memory. Again, there’s been a lot of argument about this. Unfortunately, a lot of the key papers on this and the other topics, though on-line, are locked up in pay-to-view sites. The reference below seems a good, and free, starting point:

- James Garson,
*Connectionism*, entry in the*Stanford Encyclopedia of Philosophy*. First published Sun May 18, 1997; substantive revision Wed Mar 7, 2007.

I mention these differences not only to show how far apart neural nets are from symbolic systems, but also because category theory must have a lot to offer in answering such questions. Which is why the two references below are exciting:

- Ronald Brown and Timothy Porter ,
*Category Theory and Higher Dimensional Algebra: potential descriptive tools in neuroscience*. - Michael Healy,
*Category Theory Applied to Neural Modeling and Graphical Representations*. In*Proceedings of the International Joint Conference on Neural Networks*, (IJCNN 2000), IEEE.

In the first, Brown and Porter suggest using colimits to understand brain activity in terms of the component structures and neurons.

Healy goes further. He sets up a category Conc whose objects are theories representing concepts. Arrows represent “is a subconcept of”: there is an arrow from c1 to c2 if c1 is a subconcept of c2. He sets up another category Neur, whose objects are nodes in a particular neural network. Arrows represent the “can activate” relationship: there is an arrow from n1 to n2 if n1 can activate n2. Such an arrow tells us something about the neural network, namely that there is a path between n1 and n2, and that all the synaptic connections along this path are strong enough to propagate a signal from n1 to n2.

This wouldn’t be interesting unless the categories Conc and Neur were related, and indeed, Healy lets us define functors from Conc to Neur. In effect, each functor maps a network of concepts, represented as theories, to regions in a neural network where these concepts are represented.

What’s nice is that we can combine these neural-network representations by using natural transformations. I hope I have this right, because Healy doesn’t give much detail, but the point seems to be that a single neural network might learn different instances of the same concept.

For example, imagine an ambidextrous robot with mirror-identical arms and hands. Each hand has touch sensors: the left hand’s sensors are wired to one set of input nodes in the network, and the right hand’s to another set of input nodes. Assume the robot is taught to grasp an orange with its left hand, and then, independently, with its right hand. The region of its neural network fed from the left hand sensors learns a set of associations between finger pressure and movement; the region fed from the right hand learns, independently, a very similar set. This is an example I made up, by the way, so blame me if it turns out to be misleading.

The left hand and right hand associations can then be regarded as two functors, L and R, from Conc to Neur. I’m not sure what the logical content of the objects in Conc would be — theories describing what an orange is for, or what its shape is, or how to grasp it — but at any rate, they’d say something about oranges, and they’d be represented in two different regions in the neural network. We can merge these, Healy says, by setting aside a third region in the network which combines corresponding outputs from each of the first two.

But is this in fact a valid representation of the learnt concepts? Yes, if we let natural transformations guide us. As well as functors L and R, let there be a functor M which maps Conc to this third region. Set up mappings from L to M and R to M. Then, Healy shows, the third region is a valid representation if these mappings are natural transformations.

Can such ideas be applied to other kinds of neural network? Can we show other kinds of neural network to be reflections, in suitably warped Eganian mirrors, of logical theories? Can we characterise neural networks by properties of the categories containing them? (I suspect many of these categories won’t have products, because of interference between subnetworks, but that’s only a passing thought.)

And — very importantly, because we engineers have such small
minds, so need to break problems into tiny pieces before we can understand
them — can we use colimits and other tools to help us build
*modular* neural nets?

My third example is similar in spirit to the last. In the early days of programming, one of Artificial Intelligence’s motivating ideas was the Physical Symbol System Hypothesis put forward by Allen Newell and Herbert A. Simon in the following paper:

- Allen Newell and Herbert A. Simon,
*Computer Science as Empirical Inquiry: Symbols and Search*. In*Communications of the ACM*,**19**, (March 1976).

More recently, various researchers have proposed using dynamical systems ideas to understand cognition. This is the Dynamical Systems Hypothesis, and there’s some background on it in these references:

- Nils J. Nilsson,
*The Physical Symbol System Hypothesis: Status and Prospects*. - Robert F. Port,
*Dynamical Systems Hypothesis in Cognitive Science*. Draft entry (December 5, 2000) for*Encyclopedia of Cognitive Science*, MacMillan Reference.

I was reminded of this by a nice paper I found, written by David Tall about the learning of mathematical concepts. He wrote it before the Dynamical Systems Hypothesis became popular, but it uses dynamical systems ideas to explain how learning one concept might be blocked by a conflicting earlier version of the same concept:

- David Tall,
*Cognitive Conflict and the Learning of Mathematics*. Presented at First Conference of The International Group for the Psychology of Mathematics Education, Utrecht, Netherlands, Summer 1977.

So, in a spirit similar to that of
my neural network section, I ask: can we use
category theory to help us map between logical
descriptions of knowledge and their implementation as
dynamical systems. Can we use it to help
us build *modular* dynamical systems implementations?

My fourth example ratchets up the stakes, and turns — I’ll explain why in a minute — to Margaret Thatcher. In the 1980s, she was Prime Minister of Britain, and Denis Thatcher was her husband. Ronald Reagan was President of America. So, if you were to ask an American “Who is First Lady of Britain?”, what could they reply? A reasonable answer is “Denis Thatcher”. But to make it, you must be willing to “slip” the concept of First Lady, weakening it to First Spouse as you transfer from an American frame of reference to a British.

Now consider the following two number patterns:

A: 1 2 3 4 5 5 4 3 2 1

B: 1 2 3 4 4 3 2 1

What is to B as 4 is to A? Most people would answer 3. It’s to the left of the central pair of numbers in B, just as 4 is to the left of the central pair of numbers in A. 4 in A and 3 in B are like Nancy Reagan and Denis Thatcher: they’re not the “most distinguished point” in their landscape, but they are closely related to that point. They fill the same rôle.

Here is another number problem:

A: 1 2 3 4 5 5 4 3 2 1

D: 1 1 2 2 3 3 4 4 5 4 4 3 3 2 2 1 1

What is to D as 4 is to A? A brute-force attempt to find D’s central pair won’t work. But it seems reasonable to “slip” the concepts of “pair” and “singleton” as we transfer from A’s frame of reference to D’s. Which gives us the answer 4 4.

These problems come from a column
that Douglas
Hofstadter wrote for *Scientific American*. He collected them,
with later
additions, into a book, and I’m particularly
interested here in two of its chapters:

- Douglas Hofstadter,
*Metamagical Themas*, Basic Books, 1985. See*Variations on a Theme as the Crux of Creativity*(Chapter 12) and*Analogies and Roles in Human and Machine Thinking*(Chapter 24).

Hofstadter argues that analogy is at the core of human thought and creativity. To quote from Chapter 24:

Don’t press an analogy too far, because it will always break down. In that case, what good are analogies? Why bother with them? What is the purpose of trying to establish a mapping between two things thatdo notmap onto each other in reality? The answer is surely very complex, but the heart of it must be that it is good for our survival (or our genes’ survival) , because we do it all the time. Analogy and reminding, whether they are accurate or not, guide all our thought patterns. Being attuned to vague resemblances is the hallmark of intelligence, for better or for worse.

Therefore, he says, human analogical thought is what we need to study. And we should do so by within “microdomains” such as the number problems, because then — in proper scientific spirit — we can isolate analogical thought from other phenomena.

He was particularly interested in why we slip some concepts more than others, and in what one might call rôles versus literals. As an example of the first: I’m in the pub, and I accidentally spill my pint of beer over my shoes. Why is it natural for me to “slip” my situation to “Good job I don’t have to go anywhere smart tonight!”, but not to “Shame gravity isn’t weaker, so the beer would have stayed in the air!”?

As an example of the second, why, when solving the two number problems, do we perceive 4 as filling a rôle at all in pattern A, then try to find a value for the equivalent rôle in patterns B and D? Why don’t we just say, “Well, the obvious answer is the same value, 4”?

There are some lovely examples in those two chapters, which I do recommend to anyone interested. But what’s the connection with categories?

Hofstadter and his colleagues wrote several programs to implement their theories of analogical thought in microdomains. These programs include include Copycat:

- Page on “Letter Analogies”, Center for Research on Concepts and Cognition (also known as the Fluid Analogies Research Group), Indiana University, 2002.
- Douglas Hofstadter and Melanie Mitchell,
*The Copycat Project: A Model of Mental Fluidity and Analogy-Making*.

And they include Tabletop:

- Robert French,
*The Subtlety of Sameness: a Theory and Computer Model of Analogy-Making*, MIT Press, 1995.

Doesn’t that title sound like something to interest every red-blooded n-categorist? Anyway, Tabletop solves problems such as “I am sitting at a table and I touch my nose. What should you touch?”. Or “I touch the flower vase in the centre. What should you touch?”. Or, one more: “I touch the cup near me. You have a glass but no cup. What should you touch?”

Now, these programs are *complicated*.
They contain swarms of biologically-inspired enzyme-like entities
swimming around in a kind of
symbol-filled cytoplasmic glop, glomming onto bits of concepts, and slowly
crystallising out into a finished answer to the problem. They
lack — as far as I know — any easily understandable
semantics.

So, I’m going to suggest, can we use categories to give them such a semantics? Or, more generally, to formalise our ideas about the best solutions to the kind of analogy problems that Hofstadter and his colleagues discuss? A good starting point here would be Goguen’s ideas on conceptual blending, already mentioned earlier in this thread. For example:

- Joseph Goguen and D. Fox Harrell,
*Style as a Choice of Blending Principles*. - Joseph Goguen,
*Formal Notation for Conceptual Blending*. 2003.

Let me finish this section with two more Hofstadter quotes. As I said at the beginning, the stakes are high:

[…] I have been emphasising the idea of the internal structure ofoneconcept. In my view, the way that concepts can bond together and form conceptual molecules on all levels of complexity is a consequence of their internal structure. What results from a bond may surprise us, but it will nonetheless always have been completely determined by the concepts involved in the fusion, if only we could understand how they are structured. Thus the crux of the matter is the internal structure of a single concept, and how it “reaches out” towards things it is not. The crux isnotsome magical, mysterious process that occurs when two indivisble concepts collide; it is a consequence of the divisibility of concepts into subconceptual elements. As must be clear from this, I am not one to believe that wholes elude descriptions in terms of their parts. I believe that if we come to understand the “physics of concepts”, then perhaps we can derive from it a “chemistry of creativity”, just as we can derive the principles of the chemistry of atoms and molecules from those of the physics of quanta and particles. But as I said earlier, it is not just around the corner. Mental bonds will probably turn out to be no less subtle than chemical bonds.

You can think of concepts as start, and knob-twiddling as carrying you from one point on an orbit to another point. If you twiddle enough, you may well find yourself deep within the attractive zone of an unexpected but interesting concept and be captured by it. You may thus migrate from concept to concept. In short, knob-twiddling is a device that carries you from one concept to another, taking advantage of their overlapping orbits.

Of course, all this cannot happen with a trivial model of concepts. We see it happening all the time in minds, but to make it happen in computers or to locate it physically in brains will require a fleshing-out of what concepts really are. It’s fine to talk of “orbits around concepts” as a metaphor, but developing it into a full scientific notion that either can be realised in a computer model or can be located inside a brain is a huge task. This is the task that faces cognitive scientists if they wish to make “concept” a legitimate scientific term. This goal, suggested at the start of this article, could be taken to be the central goal of cognitive science, although such things are often forgotten in the inane hoopla that is surrounding artificial intelligence more and more these days

“These days” were in the 1980s, so perhaps we can disregard the remark about the inane hoopla. Though there seems to be a lot of inane hoopla surrounding computing in general. But what about the rest of the quotes. Does it make sense to think about a “chemistry of concepts”; to make “concept” a legitimate scientific notion?

Actually, Michael Healy, who I mentioned earlier for his work on mapping concepts to neural nets, has with coauthors Thomas Caudell and Timothy Goldsmith, released another report:

- Michael Healy, Thomas Caudell, and
Timothy Goldsmith,
*A Model of Human Categorization and Similarity Based Upon Category Theory*. University of New Mexico Technical Report EECE-TR-08-0010, 1st July 2008.

This is their abstract:

Categorization and the judgement of similarity are fundamental in cognition. We propose that these and other activities are based upon an underlying structure of knowledge, or concept representation, in the brain. Further, we propose that this structure can be represented mathematically in a declarative form via category theory, the mathematical theory of structure. We test the resulting mathematical model in an experiment in which human subjects provide judgements of similarity for pairs of line drawings using a numerical scale to represent degrees of similarity. The resulting numerical similarities are compared with those derived from the category-theoretic model by comparing diagrams. The diagrams represent distributed concept structures underlying the line drawings. To compare with a more conventional analysis technique, we also compare the human judgements with those provided by a two-dimensional feature space model equipped with a distance metric for the line drawings. The results are equally favorable for both models. Because of this and the putative explanatory power of the category-theoretic model, we propose that this model is worthy of further exploration as a mathematical model for cognitive science.

Which is extremely interesting, and reminiscent of Hofstadter’s remark: “Thus the crux of the matter is the internal structure of a single concept […]”.

I’ll finish by talking about a Lawvere paper. It’s the following reference — which, I’m pleased to note, is public at Google Books — and I mention it because I found it in a book on cognition:

- William Lawvere,
*Tools for the Advancement of Objective Logic: Closed Categories and Toposes*. In John Macnamara and Gonzalo E. Reyes (eds.),*The Logical Foundations of Cognition*, Oxford University Press, 1994.

Here’s how Lawvere concludes the paper:

Despite some simplification in the above, needed for rapid description, I hope that I have made clear that there is a great deal of useful precision lying behind my illustrations, and a great deal to be developed on the same basis. Thus I believe to have demonstrated the plausibility of my thesis that category theory will be a necessary tool in the construction of an adequately explicit science of knowing.

Now, I don’t know enough to describe these “illustrations” succinctly, but they include his notion of unity-and-identity-of-opposites, which I discovered from this reference:

- Steven Awodey, Lars Birkedal,
Dana S. Scott,
*Local Realizability Toposes and a Modal Logic for Computability*(Extended Abstract). Marked as “To be published in*Electronic Notes in Theoretical Computer Science*”.

Lawvere’s illustrations also include the
notion of an adequacy comonad.
Now, I hadn’t come across adequacy comonads before, but
they sound as though they might be very useful in understanding
approximations to concepts. Moreover, the adequacy
comonad sounds like an exquisite and intricate
piece of mathematical machinery,
and I hope someone will give me a working model.
One I can pick up, prod, peer into, pull to pieces, and
generally play with. Steam *or* electric.

However, I’ve another reason for
mentioning the notion. Lawvere uses it to show
how the study of closed categories contributes *to*
the study of metric spaces (in this case,
via the use of adequacy comonads in the
geodesic remetrisation of a metric space). He
then says that, conversely, we can take
notions such as “radius” and
“engulfing” *from*
the study of metric spaces and
create “precise analogues
[of these notions] for concrete generals enriched in
any (even non-posetal) closed categories V”.

I don’t need to fear the strange phrase “concrete general”, because John Baez explains it in:

*This Week’s Finds in Mathematical Physics (Week 200)*, December 31, 2003.

A “concrete general” is the category of all the models of a theory. (Is there more to it than that?)

Now, I don’t know whether Lawvere is using the words “radius” and “engulfing” in some sense far removed from their everyday meaning. But if not, can his remark also apply to other spatial and dynamic notions? Such as “orbit” (around a concept); “attractive zone” (of a concept); “migrate” (from concept to concept). In other words, to the metaphors that Hofstadter uses, and that he says will need to be made precise before we can create a “chemistry of concepts”. If so, then category theory will be an invaluable tool for cognitive science.

Even if I’m completely wrong in
that last paragraph, Lawvere says
category theory “will be a necessary
tool in the construction of an adequately
explicit science of knowing”, which is
good enough for me. So I did an experiment. I searched
Google and CiteseerX on his main title,
*Tools for the Advancement of Objective Logic*.

Apart from citations in library catalogues, and
papers on philosophy or pure category theory, I
found only *two* authors whose papers seemed,
on a quick skim, to depend in some important
way on Lawvere’s paper. For interest,
I give the references below. (The second author,
Zippora Arzi Gonczarowski, had one paper on a pay-to-view
site that I couldn’t read, but the abstract
shows me that it’s about the same topic as the paper below):

- François Magnan,
*Distributed Components Aggregation for eLearning: Conducting Theory and Practice*. - Zippora Arzi Gonczarowski,
*Wisely Non-Rational*. 1998.

Lawvere’s paper seems filled with crystals of compressed insight, glittering through the surface of the text like quartz through a covering of dry pine-needles. I find them intensely satisfying to contemplate, in the same way I do some El Lissitzky posters and De Stijl typography. Perhaps, high in my brain’s abstraction hierarchy, the same neurons are activated by both: category theory as the ultimate abstract sculpture.

But, aesthetics aside, how can we bring such power to more researchers than the two who cited it? I’m using Lawvere’s paper as a representative of category theory generally, of course. Ask a first-year LISP hacker to iterate over a list, and they’ll reach naturally for recursion or a MAP function. Can we bring up a generation of category-theory hackers, who, when you ask them to model putting something together, will reach naturally for a colimit?

Assuming that category theory really does have something to offer AI and cognitive science.

I thought I’d add one final reason for my interest in category theory. Spreadsheets!

Excel spreadsheets are a big source of errors. One reason is that Excel doesn’t have the notion of “modules”: chunks of program that can be written, tested and debugged one at a time, then used in as many different spreadsheets as you want. If you develop a spreadsheet in Excel, and want to use part of it in another spreadsheet, you have to copy and paste cells from the first spreadsheet to the second. If you then need to change these cells — perhaps because they have bugs, or perhaps to make the formulae in them faster — you have to make the same changes to both spreadsheets. You can’t just change one component, and have the changes propagated to every use of it. So Excel programmers tend to accumulate numerous almost-identical copies of their code. Version-control becomes a nightmare.

I’ve been experimenting with ways to overcome this, via various schemes for modularising spreadsheets. And below are references to a demo of one such scheme — a prototype component library for Google’s online spreadsheet, which lets you add code to a spreadsheet much as you would bar-charts and other displays — and to a paper about the component library for Google’s spreadsheet and for Excel. Both are for spreadsheet users, so not very technical:

- Jocelyn Ireson-Paine,
*Spreadsheet Components, Google Spreadsheets, and Code Reuse*. Dr Dobbs blog entry, May 2008. This is the demo, with screen shots. - Jocelyn Ireson-Paine,
*Spreadsheet Components For All*. In*Proceedings of European Spreadsheet Risks Interest Group 2008*, Greenwich, July 10-11, 2008.

Both these were inspired by Joseph Goguen’s work on sheaf semantics, and his dogma that the behaviour of a system is the limit of the behaviours of its components, explained in:

- Joseph Goguen,
*Sheaf Semantics for Concurrent Interacting Objects*. In*Mathematical Structures in Computer Science*,**2**2(1992).

I said a bit about this in a posting here. It seems that his work also inspired research into the algebraic specification languages for software.

It illustrates the power of category theory to project down into
unsuspected regions of computing. Even if we will
never see a book on *Category Theory with Excel*.

19th November 2009