What is category theory and what is it's relation to computer science?

What is category theory useful for? I notice that many sites describing the theory say it is becoming more and more useful in the realm of computer science, especially to do with programming languages. I also notice that category theorists talk of functors and some languages, for example ML, also have functors (coincidence?).

So, what exactly is category theory, what is it useful for what relevance does it have to computer science and why?

I’ll take a stab at this, but I’m only vaguely familiar with category theory. Category theory is a very abstract attempt to unify relations and transformations of mathematical objects. It is akin to trying to formalize something more general than algebra. A category is essentially a set of objects and a set of relations between the objects that follow a few basic rules (IIRC, a composition of relations should still be a relation in the category, and stuff like that, I’m sure somebody more knowledgable will come along and correct me).

I believe, and I could be wrong, application of category theory to computer science is rather straightforward but abstract - you represent an algorithm, data structure or what have you as a category - a set of mathematical objects and their relations. At this point you have a whole slew of neat little theorems and methods to transform your computer science category, refactor, analyze, or what have you.

Does this help?

This seems to be a speciality of Mathochist - perhaps he’ll be along to explain.

Indeed, this is exactly my thing.

Category theory started from the field of homology theory, which was a major tool for understanding topology. Nowadays, it’s usually presented as generalizing notions from set theory, so that’s where I’ll jump off.

Hopefully you know the basic notions of abstract algebra, like groups and rings. A category is another such structure. There is a collection (think of a set, but there are subtleties involved) of “objects” and another of “morphisms” between the objects. Each morphisms has a “source” and “target” object. For sets, sets are objects and functions between sets are the morphisms. There’s a notion of composition of two morphisms to give a third, as long as the target of the first morphism is the source of the second. This composition is required to be associative, and for each object there’s an “identity” morphism from that object to itself. Like there are groups homomorphisms and ring homomorphisms between groups and rings, there are “functors” between categories. A functor is a map from one category to another that preserves the categorical structure in some sense.

The connection to computer science is through logic, and finds its most direct application in the semantics of programming languages, especially functional programming (like ML). In fact, there’s a special kind of category that is basically the same thing as a “typed lambda-calculus”.

Consider a formal system S. The well-formed formulas in the system are the objects, and the implications are the morphisms. For instance, “p and q” is an object, as is “p”, and there is a morphism from “p and q” to “p”, which corresponds to the fact that the statement “p and q” implies the statement p.

In particular, if the category has a property called Cartesian closure, then given two objects there is a third which corresponds in some sense to the collection of arrows between the two. This is the distinction between the (outside the system) assertion that statement p implies statement q and the (inside the system) statement “p -> q”. It’s also directly related to the connection in lambda calculi between the formula (M) and the function (\lambda x.M).

It’s difficult to answer such a general question without basically reconstructing the literature. A standard starting reference is MacLane’s “Categories for the Working Mathematician”, and for more advanced reference I like Lambek and Scott’s “Introduction to Higher-Order Categorical Logic”.

Ahh, thanks a lot, that explained it well.

One more question: is category theory itself described in terms of category theory? In other words, do categories constitute a category?

Well, actually the two questions aren’t quite the same. The second is easier.

Classically, categories cannot themselves form a category because of set-theoretic conditions. The object part of a category is properly a class, which is a bit more general than a set and was introduced to avoid paradoxes about sets containing themselves. This means that a “category of categories” would have to have as objects a collection of classes, which isn’t done. There are ways around it, though, like introducing another level of sets which can contain classes, and on that constructing a notion of a “quasicategory”. Since a quasicategory isn’t a category there isn’t a problem with it containing itself, like there might be with a category of categories.

More recently has arisen the notion of a “2-category”, which has objects, 1-morphisms between objects, and 2-morphisms between morphisms. In logic terms it would be like having a collection of statements, implications between statements, and higher-level implications between implications. Since a 2-category isn’t itself a category (if you forget 2-morphisms it is, but technically that’s not “the same thing” as the original 2-category) you can have a 2-category of categories. The objects are categories, the morphisms are functors, and the 2-morphisms are natural transformations between functors. This notion actually is turning out to have great applications in physics, as “categorifying” – moving from 1-categories to 2-categories – turns out to handle extending gauge field theories of point particles (like the Standard Model) to an analogous theory for strings.

Now, is category theory expressible in terms of categories? It turns out that this is a simple “yes”. First I’ll show how the idea works in terms of groups. This technique is ultimately due to Bill Lawvere’s thesis, and is one of the coolest parts of the categorical viewpoint, in my opinion.

Normally we say a group is a set with a binary operation defined on it which satisfies a number of rules.
[ul]
[li] (ab)c = a(bc)[/li][li] there is an element 1 such that a1 = a = 1a for all a[/li][li] for every a there is an a[sup]-1[/sup] with a*a[sup]-1[/sup] = a[sup]-1[/sup]a = 1[/li][/ul]
Now I can also say this in terms of functions. There’s a set G and functions m:GxG -> G, 1:{
} -> G, and i:G -> G, which express the multiplication, identity, and inverse. They satisfy certain equations, which are usually expressed in terms of “commuting diagrams”, which I can’t draw here. For example, given a triple of elements of G, we can first take the arrow from GxGxG to GxG multiplying the first two elements and leaving the last one alone, then the arrow from GxG to G by multiplication. We could also have started with the arrow from GxGxG to GxG multiplying the second and third elements. This gives us two arrows from GxGxG to G, which we assert must be equal. There are similar descriptions for the identity and inverse properties. The important thing is we’ve never really used the fact that these are sets and functions, so I can describe a “group object” in any category I want to be an object with three morphisms satisfying those equations.

There is a similar, but more complicated notion of a “category object” in a category, and all the axioms of category theory can be expressed themselves category-theoretically.

It is not that I disagree with anything mathochist has said, but I did want to give my own view of the subject.

At one point, “set of objects” and “set of morphisms” (I prefer to call them arrows) were mentioned. In the very first paper describing category theory, called The general theory of natural equivalences by Samuel Eilenbeg and Saunders Mac Lane, Transactions Amer Math Soc, 1945, no sets are mentioned. They defined a category in terms of objects and arrows sure enough, but made no assumption about these being sets and in the very first categories they described, they weren’t. For example, the category of sets has ALL sets as objects and we know that in standard set theory there is no set of all sets. (There are variations in which they are, but that possibility is not mentioned.) Then there are arrows. some of which compose, namely when the source object of the second is the target object of the first. Composition, insofar as it is defined is associative and each object is assumed to come with an identity arrow whose composition with any arrow (that it can compose with) doesn’t change the latter.

Is there a category of all categories? The concept does not seem to lead to any obvious contradiction. Obviously, you have to do something to prevent the formation of the category of all categories that are not objects of themselves, so some care must be taken. But all in all, I would prefer that there be a category of all category. More precisely a 2-category of all categories, as mathochist has ably described, but that is the topic of a different post.

Somewhat surprisingly, since categories can be defined independent of sets, it is possible to use categories instead of sets as a foundation of all mathematics. It is rather easy to write down, once you understand the lingo, a set of axioms for a category that allow it to function as a category of sets, without having any preconceived idea of what a set is. And since a set, in Zermelo-Frankel set theory is defined somewhat untuititively as a well-founded tree (consisting of its elements, the elements of its elements, the elements of its elements of its elements, …, every down chain terminating in a finite number of steps in the empty set), there are some of us who prefer that point of view. I definitely do.

All this is far from the considerations of computer science. Indeed one way CT is used in CS is via something like Lambek & Scott, mentioned above. But there is an unrelated path, having little to do with logic. Consider the datatype of natural numbers (actually a CS-ist would want one of integers, including negatives, but that is only a relatively minor variant). The datatype of integers can be described as follows. There is an integer called 0 and an integer called 1. There are operations of addition, negation and multiplication with 0 and 1 being the neutral elements with respect to addition and multiplication, respectively. There are some equations, associativity, commutativity, distributivity, that everyone who has followed this up to now will be able to reproduce (also negation). This is what universal algebraists call an equational theory and category theorists call a sketch (more precisely a finite product or FP sketch, for reasons I will not explain here). A model of the sketch is something (ok, a set, if you insist) with operations called addition, negation, and multiplication that satisfy the equations. Any mathematician could call a model a commutative ring. Among all models is an initial model and that is the integers. Initial here means that there is a unique homomorphism of the ring of integers into any (commutative) ring. (It happens to be initial among all rings.)

Now this works for all datatypes. If you impose the operations and relations that some datatype satisfies, then the actual type you get is the initial model of the sketch or theory. At one time, this got a few people, including me, pretty excited. Having sober second thoughts I think it is pretty thin. But then I think the applications of logic in CS are pretty thin too.

Much more interesting is that I have heard that some string theorists are finding that ideas from CT are helpful in understanding string theory. Don’t know much about that. Maybe if I did, I would find that thin too.

It is in certain branches of mathematics that CT has found its greatest use. Something called the Weil conjectures (essentially p-adic versions of the Riemann hypothesis) were proved in a way that categorical ideas were crucial. All of homological algebra benefits from infusion of category theory. It is a big topic.

Yes, often enough category theorists will wave our hands at set-theoretic considerations.

Actually, when you really work it out, this is the same thing. A typed lambda-calculus actually corresponds to a “topos”, which is the special kind of category I brushed past the term for earlier. What you’re requiring is that the topos have a natural numbers object. Specifically, what I did for groups last time you can do for the Peano axioms and assert the existence of a universal diagram S -> N -> N, where S is the terminal object in the category (corresponding to the singleton set). From that and the other structure a topos has you can make the standard construction of a rig structure on N, and then extend that by the normal process (translating every logical step into the internal logic of the topos) to the ring of integers, the field of rational numbers, and the complete field of Dedekind reals.

The existence of universals may be thin, but other than that you’re describing the same notion of “internalization” I ran through for groups before. This, it turns out is a big deal, because internalizations “commute”. For instance, the category of group objects in the category of category objects in category X is equivalent to the category of category objects in the category of group objects in category X.

Why does this matter? Because the process of categorification, which is becoming very big, starts with replacing sets by (small) categories. So we want to replace groups (group objects in Set – the category of sets) by group objects in the category of small categories, which is the category of category objects in Set. Equivalently, then, we can consider the category of category objects in the category of groups to get the notion of a 2-group, and this notion is far easier to work with. The same technique applied to the category of bundles gives the category of 2-bundles, which is essential to higher gauge theory, for instance.

See just above. Also, the overview paper Higher Gauge Theory by Baez and Schreiber. That will refer you to their more detailed paper.

My own work actually tries to pull away from the metamathematical notions of categories. I try to emphasize the viewpoint starting from the category Mat® of matrices over a given ring R. Specifically, I work with categories of tangles, which come out of knot theory. These are very tidy categories, and no more abstract than a group or ring. There’s rather little application to computer science, though, so it’s a bit away from the OP.

This is not the place to get involved in this discussion. I will only say that categorists do not wave their hands at set theoretic considerations as much as hold that categories can and maybe should be viewed as pre-set-theoretic in the same as sets are. That is a set theorist must begin somewhere before set theory is avaiable and categorists can begin in the same place. I have been involved in this kind of discussion for 40 years and more and it never seems to lead anywhere.

Okay, I should be a bit more explicit there. Category theorists in casual conversation tend to wave their hands.

Yes, there may be a category-theoretic foundation theory, and some people are working on that. On the other hand, attempting to supplant set theory is part of what pissed off regular mathematicians, and it is politically expedient to play within the set-theoretic foundation since (as long as you’re careful about notions of identity and use things like classes) it’s not really that hard. The problem, though, is that there’s immediately another layer of disclaimers and tweaks that have to be thrown in, and most in the field are sufficiently aware of them that we just don’t say it anymore. Casually we’ll refer to the “category of categories” or the “category of functors from C to D” without referring to smallness conditions or giving other set-theoretic caveats. That’s the hand-waving I meant – more about rhetorical style than logical rigor.