Classical First-Order Logic (Classical FOL) has an absolutely central place in traditional logic, model theory, and set theory. It is the foundation upon which ZF(C), which is itself often taken as the foundation of mathematics, is built. When classical FOL was being established there was a lot of study and debate around alternative options. There are a variety of philosophical and metatheoretic reasons supporting classical FOL as The Right Choice.
This all happened, however, well before category theory was even a twinkle in Mac Lane’s and Eilenberg’s eyes, and when type theory was taking its first stumbling steps.
My focus in this article is on what classical FOL looks like to a modern categorical logician. This can be neatly summarized as “classical FOL is the internal logic of a Boolean First-Order Hyperdoctrine. Each of the three words in this term,”Boolean”, “First-Order”, and “Hyperdoctrine”, suggest a distinct axis in which to vary the (class of categorical models of the) logic. All of them have compelling categorical motivations to be varied.
The first and simplest is the term “Boolean”. This is what differentiates the categorical semantics of classical (first-order) logic from constructive (first-order) logic. Considering arbitrary first-order hyperdoctrines would give us a form of intuitionistic first-order logic.
It is fairly rare that the categories categorists are interested in are Boolean. For example, most toposes, all of which give rise to first-order hyperdoctrines, are not Boolean. The assumption that they are tends to correspond to a kind of “discreteness” that’s often at odds with the purpose of the topos. For example, a category of sheaves on a topological space is Boolean if and only if that space is a Stone space. These are certainly interesting spaces, but they are also totally disconnected unlike virtually every non-discrete topological space one would typically mention.
The next term is the term “first-order”. As the name suggests, a first-order hyperdoctrine has the necessary structure to interpret first-order logic. The question, then, is what kind of categories have this structure and only this structure. The answer, as far as I’m aware, is not many.
Many (classes of) categories have the structure to be first-order hyperdoctrines, but often they have additional structure as well that it seems odd to ignore. The most notable and interesting example is toposes. All elementary toposes (which includes all Grothendieck toposes) have the structure to give rise to a first-order hyperdoctrine. But, famously, they also have the structure to give rise to a higher order logic. Even more interesting, while Grothendieck toposes, being elementary toposes, technically do support the necessary structure for first-order logic, the natural morphisms of Grothendieck toposes, geometric morphisms, do not preserve that structure, unlike the logical functors between elementary toposes.
The natural internal logic for Grothendieck toposes turns out to be geometric logic. This is a logic that lacks universal quantification and implication (and thus negation) but does have infinitary disjunction. This leads to a logic that is, at least superficially, incomparable to first-order logic. Closely related logics are regular logic and coherent logic which are sub-logics of both geometric logic and first-order logic.
We see, then, just from the examples of the natural logics of toposes, none of them are first-order logic, and we get examples that are more powerful, less powerful, and incomparable to first-order logic. Other common classes of categories give other natural logics, such as the cartesian logic from left exact categories, and monoidal categories give rise to (ordered) linear logics. We get the simply typed lambda calculus from cartesian closed categories which leads to the next topic.
A (posetal) hyperdoctrine essentially takes a category and, for each object in that category, assigns to it a poset of “predicates” on that object. In many cases, this takes the form of the Sub functor assigning to each object its poset of subobjects. Various versions of hyperdoctrines will require additional structure on the source category, these posets, and/or the functor itself to interpret various logical connectives. For example, a regular hyperdoctrine requires the source category to have finite limits, the posets to be meet-semilattices, and the functor to give rise to monotonic functions with left adjoints satisfying certain properties. This notion of hyperdoctrines is suitable for regular logic.
It’s very easy to recognize that these functors are essentially indexed |(0,1)|-categories. This immediately suggests that we should consider higher categorical versions or at the very least normal indexed categories.
What this means for the logic is that we move from proof-irrelevant logic to proof-relevant logic. We now have potentially multiple ways a “predicate” could “entail” another “predicate”. We can present the simply typed lambda calculus in this indexed category manner. This naturally leads/connects to the categorical semantics of type theories.
Pushing forward to |(\infty, 1)|-categories is also fairly natural, as it’s natural to want to talk about an entailment holding for distinct but “equivalent” reasons.
Moving in all three of these directions simultaneously leads pretty naturally to something like Homotopy Type Theory (HoTT). HoTT is a naturally constructive (but not anti-classical) type theory aimed at being an internal language for |(\infty, 1)|-toposes.
Okay, so why did people pick classical FOL in the first place? It’s not like the concept of, say, a higher-order logic wasn’t considered at the time.
Classical versus Intuitionistic was debated at the time, but at that time it was primarily a philosophical argument, and the defense of Intuitionism was not very compelling (to me and obviously people at the time). The focus would probably have been more on (classical) FOL versus second- (or higher-)order logic.
Oversimplifying, the issue with second-order logic is fairly evident from the semantics. There are two main approaches: Henkin-semantics and full (or standard) semantics. Henkin-semantics keeps the nice properties of (classical) FOL but fails to get the nice properties, namely categoricity properties, of second-order logic. This isn’t surprising as Henkin-semantics can be encoded into first-order logic. It’s essentially syntactic sugar. Full semantics, however, states that the interpretation of predicate sorts is power sets of (cartesian products of) the domain1. This leads to massive completeness problems as our metalogical set theory has many, many ways of building subsets of the domain. There are metatheoretic results that state that there is no computable set of logical axioms that would give us a sound and complete theory for second-order logic with respect to full semantics. This aspect is also philosophically problematic, because we don’t want to need set theory to understand the very formulation of set theory. Thus Quine’s comment that “second-order logic [was] set theory in sheep’s clothing”.
On the more positive and (meta-)mathematical side, we have results like Lindström’s theorem which states that classical FOL is the strongest logic that simultaneously satisfies (downward) Löwenheim-Skolem and compactness. There’s also a syntactic result by Lindström which characterizes first-order logic as the only logic having a recursively enumerable set of tautologies and satisfying Löwenheim-Skolem2.
There’s one big caveat to the above. All of the above results are formulated in traditional model theory which means there are various assumptions built in to their statements. In the language of categorical logic, these assumptions can basically be summed up in the statement that the only category of semantics that traditional model theory considers is Set.
This is an utterly bizarre thing to do from the standpoint of categorical logic.
The issues with full semantics follow directly from this choice. If, as categorical logic would have us do, we considered every category with sufficient structure as a potential category of semantics, then our theory would not be forced to follow every nook and cranny of Set’s notion of subset to be complete. Valid formulas would need to be true not only in Set but in wildly different categories, e.g. every (Boolean) topos.
These traditional results are also often very specific to classical FOL. Dropping this constraint of classical logic would lead to an even broader class of models.
A Boolean category is just a coherent category where every object has a complement. Since coherent functors preserve complements, we have that the category of Boolean categories is a full subcategory of the category of coherent categories.
One nice thing about, specifically, classical first-order logic from the perspective of category theory is the following. First, coherent logic is a sub-logic of geometric logic restricted to finitary disjunction. Via Morleyization, we can encode classical first-order logic into coherent logic such that the categories of models of each are equivalent. This implies that a classical FOL formula is valid if and only if its encoding is. Morleyization allows us to analyze classical FOL using the tools of classifying toposes. On the one hand, this once again suggests the importance of coherent logic, but it also means that we can use categorical tools with classical FOL.
There are certain things that I and, I believe, most logicians take as table stakes for a (foundational) logic3. For example, checking a proof should be computably decidable. For these reasons, I am in complete accord with early (formal) logicians that classical second-order logic with full semantics is an unacceptably worse alternative to classical first-order logic.
However, when it comes to statements about the specialness of FOL, a lot of them seem to be more statements about traditional model theory than FOL itself, and also statements about the philosophical predilections of the time. I feel that philosophical attitudes among logicians and mathematicians have shifted a decent amount since the beginning of the 20th century. We have different philosophical predilections today than then, but they are informed by another hundred years of thought, and they are more relevant to what is being done today.
Martin-Löf type theory (MLTT) and its progeny also present an alternative path with their own philosophical and metalogical justifications. I mention this to point out actual cases of foundational frameworks that a (very) superficial reading of traditional model theory results would seem to have been “ruled out”. Even if one thinks the FOL+ZFC (or whatever) is the better foundations, I think it is unreasonable to assert that MLTT derivatives are unworkable as a foundations.
It’s worth mentioning that this is exactly what categorical logic would suggest: our syntactic power objects should be mapped to semantic power objects.↩︎
While nice, it’s not clear that compactness and, especially, Löwenheim-Skolem are sacrosanct properties that we’d be unwilling to do without. Lindström’s first theorem is thus a nice abstract characterization theorem for classical FOL, but it doesn’t shut the door on considering alternatives even in the context of traditional model theory.↩︎
I’m totally fine thinking about logics that lack these properties, but I would never put any of them forward as an acceptable foundational logic.↩︎