Morleyization is a fairly important operation in categorical logic for which it is hard to find readily
accessible references to a statement and proof. Most refer to D1.5.13 of “Sketches of an Elephant” which is
not an accessible text. 3.2.8 of “Accessible Categories” by Makkai and Paré is another reference, and
“Accessible Categories” is more accessible but still a big ask for just a single theorem.
Here I reproduce the statement and proof from “Accessible Categories” albeit with some notational and
conceptual adaptations as well as some commentary. This assumes some basic familiarity with the ideas
and notions of traditional model theory, e.g. what structures, models, and are.
Preliminaries
The context of the theorem is infinitary, classical (multi-sorted) first-order logic.
will stand for a language aka a signature, i.e. sorts, function symbols, predicate symbols as usual,
except if we’re allowing infinitary quantification we may have function or predicate symbols of infinite
arity. We write for the corresponding classical first-order logic where we
allow conjunctions and disjunctions indexed by sets of cardinality less than the regular (infinite) cardinal
while allowing quantification over sets of variables of (infinite) cardinality less than
. is also allowed to indicate a propositional logic.
If or are , that means conjunctions/disjunctions or quantifications over
arbitrary sets. would be normal finitary, classical first-order logic. Geometric
logic would be a fragment of . The theorem will focus on ,
but inspection of the proof shows that theorem would hold for any reasonable choice for
and .
As a note, infinitary logics can easily have a proper class of formulas. Thus, it will make sense
to talk about small subclasses of formulas, i.e. ones which are sets.
Instead of considering logics with different sets of connectives Makkai and Paré, introduces the
fairly standard notion of a positive existential formula which is a formula that uses only
atomic formulas, conjunctions, disjunctions, and existential quantification. That is, no implication,
negation, or universal quantification. They then define a basic sentence as “a conjunction of
a set of sentences, i.e. closed formulas, each of which is of the form
where and are [positive existential] formulas”.
It’s clear the component formulas of a basic sentences correspond to sequents of the form
for open positive existential formulas. A basic sentence corresponds to what
is often called a theory, i.e. a set of sequents. Infinitary logic lets us smash a theory down
to a single formula, but I think the theory concept is clearer though I’m sure there are benefits
to having a single formula. Instead of talking about basic sentences, we can talk about a theory
in the positive existential fragment of the relevant logic. This has the benefit that we don’t
need to introduce connectives or infinitary versions of connectives just for structural reasons.
I’ll call a theory that corresponds to a basic sentence a positive existential theory for
conciseness.
Makkai and Paré also define “for the class of formulas
which are conjunctions of formulas in each of which the only conjunctions occurring are
of cardinality ”. For us, the main significance of this is that geometric theories
correspond to basic sentences in as this limits the conjunctions to the
finitary case. Indeed, Makkai and Paré include the somewhat awkward sentence: “Thus, a geometric
theory is the same as a basic sentence in , and a coherent theory is
a conjunction of basic sentences in .” Presumably, the ambiguous meaning of
“conjunction” leads to the differences in how these are stated, i.e. a basic sentence is already
a “conjunction” of formulas.
The standard notion of an -structure and model are used, and I won’t give a precise definition
here. An -structure assigns meaning (sets, functions, and relations) to all the sorts and
symbols of , and a model of a formula (or theory) is an -structure which satisfies the
formula (or all the formulas of the theory). We’ll write for the category of -structures
and homomorphisms. In categorical logic, an -structure would usually
be some kind of structure preserving (fibred) functor usually into , and a homomorphism
is a natural transformation. A formula would be mapped to a subobject, and a model would require
these subobjects to satisfy certain factoring properties specified by the theory. A sequent
in the theory would require a model to have the interpretation of
factor through the interpretation of , i.e. for the former to be a subset
of the latter when interpreting into .
Theorem Statement
is called a fragment of if:
it contains all atomic formulas of ,
it is closed under substitution,
if a formula is in then so are all its subformulas,
if , then so is , and
if , then so is .
Basically, and the motivation for this will become clear shortly, formulas in are
like “compound atomic formulas” with the caveat that we must include the classically equivalent
versions of and in terms of and or respectively.
Given , we define an -basic sentence exactly like a basic sentence
except that we allow formulas from instead of just atomic formulas as the base
case. In theory language, an -basic sentence is a theory, i.e. set of sequents,
using only the connectives , , and , except within subformulas
contained in which may use any (first-order) connective. We’ll call such a
theory a positive existential -theory. Much of the following will be double-barrelled
as I try to capture the proof as stated in “Accessible Categories” and my slight reformulation
using positive existential theories.
for a theory (or
for a basic sentence ) is the category
whose objects are -structures that are models of (or ), and whose arrows are the
-elementary mappings. An -elementary mapping,
for any subset of formulas of , , is a mapping of -structures
which preserves the meaning of all formulas in .
That is, implies for all
formulas, and appropriate sequences . We can define
the elementary mappings for a language as the -elementary mappings
where consists of (only) the atomic formulas of .
(or ) can be defined by
(or ) for the determined this way.
Here’s the theorem as stated in “Accessible Categories”.
Theorem (Proposition 3.2.8): Given any small fragment and an -basic
sentence , the category of is equivalent to
for some other language and basic sentence over
, hence by 3.2.1, to the category of models of a small sketch as well.
We’ll replace the -basic sentences and with positive existential
-theories and .
Implied is that , i.e. that and may be distinct
and usually will be. As the proof will show, they agree on sorts and function symbols, but we have
different predicate symbols in .
I’ll be ignoring the final comment referencing Theorem 3.2.1. Theorem 3.2.1 is the main theorem of
the section and states that every small sketch gives rise to a language and theory
(or basic sentence ) and vice versa such that the category of models of the sketch are
equivalent to models of (or ). Thus, the final comment is an immediate corollary.
For us, the interesting part of 3.2.8 is that it takes a classical first-order theory, ,
and produces a positive existential theory, as represented by , that has
an equivalent, in fact isomorphic, category of models. This positive existential theory is called
the Morleyization of the first-order theory.
In particular, if we have a finitary classical first-order theory, then we get a coherent theory
with the same models. This means to study models of classical first-order theories, it’s enough
to study models of coherent theories via the Morleyization of the classical first-order theories.
This allows many techniques for geometric and coherent theories to be applied, e.g. (pre)topos theory
and classifying toposes. As stated before, the theorem statement doesn’t actually make it clear that
the result holds for a restricted degree of “infinitariness”, but this is obvious from the proof.
Proof
I’ll quote the first few sentences of the proof to which I have nothing to add.
The idea is to replace each formula in by a new predicate. Let the
sorts of the language be the same as those of , and similarly for the [function]
symbols.
The description of the predicate symbols is complicated by their (potential) infinitary nature.
I’ll quote the proof here as well as I have nothing to add and am not as interested in this case.
The finitary quantifiers case would be similar, just slightly less technical. It would be even
simpler if we defined formulas in a given (ordered) variable context as is typical in categorical
logic.
With any formula in , with the repetition free sequence
of exactly the free variables of in a
once and for all fixed order of variables, let us associate the new [predicate] symbol
of arity such that . The [predicate]
symbols of are the for all .
The motivation of -basic sentences / positive existential -theories should
now be totally clear. The -basic sentences / positive existential -theories
are literally basic sentences / positive existential theories in the language of if we
replace all occurrences of subformulas in with their corresponding predicate symbol in .
We can extend any -structure to an -structure such that they agree on all the sorts
and function symbols of , and satisfies if and only if
. Which is to say, we define the interpretation of to
be the subset of the interpretation of its domain specified by for all
in the domain. In more categorical language, we define the subobject that
gets sent to to be the subobject .
We can define an -structure, , for an -structure by, again, requiring it to do the
same thing to sorts and function symbols as , and defining the interpretation of the predicate
symbols as if and only if .
We immediately have .
We can extend this to -formulas. Let be an -formula, then is defined
by a connective-preserving operation for which we only need to specify the action on predicate
symbols. We define that by declaring gets mapped to .
We extend to theories via .
A similar induction allows us to prove
for all -structures and appropriate .
We have for a positive existential theory over
(or for a basic -sentence )
and thus
for all (or ).
We want to make it so that any -structure interpreting (or ) as
(or ) is of the form for some . Right now that doesn’t happen because, while
the definition of forces it to respect the logical connectives in the formula
associated to the predicate symbol , this isn’t required for an arbitrary model .
For example, nothing requires to hold.
The solution is straightforward. In addition to (or ) representing
the theory (or ), we add in an additional set of axioms
that capture the behavior of the (encoded) logical connectives of the formulas associated to the
predicate symbols.
These axioms are largely structural with a few exceptions that I’ll address separately. I’ll present
this as a collection of sequents for a theory, but we can replace and with
and for the basic sentence version. stands
for two sequents going opposite directions.
We then have two axiom schemas that eliminate the and by leveraging the defining
property of being a fragment.
We avoid needing negation by axiomatizing that is the complement to . This
is arguably the key idea. Once we can simulate the behavior of negation without actually needing it, then
it is clear that we can embed all the other non-positive-existential connectives.
is the set of all these sequents. (For the basic sentence version, is the set of universal
closures of all these formulas for all .)
Another straightforward structural induction over the subformulas of shows that
for any -structure
which is a model of . The only interesting case is the negation case. Here, the induction hypothesis
states that agrees with and the axioms
state that is the complement of the latter which thus agrees with the
complement of the former which is .
From this, it follows that for or, equivalently, .
and thus establish a bijection between the objects of
(or ) and
(or ).
The morphisms of these two categories would each be subclasses of the morphisms of where is
the language consisting of only the sorts and function symbols of and thus . We can show that they
are identical subclasses which basically comes down to showing that an elementary mapping of
(or )
is an -elementary mapping.
The idea is that such a morphism is a map in which must satisfy
for
all and appropriate . However, since
and , we have
and similarly for . Thus
for all
, and every such corresponds to an -elementary mapping.
Choosing allows us to show the converse for any -elementary
mapping .
Commentary
The proof doesn’t particularly care that we’re interpreting the models into and would
work just as well if we interpreted into some other category with the necessary structure. The amount
of structure required would vary with how much “infinitariness” we actually used, though it would need
to be a Boolean category. In particular, the proof works as stated (in its theory form) without
any infinitary connectives being implied for mapping finitary classical first-order logic to coherent logic.
We could simplify the statement and the proof by first eliminating and and then
considering the proof over classical first-order logic with the connectives
. This would simplify the definition of fragment and
remove some cases in the proof.
To reiterate, the key is how we handle negation.
Defunctionalization
Morleyization is related to defunctionalization1.
For simplicity, I’ll only consider the finitary, propositional case, i.e. .
In this case, we can consider each to be a new data type. In most cases, it would be
a newtype to use Haskell terminology. The only non-trivial case is . Now, the
computational interpretation of classical propositional logic would use control operators to handle
negation. Propositional coherent logic, however, has a straightforward (first-order) functional
interpretation. Here, a negated formula, , is represented by an primitive type
.
The sequent is the apply
function for the defunctionalized continuation (of type ). Even more clearly, this
is interderivable with where is
the same as except the most shallow negated subformulas are replaced with the corresponding
predicate symbols. In particular, if contains no negated subformulas, then .
We have no way of creating new values of other than via whatever sequents have been given.
We can, potentially, get a value of by case analyzing on .
What this corresponds to is a first-order functional language with a primitive type for each negated formula.
Any semantics/implementation for this, will need to decide if the primitive type is
empty or not, and then implement appropriately (or allow inconsistency). A
programmer writing a program in this signature, however, cannot assume either way whether
is empty unless they can create a program with that type.
As a very slightly non-trivial example, let’s consider implementing
corresponding to double negating. Using Haskell-like syntax, the program looks like:
proof ::A->NotNotAproof a =case lem_NotA ofLeft notNotA -> notNotARight notA -> absurd (apply_NotA (notA, a))
where lem_NotA :: Either NotNotA NotA, apply_NotA :: (NotA, A) -> Void, and absurd :: Void -> a
is the eliminator for where is represented by Void.
Normally in defunctionalization we’d also be adding constructors to our new types for all the
occurrences of lambdas (or maybe s would be better in this case). However, since the only
thing we can do (in general) with NotA is use apply_A on it, no information can be extracted
from it. Either it’s inhabited and behaves like (), i.e. , or it’s not inhabited and
behaves like Void, i.e. . We can even test for this by case analyzing on lem_A which
makes sense because in the classical logic this formula was decidable.
Bonus: Grothendieck toposes as categories of models of sketches
The main point of this section of “Accessible Categories” is to show that we can equivalently
view categories of models of sketches
as categories of models of theories. In particular, models of geometric sketches, those whose
cone diagrams are finite but cocone diagrams are arbitrary, correspond to models of geometric theories.
We can view a site, , for a Grothendieck topos as the
data of a geometric sketch. In particular, becomes the underlying category of the sketch, we
add cones to capture all finite limits, and the coverage, , specifies the cocones. These cocones
have a particular form as the quotient of the kernel of a sink
as specified by the sieves in . (We need to use the apex of the cones representing pullbacks instead
of actual pullbacks.)
Lemma 3.2.2 shows the sketch-to-theory implication. The main thing I want to note about its proof is that
it illustrates how infinitely large cones would require infinitary (universal) quantification (in addition
to the unsurprising need for infinitary conjunction), but infinitely large cocones do not (but they do
require infinitary disjunction). I’ll not reproduce it here, but it comes down to writing out the normal
set-theoretic constructions of limits and colimits (in ), but instead of using some first-order
theory of sets, like ZFC, uses of sets would be replaced with (infinitary) logical operations. The
“infinite tuples” of an infinite limit become universal quantification over an infinitely large number of
free variables. For the colimits, though, the most complex use of quantifiers is an infinite disjunction of
increasingly deeply nested quantifiers to represent the transitive closure of a relation, but no single
disjunct is infinitary. Figuring out the infinitary formulas is a good exercise.
An
even more direct connection to defunctionalization is the fact that geometric logic is the internal logic
of Grothendieck toposes, but Grothendieck toposes are elementary toposes and so have the structure to model
implication and universal quantification. It’s just that those connectives aren’t preserved by geometric
morphisms. For implication, the idea is that is represented by
where is finite. We can even see how
a homomorphism that preserved geometric logic structure will fail to preserve this definition of .
Specifically, there could be additional contexts not in the image of the homomorphism that should be
included in the image of the disjunction for it to lead to in the target but won’t be.↩︎