Programmers in typed languages with higher order functions and algebraic data types are already comfortable with most of the basic constructions of set/type theory. In categorical terms, those programmers are familiar with finite products and coproducts and (monoidal/cartesian) closed structure. The main omissions are subset types (equalizers/pullbacks) and quotient types (coequalizers/pushouts) which would round out limits and colimits. Not having a good grasp on either of these constructions dramatically shrinks the world of mathematics that is understandable, but while subset types are fairly straightforward, quotient types are quite a bit less intuitive.
See this StackOverflow answer by me which provides a more software engineering perspective.
In my opinion, most programmers can more or less immediately understand the notion of a subset type at an intuitive level.
A subset type is just a type combined with a predicate on that type that specifies which values of the type we want. For
example, we may have something like { n:Nat | n /= 0 }
meaning the type of naturals not equal to |0|. We may use this for
the type of the denominator for the division function. Consuming a value of a subset type is easy, a natural not equal to |0| is still
just a natural, and we can treat it as such. The difficult part is producing a value of a subset type. To do this, we
must, of course, produce a value of the underlying type — Nat
in our example — but then we must further
convince the type checker that the predicate holds (e.g. that the value does not equal |0|). Most languages provide no mechanism
to prove potentially arbitrary facts about code, and this is why they do not support subset types. Dependently typed
languages do provide such mechanisms and thus either have or can encode subset types. Outside of dependently typed languages
the typical solution is to use an abstract data type and use a runtime check when values of that abstract data type are created.
The dual of subset types are quotient types. My impression is that this construction is the most difficult basic construction
for people to understand. Further, programmers aren’t much better off, because they have little to which to connect the idea.
Before I give a definition, I want to provide the example with which most people are familiar: modular (or clock) arithmetic.
A typical way this is first presented is as a system where the numbers “wrap-around”. For example, in arithmetic mod |3|, we
count |0|, |1|, |2|, and then wrap back around to |0|. Programmers are well aware that it’s not necessary to guarantee that an input
to addition, subtraction, or multiplication mod |3| is either |0|, |1|, or |2|. Instead, the operation can be done and the mod
function
can be applied at the end. This will give the same result as applying the mod
function to each argument at the beginning.
For example, |4+7 = 11| and |11 \bmod 3 = 2|, and |4 \bmod 3 = 1| and |7 \bmod 3 = 1| and |1+1 = 2 = 11 \bmod 3|.
For mathematicians, the type of integers mod |n| is represented by the quotient type |\mathbb{Z}/n\mathbb{Z}|. The idea is that the values of |\mathbb{Z}/n\mathbb{Z}| are integers except that we agree that any two integers |a| and |b| are treated as equal if |a - b = kn| for some integer |k|. For |\mathbb{Z}/3\mathbb{Z}|, |\dots -6 = -3 = 0 = 3 = 6 = \dots| and |\dots = -5 = -2 = 1 = 4 = 7 = \dots| and |\dots = -4 = -1 = 2 = 5 = 8 = \dots|.
To start to formalize this, we need the notion of an equivalence relation. An equivalence relation is a binary relation |\sim| which is reflexive (|x \sim x| for all |x|), symmetric (if |x \sim y| then |y \sim x|), and transitive (if |x \sim y| and |y \sim z| then |x \sim z|). We can check that “|a \sim b| if and only if there exists an integer |k| such that |a-b = kn|” defines an equivalence relation on the integers for any given |n|. For reflexivity, we have |a - a = 0n|. For symmetry, we have if |a - b = kn| then |b - a = -kn|. Finally, for transitivity, we have if |a - b = k_1 n| and |b - c = k_2 n| then |a - c = (k_1 + k_2)n| which we get by adding the preceding two equations.
Any relation can be extended to an equivalence relation. This is called the reflexive-, symmetric-, transitive-closure of the relation. For an arbitrary binary relation |R| we can define the equivalence relation |({\sim_R})| via “|a \sim_R b| if and only if |a = b| or |R(a, b)| or |b \sim_R a| or |a \sim_R c| and |c \sim_R b| for some |c|”. To be precise, |{\sim_R}| is the smallest relation satisfying those constraints. In Datalog syntax, this looks like:
A, A).
eq_r(A, B) :- r(A, B).
eq_r(A, B) :- eq_r(B, A).
eq_r(A, B) :- eq_r(A, C), eq_r(C, B). eq_r(
If |T| is a type and |({\sim})| is an equivalence relation, we use |T/{\sim}| as the notation for the quotient type, which we read as “|T| quotiented by the equivalence relation |{\sim}|”. We call |T| the underlying type of the quotient type. We then say |a = b| at type |T/{\sim}| if and only if |a \sim b|. Dual to subset types, to produce a value of a quotient type is easy. Any value of the underlying type is a value of the quotient type. (In type theory, this produces the perhaps surprising result that |\mathbb Z| is a subtype of |\mathbb Z/n\mathbb Z|.) As expected, consuming a value of a quotient type is more complicated. To explain this, we need to explain what a function |f : T/{\sim} \to X| is for some type |X|. A function |f : T/{\sim} \to X| is a function |g : T \to X| which satisfies |g(a) = g(b)| for all |a| and |b| such that |a \sim b|. We call |f| (or |g|, they are often conflated) well-defined if |g| satisfies this condition. In other words, any well-defined function that consumes a quotient type isn’t allowed to produce an output that distinguishes between equivalent inputs. A better way to understand this is that quotient types allow us to change what the notion of equality is for a type. From this perspective, a function being well-defined just means that it is a function. Taking equal inputs to equal outputs is one of the defining characteristics of a function.
Sometimes we can finesse needing to check the side condition. Any function |h : T \to B| gives rise to an equivalence relation on |T| via |a \sim b| if and only if |h(a) = h(b)|. In this case, any function |g : B \to X| gives rise to a function |f : T/{\sim} \to X| via |f = g \circ h|. In particular, when |B = T| we are guaranteed to have a suitable |g| for any function |f : T/{\sim} \to X|. In this case, we can implement quotient types in a manner quite similar subset types, namely we make an abstract type and we normalize with the |h| function as we either produce or consume values of the abstract type. A common example of this is rational numbers. We can reduce a rational number to lowest terms either when it’s produced or when the numerator or denominator get accessed, so that we don’t accidentally write functions which distinguish between |1/2| and |2/4|. For modular arithmetic, the mod by |n| function is a suitable |h|.
In set theory such an |h| function can always be made by mapping the elements of |T| to the equivalence classes that contain them, i.e. |a| gets mapped to |\{b \mid a \sim b\}| which is called the equivalence class of |a|. In fact, in set theory, |T/{\sim}| is usually defined to be the set of equivalence classes of |({\sim})|. So, for the example of |\mathbb Z/3\mathbb Z|, in set theory, it is a set of exactly three elements: the elements are |\{ 3n+k \mid n \in \mathbb Z\}| for |k = 0, 1, 2|. Equivalence classes are also called partitions and are said to partition the underlying set. Elements of these equivalence classes are called representatives of the equivalence class. Often a notation like |[a]| is used for the equivalence class of |a|.
Here is a quick run-through of some significant applications of quotient types. I’ll give the underlying type and the equivalence relation and what the quotient type produces. I’ll leave it as an exercise to verify that the equivalence relations really are equivalence relations, i.e. reflexive, symmetric, and transitive. I’ll start with more basic examples. You should work through them to be sure you understand how they work.
Integers can be presented as pairs of naturals |(n, m)| with the idea being that the pair represents “|n - m|”. Of course, |1 - 2| should be the same as |2 - 3|. This is expressed as |(n_1, m_1) \sim (n_2, m_2)| if and only if |n_1 + m_2 = n_2 + m_1|. Note how this definition only relies on operations on natural numbers. You can explore how to define addition, subtraction, multiplication, and other operations on this representation in a well-defined manner.
Rationals can be presented very similarly to integers, only with multiplication instead of addition. We also have pairs |(n, d)|, usually written |n/d|, in this case of an integer |n| and a non-zero natural |d|. The equivalence relation is |(n_1, d_1) \sim (n_2, d_2)| if and only if |n_1 d_2 = n_2 d_1|.
We can extend the integers mod |n| to the continuous case. Consider the real numbers with the equivalence relation |r \sim s| if and only if |r - s = k| for some integer |k|. You could call this the reals mod |1|. Topologically, this is a circle. If you walk along it far enough, you end up back at a point equivalent to where you started. Occasionally this is written as |\mathbb R/\mathbb Z|.
Doing the previous example in 2D gives a torus. Specifically, we have pairs of real numbers and the equivalence relation |(x_1, y_1) \sim (x_2, y_2)| if and only if |x_1 - x_2 = k| and |y_1 - y_2 = l| for some integers |k| and |l|. Quite a bit of topology relies on similar constructions as will be expanded upon on the section on gluing.
Here’s an example that’s a bit closer to programming. Consider the following equivalence relation on arbitrary pairs of the same type: |(a_1, b_1) \sim (a_2, b_2)| if and only if |a_1 = a_2| and |b_1 = b_2| or |a_1 = b_2| and |b_1 = a_2|. This just says that a pair is equivalent to either itself, or a swapped version of itself. It’s interesting to consider what a well-defined function is on this type.1
Returning to topology and doing a bit more involved construction, we arrive at gluing or pushouts. In topology, we often want to take two topological
spaces and glue them together in some specified way. For example, we may want to take two discs and glue their boundaries together. This gives a sphere.
We can combine two spaces into one with the disjoint sum (or coproduct, i.e. Haskell’s Either
type.) This produces a space that contains both
the input spaces, but they don’t interact in any way. You can visualize them as sitting next to each other but not touching. We now want to say that
certain pairs of points, one from each of the spaces, are really the same point. That is, we want to quotient by an equivalence relation that would
identify those points. We need some mechanism to specify which points we want to identify. One way to accomplish this is to have a pair of
functions, |f : C \to A| and |g : C \to B|, where |A| and |B| are the spaces we want to glue together. We can then define a relation |R| on the disjoint sum
via |R(a, b)| if and only if there’s a |c : C| such that |a = \mathtt{inl}(f(c))| and |b = \mathtt{inr}(g(c))|. This is not an equivalence relation, but we can extend it to one.
The quotient we get is then the gluing of |A| and |B| specified by |C| (or really by |f| and |g|). For our example of two discs, |f| and |g| are the same
function, namely the inclusion of the boundary of the disc into the disc. We can also glue a space to itself. Just drop the disjoint sum part. Indeed,
the circle and torus are examples.
We write |\mathbb R[X]| for the type of polynomials with one indeterminate |X| with real coefficients. For two indeterminates, we write |\mathbb R[X, Y]|. Values of these types are just polynomials such as |X^2 + 1| or |X^2 + Y^2|. We can consider quotienting these types by equivalence relations generated from identifications like |X^2 + 1 \sim 0| or |X^2 - Y \sim 0|, but we want more than just the reflexive-, symmetric-, transitive-closure. We want this equivalence relation to also respect the operations we have on polynomials, in particular, addition and multiplication. More precisely, we want if |a \sim b| and |c \sim d| then |ac \sim bd| and similarly for addition. An equivalence relation that respects all operations is called a congruence. The standard notation for the quotient of |\mathbb R[X, Y]| by a congruence generated by both of the previous identifications is |\mathbb R[X, Y]/(X^2 + 1, X^2 - Y)|. Now if |X^2 + 1 = 0| in |\mathbb R[X, Y]/(X^2 + 1, X^2 - Y)|, then for any polynomial |P(X, Y)|, we have |P(X, Y)(X^2 + 1) = 0| because |0| times anything is |0|. Similarly, for any polynomial |Q(X, Y)|, |Q(X, Y)(X^2 - Y) = 0|. Of course, |0 + 0 = 0|, so it must be the case that |P(X, Y)(X^2 + 1) + Q(X, Y)(X^2 - Y) = 0| for all polynomials |P| and |Q|. In fact, we can show that all elements in the equivalence class of |0| are of this form. You’ve now motivated the concrete definition of a ring ideal and given it’s significance. An ideal is an equivalence class of |0| with respect to some congruence. Let’s work out what |\mathbb R[X, Y]/(X^2 + 1, X^2 - Y)| looks like concretely. First, since |X^2 - Y = 0|, we have |Y = X^2| and so we see that values of |\mathbb R[X, Y]/(X^2 + 1, X^2 - Y)| will be polynomials in only one indeterminate because we can replace all |Y|s with |X^2|s. Since |X^2 = -1|, we can see that all those polynomials will be linear (i.e. of degree 1) because we can just keep replacing |X^2|s with |-1|s, i.e. |X^{n+2} = X^n X^2 = -X^n|. The end result is that an arbitrary polynomial in |\mathbb R[X, Y]/(X^2 + 1, X^2 - Y)| looks like |a + bX| for real numbers |a| and |b| and we have |X^2 = -1|. In other words, |\mathbb R[X, Y]/(X^2 + 1, X^2 - Y)| is isomorphic to the complex numbers, |\mathbb C|.
As a reasonably simple exercise, given a polynomial |P(X) : \mathbb R[X]|, what does it get mapped to when embedded into |\mathbb R[X]/(X - 3)|, i.e. what is |[P(X)] : \mathbb R[X]/(X - 3)|?2
Moving much closer to programming, we have a rather broad and important example that a mathematician might describe as free algebras modulo an equational
theory. This example covers several of the preceding examples. In programmer-speak, a free algebra is just a type of abstract syntax trees for some language.
We’ll call a specific abstract syntax tree a term. An equational theory is just a collection of pairs of terms with the idea being that we’d like these
terms to be considered equal. To be a bit more precise, we will actually allow terms to contain (meta)variables. An example equation for an expression
language might be Add(
|x|,
|x|) = Mul(2,
|x|)
. We call a term with no variables a ground term. We say a ground term matches another
term if there is a consistent substitution for the variables that makes the latter term syntactically equal to the ground term. For example, Add(3, 3)
matches
Add(
|x|,
|x|)
via the substitution |x \mapsto|3
. Now, the equations of our equational theory gives rise to a relation on ground terms |R(t_1, t_2)| if and only if there
exists an equation |l = r| such that |t_1| matches |l| and |t_2| matches |r|. This relation can be extended to an equivalence relation on ground terms, and
we can then quotient by that equivalence relation.
Let’s consider a worked example. We can consider the theory of monoids. We have two operations (types of AST nodes): Mul(
|x|,
|y|)
and 1
. We
have the following three equations: Mul(1,
|x|) =
|x|, Mul(
|x|, 1) =
|x|, and Mul(Mul(
|x|,
|y|),
|z|) = Mul(
|x|, Mul(
|y|,
|z|))
.
We additionally have a bunch of constants subject to no equations. In this case, it turns out we can define a normalization function, what I called |h| far above,
and that the quotient type is isomorphic to lists of constants. Now, we can extend this theory to the theory of groups by adding a new operation, Inv(
|x|)
,
and new equations: Inv(Inv(
|x|)) =
|x|, Inv(Mul(
|x|,
|y|)) = Mul(Inv(
|y|), Inv(
|x|))
, and Mul(Inv(
|x|),
|x|) = 1
. If we ignore the last
of these equations, you can show that we can normalize to a form that is isomorphic to a list of a disjoint sum of the constants, i.e. [Either Const Const]
in Haskell if Const
were the type of the constant terms. Quotienting this type by the equivalence relation extended with that final equality corresponds
to adding the rule that a Left c
cancels out Right c
in the list whenever they are adjacent.
This overall example is a fairly profound one. Almost all of abstract algebra can be viewed as an instance of this or a closely related variation. When you hear about things defined in terms of “generators and relators”, it is an example of this sort. Indeed, those “relators” are used to define a relation that will be extended to an equivalence relation. Being defined in this way is arguably what it means for something to be “algebraic”.
The Introduction to Type Theory section of the NuPRL book provides a more comprehensive and somewhat more formal presentation of these and related concepts. While the quotient type view of quotients is conceptually different from the standard set theoretic presentation, it is much more amenable to computation as the |\mathbb Z/n\mathbb Z| example begins to illustrate.