For many people interested in type systems and type theory, their first encounter with the literature presents them with this:
Since this notation is ubiquitous, authors (reasonably) expect readers to already be familiar with it and thus provide no explanation. Because the notation is ubiquitous, the beginner looking for alternate resources will not escape it. All they will find is that the notation is everywhere but exists in myriad minor variations which may or may not indicate significant differences. At this point the options are: 1) to muddle on and hope understanding the notation isn’t too important, 2) look for introductory resources which typically take the form of $50+ 500+ page textbooks, or 3) give up.
The goal of this article is to explain the notation part-by-part in common realizations, and to cover the main idea behind the notation which is the idea of an inductively defined relation. To eliminate ambiguity and make hand-waving impossible, I’ll ground the explanations in code, in particular, in Agda. That means for each example of the informal notation, there will be how it would be realized in Agda.1 It will become clear that I’m am not (just) using Agda as a formal notation to talk about these concepts, but that Agda’s2 data type mechanism directly captures them3. The significance of this is that programmers are already familiar with many of the ideas behind the informal notation, and the notation is just obscuring this familiarity. Admittedly, Agda is itself pretty intimidating. I hope most of this article is accessible to those with familiarity with algebraic data types as they appear in Haskell, ML, Rust, or Swift with little to no need to look up details about Agda. Nevertheless, Agda at least has the benefit, when compared to the informal notation, of having a clear place to go to learn more, an unambiguous meaning, and tools that allow playing around with the ideas.
To start, if you are not already familiar with it, get familiar with the Greek alphabet. It will be far easier to (mentally) read mathematical notation of any kind if you can say “Gamma x” rather than “right angle thingy x” or “upside-down L x”.
Using the example from the introduction, the whole thing is a rule. The “
The judgment can be viewed as a proposition and the rule is an “if-then” statement read from top to bottom.
So the “
So what is the judgment saying? First, the judgment is, in this case, a four argument relation. The
arguments of this relation are \ x -> e
.4
In the context of type systems,
In most recent type system research multiple judgments are necessary to describe the type system, and so you
may see things like
Relations in set theory are boolean valued functions. Being programmers, and thus constructivists, we want
evidence, so a relation R : (A , B) -> Set
. w : R a b
. An inductively defined relation or judgment is then just a
type constructor for an (inductive) data type. That means, if R
is an inductively defined relation, then
its definition is data R : A -> B -> Set where ...
. A rule is a constructor of this data type.
A derivation is a value of this data type, and will usually be a tree-like structure. As a bit of
ambiguity in the terminology (arguably arising from a common ambiguity in mathematical notation), it’s a bit
more natural to use the term “judgment” to refer to something that can be (at the meta level) true or false.
For example, we’d say
Since a judgment is a relation, we need to describe what the arguments to the relation look like. Typically
something like BNF is used.
The BNF definitions provide the types used as parameters to the judgments. It is common to use a
Fortran-esque style where a naming convention is used to avoid the need to explicitly declare the types
of meta-variables. For example, the following says meta-variables like
n, m ::= Z | S n
BNF definitions translate readily to algebraic data types.
data Nat : Set where
: Nat
Z : Nat -> Nat S
Agda note:
Set
is what is called*
in Haskell. “Type” would be a better name. Also, these sidebars will cover details about Agda with the aim that readers unfamiliar with Agda don’t get tripped up by tangential details.
Sometimes it’s not possible to fully capture the constraints on well-formed syntax with BNF. In other words,
only a subset of syntactically valid terms are well-formed. For example, Nat Nat
is syntactically
valid but is not well-formed. We can pick out that subset with a predicate, i.e. a unary relation. This is,
of course, nothing but another judgment. As an example, if we wired the Maybe
type into our type
system, we’d likely have a judgment that looks like
In a scenario like this, we’d also have to make sure the rules of our typing judgment also required the types involved to be well-formed. Modifying the example from the introduction, we’d get:
As a very simple example, let’s say we wanted to provide explicit evidence that one natural number was less than or equal to another in Agda. Scenarios like this are common in dependently typed programming, and so we’ll start with the Agda this time and then “informalize” it.
data _isLessThanOrEqualTo_ : Nat -> Nat -> Set where
: {n : Nat} -> Z isLessThanOrEqualTo n
Z<=n : {m : Nat} -> {n : Nat} -> m isLessThanOrEqualTo n -> (S m) isLessThanOrEqualTo (S n) Sm<=Sn
Agda notes: In Agda identifiers can contain almost any character so
Z<=n
is just an identifier. Agda allows any identifier to be used infix (or more generally mixfix). The underscores mark where the arguments go. So_isLessThanOrEqualTo_
is a binary infix operator. Finally, curly brackets indicate implicit arguments which can be omitted and Agda will “guess” their values. Usually, they’ll be obvious to Agda by unification.
In the informal notation, the types of the arguments are implied by the naming. n
is a natural number because
it was used as the metavariable (non-terminal) in the BNF for naturals. We also implicitly quantify over
all free variables. In the Agda code, this quantification was explicit.
Again, I want to emphasize that these are defining isLessThanOrEqualTo
and
In Agda:
: (S (S Z)) isLessThanOrEqualTo (S (S (S Z)))
twoIsLessThanThree = Sm<=Sn (Sm<=Sn Z<=n) twoIsLessThanThree
In the informal notation:
Here’s a larger example that also illustrates that these judgments do not need to be typing judgments. Here we’re defining a big-step operational semantics for the untyped lambda calculus.
x variable
v ::= λx.e
e ::= v | e e | x
In informal presentations, binders like Lambda
constructor taking a function as its argument. In a later section, I’ll use a different
approach using deBruijn indices.
-- PHOAS approach to binding
mutual
data Expr (A : Set) : Set where
: Value A -> Expr A
Val : Expr A -> Expr A -> Expr A
App : A -> Expr A
Var
data Value (A : Set) : Set where
: (A -> Expr A) -> Value A
Lambda
-- A closed expression
: Set1
CExpr = {A : Set} -> Expr A
CExpr
-- A closed expression that is a value
: Set1
CValue = {A : Set} -> Value A CValue
Agda note:
Set1
is needed for technical reasons that are unimportant. You can just pretend it saysSet
instead. More important is that the definitions ofExpr
andValue
are a bit different than the definition for_isLessThanOrEqualTo_
. In particular, the argument(A : Set)
occurs to the left of the colon. When an argument occurs to the left of the colon we say it parameterizes the data declaration and that it is a parameter. When it occurs to the right of the colon we say it indexes the data declaration and that it is an index. The difference is that parameters must occur uniformly in the return type of the data constructors while indexes can be different in each data constructor. The arguments of an inductively defined relation like_isLessThanOrEqualTo_
will always be indexes (though there could be additional parameters.)
The
The
: {A : Set} -> Expr (Expr A) -> Expr A
subst (Var e) = e
subst (Val (Lambda b)) = Val (Lambda (λ a -> subst (b (Var a))))
subst (App e1 e2) = App (subst e1) (subst e2)
subst
data _EvaluatesTo_ : CExpr -> CValue -> Set1 where
: {v : CValue} -> (Val v) EvaluatesTo v
EvaluateTrivial : {e1 : CExpr} -> {e2 : CExpr}
EvaluateApp -> {e : {A : Set} -> A -> Expr A}
-> {v2 : CValue} -> {v : CValue}
-> e1 EvaluatesTo (Lambda e)
-> e2 EvaluatesTo v2
-> (subst (e (Val v2))) EvaluatesTo v
-> (App e1 e2) EvaluatesTo v
The EvaluateTrivial
constructor explicitly uses the Val
injection of values into
expressions. The EvaluateApp
constructor starts off with a series of implicit arguments that
introduce and quantify over the variables used in the rule. After those, each judgement above the line
in the EvaluateApp
constructor.
In this case ↓ is defining a functional relation, meaning for every expression there’s at most one value that the expression evaluates to. So another natural way to interpret ↓ is as a definition, in logic programming style, of a (partial) recursive function. In other words we can use the concept of mode from logic programming and instead of treating the arguments to ↓ as inputs, we can treat the first as an input and the second as an output.
↓ gives rise to a partial function because not every expression has a normal form. For _EvaluatesTo_
this is realized by the fact that we simply won’t be able to construct a term of type e EvaluatesTo v
for any v
if e
doesn’t have a normal form. In fact, we can use the inductive structure of
the relationship to help prove that statement. (Unfortunately, Agda doesn’t present a very good experience
for data types indexed by functions, so the proof is not nearly as smooth as one would like.)
Next we’ll turn to type systems which will present an even larger example, and will introduce some concepts that are specific to type systems (though, of course, they overlap greatly with concepts in logic due to the Curry-Howard correspondence.)
Below is an informal presentation of the polymorphic lambda calculus with explicit type abstraction and type application. An interesting fact about the polymorphic lambda calculus is that we don’t need any base types. Via Church-encoding, we can define types like natural numbers and lists.
α type variable
τ ::= τ → τ | ∀α. τ | α
x variable
c constant
v ::= λx:τ.e | Λτ.e | c
e ::= v | e e | e[τ] | x
In this case I’ll be using deBruijn indices to handle the binding structure of the terms and types. This
means instead of writing
data TType : Set where
: Nat -> TType -- α
TTVar _=>_ : TType -> TType -> TType -- τ → τ
: TType -> TType -- ∀α. τ
Forall
mutual
data TExpr : Set where
: TValue -> TExpr -- v
TVal : TExpr -> TExpr -> TExpr -- f x
TApp : TExpr -> TType -> TExpr -- e[τ]
TTyApp : Nat -> TExpr -- x
TVar
data TValue : Set where
: TType -> TExpr -> TValue -- λx:τ.e
TLambda : TExpr -> TValue -- Λτ.e
TTyLambda : Nat -> TValue -- c TConst
In formulating the typing rules we need to deal with open terms, that is terms which refer to variables
that they don’t bind. This should only happen if some enclosing terms did bind those variables, so we
need to keep track of the variables that have been bound by enclosing terms. For example, when type
checking
Γ ::= . | Γ, x:τ
Δ ::= . | Δ, α
We see that this is just a (snoc) list. In the first case,
data List (A : Set) : Set where
: List A
Nil _,_ : List A -> A -> List A
: Set
Context = List TType
Context
: Set
TypeContext = Nat TypeContext
Signatures keep track of what primitive, “user-defined” constants might exist. Often the signature is omitted since nothing particularly interesting happens with it. Indeed, that will be the case for us. Nevertheless, we see that the signature is just another association list mapping constants to types.
Σ ::= . | Σ, c:τ
: Set
Signature = List TType Signature
The main reason I included the signature, beyond just covering it for the cases when it is included, is that sometimes certain rules can be better understood as manipulations of the signature. For example, in logic, universal quantification is often described by a rule like:
What’s happening and what “freshness” is is made a bit clearer by employing a signature (which for logic is
usually just a list of constants similar to our TypeContext
):
To define the typing rules we need two judgements. The first,
The Agda is
data _<_ : Nat -> Nat -> Set where
: {n : Nat} -> Z < S n
Z<Sn : {n m : Nat} -> n < S m -> S n < S (S m)
Sn<SSm
data _isValidIn_ : TType -> TypeContext -> Set where
: {n : Nat} -> {ctx : TypeContext} -> n < ctx -> (TTVar n) isValidIn ctx
TyVarJ : {t1 t2 : TType} -> {ctx : TypeContext} -> t1 isValidIn ctx -> t2 isValidIn ctx -> (t1 => t2) isValidIn ctx
TyArrJ : {t : TType} -> {ctx : TypeContext} -> t isValidIn (S ctx) -> (Forall t) isValidIn ctx TyForallJ
The meat is the following typing judgement, depending on the judgement defining well-formed types. I’m not really going to explain these rules because, in some sense, there is nothing to explain. Beyond explaining the notation itself, which was the point of the article, the below is “self-explanatory” in the sense that it is a definition, and whether it is a good definition or “meaningful” depends on whether we can prove the theorems we want about it.
Here’s the corresponding Agda code. Note, all Agda is doing for us here is making sure we haven’t written self-contradictory nonsense. In no way is Agda ensuring that this is the “right” definition. For example, it could be the case (but isn’t) that there are no values of this type. Agda would be perfectly content to let us define a type that had no values.
: TType -> TType -> TType
tySubst = tySubst' t1 t2 Z
tySubst t1 t2 where tySubst' : TType -> TType -> Nat -> TType
(TTVar Z) t2 Z = t2
tySubst' (TTVar Z) t2 (S _) = TTVar Z
tySubst' (TTVar (S n)) t2 Z = TTVar (S n)
tySubst' (TTVar (S n)) t2 (S d) = tySubst' (TTVar n) t2 d
tySubst' (t1 => t2) t3 d = tySubst' t1 t3 d => tySubst' t2 t3 d
tySubst' (Forall t1) t2 d = tySubst' t1 t2 (S d)
tySubst'
data _isIn_at_ {A : Set} : A -> List A -> Nat -> Set where
: {a : A} -> {l : List A} -> a isIn (l , a) at Z
Found : {a : A} -> {b : A} -> {l : List A} -> {n : Nat} -> a isIn l at n -> a isIn (l , b) at (S n)
Next
data _hasType_inContext_and_given_ : TExpr -> TType -> Context -> TypeContext -> Signature -> Set where
: {t : TType} -> {c : Nat}
ConstJ -> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> t isIn Sigma at c
-> t isValidIn Delta
-> (TVal (TConst c)) hasType t inContext Gamma and Delta given Sigma
: {t : TType} -> {x : Nat}
VarJ -> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> t isIn Gamma at x
-> t isValidIn Delta
-> (TVar x) hasType t inContext Gamma and Delta given Sigma
: {t1 : TType} -> {t2 : TType} -> {e1 : TExpr} -> {e2 : TExpr}
AppJ -> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> e1 hasType (t1 => t2) inContext Gamma and Delta given Sigma
-> e2 hasType t1 inContext Gamma and Delta given Sigma
-> (TApp e1 e2) hasType t2 inContext Gamma and Delta given Sigma
: {t1 : TType} -> {t2 : TType} -> {e : TExpr}
TyAppJ -> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> e hasType (Forall t1) inContext Gamma and Delta given Sigma
-> t2 isValidIn Delta
-> (TTyApp e t2) hasType (tySubst t1 t2) inContext Gamma and Delta given Sigma
: {t1 : TType} -> {t2 : TType} -> {e : TExpr}
AbsJ -> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> e hasType t2 inContext (Gamma , t1) and Delta given Sigma
-> (TVal (TLambda t1 e)) hasType (t1 => t2) inContext Gamma and Delta given Sigma
: {t : TType} -> {e : TExpr}
TyAbsJ -> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> e hasType t inContext Gamma and (S Delta) given Sigma
-> (TVal (TTyLambda e)) hasType (Forall t) inContext Gamma and Delta given Sigma
Here’s a typing derivation for the polymorphic constant function:
: TExpr -> TExpr
tyLam = TVal (TTyLambda e)
tyLam e
: TType -> TExpr -> TExpr
lam = TVal (TLambda t e)
lam t e
polyConst: tyLam (tyLam (lam (TTVar Z) (lam (TTVar (S Z)) (TVar (S Z))))) -- Λs.Λt.λx:t.λy:s.x
(Forall (Forall (TTVar Z => (TTVar (S Z) => TTVar Z)))) -- ∀s.∀t.t→s→t
hasType
inContext Nil and Z
given Nil= TyAbsJ (TyAbsJ (AbsJ (AbsJ (VarJ (Next Found) (TyVarJ Z<Sn))))) -- written by Agda
polyConst
data False : Set where
: Set -> Set
Not = A -> False
Not A
wrongType: Not (tyLam (lam (TTVar Z) (TVar Z)) -- Λt.λx:t.x
(Forall (TTVar Z)) -- ∀t.t
hasType
inContext Nil and Z)
given Nil(TyAbsJ ()) wrongType
Having written all this, we have not defined a type checking algorithm (though Agda’s auto
tactic does a pretty
good job); we’ve merely specified what evidence that a program is well-typed is. Explicitly, a type checking algorithm
would be a function with the following type:
data Maybe (A : Set) : Set where
: Maybe A
Nothing : A -> Maybe A
Just
: (e : TExpr) -> (t : TType) -> (sig : Signature) -> Maybe (e hasType t inContext Nil and Z given sig)
typeCheck = ? typeCheck
In fact, we’d want to additionally prove that this function never returns Nothing
if there does exist
a typing derivation that would give e
the type t
in signature sig
. We could formalize this in Agda by
instead giving typeCheck
the following type:
data Decidable (A : Set) : Set where
: A -> Decidable A
IsTrue : Not A -> Decidable A
IsFalse
: (e : TExpr) -> (t : TType) -> (sig : Signature) -> Decidable (e hasType t inContext Nil and Z given sig)
typeCheckDec = ? typeCheckDec
This type says that either typeCheckDec
will return a typing derivation, or it will return a proof that there is
no typing derivation. As the name Decidable
suggests, this may not always be possible. Which is to say, type
checking may not always be decidable. Note, we can always check that a typing derivation is valid — we just
need to verify that we applied the rules correctly — what we can’t necessarily do is find such a derivation
given only the expression and the type or prove that no such derivation exists. Similar concerns apply to type inference
which could have one of the following types:
record Σ (T : Set) (F : T -> Set) : Set where
field
: T
fst : F fst
snd
: (e : TExpr) -> (sig : Signature) -> Maybe (Σ TType (λ t → e hasType t inContext Nil and Z given sig))
inferType = ?
inferType
: (e : TExpr) -> (sig : Signature) -> Decidable (Σ TType (λ t → e hasType t inContext Nil and Z given sig))
inferTypeDec = ? inferTypeDec
where Σ indicates a dependent sum, i.e. a pair where the second component (here of type e hasType t inContext Nil and Z given sig
)
depends on the first component (here of type TType
). With type inference we have the additional concern that there may
be multiple possible types an expression could have, and we may want to ensure it returns the “most general” type in some sense.
There may not always be a good sense of “most general” type and user-input is required to pick out of the possible types.
Sometimes the rules themselves can be viewed as the defining rules of a logic program and thus directly provide an algorithm. For example, if we eliminate the rules, types, and terms related to polymorphism, we’d get the simply typed lambda calculus. A Prolog program to do type checking can be written in a few lines with a one-to-one correspondence to the type checking rules (and, for simplicitly, also omitting the signature):
, [T|_], T).
lookup(zN), [_|Ctx],T) :- lookup(N, Ctx, T).
lookup(s(
var(N), Ctx, T) :- lookup(N, Ctx, T).
typeCheck(F,X), Ctx, T2) :- typeCheck(F, Ctx, tarr(T1, T2)), typeCheck(X, Ctx, T1).
typeCheck(app(B), Ctx, tarr(T1, T2)) :- typeCheck(B, [T1|Ctx], T2). typeCheck(lam(
This Agda file contains all the code from this article.↩︎
Most dependently typed languages, such as Coq or Epigram would also be adequate.↩︎
See this StackExchange answer for more discussion of this.↩︎