Introduction to Type Theory

Seminar
 [12:06] &lt; vixey&gt; so we are primarily looking at the mechanics of a family of proof checking systems for constructive math [12:07] &lt; vixey&gt; You'll all have seen Set Theory and probably Category Theory as a foundation for mathematics [12:07] &lt; vixey&gt; where we encode tuples as {{a},{a,b}}, functions as sets of tuples, etc.. [12:09] &lt; vixey&gt; I should give some motivation actually [12:10] &lt; vixey&gt; Some theorems that have been formalized in type theory are 4 color theorem (Ill give some reasons it's                 especially practical for things like this later), Godels theorems and some computer stuff e.g. Fast Fourier Transform, proof checker for COC with one universe [12:11] &lt; vixey&gt; So I'll start by a couple diversions, first untyped lambda calculus and then a little on combinatory logic [12:11] &lt; vixey&gt; (And do just interrupt me if I say anything that doesn't make sense or make any mistakes) [12:12] &lt; vixey&gt; so thee syntax for untyped lambda calculus (I'll call lam) is: &lt;lam&gt; ::= &lt;var&gt; | &lt;lam&gt; &lt;lam&gt; | \&lt;var&gt;, &lt;lam&gt; [12:12] &lt; vixey&gt; so terms might look like, \x -&gt; x,  \x y -&gt; y x, \u -&gt; u u [12:13] &lt; vixey&gt; \ is supposed to look like a lambda symbol, I just chose not to use unicode for this [12:14] &lt; vixey&gt; I'm not going to be tediously formal because the idea is quite simple but the first relation I need is                 alpha-equivalence [12:14] &lt; vixey&gt; \x y -&gt; y x is alpha equivalent to \u v -&gt; v u  in general alpha equivalence means that I renamed some variables [12:16] &lt; vixey&gt; oops I've been typing -&gt; instead of, (going to start using , instead -&gt; like I meant to..) [12:16] &lt; vixey&gt; so next I need beta reduction for terms, which is just substitution, For example (\x, \y, x y x) e                  ~beta~&gt; \y, e y e [12:17] &lt; vixey&gt; and I can identify terms as beta equivalent, if there is any sequence of beta reductions to which the become equal [12:18] &lt; vixey&gt; so for example \i, i is alpha-beta equivalent to (\o, (\u v, v) o o) because (\o, (\u v, v) o o)                 ~beta~&gt; \o, o [12:19] &lt; vixey&gt; There's also the notion of eta equivalence, that is any function f is equivalent to \x, f x (and eta                  equivalence will be the transitive closure of that) [12:19] &lt; vixey&gt; This untyped calculus is really not a useful proof system though, because 1) There's no meaning to the                 terms 2) alpha-beta-eta equality is not decideable [12:20] &lt; vixey&gt; so diversion 2 is the S K combinatory logic [12:20] &lt; vixey&gt; we take two axioms, [12:20] &lt; vixey&gt;  K: A =&gt; (B =&gt; A) [12:20] &lt; vixey&gt;   S: (A =&gt; (B =&gt; C)) =&gt; ((A =&gt; B) =&gt; (A =&gt; C)) [12:21] &lt; vixey&gt; and you can combine them (modus ponens) [12:21] &lt; vixey&gt; so for example [12:21] &lt; vixey&gt;  SK: ((A =&gt; B) =&gt; (A =&gt; A)) [12:21] &lt; vixey&gt;  SKK: (A =&gt; A) [12:22] &lt; vixey&gt; S &amp; K is actually a complete basis for the implicational fragment of intuitionistic logic (Gentzens LJ) [12:23] &lt; vixey&gt; We can actually assign types to lambda terms in a very simple way though [12:24] &lt; vixey&gt; There will be a rule for each var app and lam production of the syntax earlier [12:25] &lt; vixey&gt; in the natural deduction style, [12:25] &lt; vixey&gt; Gamma |- M : A -&gt; B [12:25] &lt; vixey&gt;  Gamma |- N : A [12:25] &lt; vixey&gt; [12:25] &lt; vixey&gt; Gamma |- M N : B [12:25] &lt; vixey&gt; (I better give a syntax for simple types,    &lt;type&gt; ::= &lt;var&gt; | &lt;type&gt; -&gt; &lt;type&gt;) [12:26] &lt; vixey&gt; this rule says that in a type context Gamma, give that M has the type A -&gt; B, and N has type A, then the application MN has type B [12:26] &lt; vixey&gt; for the lambda abstraction the rules are: [12:26] &lt; vixey&gt; Gamma; x : A |- M : B [12:26] &lt; vixey&gt;  -- [12:26] &lt; vixey&gt; Gamma |- \x, M : A -&gt; B [12:27] &lt; vixey&gt; which says that if M has the type B in the context Gamma extended with x given some assumed type A,                  then \x, M has type A -&gt; B [12:27] &lt; vixey&gt; the variable case means you just take the type from the context: [12:27] &lt; vixey&gt; Gamma; x : T; Delta |- x : T  (x not in Delta) [12:28] &lt; vixey&gt; We can now define the S and K combinators using this language, called simply typed lambda calculus [12:28] &lt; vixey&gt; K = \x,\y,x [12:28] &lt; vixey&gt; S = \m,\n,\x,mx(nx) [12:28] &lt; vixey&gt; I'll show the type derivation for S [12:29] &lt; vixey&gt; m : A -&gt; B -&gt; C, n : A -&gt; B, x : A |- mx : B -&gt; C [12:29] &lt; vixey&gt;  m : A -&gt; B -&gt; C, n : A -&gt; B, x : A |- nx : B [12:29] &lt; vixey&gt;  m : A -&gt; B -&gt; C, n : A -&gt; B, x : A |- mx(nx) : C [12:29] &lt; vixey&gt;  m : A -&gt; B -&gt; C, n : A -&gt; B |- \x, mx(nx) : A -&gt; C [12:29] &lt; vixey&gt;  m : A -&gt; B -&gt; C |- \n,\x,mx(nx) : (A -&gt; B) -&gt; (A -&gt; C) [12:29] &lt; vixey&gt;  |- \m,\n,\x,mx(nx) : (A -&gt; B -&gt; C) -&gt; (A -&gt; B) -&gt; (A -&gt; C) [12:30] &lt; vixey&gt; So now this lambda notation is another way to write proofs, as the S and K Hilbert system was [12:30] &lt; vixey&gt; This (simply typed lambda) calculus has lots of nice properties, the one that really matters is strong normalization. [12:30] &lt; vixey&gt; meaning that every term reduces down to a unique normal form [12:31] &lt; vixey&gt; Turing has shown that this is strongly normalizing lexicographic induction on a list of types sorted by complexity, but you can also do induction on the size of the type derivation and various other ways. [12:31] &lt; vixey&gt; The consequence is that alpha-beta-eta equality is decideable. [12:32] &lt; vixey&gt; viewed as a programming language, you can express booleans, pairs, numbers and addition subtraction, exponetiation. [12:32] &lt; vixey&gt; infact the proof of P -&gt; P, is the identity function [12:33] &lt; vixey&gt; that is given i : P -&gt; P, then  i x ~beta~&gt; x [12:33] &lt; vixey&gt; also a proof of (C &lt;- B) -&gt; (B &lt;- A) -&gt; (C &lt;- A) is the 'o' combinator for function composition (you                  know &quot;f o g&quot;) [12:33] &lt; vixey&gt; This general idea of relating proofs with programs and their types (or meaning) with logical statements is called the Curry-Howard Isomorphims or BHK interpretation. [12:34] &lt; vixey&gt; So this simple typed calculus &lt;=&gt; (implicational) LJ [12:34] &lt; vixey&gt; But it's not very _useful_, you can't express second order logic, or even first order [12:35] &lt; vixey&gt; if you want to prove statements like, P(x) -&gt; P(y) for example, Judgements about formation of                  sentences are required  (things like P : * -&gt; *; x : * |- P(x) : *) [12:35] &lt; vixey&gt; (I'm using * being the type of types in this example) [12:36] &lt; vixey&gt; That still wouldn't really be enough so I wont dwell on that idea [12:36] &lt; vixey&gt; We want to express statements like f : R^n -&gt; R^m, so we really need to allow values (or proofs) inside types/statements [12:37] &lt; vixey&gt; (so (^) is a function * -&gt; N+ -&gt; *, X^1 = X, X^2 = (X,X)) [12:37] &lt; vixey&gt; That is, to introduce and eliminate arrows from not only proofs to proofs, but types to types, proofs to types and types to proofs [12:39] &lt; vixey&gt; So to do that I have to generalize the simple typed calculus to something, this means figuring out 3 things [12:39] &lt; vixey&gt; 1) generalize the X -&gt; Y into the dependent product ||(x:X), Y [12:39] &lt; vixey&gt; (I'm using || as ASCII notation for the conventional uppercase pi symbol) [12:40] &lt; vixey&gt; 2) fold the value and the type level (either side of the : judgement) into one, so that the rules can apply to terms on either side [12:40] &lt; vixey&gt; 3) make sure it still flies (i.e. that there is no proof of False in the theory) [12:41] &lt; vixey&gt; I should probably mention something about the weird terminology the arrow was generalized into a                 'product' [12:41] &lt; vixey&gt; That term comes from looking at the product ||x,P(x) as an element from forall x, P(x), P(1) and P(2)                  and P(3) and ... [12:42] &lt; vixey&gt; I'll just show how a couple of the rules for making the type judgement changed, [12:42] &lt; vixey&gt; G; (x:A) |- M : B   G |- ||x:A, B [12:42] &lt; vixey&gt; -- lam [12:42] &lt; vixey&gt;    G |- (\x:A, M) : (||x:A, B) [12:43] &lt; vixey&gt; the lambda case is very similar except that  x might occur inside B now [12:43] &lt; vixey&gt; also not every product '||x:A, B' is allowed, so  G |- ||x:A, B is there too now [12:44] &lt; vixey&gt; and especially * : * is not allowed, this would allow a paradox similar to the famous set of all sets paradox. [12:44] &lt; vixey&gt; really each * has an subscript index with *[0] : *[1], *[1] : *[2] but I'm going to leave all these implicit [12:45] &lt; vixey&gt; the next interesting change is the app rule; [12:45] &lt; vixey&gt; G |- f : (||x:A, B)  a : A'   A =alpha-beta= A' [12:45] &lt; vixey&gt; - app [12:45] &lt; vixey&gt;               f a : B[x := a] [12:45] &lt; vixey&gt; A =alpha-beta= A' means that the terms A and A' have to convertible [12:46] &lt; vixey&gt; and B[x := a] means that the value a is substituted inside the type B [12:46] &lt; vixey&gt; (it's in the same way that beta reduction happens) [12:46] &lt; vixey&gt; When you think about Curry-Howard again, this app rule is actually the cut rule [12:47] &lt; vixey&gt; infact cut elimination is strong normalization [12:47] &lt; vixey&gt; And I don't know if anybody is wondering why I left out eta, [12:48] &lt; vixey&gt; It causes a problem though, this term \x:A,(\x:B,x)x reduces into two different normal forms with eta [12:48] &lt; vixey&gt; \x:A,(\x:B,x)x ~beta~&gt; \x:A,x [12:48] &lt; vixey&gt; \x:A,(\x:B,x)x ~eta~&gt; \x:B,x [12:48] &lt; vixey&gt; so we've lost strong normalization (and more directly the church-rosser property, different reductions                 of the same term can always be reduced more into equivalent forms) [12:49] &lt; vixey&gt; Since that was all awfully abstract, I'll put it into use a bit.. and show how to build some blocks of                 logic up with those [12:50] &lt; vixey&gt; Define False = ||(P:*), P  that is the absurd statement for any statement P, P is provable [12:51] &lt; vixey&gt; The consistency argument for this calculus (CoC) is that, since every term has a normal form, There is                 clearly no normal form with type False [12:51] &lt; vixey&gt; And negation can be defined: Not P = P -&gt; False [12:51] &lt; vixey&gt; (I'm using X -&gt; Y as shorthand for ||(_:X),Y) [12:52] &lt; vixey&gt; This is still an constructive logic so we wont get excluded middle or the equivalent, Not (Not P) -&gt; P [12:53] &lt; vixey&gt; but Not (Not (Not P)) -&gt; Not P _is_ provable [12:53] &lt; vixey&gt; You can actually embed classical reasoning in the intuitionistic logic by double negation [12:53] &lt; vixey&gt; Here's the actualy proof term: [12:53] &lt; vixey&gt; \(N : Not (Not (Not P))), \(p : P), N (\(N': Not P), N' p) : Not (Not (Not P)) -&gt; Not P [12:54] &lt; vixey&gt; You can also express equality, Eq A x y = ||(P:A -&gt; *), P(x) -&gt; P(y)  (by Liebniz) [12:54] &lt; vixey&gt; logical conjunction, and A B = ||(C:*), (A -&gt; B -&gt; C) -&gt; C (given A B : *) [12:55] &lt; vixey&gt; To actually construct (or introduce) a conjuction: conj a b = \C:*, \z:A-&gt;B-&gt;C, z C x y [12:55] &lt; vixey&gt; and to take projections, pi1 p = p A (\a, \b, a)   pi2 p = p A (\a, \b, b) [12:55] &lt; vixey&gt; Even the natural numbers are definable in this calculus, but there is a problem we can't prove induction! :( [12:56] &lt; vixey&gt; For a stronger theory (in which we can encode real mathematics like algebra, calculus, real analysis, ..) [12:56] &lt; vixey&gt; You can allow defining new terms (as long as they fit some criteria) in the same way that you add new                 datatypes to a programming language [12:57] &lt; vixey&gt; (and they each get an induction principles derived with them) [12:57] &lt; vixey&gt; by Curry-Howard this means that data-constructors are logical introduction rules and the induction                  principle/elimination rules correspond to recursion combinators [12:57] &lt; vixey&gt; so in the style of peano, declaring N : * with constructors [12:57] &lt; vixey&gt;   zero : N [12:57] &lt; vixey&gt;   succ : N -&gt; N [12:58] &lt; vixey&gt; the derived induction scheme (I'll call it N!) is:  N! : ||(P:N-&gt;*), (P zero) -&gt; (||(n:N), P n -&gt; P                  (succ n)) -&gt; ||n, P n [12:58] &lt; vixey&gt; I'll try to illustrate the anatomy of it with some ASCII diagram.. [12:59] &lt; vixey&gt; N!                               &lt;-- eliminator [12:59] &lt; vixey&gt; : ||(P:N-&gt;*),                    &lt;-- motive [12:59] &lt; vixey&gt; (P zero) -&gt;                      /___ methods [12:59] &lt; vixey&gt; (||(n:N), P n -&gt; P (succ n)) -&gt;  \ [12:59] &lt; vixey&gt; ||n, P n                         &lt;-- target [12:59] &lt; vixey&gt; (took that terminology from Elimination with a Motive) [12:59] &lt; vixey&gt; Im' sort of running out of time (I though I'd have 10 more mins..) [12:59] &lt; vixey&gt; but you can define functions like addition: x + y := N! (\_, N) y (\_ m, succ m) x [13:00] &lt; vixey&gt; All of the normal logical connectives can be defined similary, [13:00] &lt; vixey&gt; False with no constructors leads to the induction principle: [13:00] &lt; vixey&gt; False! :||(P:False -&gt; *), ||(f : False), P f i.e. for any P:*,  False -&gt; P [13:01] &lt; vixey&gt; You can even reflect the alpha-beta conversion relation into a term of the type theory [13:01] &lt; vixey&gt; Eq: ||(A:*), A -&gt; A -&gt; * with the constructor (or introduction rule) reflexivity : ||(A:*), ||(x:A), Eq A x x [13:01] &lt; vixey&gt; Now we can prove that 1 + 1 = 2! [13:01] &lt; vixey&gt; reflexivity N 2 : Eq N (1 + 1) 2 [13:02] &lt; vixey&gt; This judgement is derivable because both (1 + 1) and 2 are alpha-beta convertable to 2 [13:02] &lt; vixey&gt; I'll just wrap with an induction proof then [13:03] &lt; vixey&gt; in the same way as 1 + 1 = 2, One can show, forall x, 0 + x = x,   \x, reflexivity N x : ||x, Eq N (0                  + x) x [13:03] &lt; vixey&gt; but not ||x, Eq N (x + 0) x [13:03] &lt; vixey&gt; IT has to be split into the case 0 + 0 and assuming x + 0 = x, then show succ x + 0 = succ x [13:04] &lt; vixey&gt; N! (\x, Eq N (x + 0) x) [13:04] &lt; vixey&gt;    (reflexivity N 0) [13:04] &lt; vixey&gt;   (\x eql, Eq! N (x + 0) (\x', Eq N (succ (x + 0)) (succ x')) (reflexivity N (succ (x + 0))) x eql) : ||x, Eq N (x + 0) x [13:05] &lt; vixey&gt; that's the actual (quite complex) term, which is using the Eq elimination rule -- which corresponds to                 substitution (just what you'd use to prove symmetry and transitivity) [13:05] &lt; vixey&gt; I was going to say a bit about proof by reflection but I think I better stop now, being 6 mins overtime :)