Curry–Howard correspondence

For most of my schooling (basic school, high school), I’ve had a private math teacher thanks to my parents. I believe this has shaped my career path for the better.

I’ve always wanted to be a programmer, and I used to be interested in nothing else. Now I am lucky for the past years to be doing exactly what I’ve always wanted, and that is programming.

Back in basic school, I remember one thing that one of my math teachers kept reminding me: “Study math, it is very closely related to programming”. Now I think I really understand what that statement means.

In any case, I’ve recently started digging into Lean Theorem Prover by Microsoft Research. Having some Haskell experience, and experience with mathematical proofs, the tutorial is mostly easy to follow.

I don’t have much experience with type theory, but I do know some stuff about types from playing with Haskell. I’ve heard about the Curry-Howard correspondence a bunch of times, and it kind of made sense, but I haven’t really understood it in depth. So, by following the Lean tutorial, I got to get introduced to it.

An excerpt from Wikipedia:

In programming language theory and proof theory, the Curry–Howard correspondence (also known as the Curry–Howard isomorphism or equivalence, or the proofs-as-programs and propositions- or formulae-as-types interpretation) is the direct relationship between computer programs and mathematical proofs.

In simpler words, a proof is a program, and the formula it proves is the type for the program.

Now as an example, consider your neat function of swapping 2 values of a product type:

swap :: (a, b) -> (b, a)
swap (a, b) = (b, a)

What the Curry-Howard correspondence says is that this has an equivalent form of a mathematical proof.

Although it may not be immediately obvious, think about the following proof:
Given P and Q, prove that Q and P.
What you do next is use and-introduction and and-elimination to prove this.

How does this proof relate to the swap code above? To answer that, we can now consider these theorems within Lean:

variables p q : Prop
theorem and_comm : p ∧ q → q ∧ p := fun hpq, and.intro (and.elim_right hpq) (and.elim_left hpq)

variables a b : Type
theorem swap (hab : prod a b) : (prod b a) := prod.mk hab.2 hab.1

Lean is so awesome it has this #check command that can tell us the complete types:

#check and_comm -- and_comm : ∀ (p q : Prop), p ∧ q → q ∧ p
#check swap     -- swap     : Π (a b : Type), a × b → b × a

Now the shapes are apparent.

We now see the following:

  • and.intro is equivalent to prod.mk (making a product)
  • and.elim_left is equivalent to the first element of the product type
  • and.elim_right is equivalent to the second element of the product type
  • forall is equivalent to the dependent type pi-type
    A dependent type is a type whose definition depends on parameters. For example, consider the polymorphic type List a. This type depends on the value of a. So List Int is a well defined type, or List Bool is another example.

    More formally, if we’re given A : Type and B : A -> Type, then B is a set of types over A.
    That is, B contains all types B a for each a : A.
    We denote it as Pi a : A, B a.

As a conclusion, it’s interesting how logical AND being commutative is isomorphic to a product type swap function, right? 🙂

Advertisements

Code checklist

Here is a checklist of some of the most important stuff (in my opinion) that should be ran through any code change, regardless of a programming language:

  • Does what it’s supposed to do
    • Process it mentally (proof of correctness)
    • Automated tests
    • Manual tests
  • Can’t be further broken into smaller pieces
  • Is written in a generalized way (re-usage)
  • Has no repeating parts
  • My future-self and coworkers can understand it
    • Is documented
    • Is readable
  • Conforms to (any) coding standards for consistency
  • Separation of concerns

With experience, one doesn’t need to think about these, since they occur naturally.

There might be a couple of more that I missed, but these should be a good base.

Induction with Coq

To get ourselves introduced to the induction tactic, we’ll start by showing that 0 is a right identity for the naturals under addition, i.e. \forall n \in \mathbb{N}, n + 0 = n.

The definition of addition is: n = 0 + n, S(m) + n = S(m + n).

Mathematically, we start with induction on n.

We have 0 + 0 = 0 for the base case, which is true (based on definition of add).
For the inductive step, we assume that n + 0 = n, and try to prove that S n + 0 = S n.
We can rewrite our goal by using the definition of add, where we have S n + 0 = S (n + 0).
Now using the hypothesis we have S (n + 0) = S n, which is what we needed to show.

Programmatically, we have:

Compute 2+3. (* Test to see what's 2 plus 3 *)

Theorem zero_identity_on_addition : (forall n:nat, n + 0 = n).
Proof.
  intros n.
  induction n.     (* Use induction on n *)
  (* Base case *)
    simpl.         (* Simplify 0 + 0 = 0 to 0 = 0 *)
    exact eq_refl. (* Which is exactly reflexivity *)
  (* Inductive step *)
  simpl.           (* At this point we have S n + 0 = S n, in other words n + 1 + 0 = n + 1 *)
                   (* Simplify to get (n + 0) + 1 = n + 1 *)
  rewrite IHn.     (* Use the hypothesis to rewrite (n + 0) to n *)
  exact eq_refl.   (* n + 1 = n + 1 is proven by reflexivity *)
Qed.

Neat, right?

Our second example is more complex. We will:
1. Define a recursive function, fact.
2. Prove that fact(3) = 6.
3. Introduce and prove a lemma to be used for our proof.
4. Prove that fact(n) > 0.

Mathematically:

  1. fact(n) =  \left\{  	\begin{array}{ll}  		1  & \mbox{if } n = 0 \\  		n * fact(n - 1) & \mbox otherwise  	\end{array}  \right.
  2. To show what fact(3) is, we go by definitions: 3 * fact(2) = 3 * 2 * fact(1) = 3 * 2 * 1 * fact(0) = 6.
  3. To prove x > 0 \implies x + y > 0, which is the lemma we’ll use in our Coq proof, note we have as givens x > 0, y \ge 0 (since y \in \mathbb{N}).

    From here, we have two cases:
    \begin{cases}  y = 0: x + y = x + 0 = x > 0  \\  y > 0: x + y > 0 + x = x > 0  \end{cases}

    By transitivity on (>, \mathbb{N}) for the second case, combined with the first case, we can conclude that x + y > 0.

  4. To prove that fact(n) > 0 for n \ge 0, we use induction on n.

    Base case: n = 0: fact(0) = 1 > 0, which is true.
    Inductive step: Assume fact(k) > 0 for some n = k. Note that n \ge 0 \implies k + 1 > 0, so we’re good to multiply with a positive number.

    Multiply both sides by k + 1 to get (k + 1) * fact(k) = fact(k + 1) > 0, which is what we needed to show.

    Thus fact(n) > 0 in general.

Now for the programming part.

  1. The recursive definition:

    (* If we try to use Definition, we get fact is not defined *)
    (* Coq provides a special syntax Fixpoint for recursion *)
    Fixpoint fact (n:nat) : nat :=
      match n with
        | S n => S n * fact n
        | 0 => 1
    end.
    
  2. Evaluate and prove fact(3) = 6

    Compute (fact 3).
     
    Theorem fact_3_eq_6 : (fact 3 = 6).
    Proof.
      simpl.           (* Evaluate fact 3 = 6 *)
      exact eq_refl.   (* 6 = 6 is exactly reflexivity *)
    Qed.
    
  3. We start our lemma as follows:

    Require Import Coq.Arith.Plus.
    
    Lemma x_plus_y_gt_0 : (forall x y : nat, x > 0 -> x + y > 0).
    Proof.
      intros x y.
      intros x_gt_0.
      unfold gt.           (* Convert x + y > 0 to 0 < x + y *) 
      unfold gt in x_gt_0. (* Convert x_gt_0 : x > 0 to x_gt_0 : 0 < x *)
      (* We have that Theorem lt_plus_trans n m p : n < m -> n < m + p *)
      (* So we feed 0, x, y to match the arguments, and additionally pass x_gt_0 to match the n < m part *)
      exact (lt_plus_trans 0 x y x_gt_0).
    Qed.
    

    Nothing new going on here. But, we can try to be smart, and try to rewrite the lemma as:

    Require Import Omega.
    
    Lemma x_plus_y_gt_0 : (forall x y : nat, x > 0 -> x + y > 0).
    Proof.
      intros x y.
      omega.
    Qed.
    

    The new thing we can notice here is the usage of the omega tactic.

    This tactic is an automatic decision procedure for Presburger arithmetic, i.e. it will solve any arithmetic-based proof goal that it understands (and that is true). But note e.g. that it doesn’t understand multiplication.

  4. The actual proof

    Theorem fact_gt_0 : (forall n : nat, fact n > 0).
    Proof.
      intros n.
      induction n.
      (* Base case *)
        simpl.          (* At this point we have fact 0 > 0, simplify to get 1 > 0 *)
        exact (le_n 1). (* This is exactly 1 > 0 *)
      (* Inductive step *)
        simpl.          (* Simplify to convert fact(n + 1) > 0 to fact n + n * fact n > 0 *)
                        (* Which is exactly our lemma defined above *)
                        (* We also have IHn : fact n > 0 *)
        (* Feed (fact n), (n * fact n) into x_plus_y_gt_0, and IHn as well for the x > 0 part *)
        exact (x_plus_y_gt_0 (fact n) (n * fact n) IHn).
    Qed.
    

Thus, the complete program:

Require Import Omega.

(* If we try to use Definition, we get fact is not defined *)
(* Coq provides a special syntax Fixpoint for recursion *)
Fixpoint fact (n:nat) : nat :=
  match n with
    | S n => S n * fact n
    | 0 => 1
end.

Compute (fact 3).

Theorem fact_3_eq_6 : (fact 3 = 6).
Proof.
  simpl.           (* Evaluate fact 3 = 6 *)
  exact eq_refl.   (* 6 = 6 is exactly reflexivity *)
Qed.

Lemma x_plus_y_gt_0 : (forall x y : nat, x > 0 -> x + y > 0).
Proof.
  intros x y.
  omega.
Qed.

Theorem fact_gt_0 : (forall n : nat, fact n > 0).
Proof.
  intros n.
  induction n.
  (* Base case *)
    simpl.          (* At this point we have fact 0 > 0, simplify to get 1 > 0 *)
    exact (le_n 1). (* This is exactly 1 > 0 *)
  (* Inductive step *)
    simpl.          (* Simplify to convert fact(n + 1) > 0 to fact n + n * fact n > 0 *)
                    (* Which is exactly our lemma defined above *)
                    (* We also have IHn : fact n > 0 *)
    (* Feed (fact n), (n * fact n) into x_plus_y_gt_0, and IHn as well for the x > 0 part *)
    exact (x_plus_y_gt_0 (fact n) (n * fact n) IHn).
Qed.

As a conclusion from the examples above, we can say that Coq with its type mechanism provides a neat way for us to reason about properties of our programs.