Intro to proof assistants
who transitioned into knowledge science, and I work every day with machine studying algorithms. I’m fascinated each by their obvious magic and by the arithmetic that underlies them. Pry open any machine studying library and also you’ll discover mathematical tips involving matrix decompositions, convolutions, Gaussian curves, and extra. These, in flip, are constructed on much more basic axioms and guidelines, resembling operate software and logic.
Throughout my journey to be taught these primitives, I collected a complete shelf of arithmetic books, a lot of which now collect mud. I additionally enrolled on the Open College, the place I attended courses over Zoom and realized about syntax and axioms, then how one can mix them to construct theorems.
The subject was fascinating: axioms are like recreation items and authorized strikes on a board, whereas theorems are authorized performs, some extra attention-grabbing than others. Proving a theorem means unwinding the gameplay simply sufficient to persuade the gamers that it’s reachable, or disproving it by mentioning an impossibility. For instance: “The 2 bishops are on white squares, and a bishop can by no means change to a sq. of a special shade.”
However on a sensible degree, the programs have been tedious. I used to be given PDFs to resolve by hand after which submit scans of my work on-line. Whereas scribbling with pencil on paper, scuffling with the course assignments I had two insights:
- Constructing a proof is like programming.
- There’s no manner I can interact with arithmetic in these early century situations. I want an IDE, with sort hints, syntax highlighting, and descriptive error messages, and have interaction with it interactively.
My ideas weren’t new. The primary thought is known as the Curry-Howard correspondence, defined beneath. The latter is the objective of proof assistants resembling Lean and their IDE extensions (often a VS Code extension).
Interactive proof assistants vs computerized theorem provers
Theorem checkers like Lean’s kernel can decide whether or not expressions are well-formed, if the steps are authorized, and if the ultimate conclusion is legitimate. Interactive proof assistants construct on them and also will assist you construct your proofs and counsel steps. Automated theorem provers (ATPs) try to search out the proof on their very own utilizing AI strategies. Lean works as a proof checker and assistant, however it pairs nicely with generative AI to operate successfully as an ATP.
Terence Tao streams AI-assisted proofs on his YouTube channel utilizing Lean 4 and GitHub Copilot. Many of the headlines within the type of “AI simply solved an x-years-old open downside” have been seemingly formalized with Lean (instance 1, instance 2, and 3). AI initiatives and benchmarks resembling LeanDojo, miniF2F, ProofNet, PutnamBench, FormalMATH, use Lean. OpenAI and Meta have skilled fashions to make use of Lean, and DeepMind’s AlphaProof used it to earn a silver medalist on the Worldwide Mathematical Olympiad.

There’s a nice current corpus of guides and tutorials for Lean, in addition to a very good group, but the audience doesn’t appear to be builders skilled in different programming languages. I used to be struggling to parse Lean’s syntax in my head, and as I realized, I observed I used to be strolling on unpaved territory. Hopefully the next sections will information you on this path.
The Curry-Howard correspondence
Let’s begin with a easy logic theorem, one thing you might need to show in an undergraduate logic class:
(P → Q → R) → (P ∧ Q → R)
The operator → denotes logical implication, however within the Curry-Howard correspondence between proofs and pc packages it may be learn as a operate. ∧ is the logical “and.” In plain English, we’re asking whether or not this assertion is true: “If I’ve a operate that, given P, produces a operate from Q to R, then if I have already got each P and Q already, I can produce R.”
Be aware:
(P → Q → R)is a “curried operate,” (Haskell Curry is right here responsible as nicely), and the syntax is equal to(P → (Q → R)). In programming phrases, it’s a operate that returns one other operate, which may later be invoked asfoo(bar)(baz). It’s logically equal tofooreceiving eachbarandbazas a tuplefoo((bar, baz)), however that’s what we’re making an attempt to show right here. Likewise, the whole sentence could possibly be written as(P → Q → R) → P ∧ Q → R.
Let’s think about this with a concrete instance: If I’ve a merchandising machine that given a coin (P) returns a cup of instantaneous noodles (Q → R), which given scorching water (Q) returns noodles (R), than implies that I could make a system that given each a coin and scorching water (P ∧ Q) as inputs, I can return noodles.

In case you have been instructed to show that that is doable as a programmer, you is perhaps tempted to take action by writing a operate that conforms to this state of affairs. In TypeScript we’d write:
Strive it within the TypeScript playground, modifying something will trigger sort errors.
Written generically and simplified (playground):
We outlined a operate — our theorem — that receives a curried operate (P → Q → R) as a parameter and outputs a operate that receives a product sort P ∧ Q and outputs R, that’s, (P ∧ Q → R). This operate acts as a “witness” that the logic theorem is true. The proof appears to be like sound and it even satisfies the categories that spell the unique theorem in TypeScript:
Let’s state this even stronger: The proof is sound as a result of it glad the sort. Propositions are sorts, and proofs are phrases of these sorts. We additionally noticed that conjunctions ∧ are product sorts / tuples, and that logical implications Q → R are capabilities (q: Q) => R. The arrow is equal each syntactically and semantically.
Let’s implement this in Lean:
You’ll be able to play with it in Lean’s playground.
instance is a approach to outline nameless declarations solely to verify the categories (not like the key phrase theorem), and its sort (what follows the colon : as much as the definition :=) is our theorem. The proof is achieved when a definition “inhabits” this sort. As is trendy in useful programming, operate software is simply whitespace: probably the most basic operation is mapped to probably the most primary operator within the language. P ∧ Q is a product sort / tuple, an its members are accessed through .left and .proper .
Discover that the sort and the implementation are almost equivalent — the sort alone is sufficient to decide the code. In truth, given a sort signature, you should utilize Loogle to seek for matching definitions in Lean’s library.
Different Curry-Howard Correspondences
| Logic/Lean | Kind type | TypeScript Approximation |
|---|---|---|
| P ∧ Q | product sort | [P, Q] or { left: P; proper: Q } |
| P ∨ Q | sum sort | P | Q * |
| P → Q | operate sort | (p: P) => Q |
| True | unit sort | undefined, or null |
| False | empty sort | by no means |
| ¬P | operate to empty sort | (p: P) => by no means |
| P ↔ Q | pair of capabilities | [(p: P) => Q, (q: Q) => P] |
| (P ∧ Q) → R | operate from product | (pq: [P, Q]) => R |
| P → Q → R | curried operate | (p: P) => (q: Q) => R |
* A greater sum sort could be a discriminated union: { tag: “left”; worth: P } | { tag: “proper”; worth: Q }
What you haven’t seen but
Dependent Sorts
Let’s take a look at the next, extra advanced, sort:
It means: “For all values (∀) of x which might be pure numbers, 0 is much less or equal to x.” the very first thing we discover is the Unicode mathematical symbols; these can both be auto-completed within the IDE with forall or just substituted with the ASCII key phrase forall.
This strikes us as not wanting like a sort. The sort is dependent upon the worth of x — one thing that we often get from advanced validation libraries like Zod, not from static sort checkers like TypeScript. It’s a sort that “relies upon” on values. But Lean’s sorts don’t depend on runtime checks the best way Zod does; it’s nonetheless a sort, albeit a fancy one. This can be a supply of confusion when studying Lean’s syntax.

Universes
Within the playground or IDE, we are able to verify the kind of phrases with #verify:
Now let’s verify the kind of Nat:
Nat is of sort Kind. However why cease there?
Kind is a sort of Kind 1, and so forth. This reveals that there isn’t a tough syntactic distinction between sorts and phrases in Lean. Sorts are phrases which have their very own sorts, stacked like onion layers — these layers are referred to as ‘universes.’ 3 is a time period of sort Nat and Nat is a time period of sort Kind, Kind n has sort Kind (n+1), and so forth.
Lean should maintain monitor of universe ranges to keep away from the barber’s paradox: “A barber shaves everybody who doesn’t shave themselves, does the barber shave himself?” With sort universes, the barber is a process whose sort can solely function on components of a sort from a universe beneath it, it can not function components by itself degree, together with himself. This avoids the self-referential paradoxes that would trigger inconsistency. Inconsistency would imply we may show False to be true, and thru it show all propositions.
Propositions
Let’s attempt a fair tougher sort:
In plain English: “Is that this true? For each pure quantity x, y, z, and n, given a proof that n > 2and x * y * z ≠ 0, we are able to produce a proof that xⁿ + yⁿ ≠ zⁿ”
That is Fermat’s Final Theorem, which dumbfounded mathematicians for 358 years after Fermat claimed his personal proof didn’t match the margins of the web page. The definition makes use of sorry which is a placeholder for an incomplete proof. It would compile however will inform you: declaration makes use of `sorry`. It seems like utilizing TypeScript’s any to silence its sort errors, however a reviewer may discover it and ask you to repair the lint error. Offering this proof is your “objective.”
Let’s stroll again into one thing simpler:
That is ridiculous. How can I outline a time period whose sort is 2 + 2 = 4?
Let’s begin by wanting on the sort:
The sort is a Prop, quick for proposition. A proposition is a press release that may be true or false. As we acknowledged above, within the Curry-Howard correspondence a proposition is a sort, and a proof is a time period or program of that sort. The proposition is true if the sort has “an inhabitant” and false if it doesn’t. Lean doesn’t “resolve” a proposition right into a Boolean true or false in a computational sense; we simply say it’s “true” or “false.” Utilizing Lean as a proof assistant means working principally with propositions.
Behind the scenes, Lean’s sort checker will normalize this sort for you into 4 = 4, however no additional. We simply want to offer a time period that inhabits this normalized sort. We do that with rfl, an entity constructor robotically imported from the Prelude bundle into your namespace.
This works. It’s declarative syntax for asserting that 2 + 2 = 4 is true by reflexivity. To know why this works, let’s verify the sort:
These are generics, which by conference use Greek letters. The curly brackets imply that Lean will infer them for you.
rfl.{u} is the title of the operate and its universe degree, (your layer of the categories onion). {α : Type u} means α is a sort at universe degree u, nothing to do with “sortable,” it’s simply of the “kind” u. {a : α} means a is a time period of sort α, and the kind of rfl is in the end the proposition a = a. In our case, α is Nat and a is 4 , rfl has the sort 4 = 4 , matching the sort Lean normalized above. We have now discovered an inhabitant for 2 + 2 = 4 proving the proposition by reflexivity.

Ways
Earlier we solved our merchandising machine instance with operate software alone, however there are numerous extra attention-grabbing primitives in Lean. In Lean you can begin your proof with by which prompts “tactic” mode:
This allows you to construct your proof step-by-step, whereas the appropriate pane of your IDE (the InfoView) progressively reveals your native assumptions and your objective, marked by ⊢. Ways are a mix of macros and syntactic sugar that resolve the proof state for the type-checking language kernel. They assist you suppose analytically and in a declarative manner whereas tackling your objectives. It’s an imperative-style answer within a useful language, operating inside a monad.
You’ll have to familiarize your self with ways simply as somebody studying SQL does with analytical constructs like GROUP BY, PIVOT, and LEFT JOIN. Eradicating sorry will present the state and objectives within the InfoView:
We have now one objective to show. Coin and HotWater are propositions. Noodles is an unknown of kind ?u.29, a random ID for a yet-unassigned universe degree. Our objective is to provide a proof of (Coin → HotWater → Noodles) → Coin ∧ HotWater → Noodles.
Let’s use the tactic intro to introduce an assumption into scope. It would robotically seize the enter sort (Coin → HotWater → Noodles) — one of many proofs I already “have” to work with:
The InfoView will refresh and present you the up to date native assumptions and objective:
Along with Coin, HotWater, and Noodles, we now have an area speculation (or assumption): noodleVendingMachine, a relentless of sort Coin → HotWater → Noodles (a curried operate). Our objective is now to provide Coin ∧ HotWater → Noodles.
Let’s introduce one other variable that can robotically seize the subsequent assumption, of sort Coin ∧ HotWater:
InfoView now reveals:
Now we solely have to return Noodles utilizing our toolbox of ways and the assumptions in native scope. We have now a curried operate noodleVendingMachine and a product sort (tuple) coinAndHotWater. Let’s apply the Coin and the HotWater from coinAndHotWater to noodleVendingMachine:
The tactic actual serves as a return key phrase when we have now the precise sort to return. .left and .proper are the weather of the product sort / tuple coinAndHotWater, which we utilized with the intention to the curried operate noodleVendingMachine. Do not forget that whitespace is operate software.
InfoView reveals:
This may be written generically as:
Satisfying our unique theorem (P → Q → R) → (P ∧ Q → R) above.
Some ways assist decompose a objective into smaller obligations, resembling intro, circumstances, constructor, apply, or refine. Others rewrite expression utilizing assumptions of equivalences, resembling rw or simp. Different ways normalize expression by computation or algebraic manipulation, resembling rfl, and ring. And a few present focused automation: linearinth, omega, aesop. Seek the advice of the tactic reference.
Considering in ways is a chic talent that may solely come from expertise. To reuse a earlier simile, it’s like an information analyst pulling up the appropriate analytic capabilities and effortlessly cleansing up, enriching, and summarizing tables — or a senior software program engineer piping collectively the appropriate Bash instructions or recognizing the cleanest design sample for the job.
Overarching Imaginative and prescient of Theorem Provers

Leibniz dreamed of a common formal language of thought (Characteristica universalis) paired with a mechanical calculator (Calculus ratiocinator) that would show all theorems and philosophical arguments by turning a crank. Hilbert requested, “is that this doable?” — satisfied till loss of life that “there may be nothing that we will not know.” This was considered “the principle downside of mathematical logic” and “probably the most common downside of arithmetic.” Gödel although about this and proved that some truths can’t be confirmed, whereas Church and Turing each independently proved that “typically you’ll be turning that crank ceaselessly,” arguably beginning the sector of pc science within the course of. To make use of the chess analogy from earlier, automated theorem proving is like looking the sport tree for successful positions, however arithmetic has an infinite board, so the search could by no means finish — some performs are “undecidable.” Nevertheless, we are able to nonetheless verify whether or not a transfer is authorized, and that is the place we at the moment are with proof assistants.
In 1956, Bertrand Russell wrote a letter concerning Logic Theorist, that had robotically proved theorems from Principia Mathematica, and which is typically described as the primary AI program:
“I want Whitehead and I had identified of this risk earlier than we wasted ten years doing it by hand. I’m fairly prepared to imagine that every little thing in deductive logic might be carried out by equipment.”
Conclusion
Lean has a steep studying curve, partly attributable to its syntax and its use of ways, which you should be taught. Battling Lean comes, nevertheless, with a reward: you’re studying mathematics. Sorts, universes, ways, theorems, and proofs aren’t quirks of yet one more programming language however ideas that mathematicians and logicians struggled with for generations. They aren’t ineffective details to memorize however thought patterns as historical as Euclid and as everlasting because the theorems we proved.
When you’re conversant in the basics, attempt shifting on to extra superior matters with Arithmetic in Lean, the place you’ll sort out issues in discrete arithmetic, linear algebra, differential calculus, and topology. Or delve deep into the foundations ranging from zero (actually) by way of Tao’s Lean companion to “Evaluation I” Guide. Mathlib, Lean’s collaborative library, already has 1.6 million strains of formalized arithmetic that you should utilize to bridge to all branches of arithmetic.















