
Lean into Math
A Concise Guide to Vibe Mathing
Andrej Karpathy who co-founded OpenAI coined the phrase vibe coding in February 2025 by which he meant that software engineers can use Large Language Models (LLMs) such as OpenAI’s ChatGPT to generate code from input prompts. “Vibe coding is claimed by its advocates to allow even amateur programmers to produce software without the extensive training and skills previously required for software engineering.” (Wikipedia - Vibe Coding)
The response from the community was not overwhelmingly positive. In the r/programming subreddit, Kush_238 asks, “Why ‘Vibe Coding’ Makes Me Want to Throw Up?” and some of the answers were:
- It’s the new “10x engineer”, so your initial reaction tracks. It’s mostly LinkedIn lunatics parroting this…
- This is the next buzzword used by the same crop of clueless paste eaters that have been trying to build software without actually building software for 50 years.
- Vibe coding is just programming on accident.
Could this vibe idea be applied to mathematics? Can AI prove theorems? Yes, but even though these Large Language Models (LLMs) are constantly improving, they’re still prone to errors. After all, they were trained to understand and respond to language, not mathematics. What if we could improve the accuracy of LLMs by asking them to apply math tools rather than solving the problem directly? Why ask a language model to solve a math problem when you can use a calculator to provide the exact answer? Wolfram Alpha is available as an app for your phone and will accurately answer any questions you might ask.
Sometimes, to solve a problem you need to know the right substitution to make the solution possible, but even a good Computer Algebra System (CAS) like Mathematica (Wolfram Alpha is based on Mathematica) won’t be able to solve it. On the other hand, an LLM might provide an answer but you shouldn’t trust it without carefully checking the steps.
Let’s take a look at a tricky problem, and see if we can get an LLM to generate the steps needed to solve it with the help of a CAS or a theorem prover such as Lean. Combining actual thinking about the problem with an AI system and real math tools could be the path to vibe mathing.
A Tricky Problem
We want to solve for an integer such that
Wolfram Alpha couldn’t find a solution to this, but if you make a simple substitution the problem becomes much easier. Let so the equation simplifies to
and since the left side may be rewritten as
One solution is to let so both sides become zero. If , then , or . The next step is to realize that which means or So, now we have two solutions. Either or and solving for gives or
Use your favorite CAS or other programming language (I used Pari/GP) to define two functions:
fL(n) = (n^3)! - (n^3-1)!
fR(n) = 720(n^3-1)^2
For both sides are zero since and . When both sides equal , so the solutions check.
Now let’s see if an LLM can solve the problem.
DeepSeek
DeepSeek is the latest LLM. It’s completely free to use and the code is fully open-source and is actively being blocked by governments around the world due to fears of excessive user data collection. Fine. I’ll bore them to tears with math problems. Here’s the prompt I used for the problem ( is shorthand for positive integers):
I want to use the Lean4 Theorem Prover to solve for . How should I set this up in Lean?
DeepSeek responded with a six-step outline that included required Lean libraries, but Deepseek thought the problem was to show that the equation holds for all . It tried and showed it was a solution, and then it tried and again found a solution. Step 5 was:
Step 5: General Proof (if possible)
If you want to prove the equation for all n∈Z+, you would need to find a general proof. However, this might not be straightforward, and you might need to use more advanced techniques or even prove that no other solutions exist beyond a certain point.
Finally, it wrote a Lean function to test other integers.
ChatGPT
ChatGPT by OpenAI has many specialized AI tools, one of them being Wolfram. The Wolfram GPT gave this solution:
We are given the equation:
Step 1: Factorial Factorization
Rewriting the factorial terms:
Thus, substituting in the equation:
Factoring out :
Step 2: Canceling When Possible
If , we divide by :
Step 3: Solving for
We now need to find integer values of such that:
Checking small values of :
- : , division is not possible.
- : This is a valid solution.
- : (26! is vastly larger than , so this does not work.)
Since factorials grow extremely fast, checking larger values of will only increase the left-hand side much more than the right-hand side.
Conclusion
The only integer solution is:
This is almost correct, but it decided that couldn’t be a solution because that might imply a division by zero. Let’s try a hybrid approach.
The Lean Theorem Prover
Lean is both a mathematical proof assistant and a computer language designed to help formalize mathematics. This means that every step of a proof can be checked rigorously. Mathematical concepts and proofs are translated into a language that a computer can understand and verify. Imagine writing a problem out on a whiteboard. Sooner or later, you’ll miss a minus sign, or make some other minor algebraic error and the proof will be incorrect. Lean takes the tedium out of the proof.
The advantage of both Lean and Mathematica is that they don’t hallucinate the way AI systems do, but a good AI can provide the syntax used in the mathematical software. For help setting up Lean and Mathematica, see our previous article, JupyterLab Desktop.
My initial prompt to ChatGPT was
How should the problem statement be entered into Lean4? Don’t try to solve the problem, just set it up.
It showed how to set the problem up, but it also asked if I wanted to restrict instead of all integers because the factorial function is not defined for negative numbers. In fact, Ashwani Thukral described a method for applying the factorial to both negative and imaginary numbers, but we’ll keep the original restriction of non-negative integers. ChatGPT also suggested that I set the problem up as a theorem to be proved, rather than a definition.
Here’s how you can write the problem as a theorem in Lean 4, phrased as the existence of a natural number satisfying the given equation:
import Mathlib.Data.Nat.Factorial.Basic
import Mathlib.Data.Nat.Basic
open Nat
theorem exists_solution_to_factorial_equation :
∃ n : ℕ, let k := n ^ 3
k.factorial - (k - 1).factorial = 720 * (k - 1) ^ 2 :=
sorry
Explanation:
∃ n : ℕ, ...
asserts that there exists a natural number satisfying the condition.- The
let k := n ^ 3
binding makes the expression cleaner. - The
sorry
is a placeholder in Lean for a proof that hasn’t been filled in yet.
You could also add an explicit condition like n ≥ 1
if you want to avoid dealing with edge cases like .
Notice that ChatGPT made the substitution , which is in the general direction of how I want to solve the problem, but I’d rather use the original substitution . Another prompt to ChatGPT gave this:
There are two adjustments needed for correctness in Lean 4 syntax:
- Lean does not allow arbitrary expressions like
(n^3).factorial
unlessn^3
is explicitly defined as a natural number. Sofactorial
must be applied to variables of typeℕ
, and the whole expression must live in aProp
(like an equality or a∃
). - You must quantify
n : ℕ
, since your theorem refers to an unspecified variablen
.
Here is the corrected and idiomatic way to state it in Lean 4 without any substitutions:
import Mathlib.Data.Nat.Factorial.Basic
import Mathlib.Data.Nat.Basic
open Nat
theorem exists_solution_to_factorial_equation :
∃ n : ℕ, (n ^ 3) ! - (n ^ 3 - 1) ! = 720 * (n ^ 3 - 1) ^ 2 :=
sorry
Notes:
- In Lean,
n!
is written asn !
(with a space), or you can usefactorial n
. ^
is exponentiation, and!
is factorial (with proper spacing).- The
∃ n : ℕ, ...
makes this a statement about the existence of such ann
. sorry
is a placeholder so Lean can still compile the file while the proof is missing.
Then, ChatGPT seemed to get stuck on the idea that all it needed to do was to show that and were solutions to the problem:
import Mathlib.Data.Nat.Factorial.Basic
import Mathlib.Data.Nat.Basic
open Nat
theorem all_solutions_to_factorial_equation :
∃ n : ℕ, (n ^ 3)! - (n ^ 3 - 1)! = 720 * (n ^ 3 - 1) ^ 2 :=
by
-- Try n = 1
use 1
norm_num -- (1^3)! - (0)! = 1 - 1 = 0, RHS = 720 * 0 = 0
-- Try n = 2
use 2
norm_num -- (8)! - (7)! = 40320 - 5040 = 35280, RHS = 720 * 49 = 35280
If I were more familiar with the syntax of Lean, I might be able to work around this. The Mechanics of Proof by Heather Macbeth and Mathematics in Lean by Jeremy Avigad and Patrick Massot are recommended on the Lean documentation site.
The Mathematica Solution
The syntax in Mathematica is a bit more similar to how you might write equations on a whiteboard. The first step is to define the equation,
eq = Factorial[n^3] - Factorial[n^3 - 1] == 720*(n^3 - 1)^2
which Mathematica rewrites as
Next, we substitute with
eq1 = eq /. n^3 -> p + 1
Mathematica correctly generated . At this point, Mathematica began to break down. It couldn’t simplify this with the Reduce
command, so I moved on to the last step, and asked it to find the numerical answer,
NSolve[Factorial[p - 1] == 720, p]
which generated this result:
NSolve::ifun: Inverse functions are being used by NSolve, so some solutions may not be found; use Reduce for complete solution information.
Curiously, WolframAlpha finds the correct solution, which we can use to solve for :
Solve[n^3 == 0 + 1 && n ∈ Integers && n ≥ 0, n]
Solve[n^3 == 7 + 1 && n ∈ Integers && n ≥ 0, n]
returning . You can check that these answers are correct by substituting them into the original equation,
eq /. n -> 1
eq /. n -> 2
which will return True
.
Prove-It
Several other theorem-proving systems have been developed, and one that might be worth a second look is Prove-It developed at Sandia National Laboratories. Prove-It is a Python-based interactive theorem-proving assistant designed to mirror informal mathematical reasoning in a JupyterLab Notebook. Prove-It focuses on human intuition in proof construction while supporting automation.
The system is organized into a hierarchical theory package system containing axioms, conjectures, and theorems with associated proofs, and it includes mechanisms to manage theorem dependencies and avoid logical paradoxes such as Russell’s or Curry’s. In their paper, the developers of Prove-It present a formalization of the classic proof that the square root of 2 is irrational and show its potential applications, including quantum computing verification.
The value of Prove-It may be that it generates output using LaTeX symbols bringing the system closer to a whiteboard experience.

The Prove-It interface.
Getting started with AI in mathematics
The value in vibe mathing is in checking your answers. Of course, most of the value comes from using a solid prover or Computer Algebra System (CAS), and not AI. A procedure for solving problems with these tools is:
- Try to solve as much of the problem without aids as possible.
- Learn the basics of Lean and Mathematica (or other similar systems), and apply them to check your work.
- Use AI to recommend keywords. For example, ask it the difference between
Solve
andReduce
in Mathematica. - Avoid telling the AI too much about the problem. AI tends to want to solve the entire problem and can get trapped in an unproductive direction. Give it snippets or similar problems to illustrate what you’re looking for.
You might try Andrej Karpathy’s method of asking AI to solve the problem and then iterating on the solution it provides. I asked Claude (Sonnet 3.5) on Poe and it somehow decided that the solution was . When I asked if was a solution, it checked and found that it was, and then I repeated the question for . Again, it decided that this was a solution, but in the end, thought that all three answers: satisfied the equality.
Wolfram Alpha wasn’t able to find any solution. Still, this may have been a tricky problem, and you may have more luck with another one. If you want to learn how to prove theorems, I recommend The Nuts and Bolts of Proofs by Antonella Cupillari (free online). You don’t need to write out theorems and their proofs formally the way mathematicians do, either. In fact, doing the algebra for a problem step-by-step is a form of proof.
With these thoughts in mind, happy vibe mathing!
Code for this article
The JupyterLab notebook for this article is on GitHub. This should give you a good starting point for any theorems you’d like to prove.
Software
- Wolfram Language - Wolfram Language is a symbolic language, deliberately designed with the breadth and unity needed to develop powerful programs quickly. By integrating high-level forms—like Image, GeoPolygon or Molecule—along with advanced superfunctions—such as ImageIdentify or ApplyReaction—Wolfram Language makes it possible to quickly express complex ideas in computational form.
- Lean Theorem Prover - Lean is a functional programming language that makes it easy to write correct and maintainable code. You can also use Lean as an interactive theorem prover.
- Pari/GP - PARI/GP is a widely used computer algebra system designed for fast computations in number theory (factorizations, algebraic number theory, elliptic curves, modular forms, L functions…), but also contains a large number of other useful functions to compute with mathematical entities such as matrices, polynomials, power series, algebraic numbers etc., and a lot of transcendental functions.
- JupyterLab - JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning.
- Prove-It! - Prove-It is a tool for proving and organizing general theorems using Python.
References and further reading
- Thukral, A.K. Factorials of real negative and imaginary numbers - A new perspective. SpringerPlus 3, 658 (2014).
- J. Avigad and P. Massot, Mathematics in Lean, Lean Community, 2023.
- H. Macbeth, The Mechanics of Proof, Fordham University, 2024.
- Jasmin Christian Blanchette, Cezary Kaliszyk, Lawrence C. Paulson, Josef Urban. Hammering towards QED. Journal of Formalized Reasoning, 2016, 9 (1), pp.101-148.
- Peiyang Song, Kaiyu Yang, Anima Anandkumar. Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean, arXiv preprint arXiv:2404.12534, 2025.
- Vibe Coding, Wikipedia.
- W. Weisstein, Factorial, MathWorld–A Wolfram Web Resource, 2024.
- M. L. Bottoni, A. S. Cattaneo, and E. Sacikara, Teaching ‘Foundations of Mathematics’ with the Lean Theorem Prover, arXiv preprint arXiv:2501.03352, 2025.
- C. Purcell, Mathematicians’ Newest Assistants Are Artificially Intelligent, Scientific American, 2024.
- M. Rieffel, The Use of Lean in Undergraduate Education, IEEE Comp. in Education, vol. 67, no. 4, pp. 23–30, 2023.
- A. Gusakov, B. Mehta, and K. A. Miller, Formalizing Hall’s Marriage Theorem in Lean, arXiv preprint arXiv:2101.00127, 2021.
- M. Luciano Bottoni, A. S. Cattaneo, and E. Sacikara, Teaching ‘Foundations of Mathematics’ with the Lean Theorem Prover, arXiv preprint arXiv:2501.03352, 2025.
- K. Yang, Computer Algebra Systems and Lean, Zulip Chat Archive, 2023.
- A. Lubrino, Five free resources to get you started with the Lean Theorem Prover, Medium, 2022.
- A. Karpathy, A Recipe for Training Neural Networks, Andrej Karpathy blog, 2019.
- J. Samuel, Human Languages as New Hot Programming Languages, personal blog.
- B. Edwards, Will the Future of Software Development Run on Vibes?, Ars Technica, 2025.
- W. Witzel, R. Carr, W. Craft, J. E. Madrid Larrañaga, Prove-It: A Proof Assistant for Organizing and Verifying General Mathematical Knowledge, arXiv, 2020.
- A. Cupillari, The Nuts and Bolts of Proofs 4th ed., Elsevier, 2013.
Image credits
- Hero: How Michael Jackson pulled off Smooth Criminal tilt revealed by scientists. Sky News, 23 May 2018.
- Michael Jackson - Smooth Criminal (Official Video). Tilt starts at 7:05.
- Proof of the irrationality of - Prove-It