Formal or not formal? That is the question in AI for theorem proving

3 points by xcthulhu


k749gtnc9l3w

First there is the question on whether computers will ever prove interesting research-level mathematical theorems autonomously at all

Well with symbolic AI the answer is already yes… https://en.wikipedia.org/wiki/Robbins_algebra

Also, there have been precedents of human research-level mathematical advances being almost completely «apply this theorem from adjacent field, finding which one was tricky though», so probably if someone goes all-in on Rocq or Lean or even first-order logic with hints from the existing papers they should be able to find another published-and-cited open problem resolvable like that.

But overall it will probably be like the classical symbolic AI plus new neural-network methods for prioritising search branches plus LLM-based prioritisation hints based on human-written papers.

And certainly this story shows that it can be used for some research level mathematics. But all of it? Not so fast.

A thing not to forget: areas with interesting remaining questions have died off for culture reasons before, so the real question is whether computer tools can incentivise shifting to areas where they are useful.

Applications can keep an area relevant against a fashion, but I am not sure worst-for-automatisation areas are close enough to the applications to be kept active regardless of other trends.

lacker

We need a workflow in formal mathematics where humans introduce the definitions and AIs handle the proofs. The proofs are the vast majority of the difficulty of formalizing mathematics. And introducing an ugly proof isn't anywhere near as bad, in terms of technical debt, as introducing an ugly definition.

MrFantastik

My worries about the formal approach are that currently even the best formal mathematics libraries do not contain the definitions needed to understand most of what is happening in modern research mathematics.

This seems like someone just has to do the work to start contributing these to mathlub. I would be down to take a stab at automating this process but I lack the facilities to verify their correctness :p