AI does math olympiad. How about open problems
[essay] 2 min read
Abstract
Seed-Prover’s success comes from a bottom-up search that builds a knowledge base by curating lemmas. Such is analogous to fuzzing’s two-stage search and queue culling. Further, techniques for directed fuzzing may be a key to apply Seed-Prover to open-ended problems.
본문
Seed-Prover succeeds in solving difficult IMO problems by a systematic, bottom-up search of facts. It does not focus on the problem goal and make conjecture top-down. Instead, it first generates thousands of candidate lemmas before considering the goal. It then builds a knowledge base by scoring lemmas based on difficulty and empirical relevance, and uses top-scored ones to make and prove conjectures. The approach is akin to how human would map out a problem’s landscape before hypothesizing a direct solution.
This architecture is conceptually parallel to modern directed fuzzing. The initial, broad generation of lemmas matches exploration phase to maximize coverage. Selecting top-ranked lemmas for a focused proof matches exploitation and queue culling. Furthermore, prioritizing difficult lemmas matches energy scheduling between targets in multi-directed fuzzing.
However, this core heuristic may not work well for open-ended problems. IMO problems are closed-world systems with a finite search space, where all given clues are guaranteed to be relevant for the proof. Thus, every lemma marks progress, which validates prioritizing difficult (non-trivial) lemmas. Open-ended problems in contrast have a vast search space, where the lemmas criticality comes from its creative insight, not difficulty.
To address this, we might draw inspiration from relevance scoring in directed fuzzing. We could evaluate how relevant a lemma’s pre and post-conditions (properties) are to the final proof’s requirements, which will prune search space and prioritize lemmas effectively. The fundamental challenge in this lies in defining the distance metric for such an analysis. The key step forward is to define one on the space of mathematical knowledge, that would capture the progress of a proof.
분류:수학과 AI