Researchers at Google DeepMind have made remarkable progress in the field of mathematics. They developed two AI systems, AlphaProof and AlphaGeometry 2, to tackle questions from the prestigious International Mathematical Olympiad (IMO).
These AI systems came tantalisingly close to securing a gold medal, scoring 28 out of 42 points. This achievement not only highlights the potential of AI in mathematics but also points to its current limitations.
AI Systems Tackle Olympiad Challenges
Researchers at Google DeepMind have taken a significant step in mathematics, with two new AI systems named AlphaProof and AlphaGeometry 2. These systems aimed to solve questions from the International Mathematical Olympiad (IMO), an annual competition for secondary-school students.
The Olympiad consists of six difficult questions covering various fields like algebra, geometry, and number theory. DeepMind’s systems scored 28 out of 42, just one point shy of a gold medal, earning a silver instead.
Performance Analysis of AlphaProof
AlphaProof solved three out of the six problems by employing a method that pairs a large language model with reinforcement learning. This system was trained on numerous maths problems written in English, enabling it to generate specific proofs in the formal language.
AlphaProof uses an approach called formal mathematics, which allows writing a mathematical proof as a program that can run only if true. Although this method can solve complex problems, it took AlphaProof three days to find the correct formal model for one of the toughest questions.
Lead researcher Thomas Hubert described the goal as building a bridge between formal and informal mathematics, allowing for the system to improve itself through verifiable true proofs. This method ensured accuracy but also revealed the slow pace at which AI can solve particular problems.
AlphaGeometry 2’s Efficacy
Unlike AlphaProof, AlphaGeometry 2 focused on geometry-related problems and solved them remarkably fast. It paired a language model with a mathematically inclined approach, solving one problem in just 16 seconds.
Its output was described as unexpectedly efficient. This was likened to DeepMind’s famous “move 37” in its historic Go victory, where the AI made a move no human had thought of.
Thang Luong, the lead on AlphaGeometry 2, noted that the solution connected many triangles elegantly. The system constructed a circle around another point, an approach initially confusing but eventually appreciated for its elegance and effectiveness.
Challenges and Limitations
Despite their success, the AI systems showed some limitations. They either answered questions perfectly or couldn’t start at all. They had no time limit, unlike human competitors, who get nine hours.
For some questions, DeepMind took up to three days to find an answer. This inefficiency highlighted the current limits of AI in tackling very complex problems swiftly.
This raises questions about the practical applications of AI in real-world scenarios where time is often a critical factor.
Example Problems Faced
AlphaGeometry 2 tackled a problem involving a triangle ABC with specific conditions. It solved this in 19 seconds, making it seem almost effortless.
The problem was: Let ABC be a triangle with AB < AC < BC. Let the incentre and incircle of triangle ABC be I and ω, respectively. Let X be the point on line BC different from C such that the line through X parallel to AC is tangent to ω.
Similarly, let Y be the point on line BC different from B such that the line through Y parallel to AB is tangent to ω. Let AI intersect the circumcircle of triangle ABC again at P ≠ A. Let K and L be the midpoints of AC and AB, respectively. Prove that ∠KIL + ∠YPX = 180◦.
Expert Reviews
Prof Timothy Gowers, who marked the answers, noted that the AI systems were either flawless or hopeless in their responses. He appreciated the innovative solutions but also pointed out the inefficiency in tackling some questions.
Gowers’ comments highlighted the potential and the current limitations of AI in solving high-level mathematical problems.
Future Implications
The success of these AI systems in the Olympiad indicates a promising future for AI in mathematics. However, their limitations also show that more work is needed.
As AI continues to evolve, it may eventually match or even surpass human capabilities in solving complex problems. This development could revolutionise fields that rely heavily on mathematical research.
The advancements by Google DeepMind with AlphaProof and AlphaGeometry 2 show the potential that AI holds for complex mathematics. Their near-gold performance in the International Mathematical Olympiad marks a significant achievement.
However, the evident limitations, such as the lengthy time required to solve some problems, highlight areas needing further improvement. This underscores a critical balance between showcasing AI’s capabilities and recognising the current gaps that exist.
As research continues, AI may soon bridge these gaps, revolutionising mathematical problem-solving and providing new tools to the academic community.