Google researchers have developed an artificial intelligence (AI) math system that can out-smart gold medalists in international geometry competitions.

The system, called “AlphaGeometry2” (AG2), is an advanced AI framework capable of solving 84% of geometry problems posed in the International Mathematical Olympiad (IMO). The average IMO gold-medal winners solved 81.8% of Olympiad problems.

Engineered by Google DeepMind, it can engage not only in pattern matching but also in creative problem-solving, the scientists said. They outlined their findings in a study uploaded Feb. 7 to the preprint arXiv database.

The company’s announcement comes one month after Microsoft released its own advanced AI math reasoning system, rStar-Math, which uses small language models (SMLs) to solve complex equations. Both companies seek to dominate the AI math domain because scientists say that systems with high capabilities in solving math problems might sufficiently mimic other forms of human reasoning. AG2 differs from Microsoft’s rStar-Math in that it focuses on solving advanced problems with a hybrid reasoning model, whereas r-Star uses smaller language models to solve a broader range of problems.

Google released the original version of AlphaGeometry in January 2024, and its latest version shows a performance increase of 30% over previous iterations, the scientists said in the study. The improvements in AG2 focus on mastery of geometry which, unlike calculus and algebra, requires a mix of visual reasoning and logic to solve complex problems.

Related: Older AI models show signs of cognitive decline, study shows — but not everyone is entirely convinced

Experts, however, caution against viewing this milestone as achieving artificial general intelligence (AGI) — where an AI system is smarter than humans in multiple disciplines, instead of just being superhuman in one discipline, regardless of the training data.

“AlphaGeometry2 represents a form of intelligence, but human intelligence goes far beyond this — we invent, rather than simply apply knowledge or create the illusion of thought,” John Bates, CEO of AI company SER Group and a doctor in computer science from the University of Cambridge, told Live Science.

How AI can solve the hardest math problems

DeepMind’s breakthrough is the successful combination of neural language models and symbolic engines (logic-based systems designed to solve problems using symbols and parameters). The language model suggests geometric constructions while the symbolic engine tests them. This match-up enables the system to convert everyday language that a human would see in a geometry problem and convert it into “auxiliary constructions” that the symbolic engine can understand and test.

The system then works in concert to propose new constructions if previous ones don’t work. This search for solutions is done in parallel, passing information from one side of the system to the other until it arrives at a solution.

AG2 is better than the first version thanks to a neural language model trained on a larger and more diverse data set, alongside a faster symbolic engine primed to verify more geometric constructions. The system also boasts a unique algorithm for searching and finding geometric proofs.

The DeepMind researchers noted that AG2’s drawbacks lie in its longer processing time, and that it can’t handle the most challenging IMO geometry problems in 3D geometry, non-linear equations, or problems with variable points (points that change position within a geometry problem) and/or infinite points (problems with an infinite sequence of points and have infinitely many solutions). Finally, the system can’t explain how it reached its solutions in any language a human can understand.

The scope of DeepMind’s aspirations for its AG2 system remains squarely in the improvement of mathematical reasoning. Yet improvements in this area can be applied to several disciplines including engineering design, automated systems verification, robotics, pharmaceutical research and genomic research, the scientists said.

The plan is for AG2 to deliver full automation of geometry problem-solving, the scientists added, without any errors. In future versions, they hope to expand its support of more geometric concepts and break problems into subgroups. They also plan on speeding up the inference process and system reliability.

Share.

Leave A Reply

Exit mobile version