Welcome to Issue #87 of One Minute AI, your daily AI news companion. This issue discusses a new research announcement from Alibaba Cloud.
Introducing Qwen2-Math
Alibaba Cloud's Qwen team has launched Qwen2-Math, a new set of large language models engineered to tackle intricate mathematical problems with unparalleled accuracy. Built upon the Qwen2 architecture, these models leverage a comprehensive and diverse mathematics-specific corpus, including web texts, books, code, exam questions, and synthetic data. Rigorous evaluations on benchmarks like GSM8K, Math, and GaoKao Math highlight Qwen2-Math's superiority, particularly with its flagship Qwen2-Math-72B-Instruct model, which surpasses the performance of industry leaders like GPT-4o and Claude 3.5 in mathematical tasks.
The exceptional performance of Qwen2-Math is largely credited to the use of a math-specific reward model during its development. The models have also shown remarkable results in challenging math competitions, such as the American Invitational Mathematics Examination (AIME) 2024 and the American Mathematics Contest (AMC) 2023. To ensure accuracy and reliability, the Qwen team employed rigorous decontamination methods during training to avoid data contamination. Looking forward, the team aims to broaden Qwen2-Math’s capabilities with bilingual and multilingual models, aiming to make advanced mathematical problem-solving accessible to a global audience.
Want to help?
If you liked this issue, help spread the word and share One Minute AI with your peers and community.
You can also share feedback with us, as well as news from the AI world that you’d like to see featured by joining our chat on Substack.