HawkInsight

  • Contact Us
  • App
  • English

OpenAI employees have publicly accused xAI's latest AI model Grok3 of misleading benchmark results

Internet reports that recently, an employee of OpenAI publicly accused Elon Musk's xAI company, saying that the benchmark results of its latest AI model Grok3 were misleading. In this regard, xAI co-founder Igor Babushkin insisted that the company was not inappropriate. xAI's chart shows that two versions of Grok3-Grok3 Reasoning Beta and Grok3 mini Reasoning-outperformed OpenAI's currently strongest available model, o3-mini-high, on AIME 2025. However, OpenAI employees quickly pointed out on the X platform that xAI's chart did not include the AIME 2025 score of o3-mini-high under the "cons@64" condition. Babushkin argued on the X platform that OpenAI has published similar misleading benchmark charts in the past. Although these charts are used to compare the performance of their own models.

Disclaimer: The views in this article are from the original Creator and do not represent the views or position of Hawk Insight. The content of the article is for reference, communication and learning only, and does not constitute investment advice. If it involves copyright issues, please contact us for deletion.

NewFlashHawk Insight
More