HawkInsight

  • Contact Us
  • App
  • English

Microsoft Open Source Multimodal AI Agent-Magma

Internet reports that at 3 a.m. today, Microsoft open-sourced the multimodal AI Agent basic model-Magma on its official website. Compared with traditional Agents, Magma has multimodal capabilities across the digital and physical worlds, and can automatically process different types of data such as images, videos, and text. For example, you can use Magma to automatically place e-commerce orders and check weather; You can also automatically operate a physical robot or get help when playing real chess. In addition, Magma can also have built-in psychological prediction function, which enhances the ability to understand spatio-temporal dynamics in future video frames and can accurately infer the intentions and future behaviors of characters or objects in the video.

Disclaimer: The views in this article are from the original Creator and do not represent the views or position of Hawk Insight. The content of the article is for reference, communication and learning only, and does not constitute investment advice. If it involves copyright issues, please contact us for deletion.

NewFlashHawk Insight
More