Apple AI Models Use Google's TPU Instead of Nvidia Chips
In its paper, Apple said that to train its AI models, it used two of Google's tensor processors (TPUs).
On Monday, Apple released a technical paper detailing the AI models that power its personal intelligence system, Apple Intelligence. In this paper, Apple revealed quite a bit of important information.
Not using NVIDIA's GPU
In the paper, Apple describes two models that give support to Apple Intelligence - AFM-on-device (AFM stands for Apple Foundation Model) and AFM-server (a large server-based language model), the former being a 3 billion parameter language model, and the latter a server-based language model.
To train its AI models, Apple said in the paper, it used two of Google's tensor processors (TPUs), units that are grouped into large clusters of chips. To build the AI model AFM-on-device, which runs on iPhones and other devices, Apple used 2,048 TPUv5p chips. For its server AI model, AFM-server, Apple deployed 8,192 TPUv4 processors.
TPU is a processor developed by Google specifically to accelerate machine learning. TPUs are designed and utilized for a large number of low-precision computations compared to graphics processing units (GPUs). It is said that TPUs can achieve 15 to 30 times the performance of GPUs in AI computing tasks that use neural net inference. However, TPUs can be more expensive due to the small number of manufacturers producing them and the market being in short supply.
NVIDIA does not design TPUs, and its main focus is on researching and producing GPUs that are widely used for AI.
Unlike NVIDIA, which sells chips and systems as standalone products, Google sells access to TPUs through its cloud platform. Customers interested in purchasing access must build software through Google's cloud platform to use the chips.
Instead of choosing NVIDIA's GPUs, Apple is using Google's TPUs across the board, or signaling a further deepening of the collaboration between the two.
Apple engineers said in the paper that using Google's chips, it is possible to build larger and more complex models than the two discussed in the paper.
Reiterate that private user data is not used
When Apple released Apple Intelligence, a number of Apple users expressed concerns about the sub's privacy, suggesting that Apple may have tried its private data to train models.
In the paper, Apple refuted allegations that it had violated the privacy of others when training certain models, reiterating that it did not use private user data.
"We follow a strict data policy to ensure that we do not include Apple user data, and we conduct rigorous legal review of every component of our training corpus." The paper reads, "We protect user privacy through robust device-side processing and groundbreaking infrastructure such as private cloud computing. We do not use users' private personal data or user interactions in training our base model."
In the paper, Apple emphasizes that Apple Intelligence has been designed with Apple's core values in mind every step of the way and is built on an industry-leading foundation of privacy protection.
Use license data
Earlier this month, it was reported that Apple used a dataset called The Pile, which contains subtitles from hundreds of thousands of YouTube videos, to train a series of models designed for device processing. Many of the affected YouTube creators were unaware of this and did not consent to the practice. Apple later issued a statement denying it, saying it did not intend to use the models to support any AI features in its products.
In the paper, Apple again clarified the compliance of its model training data.
Apple engineers wrote in the paper, "The AFM's pre-training dataset consists of a mix of high-quality data. This includes data we license from publishers, curated public or open source datasets, and publicly available information crawled by our web crawler, Applebot."
Apple reportedly struck deals with a number of publishers at the end of 2023, including NBC, Condé Nast, and IAC.Apple struck multi-year deals worth at least $50 million with publishers to allow Apple to utilize the publishers' news to train models.
In addition, the unauthorized use of code (even open code) to train models is a point of contention among developers. Some developers argue that some of the open-source code libraries are unlicensed or that their terms of use don't allow for AI training. But Apple says it gets its code data from license-filtered open source repositories on GitHub. These "license-filtered" repositories include MIT, ISC or Apache, among others.
Apple said most of the code data covered 14 common programming languages, including computer languages such as Swift, Python, C, Objective-C, C++, JavaScript, Java and Go.
To improve the mathematical capabilities of the AFM model, Apple specifically included mathematical questions and answers from web pages, math forums, blogs, tutorials, and workshops in the training set, according to the paper. "We evaluated and selected a number of high-quality publicly available datasets that were licensed to allow for use in training language models. We then filtered the datasets to remove personally identifiable information."
·Original
Disclaimer: The views in this article are from the original Creator and do not represent the views or position of Hawk Insight. The content of the article is for reference, communication and learning only, and does not constitute investment advice. If it involves copyright issues, please contact us for deletion.