HawkInsight

  • Contact Us
  • App
  • English

"White" is over.?After Twitter, Reddit also wants to join the big language model charging team

Reddit said it is planning to charge for external access to its application's programming interface (API), and has discussed specific details that are expected to be announced in the next few weeks.。

On April 18, Hoffman, founder of Reddit, a well-known foreign forum, said in an interview that he was unwilling to provide free services for AI products to crawl data on its forum for language learning.。

Hoffman points out that Reddit has an extremely valuable database, but that data doesn't have to be provided in vain to the world's largest companies.。

Sources pointed out that in recent years, the Reddit forum founded by Hoffman has been providing free language learning environment for AI products of Google, OpenAI, Microsoft and other artificial intelligence head companies, and the language crawlers developed by these companies can capture conversations, comments, questions, and even arguments between users on the forum, thus enriching the large language model of their AI products.。

In China, the forum is affectionately known as the "American Post Bar."。According to statistics, the number of daily users of the forum reached 57 million, language data is quite large, and can be updated in real time。

Reddit says it is planning to charge for external access to its application's programming interfaces (APIs)。It is explained that entities outside the company can download and process a large number of panel conversations of the social network for language learning by accessing the API。

Analysts believe that Reddit's move may be due to economic considerations。

On February 14 this year, the US technology media The Information reported that Reddit plans to go public in the second half of this year, but the company is far from profitable.。At present, most of the platform's revenue comes from the forum's advertising and platform e-commerce transactions, if Reddit can increase its API interface fee items, with Reddit's daily users and corpus size, will help it achieve its profitability goals ahead of schedule, ready for future listing.。

Second, Reddit may also be indirectly suppressing competitors by charging tech giants。

Recently, on the Reddit forum, many users said that when the AI products of the American artificial intelligence giant gradually grow up, they will create many competitors for the Reddit forum platform, because they will have "similar user communication patterns, similar user comment copy and similar data corpus."。

According to reports, the current technical bottleneck of AI lies in two aspects, one is whether the computer computing power is strong enough, and the other is whether the machine learning data is sufficient.。In terms of computing power, the major Internet companies have their own housekeeping skills, each showing their magical powers.。Google, for example, has already deployed the strongest AI chip at the time, TPUv4, in its own data centers in 2020.。According to the latest data disclosed by Google, for systems of comparable size, the TPUv4 can provide 1.7 times the performance, but also in the energy efficiency can be improved by 1.9 times。It is with this powerful computing tool that Google's Bard has the opportunity to compete with Microsoft's ChatGPT.。

Under the competition, Microsoft is not to be outdone。A few days ago, the tech giant also announced that it will soon launch an AI chip code-named "Athena."。According to sources, the development of the "Athena" chip took nearly 5 years, and its performance will be fully adapted to large language model training.。In addition, "Athena" will be based on 5nm process production, which can power all the AI software behind ChatGPT。

Compared to the "arms race" of computing power, companies have a relatively limited range of options for data learning.。At present, the large language database of each AI basically comes from four channels。First, various encyclopedia sites; second, millions of electronic books; third, various academic articles; fourth, user self-discussion platforms like Reddit。Among them, Reddit is popular with the existing AI team because of the real-time nature of its corpus.。

Hoffman says that, unlike anywhere else on the web, Reddit is a home for original, authentic conversations。

In fact, before Reddit, other original content platforms have begun to value the value of their own platform content。

In October last year, Shutterstock, a well-known commercial photography feed site in the United States, said it would allow Open AI to learn from its users' manuscripts by selling image data from the platform.。This collaboration later led to the birth of Open AI's artificial intelligence drawing tool DALLE, where users only need to tap a few lines of instructions, and DALLE self-authored graphs of user needs within its understanding.。

In February, Twitter CEO Musk announced that he would stop users' access to Twitter's free API, saying the free API was being "grossly abused" by robotic crooks and manipulators to influence public opinion.。Musk also said he plans to charge $100 a month for API access, however, currently Twitter is pricing the API at $4 a month..$20,000。On April 19, Twitter broke the news again, according to foreign media reports, Musk accused Microsoft of illegally using Twitter data to train its AI model, and hinted that he would sue Microsoft。

There are signs that the value of original content on social platforms is being valued step by step, the days when knowledge is crawled and learned for free are about to pass, and machines will face the "copyright" problem of the human world in the future.。

The latest news shows that Reddit is currently finalizing the specific details of its API-side access charges, which are expected to be announced in the next few weeks。

reddit

 

 

·Original

Disclaimer: The views in this article are from the original author and do not represent the views or position of Hawk Insight. The content of the article is for reference, communication and learning only, and does not constitute investment advice. If it involves copyright issues, please contact us for deletion.