November 12 news, artificial intelligence companies like OPENAI are committed to developing new training technologies in order to overcome unexpected delays and challenges encountered when constructing larger -scale language models.These technologies aim to "think" in a more similar to humans.
Many scientists, researchers and investors in the fieldIt affects the continuous demand of artificial intelligence companies to resources (such as energy and chip types).
Openai did not comment.Since the release of chat robots, many technology companies have benefited from the artificial intelligence boom, and the valuation has risen sharply.These companies have publicly announced that they can continue to improve artificial intelligence technology by increasing data and computing capabilities to continue to improve artificial intelligence technology.
However, some top scientists now publicly pointed out that the concept of "bigger and better" models is limited.Ilya Sutskever, former chief scientist and SAFE SuperIntelligence (SSI) founder, said that expanding the pre -training method has entered a bottleneck period.Pre -training refers to the stage of training artificial intelligence models. The model uses a large amount of unbar data to understand the language pattern and structure.
Sutzkwei has advocated that by increasing data and computing power to promote the progress of production artificial intelligence, it has created ChatGPT.Earlier this year, Suzkwei left Openai and established SSI.He pointed out: "The 2010s was an era of model expansion, and now we have returned to the era of miracles and discovery. Everyone is looking for the next breakthrough point. Now choosing the correct expansion method is more important than ever."
Sutzkwei did not disclose the specific details of the new method of the SSI teams exploration, only indicating that the team is studying the alternative way to expand the pre -training scale.
According to three insiders, researchers in the main artificial intelligence laboratory are trying to release a large language model that exceeds the performance of the OpenAI GPT-4 model.As a result, the GPT-4 model has been launched for nearly two years.
The "training operation" of these big models is not only costly, but it may reach tens of millions of dollars. It also needs to run hundreds of chips at the same time. The system is complicated and the risk of hardware failure is also high.In addition, researchers usually wait for several months to evaluate the final performance of the model, which increases the uncertainty in the development process.
What is even more tricky is that the demand for data for large language models is great, and the current artificial intelligence model has almost exhausted all the data resources in the world.At the same time, power shortage has also become another major problem in restricting training operations, because this process requires huge energy support.
In order to deal with these challenges, researchers are actively exploring the "test time calculation" technology, which enhances its performance in the "reasoning" stage of the model.For example, the model can generate and evaluate a variety of possibilities in real time, and ultimately select the best path instead of the answer immediately.
This method allows models to use more processing capabilities for challenging tasks, such as mathematics or coding problems, or complex operations similar to human reasoning and decision -making.
OpenAI researcher Noam Brown said at the TED Artificial Intelligence Conference held in San Francisco last month: "It turns out that letting robots think in play card games for 20 seconds for 20 secondsIt is equivalent to the effect of expanding the model of 100,000 times."Strawberry" can "think" in multiple steps, similar to human reasoning.At the same time, the O1 model also combines data and feedback from doctoral and industry experts.The core is to conduct additional training on the "basic" model such as GPT-4.Openai said it plans to apply this technology to more and larger basic models.
At the same time, according to five people familiar with the matter, researchers from other top artificial intelligence laboratories such as Anthropic, XAI and Google DeepMind are also actively developing their own "test time calculations".technology.
OPENAI chief product officer Kevin Weil (at a science and technology conference in October: "There are many results at the moment.Improve the model of the model.
This trend may reshape the competitive pattern of artificial intelligence hardware.So far, the demand for Nvidia artificial intelligence chips has always occupied the market leading position.Well -known venture capital institutions such as Sequoia Capital and Andreessen Horowitz have been keenly capturing this transformation and are evaluating the impact of its high investment.These institutions have invested heavily in the development of artificial intelligence models in many artificial intelligence laboratories such as OpenAi and XAI.
Sonya Huang, a partner of Sequoia Capital, said: "This change will lead us to a large -scale pre -training cluster to cloud -based distributed reasoning server-—The reasoning cloud "
Nvidias most advanced artificial intelligence chip demand has increased, promoting the companys market value to surpass Apple in October, becoming the highest market value company.However, with Nvidias leading position in the training chip market, this chip giant may face more competition in the reasoning market.
In response to the impact of its product demand, Nvidia responded that in the recent demonstration, the importance of the technology behind the O1 model has been emphasized.Huang Renxun, CEO of Nvidia, pointed out that the need for reasoning using its chip for reasoning is rising.
He said at a meeting in India: "We have found the second laboratory law, that is, the laws of zooming during reasoning ... All these factors jointly promoted Blackwell (the companyThe latest artificial intelligence chip) demand surge. "(Small)