Why Cohere’s former AI research lead opposes the scaling race.

The Race for Advanced AI: A Shift from Scaling Large Language Models

AI labs are engaged in a frenzied competition to establish data centers as expansive as Manhattan, each requiring billions of dollars and consuming energy equivalent to that of small cities. This race is motivated by the belief in “scaling”—the notion that increasing computing power applied to existing AI training methods will eventually produce superintelligent systems capable of executing a wide array of tasks.

The Limits of Large Language Models

However, an increasing number of AI researchers are cautioning that the scaling of large language models (LLMs) may be nearing its limits. They argue that new breakthroughs may be essential for enhancing AI performance significantly. Among those leading this charge is Sara Hooker, the co-founder of the new startup Adaption Labs, which she founded alongside Sudip Roy, both veterans from Cohere and Google.

Hooker has expressed a belief that relying solely on scaling LLMs has become an inefficient strategy for maximizing AI model performance. Shortly after leaving her position as the VP of AI Research at Cohere in August, she announced her new initiative aimed at addressing what she sees as one of the most pressing issues: creating AI systems that can adapt and continuously learn from real-world experiences.

A Vision for Adaptive Learning

In a recent interview with TechCrunch, Hooker elaborated on Adaption Labs’ mission to build AI systems capable of efficiently learning and adapting to their environments. While she refrained from divulging specific methodologies or whether the company will utilize LLM architectures, the idea is clear: in her view, merely scaling up existing models has not led to true intelligence that can interact meaningfully with the world.

Hooker emphasizes that adaptability lies at the “heart of learning.” She likens the process of learning to a human experience—stub your toe on a dining room table, and you learn to navigate more carefully the next time. Although AI researchers have tried to embed this adaptive learning into models through reinforcement learning (RL), Hooker points out that current RL methods fall short in real-world applications, as they do not allow AI systems to learn from mistakes while in production.

The Challenges of Fine-Tuning AI Models

Many AI labs currently provide consulting services aimed at customizing AI models for enterprise needs, but these services often come at a steep cost. For example, OpenAI reportedly requires clients to invest over $10 million to receive consulting on fine-tuning. Hooker argues that there are only a handful of dominant labs dictating a limited range of AI models that are expensive to adapt. She believes this paradigm is outdated, suggesting that AI systems can and must learn efficiently from their environments to redefine the landscape of AI control and utility.

The Growing Skepticism Around Scaling

Adaption Labs symbolizes a critical shift in the industry’s confidence regarding scaling LLMs. Recent research from MIT indicates that the largest AI models may soon experience diminishing returns, a sentiment echoed in skeptical discussions among leading AI researchers. For instance, Richard Sutton, a Turing Award-winning figure known as the “father of RL,” commented in a recent conversation that LLMs cannot truly scale since they lack the ability to learn from real-world experiences. This sentiment was further reinforced by Andrej Karpathy, an early OpenAI employee, who expressed reservations about the long-term viability of RL for improving AI models.

Concerns regarding scaling through pretraining—a strategy where AI models learn patterns from extensive datasets—have surfaced as well. Pre-training had previously been the cornerstone for the success of firms like OpenAI and Google. However, evidence suggests that the benefits from this method may also be approaching saturation.

Alternate Approaches: AI Reasoning Models

While concerns about the limitations of scaling have gained traction, the AI industry continues to identify alternative ways to enhance model performance. Breakthroughs in AI reasoning models—systems that utilize additional time and computational resources to methodically work through problems before generating responses—have extended the capabilities of AI in exciting directions.

AI labs are becoming increasingly convinced that scaling up RL and AI reasoning models represents the new frontier for AI advancements. OpenAI previously communicated to TechCrunch that they developed their initial AI reasoning model, o1, with the expectation that it would scale effectively. Additionally, research released by Meta and Periodic Labs recently explored the potential for RL to further enhance performance, highlighting the expensive nature of current approaches.

Adaption Labs: Finding Cost-Effective Solutions

In contrast to prevailing methods, Adaption Labs strives to unearth innovations that demonstrate learning from experience can be pursued in a far more cost-efficient manner. The startup was reportedly in discussions to raise a seed round between $20 million and $40 million this past fall, and while the final amount is undisclosed, it’s understood that the round has since concluded successfully. Hooker expressed their ambition to achieve significant advancements in AI and hinted at the high-caliber nature of their founding team.

Previously at Cohere Labs, Hooker specialized in training compact AI models for enterprises, and she now aims to push this trend of outperforming larger models in diverse benchmarks, including coding and reasoning tasks.

A Global Approach to AI Research

Beyond technological advancements, Hooker has made a name for herself in broadening access to AI research on a global scale. Hiring talent from underrepresented regions such as Africa has been part of her mission. While plans for an office in San Francisco are underway, Hooker is committed to a diverse approach to hiring.

The Broader Implications of Adaptive Learning

Should Hooker and Adaption Labs succeed in proving the limitations of scaling methods, the implications could be monumental. With billions already invested in the belief that larger models translate to greater general intelligence, a shift towards true adaptive learning may reveal a more powerful and efficient route to AI advancement.

In conclusion, as AI researchers and startups like Adaption Labs explore new methods of building intelligent systems, the industry might soon witness a transformation away from mere scaling, toward a more nuanced approach based on adaptability and continuous learning. This pivotal moment could reshape not only the technology but also the very relationships that users have with AI systems moving forward.

Thanks for reading. Please let us know your thoughts and ideas in the comment section down below.

Source link
#Coheres #exAI #research #lead #betting #scaling #race

About The Author

Emmanuel Kesse

See author's posts

Categories

Recent Posts

Emmanuel Kesse

More Stories

John Carreyrou and others file lawsuit against six leading AI corporations.

ChatGPT Introduces Year-End Summary Feature Similar to Spotify Wrapped

Alphabet Acquires Intersect Power to Overcome Energy Grid Constraints

Leave a Reply Cancel reply