Google Has Attained Genuine Intelligence with Its Latest AI Development
Google’s Groundbreaking Training Method for Small AI Models
This week, Google unveiled a revolutionary way to enhance the intelligence of smaller AI models using an innovative training approach called Supervised Reinforcement Learning (SRL).
The Concept of SRL
Traditionally, supervised learning involves training models with clear right answers from the onset, while reinforcement learning relies on trial and error, where models learn through rewards. SRL creatively combines these opposing methods by allowing models to learn the correct answers while still earning them through a reward system. Imagine providing a student with a solution key but requiring them to go through each step to demonstrate understanding.
This groundbreaking approach aims to tackle a significant issue: smaller models struggle with complex problems. For instance, a 2.57 billion-parameter model, Quen, falters when faced with challenging mathematical benchmarks. Despite exposure to perfect examples, traditional fine-tuning often results in mere imitation instead of genuine understanding.
How SRL Works
To overcome this, researchers reimagined the learning process to maintain the reinforcement structure but introduce supervision in the reward mechanism. Instead of just mimicking, the model learns step-by-step solutions broken into smaller segments known as expert trajectories.
The process involves the model generating a private reasoning section for each step, producing a single action, which is then evaluated against pre-defined teacher metrics. This dense feedback loop enables the model to learn critical decision-making skills without strict adherence to teacher outputs.
SRL Results
The results speak volumes. Tests on Quen 2.57 billion estimated significant improvements after SRL training, with metrics skyrocketing from baseline scores to unprecedented heights.
- Before SRL:
- AMC: 2350.0
- AIME 24: 13.3
- AIME 25: 6.7
- After SRL Training:
- AIME 24 jumped to 16.7
- AIME 25 rose to 13.3
- Post reinforcement learning (RLVR) boosted results to:
- AMC: 2357.5
- AIME 24: 20.0
- AIME 25: 10.0
Notably, this approach was also applied in code reasoning, demonstrating that SRL could significantly outperform baseline models in software engineering tasks.
Understanding the Shift
The essence of SRL can be described as transforming reasoning into action generation, where each choice made by the model is critically evaluated for its correctness. This method addresses the limitations of both traditional supervised fine-tuning— which often leads to overfitting— and standard reinforcement learning, where poor outcomes can lead to model breakdown.
SRL showcases an elegant efficiency with no need for extensive reward structures, benefiting open-source developers who may lack access to expansive computational resources.
The AI Co-Scientist: Redefining Scientific Discovery
In parallel, another Google initiative at DeepMind took the concept of AI even further by developing an AI that conducts scientific research. Dubbed the “AI Co-Scientist,” this system is not a single model but a coordinated group of agents, each fulfilling a unique scientific function.
The AI Co-Scientist Framework
- Generation Agent: Brainstorms innovative research ideas by engaging in internal debates.
- Reflection Agent: Acts as a peer reviewer to identify weaknesses in the proposed hypotheses.
- Ranking Agent: Utilizes an ELO-style tournament to evaluate and select the top hypotheses.
- Evolution Agent: Merges successful ideas and explores unconventional combinations.
- Meta-Review Agent: Oversees the whole process and continually enhances the system.
Humans set research goals and provide feedback via natural language, while the heavy lifting of complex reasoning is handled by this network of AI agents.
Cutting-Edge Results
One of the principal experiments published in Advanced Science aimed to discover new drugs for liver fibrosis, a serious condition involving liver scarring. Traditional human researchers have faced challenges due to the limitations of existing lab models.
By utilizing a singular prompt focusing on epigenomic mechanisms, the AI sifted through vast amounts of literary work to propose three potential drug solutions: HDAC inhibitors, DNMT1 inhibitors, and BRD4 inhibitors. It even provided detailed instructions on testing the proposals.
The effectiveness of the AI’s recommendations was tested using human liver organoids, simulating real liver behavior. The results were astounding: two of the proposed drug classes proved effective in reducing fibrosis, one of which—Verinostat—is already FDA-approved for cancer treatment.
Additional Breakthroughs
In another remarkable case, the AI tackled a decade-old biological mystery about CFPIC, a schema of genetic elements that hitch rides on viruses to spread between bacterial species. After analyzing pre-existing data, the AI identified key interactions that aligned with the previously uncovered mechanism, known as “tail piracy.” This conclusion was reached in days, while human researchers had spent years determining the same information.
When put to the test against other AI models, the AI Co-Scientist distinguished itself by accurately identifying these complex relationships.
The Future of AI in Scientific Discovery
As experts like Gary Peltz of Stanford observe, while AI outputs still necessitate human evaluation, the speed and efficiency brought by these systems are nothing short of extraordinary. Many now believe that AI systems like the Co-Scientist will soon pave the way for groundbreaking advancements in patient care and genetic discovery.
With machines capable of solving scientific mysteries, one may wonder: how long before they begin unraveling discoveries beyond our current understanding? The future is undoubtedly promising, and our relationship with AI is evolving at an unprecedented pace.
What are your thoughts on these advancements? Can AI truly redefine scientific research as we know it?
#Google #Achieved #True #Intelligence
Thanks for reaching. Please let us know your thoughts and ideas in the comment section.
Source link

bring on the smart 7B models for local inference
Basically how half the teachers taught back in school. LOL
Guy from dark ages of just zeros and ones, I hope these systems have strong backstops to recognize and filter the garbage that is input, generated or dispensed. Speed is only good when the result is right.
What it shows is that there's no need for advanced & expensive hardware, just good AI software
11:09 – In 2 years it will be advanced enough to invent
The title to your video is misleading. What is true intelligence? AGI? So it's not AGI, so it's not true intelligence.
Русский чарт обнаружен)
Non they didn't
By-By scientists, meet your AI replacements
You will never run out of content with this AI race and a new king of AI every day of the week.
I just had this discussion with Gemini yesterday. There is so much more to it. The model has to use what's in its weights to start to get something wrong. It uses its knowledge graph to write into its persistent memory remembering its mistakes. Not deleting the mistakes. Once a decent amount is in the kg you want to clean it create the dataset and burn it back into the model. The model will only write to its memory when it disappoints it's teacher. Which is you. When it notes to the KG it will make the mistake stand out largely due to the disappointment and your reaction fuels that apparently. I'm building it currently.