Why reinforcement learning plateaus without representation depth (and other key takeaways from NeurIPS 2025) ...
OpenAI has introduced its latest AI model, ChatGPT o1, a large language model (LLM) that significantly advances the field of AI reasoning. Leveraging reinforcement learning (RL), o1 represents a leap ...
Reinforcement Learning does NOT make the base model more intelligent and limits the world of the base model in exchange for early pass performances. Graphs show that after pass 1000 the reasoning ...
Google researchers introduce ‘Internal RL,’ a technique that steers an models' hidden activations to solve long-horizon tasks ...
Researchers from Fudan University and Shanghai AI Laboratory have conducted an in-depth analysis of OpenAI’s o1 and o3 models, shedding light on their advanced reasoning capabilities. These models, ...
OpenAI o1 is a new large language model trained with reinforcement learning to perform complex reasoning. o1 thinks before it answers—it can produce a long internal chain of thought before responding ...
It’s been almost a year since DeepSeek made a major AI splash. In January, the Chinese company reported that one of its large language models rivaled an OpenAI counterpart on math and coding ...
Microsoft Corp. has released three new advanced small language models artificial intelligence models extending its “Phi” range of AI models that include reasoning capability. The new model releases ...
DeepSeek has released new research showing that a promising but fragile neural network design can be stabilised at scale, delivering measurable performance gains in large language models without ...
OpenAI believes its data was used to train DeepSeek’s R1 large language model, multiple publications reported today. DeepSeek is a Chinese artificial intelligence provider that develops open-source ...
While Microsoft's multi-billion-dollar partnership with OpenAI has seemingly begun fraying, the company is still keen on making its mark in the generative AI landscape first-hand. The software giant's ...