The advancement of artificial intelligence (AI) algorithms has opened new possibilities for the development of robots that ...
Open Molecules 2025, an unprecedented dataset of molecular simulations, has been released to the scientific community, paving the way for the development of machine learning tools that can accurately ...
The largest single cell perturbation dataset to-date will be generated and released open source in a new team effort.
A collaborative effort between Meta, Lawrence Berkeley National Laboratory and Los Alamos National Laboratory leverages Los Alamos' expertise in building tools for molecular screening capabilities.
OpenAI secretly funded and had access to a benchmarking dataset, raising questions about high scores achieved by its new o3 AI model. Revelations that OpenAI secretly funded and had access to the ...
EleutherAI, an AI research organization, has released what it claims is one of the largest collections of licensed and open-domain text for training AI models. The dataset, called the Common Pile v0.1 ...
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Getty Images is going all in to establish itself as a trusted data ...
Harvard University announced Thursday it’s releasing a high-quality dataset of nearly 1 million public-domain books that could be used by anyone to train large language models and other AI tools. The ...
Apple has released Pico-Banana-400K, a highly curated 400,000-image research dataset which, interestingly, was built using Google’s Gemini-2.5 models. Here are the details. Apple’s research team has ...