Anthropic, OpenAI Push AI Deeper Into Scientific Research
Anthropic launched a research workbench for scientists, while OpenAI introduced a benchmark to test AI reasoning in computational biology.
News
- US Lifts Curbs on Anthropic AI Models
- Tata Communications Plans $152 Million India-Singapore Subsea Cable Expansion
- RBI Flags AI Cyberattacks As Top Financial Stability Risk
- Apple Speeds Up Security Updates As AI Threats Rise
- Anthropic, OpenAI Push AI Deeper Into Scientific Research
- BAT Leans On ITC Infotech Amid Tech-Led Overhaul
Image Credit- Chetan Jha/ MIT Sloan Management Review India
Anthropic and OpenAI on Tuesday announced separate tools for scientific research, sharpening their focus on computational biology and life sciences.
Anthropic launched Claude Science, a beta research workbench that combines literature review, data analysis, code execution and scientific computing in one workspace. The app is available to Claude Pro, Max, Team and Enterprise users on macOS and Linux.
Claude Science lets researchers query scientific databases, generate figures and manuscripts, run analysis pipelines and use computing resources from a single interface. It includes more than 60 curated skills and connectors for fields including genomics, single-cell biology, proteomics, structural biology and cheminformatics.
“AI has the potential to dramatically accelerate the pace of scientific discovery and the development of healthcare interventions,” Anthropic said.
The company said Claude Science keeps an auditable record of outputs, including the code, environment and workflow used to produce figures and analyses. It can run on researchers’ existing infrastructure, including laptops, Linux systems and high-performance computing login nodes.
Researchers testing Claude Science have used it for single-cell RNA sequencing analysis, CRISPR screen design, protein structure prediction and cheminformatics, Anthropic said. The company also plans to support up to 50 AI for Science projects with Claude credits and compute resources from Modal.
Separately, OpenAI introduced a benchmark meant to test whether AI models can handle complex computational biology analyses that require scientific judgment, not just predefined workflows.
“Scientific data rarely arrive with instructions,” OpenAI said. “Researchers must decide whether a pattern reflects biology or noise, whether the data can support the question being asked, and how each result should change what they do next.”
GeneBench-Pro has 129 synthetic problems across statistical genetics, cancer genomics, proteomics, pharmacogenomics, clinical diagnostics and other areas. OpenAI said the benchmark tests whether models can analyze datasets, choose suitable methods, revise assumptions and reach scientifically valid conclusions.
The company said 82 of the 129 problems were reviewed by external experts, including graduate students, postdoctoral researchers, industry scientists and professors.
OpenAI said GPT-5.6 Sol reached a 28.7% pass rate at its highest reasoning setting, while GPT-5.6 Sol Pro reached 31.5% in separate Pro runs.
The scores suggest progress but also show the limits of current systems. OpenAI said AI agents remain too unreliable to replace human experts, although even partial automation could have scientific and economic value.
The launches show how major AI companies are moving beyond general-purpose assistants into specialized tools for scientific work, while also building benchmarks to measure whether those systems can reason through messy research problems.

