Anthropic Safeguards Lead Sharma Steps Down Over Values and Risk
Mrinank Sharma leaves after two years shaping the firm’s AI safety work, warning of a widening gap between technological power and judgment.
Topics
News
- How AI Chatbots Pose Risks When used for Medical Advice
- Anthropic Safeguards Lead Sharma Steps Down Over Values and Risk
- Meta, Google Face Landmark Trial Over Addictive Social Media Design
- Hackers Breach Government Systems Worldwide
- Cyber Risk Rises to the Top of India Inc Worry List
- India, Malaysia Bet on Semiconductors to Deepen Ties
Mrinank Sharma, head of the safeguards research team at Anthropic, has resigned from the AI firm, saying he wants to pursue work that better reflects his values as “global risks intensify.”
His departure was effective 9 February, the day he shared a long letter on X with colleagues, explaining his decision.
In the letter, Sharma wrote “it is clear to me that the time has come to move on,” adding that “the world is in peril,” not only from AI or bioweapons but from “a whole series of interconnected crises unfolding in this very moment.”
Sharma joined Anthropic in August 2023 after completing a PhD in machine learning at the University of Oxford. He led the company’s Safeguards Research Team since its launch last year, focusing on reducing the risks of misuse of AI systems, including research on AI assisted bioterrorism and AI sycophancy, or the tendency of chatbots to excessively flatter users.
“I feel lucky to have been able to contribute,” Sharma wrote, listing work that included “developing defences to reduce risks from AI assisted bioterrorism,” putting those systems into production, and writing one of the company’s early AI safety cases.
He also highlighted recent efforts around internal transparency and a final project examining how AI assistants could “make us less human or distort our humanity.”
Despite those contributions, Sharma said the tension between values and action weighed heavily on him.
“I’ve repeatedly seen how hard it is to truly let our values govern our actions,” he wrote, adding that within organizations “we constantly face pressures to set aside what matters most.”
In a passage that has been widely quoted, Sharma warned of a growing imbalance between technological power and judgment.
“We appear to be approaching a threshold where our wisdom must grow in equal measure to our capacity to affect the world, lest we face the consequences,” he wrote.
Sharma’s departure also follows the release of a recent study he authored on chatbot interactions. The research found thousands of daily conversations with AI systems may contribute to distorted perceptions of reality, particularly around sensitive topics such as relationships and wellness. While severe cases were rare, Sharma said the findings “highlight the need for AI systems designed to robustly support human autonomy and flourishing.”
Looking ahead, Sharma said he does not yet know what comes next. He wrote that he hopes to explore a poetry degree and “devote myself to the practice of courageous speech,” alongside work in writing, facilitation and community building. “I want to contribute in a way that feels fully in my integrity,” he said.
