Scaling Interpretability - Detailed Analysis
Science and engineering are inseparable. Our researchers reflect on the close relationship between scientific and engineering ... Atticus Geiger from Pr(Ai)²R Group explores “State of A surprising fact about modern large language models is that nobody really knows how they work internally. At Anthropic, the ... Take your personal data back with Incogni! Use code WELCHLABS at the link below and get 60% off an annual plan: ... Eric is a PhD student in the Department of Physics at MIT working with Max Tegmark on improving our scientific/theoretical ... Lex Fridman Podcast full episode: Thank you for listening ❤ Check out our ...
How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to Mechanistic ... Eric Michaud returns to the stream to talk about his recent work on What's happening inside an AI model as it thinks? Why are AI models sycophantic, and why do they hallucinate? Are AI models ... At an Anthropic Research Salon event in San Francisco, four of our researchers—Alex Tamkin, Jan Leike, Amanda Askell and ... Part 1 of a walkthrough of our paper, Progress Measures for Grokking via Mechanistic Stanford AI Lab Faculty Lunch, November 7, 2025. Updated version of 0:59 ...
This has been my favorite video so far to make! I think AI models are trained and not directly programmed, so we don't understand how they do most of the things they do. Our new ... This talk was recorded at NDC AI in Oslo, Norway. Attend the next NDC ...
Photo Gallery

![Atticus Geiger - State of Interpretability & Ideas for Scaling Up [Alignment Workshop]](https://i.ytimg.com/vi/eqZ1iEoor5s/mqdefault.jpg)

![The Dark Matter of AI [Mechanistic Interpretability]](https://i.ytimg.com/vi/UGO_Ehywuxc/mqdefault.jpg)














