An AI’s Internal Representation

Taken together, our experiments suggest that [Large Language Models] possess some genuine capacity to monitor and control their own internal states. This doesn’t mean they’re able to do so all the time, or reliably. In fact, most of the time models fail to demonstrate introspection—they’re either unaware of their internal states or unable to report on them coherently. But the pattern of results indicates that, when conditions are right, models can recognize the contents of their own representations. In addition, there are some signs that this capability may increase in future, more powerful models.

— “Emergent introspective awareness in large language models“, Anthropic Research

Discover more from Fluid Imagination

Subscribe to get the latest posts sent to your email.

Share the Post:

Latest Posts

One Swift Wish: What I’d Like to See at WWDC26

Apple sent their WWDC 2026 invitations built around the Swift logo. Swift is the programming language developers use to build native Mac apps. I’m definitely reading too much into it. I’m going to do it anyway.

Fodder: Memorial Day, 2026

Four of my ancestors didn’t come home from their wars. What the records teach you, if you sit with them long enough, is that the bullets got almost nobody.