Your Model is Wrong

26 Feb, 2026

Have you ever seen a movie where someone time travels and they have to ask what year it is? They always have to do so carefully or awkwardly, like it's a stupid question or somehow suspicious. For some reason, it is fundamentally inconceivable that a sane human being would not know what year they are currently living in.

Now, how many times have you had to tell ChatGPT what year it is?

AI is confidently wrong a lot. It insists it's 2024 when it's not, misrepresents company policies, and cites sources that don’t exist. People in my life who don’t work with AI all the time often cite this kind of behaviour to me and complain about it at great length. It's a huge sticking point for them and a core facet of their argument that AI is bad, incapable or otherwise not useful technology. I’m sure you’ve experienced this if you’re in a position where others ask you about AI a lot, but recently I have found myself asking why? Why do people seem to get so frustrated by this phenomenon? There’s nowhere near the same level of emotion when someone can’t find what they want in Google Search, or when an app doesn’t quite work the way they expected. Most people voice some annoyance and then move on. So why is AI different? What expectation is being violated that makes people angry enough to start hurling obscenities at ChatGPT through their keyboards when they can’t get it to do what they want? Personally, I think it has less to do with the fact that the AI is getting things wrong and more to do with Joseph Weizenbaum’s 1966 chatbot, ELIZA, but we’ll get to that. First let's talk about humans.

Human Communication

Human communication is much richer than it appears at face value. We lean on tone, gesture, expression, and shared physical context to fill in what words leave out. But even more than all of this, our ability to communicate effectively relies on understanding not at the linguistic level, but at the experiential level.

Theory of Mind (ToM) is our ability to model what others believe, know, and intend,¹ and it is integral to human communication. We use ToM constantly without ever knowing it. We predict how our words will land and adjust them for our audience. We infer how the interviewer felt about our answer based on their facial expression and we adjust our next answer to suit. When I say “the meeting was a disaster,” I trust you’ll infer frustration rather than literal catastrophe. ToM allows us to deliver complex ideas and concepts to others through simple words, because we understand how they will understand those words. Take this away, and we would be left with just the words themselves, without the context. What does “blue” mean to a mosquito anyway?

From one perspective, language can be seen as a compression algorithm, and an extremely lossy one at that. Take the example of the word “blue” above. What does that mean? What is “blue” to you? What is it to me? What about your mum? Are they the same? Where’s the line between “blue” and “not blue” on the colour wheel, for you, or me, or your mum?

When we use the word “blue” we’re compressing a rich set of sensory experiences into four letters on a page. Those letters don’t mean anything at all, without the reader’s ability to decode them through the lens of their lived experience. If you go swimming in the ocean every day your “blue” is going to be different from the “blue” of someone living next to an alpine lake, or the staff member at Bunnings selling you paint from the Dulux colour chart. The word “blue” doesn’t itself contain the color, it's merely a trigger for the reader’s pre-installed “blue” decoder.

Our communication with others therefore succeeds in proportion to the similarity of our decoders — decoders derived from our lifetimes of lived experience. Humans all share a large part of this experiential decoder: our physiology. We have the same sense organs, pain receptors and body morphology, we're all affected by gravity, and heat, and the progress of time. These things give us a huge foundation of shared experience, and on top of that we layer the cultural context of our daily lives: our shared history, food, slang, and upbringings. ToM is how we reason about all of this. It's how we predict the decoder someone else will use to interpret our words, and how we adjust the way we encode our intent into language.

Calibrating Theory of Mind

The key point about ToM being built up over our lifetimes is that it isn’t a fixed capacity, it’s calibrated through experience and we learn it like any other skill: through practice. Research has found that when measuring people’s ability to predict others’ mental states, cross-culture prediction is greatly impaired compared to in-culture prediction.²^,³^,⁴ Think about your best friend; I bet you have a much better idea of how they think than that coworker you speak to occasionally in the lunch room. The upshot of this is, when you move to a new culture, or begin to interact with a new group or individual, you have to recalibrate your ToM for this new situation. And here we come to the point: this recalibration gets considerably harder when said individual doesn't share your biology.

Yes, I am suggesting that just like we build ToM for biological forms of intelligence, so we must for the artificial variety. However, this poses an immediate problem: AI doesn’t share any of our biological decoder — it doesn’t even feel the progress of time — and its cultural context is so diluted that it gives us little to reason on. Fundamentally, AI doesn’t produce intelligent behaviour in the same way as the biological entities we are used to interacting with. Its knowledge and behaviour is grounded in linguistic patterns, rather than sensory experience.

This symbol grounding problem, as it is often referred to, is a long standing discussion in the AI literature.⁵^,⁶^,⁷ A human’s understanding of “blue” may be different between individuals, but it is always grounded in experience: the blue of the ocean, the alpine lake or the Dulux colour chart. For LLMs on the other hand “blue” is grounded in a set of complex attentional weights between other words in the input text, related somehow or another to the word “blue”. That is, symbols point to other symbols rather than to the world. This infinite regress of symbols without anchor to experience is a fundamental difference in the way the current paradigm of artificial intelligence and human intelligence function. This gap between linguistic competence and experiential understanding is invisible most of the time, which is precisely what makes it so disorienting when it surfaces.

The ELIZA Effect

And now finally, we come back to ELIZA. ELIZA was the first NLP chatbot application, developed by Joseph Weizenbaum at MIT from 1964 to 1966. When he showed ELIZA to his secretary, she reportedly asked him to leave so she could have a "real conversation" with it. Weizenbaum was so struck by this kind of reaction that he later wrote: "Extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people."⁸

ELIZA was but a simple pattern-matching program with certainly far less “understanding” of what it was saying compared to modern AI systems, and yet it was enough to make someone treat it as something that could understand them, and was worth having a private conversation with. Now consider what happens when the system on the other end isn't matching patterns from a script, but generating fluid, context-aware prose that adapts to your tone.

When we use a modern LLM chatbot, it appears to speak like a human. It echoes our tone, mirrors our mannerisms, and responds in kind. Without realising it, we begin to apply ToM, modelling how this system will interpret our words just as we would for another person. And then the AI confidently tells us something wrong, or ignores what we thought was a clear instruction, or does something we never asked for. The result is the same frustration we feel when a person misunderstands us, except sharper, because the AI seemed like it understood perfectly.

But why do we misapply ToM to AI, particularly as novice users? It comes back to the fundamental differences in human and AI intelligence. Humans have embodied context: we know what the cold feels like as a matter of experience, rather than just statistics. AI does not, or at least, not in the same way we do. Our subconscious assumptions about the intelligence we're interacting with are wrong, and so our predictions often miss, even when we consciously understand what we're talking to. This is why the confident wrongness stings so much: it's not a tool malfunction, it's a social betrayal. The AI spoke like someone who understood, and then proved it didn't. To correct this ToM misalignment, we have to correct our subconscious model, not just our conscious understanding.

Fixing Your Model

When we move overseas, we adjust our ToM to work for the new people in our new situation, and I believe this is precisely how people learn to interact with AI as well. Through trial and error and exposure, people who are highly effective at using AI have built ToM for this new form of intelligence. They have a clear model of how their words will be interpreted and acted upon. They intrinsically know the AI won’t remember their past conversations, and have learned to provide all the context upfront every time, without even thinking about it. They know that just because the AI expresses something confidently, doesn’t mean it is right, and ask it to “double check that with a web search,” when it's making factual claims.

At the beginning, remembering to do these things takes conscious effort, just as it does for me to remember to ask for directions to a “gas station” and not a “servo” when I visit the USA. Over time, however, our brains naturally adapt. Given enough exposure, the subconscious model updates, and communication gets easier. The skill is in being patient with yourself, and the AI, while it happens.

So, whether you’re new to AI, or just trying the latest model, expect your intuitions to be wrong, pay attention to where your predictions fail, and expose yourself to lots of different situations. Play around. Try weird things. Build your ToM.

"Theory of Mind," Wikipedia. [Online]. Available: https://en.wikipedia.org/wiki/Theory_of_mind [Accessed: 26-Feb-2026]↩
L. R. Kim, J. Jetten, A. Pekerti, and V. Slaughter, "Mindreading across cultural boundaries," International Journal of Intercultural Relations, vol. 93, Art. no. 101775, Mar. 2023.↩
F. Quesque, A. Coutrot, S. Cox, L. C. de Souza, S. Baez, J. F. Cardona, et al., "Does culture shape our understanding of others' thoughts and emotions? An investigation across 12 countries," Neuropsychology, vol. 36, no. 7, pp. 664–682, Oct. 2022.↩
A. Lillard, "Ethnopsychologies: Cultural Variations in Theories of Mind," Psychological Bulletin, vol. 123, no. 1, pp. 3–32, 1998.↩
J. R. Searle, "Minds, Brains, and Programs," Behavioral and Brain Sciences, vol. 3, no. 3, pp. 417–424, 1980.↩
S. Harnad, "The Symbol Grounding Problem," Physica D: Nonlinear Phenomena, vol. 42, no. 1–3, pp. 335–346, 1990.↩
E. M. Bender and A. Koller, "Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data," in Proc. 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 5185–5198. [8] J. Weizenbaum, Computer Power and Human Reason: From Judgment to Calculation. San Francisco: W. H. Freeman, 1976.↩
J. Weizenbaum, Computer Power and Human Reason: From Judgment to Calculation. San Francisco: W. H. Freeman, 1976.↩

#ai