We need better analogies for LLMs • TSGNEU

2025-08-15

I don’t think this analogy is wrong, but as I’ve said earlier, I think it’s lacking. Same with the colour-blindness one. But I think both are useful – and important.¹

Yes, in a way an LLM is like a database. It’s fed a lot of text and you can query it by giving it text and it will respond based on what’s a statistically likely continuation – with a bit of randomness thrown in to make things more interesting.

I find it odd to dismiss LLMs completely the way Schubert does because they will hallucinate and lie when you query about things where there’s too little text. They are fed such huge quantities of text so it’s like querying the collective knowledge about nearly everything. That’s just missing out. It’s a little like refusing to use a search engine because it can’t return things that haven’t been indexed. (If you criticise my analogy I will just ignore you. Sure it’s not perfect. They never are.)

In the beginning, LLMs just continued whatever text you fed it. Then they were trained to respond in prompt-response form. This is training that happens after feeding them all the text.

More recently they’ve been trained to ’reason’ before responding, to use chain of ’thought’.

The database analogy is important as a reminder that they don’t actually reason or think. They also do not understand things. You don’t actually teach them things.

Here’s where the database analogy starts to become inadequate and therefore less useful. Because LLMs can do so much more than simply look up textual data by querying using text.

You don’t just query an LLM. You’re also instructing it to do something. You’re telling it what you want it to do. And it will do things using language – by ’thinking out loud’ to itself. As it does this it outputs statistically relevant bits of text that also guides the LLM itself. In a way it sees its thoughts and thereby makes a path to the response it will deliver back to you.

Why I don’t like the database analogy is that it reduces LLMs to something that they once were but no longer are.

Large language models today are as much the result of feeding them lots of text as they are of being subjected to the training that happens afterwards – like that which taught them the prompt-response form and chain of thought reasoning, but also feeding them prompts along with human-written responses about things that we want the models to handle well.

I wouldn’t rule out that it’s possible to feed one of the currently available models some text that would return a response that could be deemed equivalent to human creativity. A prompt that would cause its chain of thought to produce something that goes far beyond just regurgitating bits of text from the places in the multidimensional space that the input text activated. (Most responses already go beyond that.)

No, we shouldn’t anthropomorphise LLMs. We should be careful to say that they think or reason. We should remember that it is possible to get them to output anything with the right input. And we should remember that they will always output something – and that something might be a hallucination or a lie, so we must learn how to determine when that’s the case. For this, those two analogies are useful and important. But I think we need better ones.