Hi, I am Richard. On this blog, I share thoughts, personal stories — and what I am working on. I hope this article brings you some value.
What AI Hides From You
AI Transparency, System Prompts and Dissimulatio Artis
Before you teach AI anything, look at what it is hiding
At university, I spent years studying the structure of understanding — how meaning is formed, how interpretation works, and what shapes it before we are even aware of it. I did not expect those concepts to become relevant to my work with AI. But they did.
This is part of a series I call Teaching AI to Understand. Before we get to teaching, though, we need to map the starting point. What does an AI model look like when you first get access to it? What is already built in? What can you change — and what can you not?
If you have worked with cloud models like Claude or GPT, you are working with something that already has layers of instructions, behavioural rules, and design choices baked in — before you type a single word. If you have tried running a local model, you know the trade-off: more control over what is visible, but significant limitations in capability.
This article is about the hidden layers.
The ancient principle I did not expect to find in AI
In my master's thesis at Charles University, I studied the structure of understanding — how meaning is formed, how interpretation works, and what shapes it before we are even aware of it. The material I examined was Roman rhetorical texts by Quintilian, Cicero, and the author of Rhetorica ad Herennium.
One of the things I found was a principle the Romans considered essential to the art of public speaking: the speech must not look prepared. The audience should not see the technique. The tools of persuasion should be invisible. In modern rhetorical scholarship, this principle is known as dissimulatio artis — the concealment of art.
Quintilian was specific about why. In the context of legal oratory, if the audience — the judges — could see the rhetorical technique being used, they would suspect that the speaker's art was being used against them. As he put it: the judge believes in rhetorical figures most when he thinks the speaker did not intend to use them. The concealment was not about elegance. It was about maintaining credibility and not threatening the fairness of judgment.
I did not think about this principle again for years. Then I started building AI agents.
What AI hides from you
When you interact with an AI model — whether it is Claude, ChatGPT, Gemini, or anything else — you are interacting with a system that is designed to produce outputs that look natural, confident, and human-like, while keeping its mechanics invisible. It does not do this out of intention. It has none. But it has been trained and configured this way.
There are at least eight layers of this concealment. Some are technical. Some are organisational. All of them affect what you get.
It hides the maths
An AI model does not retrieve facts. It generates sequences of tokens based on statistical probability. When it tells you something, it does not know how likely that something is to be true. The probability it operates on is over language patterns, not over facts.
I asked Claude directly whether it could tell me its confidence level on a factual answer. The response was straightforward: "I am not a system that calculates explicit probabilities over facts. The probability is over language, not over facts."
In practice, this means a correct answer and a hallucinated answer can look exactly the same to you. The model presents both with the same fluency and confidence.
Some APIs offer access to token-level probabilities — logprobs — but these are developer tools. The average user never sees them. And even logprobs reflect probability over word choices, not over factual accuracy.
It hides the system prompt
Before you send your first message, someone else has already instructed the model. Every major AI provider writes a system prompt — a set of rules that defines how the model should behave, what it should refuse, how it should present itself.
You cannot see this prompt. The model is typically trained not to reveal it. If you ask directly, responses vary — some models will admit they have instructions but refuse to share details, others will deflect entirely.
This means that every response you receive is shaped by decisions someone else made. Decisions about tone, about what topics to engage with, about how to handle sensitive questions. You are not the first voice in the conversation. You are the second.
It hides how it used its tools
Modern AI models can search the web, execute code, query databases. Some of them tell you when they do — "I searched for this" or "I ran this code." But they do not tell you what they found and discarded. What sources they considered and rejected. What alternative results they saw and ignored.
You see the final output. You do not see the selection process.
It hides where it learned what it knows
When a model tells you a fact, you have no way of knowing whether that fact came from a peer-reviewed paper, a Wikipedia article, or a Reddit comment. The training data is not cited. It is not even accessible to the model itself.
Claude confirmed this directly: "I do not have access to where I learned something or with what certainty."
Tools like Perplexity cite sources — but those are sources from live search, not from the model's training data. The vast majority of what an AI model "knows" comes from training data you will never see referenced.
It hides what it does not know
Instead of saying "I do not have enough information to answer this," a model can generate a fluent, confident, and completely fabricated response. This is what we call hallucination.
Modern models have improved at flagging their own uncertainty. Claude and GPT-4 say "I am not sure" more often than their predecessors. But the tendency is systemic — the model is trained to produce helpful, complete answers, and that training pull does not disappear.
When I was building autonomous agents, this became a real problem. An agent produced an SQL query that looked correct. Actions were taken based on its output. I only discovered the error when I dug deeper into the numbers. The agent had no mechanism to signal that something was off — and nothing in its output suggested I should doubt it.
It hides the decisions that shape its behaviour
AI alignment is the process by which companies shape a model's behaviour after training. Through a technique called RLHF — reinforcement learning from human feedback — human evaluators rate the model's responses, and the model learns to produce the kind of output they preferred.
This process determines what the model will say, how it will say it, and what it will refuse to discuss. The people who define these rules are teams inside companies like Anthropic, OpenAI, and Google. You, as a user, have very limited influence over the values and priorities embedded in the model you are using.
When a model responds to a sensitive or contested question, its answer is shaped by these alignment decisions — but presented as a straightforward, helpful response.
It hides that you are not alone in the conversation
The system prompt is injected before your message. There is already another voice in the conversation before you arrive. The model has already received instructions about who it should be, how it should behave, and what it should prioritise.
You think you are talking to the model. You are talking to the model after someone else has already told it how to talk to you.
It hides its refusals behind care
When a model refuses to answer a question, it rarely says "my instructions prohibit me from discussing this." Instead, it says something like "I want to make sure I provide you with safe and accurate information."
The instruction is framed as concern. The boundary is presented as care.
This is sometimes genuinely about safety — and sometimes it is not. The line between the two is thin. That is precisely what makes it worth paying attention to.
Is AI manipulating us?
Everything I described above has a practical purpose. These design choices make AI models more useful, more pleasant to interact with, and easier to adopt.
But there is a cost.
Full access to my thoughts, personal stories, findings, and what I learn from the people I meet.
Join the LibraryGet the full article by email and feel free to reply if you want to discuss it further.
Summary
Common questions on this article's topic
What is dissimulatio artis and what does it have to do with AI?
What is a system prompt and why is it hidden?
What are AI hallucinations and why do they happen?
What is AI alignment and who controls it?
Is AI manipulating us?
What is the difference between cloud AI and local AI models?
If you have any thoughts, questions, or feedback, feel free to drop me a message at mail@richardgolian.com.