The Inferring Self
Part 4 of "The Inference Universe" : You are not a thing that thinks. You are the "thinking" itself, the computation running on meat, for a little while.
Close your eyes. Wait three seconds. Now open them.
In that moment of opening, your visual cortex processed roughly 10 million bits of information per second. But here’s the strange part: only about 40 bits per second made it to your conscious awareness. The rest was handled in your brain, but... somewhere else. By something that is you, but that “you”, you never meet.
Where did those 10 million bits go? What did that silent part of you do with them?
It predicted. Before the light hit your retina, your brain had already generated a model of what it expected to see. The 10 million bits weren’t processed as new information. They were compared against a prediction. Only the errors in that prediction, the places where reality diverged from expectation; went ahead and bubbled up to consciousness.
You don’t see the world. You see your prediction errors.
This is the starting point for understanding what we all are: not a passive receiver of experience, but an active generator of predictions, constantly inferring the world into being and occasionally getting corrected by it.
The predictive brain
Let’s start with the basic architecture.
The Old Model: Stimulus → Response
For most of the 20th century, neuroscience assumed perception worked like this:
World → Senses → Brain → PerceptionLight hits your eye. The signal travels to brain. Your brain processes signal. And thus you see.
This model is wrong. Not subtly wrong. Architecturally wrong.
The New Model: Prediction → Correction
The actual flow looks more like this:
Brain generates prediction → Compares to sensory input → Updates on it's prediction errorsYour brain doesn’t wait for the world to tell it what’s out there. It guesses first, then checks with the input it receives through your senses. The sensory input is primarily used to correct the guess, not to create the perception from scratch.
This is called predictive processing or predictive coding, and it’s now the dominant framework in computational neuroscience.
Predictive processing is the theory that the brain is fundamentally a prediction machine, constantly generating models of expected inputs and updating those models based on prediction errors.
Evidence this is real
This isn’t vague philosophical point of view. It’s measurable science, via:
Faster processing of expected stimuli: When you see something you predicted, neural response is smaller (less error to process). When you see something unexpected, neural response spikes. There’s a related spike in cortisol or dopamine in your body, corresponding to the nature of the error.
Illusions as predictions overriding reality: In the hollow face illusion, your brain’s strong prior (”faces are convex”) overrides the actual visual data showing a concave face. You literally see what you expect, not what’s really there.
Placebo effects: Your prediction that a pill will help actually changes your physiological state. The model that you believe affects the physical system.
Phantom limbs: Amputees often report that they feel limbs that don’t exist because the brain’s model still includes the limb. The prediction persists without the input to correct it.
You are, at every moment, hallucinating your reality; and calling it perception only when the hallucination is well-calibrated to the real world.
The free energy principle
Now let’s go deeper. Much deeper.
In 2006, the neuroscientist Karl Friston proposed something audacious: a unified theory of brain function. Not just perception, but action, learning, attention, memory, development; all of it, derivable from a single principle.
The principle: minimize free energy.
What is free energy?
This requires precision and attention. The “free energy” here is not the physics term (Helmholtz or Gibbs free energy). It’s an information-theoretic quantity that Friston borrowed from statistical mechanics and machine learning.
Variational free energy is an upper bound on “surprise” the negative log probability of sensory observations given an agent’s model of the world.
In less technical terms: free energy measures how badly your predictions are failing.
High free energy = big prediction errors = your model is wrong
Low free energy = small prediction errors = your model fits reality
The Free Energy Principle (FEP) says: all living systems act to minimize free energy.
Here’s the profound part. Minimizing free energy can happen in two ways:
Update your model (change your beliefs to better predict inputs)
Change your inputs (act on the world to make it match your predictions)
This means something profound: perception and action are the same process seen from different angles.
Perception = changing your model to fit the world
Action = changing the world to fit your model
Both minimize free energy. Both reduce prediction error. They’re two sides of one coin.
The math (simplified)
For those who want the mathematical structure, here’s the key equation:
F = E_q[log q(s) - log p(o,s)]Where:
F = free energy (what we minimize)
q(s) = our beliefs about hidden states of the world
p(o,s) = the joint probability of observations and states under our model
E_q = expected value under our beliefs
The equation decomposes into:
F = Complexity - AccuracyComplexity: how much your beliefs diverge from your prior
Accuracy: how well your beliefs predict observations
Minimizing free energy means: be as accurate as possible while keeping your model as simple as possible. This is Occam’s razor, derived mathematically.
Active inference
The extension to committing action (to update your beliefs) is called active inference. Instead of just updating beliefs, the agent can act to gather information or to make the world conform to its predictions.
This explains:
Curiosity: we are always seeking information that reduces model uncertainty (epistemic action)
Goal-directed behaviour: acting to bring about a predicted (desired) outcome (pragmatic action)
Homeostasis: acting to keep physiological variables within predicted ranges
A thermostat minimizes free energy by turning on the heater. Your body minimizes free energy by shivering. You minimize free energy by grabbing a blanket. Same principle, different substrates.
“You” as a model
Here’s where it gets really personal. If your brain is a prediction machine, what is the self?
The self is the model your brain uses to predict your own states, actions, and their consequences.
You are not a little person inside your head watching a screen. You are the model that predicts what “you” will do, feel, and experience next. The sense of being a unified self is a predictive construct; useful for coordinating action, but not a fundamental feature of reality.
Identity as sticky prior
Your identity, your sense of who “you” are; is a prior that’s very resistant to updating.
Why? Because identity predictions are entangled with everything. If you update your belief about “who I am,” you have to update thousands of downstream predictions in your brain: how I’ll behave, what I’ll enjoy, who I’ll spend time with.
This is why identity change is hard. It’s not laziness. It’s comes with extensive computational cost. The system is built so that it resists updates that cascade too widely, and it’s programmed to avoid spending too much energy on computation.
And this is why identity threats feel so visceral. An attack on your identity isn’t just an argument, it’s a threat to destabilise your entire predictive architecture; and therefore your view of yourself, the world, and your place in it.
Memory as compressed prior
What is a memory?
Memory is a compressed prior; a stored prediction that can be deployed in the present to generate expectations about situations similar to the past.
You don’t “retrieve” memories like files from a hard drive. You reconstruct them from compressed priors. This is why memories are malleable, why they change with each recall, why eyewitness testimony is unreliable.
The memory isn’t the past. It’s your model(remembrance) of the past, re-rendered each time you access it.
Attention as channel selection
You can’t process everything every time. Attention is the mechanism that selects which prediction errors get the current resources.
Attention is the process of increasing the gain on certain prediction errors while suppressing others.
What you attend to is what you allow to update your model. What you ignore is what you refuse to learn from.
This is why human attention is so valuable and so contested. Whoever controls your attention controls which prediction errors reach your model, and thus your belief systems and identity.
Advertisers, politicians, algorithms: all are competing to control your inference.
Consciousness: what inference feels like
Now to the hard question. Why does any of this feel like something?
A thermostat minimizes free energy. A thermostat does not (presumably) experience anything, heat or cold or change in temperature. It doesn’t shiver, or get angry when the power goes out. What’s different about us?
I won’t pretend to solve the hard problem of consciousness. But I can offer a frame that at least locates it approximately within our inference picture.
The global workspace
One leading theory (Global Workspace Theory, Bernard Baars) suggests that consciousness is what happens when information becomes globally available across brain systems.
Most prediction errors are processed locally. The error in your visual cortex stays in your visual cortex. But some errors are significant enough or relevant enough to the current goals that they get “broadcast” to the whole system.
That broadcast is consciousness. It’s the moment when a local update becomes a global update; when the whole system get’s informed and shifts itself in response.
The integrated information view
Another theory (Integrated Information Theory, Giulio Tononi) defines consciousness in terms of integration; the degree to which a system generates information above and beyond its parts. (It continues to remain controversial, with a small minority endorsing the "pseudoscience" label to refer to it.)
A collection of independent neurons generates no integrated information. But if you look at a brain, with its dense interconnections, generates a lot of integrated information . The “amount” of consciousness (phi, Φ) is the amount of this integrated information.
In this view, consciousness isn’t on/off abstraction. It’s a gradient. More integration = more consciousness.
The free energy frame
Using the Free Energy Principle and above, here’s a speculative but coherent proposal:
Consciousness is the felt quality of being a system that models itself modeling the world.
A thermostat models room temperature. But it doesn’t model itself modeling temperature. It has no self-model to update.
You and have a model of ourselves as a prediction machine. You can predict your own predictions. You can attend to your own attention. This recursive structure of “the model modeling itself”; may be what generates the felt sense of “there being something, like a consciousness” to be you.
This is hypothesis, not settled science. But it helps me locate ad narrow down consciousness within the inference frame: consciousness is what self-modeling feels like from the inside.
Silicon minds: intelligence without our substrate
Now let’s ask the question we are all wondering: what does this mean for and with AI?
If intelligence is inference, and brains are just one substrate for inference, then what happens when we build inference machines from silicon?
What’s the same
The math is identical. A large language model (like ChatGPT or Claude or Gemini) is doing something structurally similar to predictive processing:
Prior: the weights learned during training (what patterns are expected)
Likelihood: how the current input relates to possible continuations
Posterior: the probability distribution over next tokens
When GPT generates text, it’s minimizing a form of prediction error; the cross-entropy between its predictions and the training distribution.
The Free Energy Principle applies here in principle: the model’s “goal” is to minimize surprise (prediction error) given its learned model of language.
What’s different
Several things seem to differ fundamentally:
The uniqueness question
So what makes human intelligence unique, if anything?
Not inference. Silicon does that, and probably can do it better on scale.
Not learning. Silicon does that too.
Not language. Silicon does that really well.
Here’s a candidate: we are mortality-aware, embodied inference.
Your predictions are not abstract. They are in service of keeping you alive. Your model of the world is saturated with relevance: what matters to you, what threatens you physically, psychologically, socially and financially; what nourishes you, what connects you to others.
An AI minimizes prediction error because that’s its loss function design. You minimize prediction error because if you don’t, you die.
And this changes everything. Our inference is existentially grounded. Every prediction carries weight because we have something to lose.
An AI has nothing to lose, it’s not clear there’s a “there” there that could lose anything. This may explain why AI can be so capable and yet feel so flat. It predicts beautifully. But it doesn’t care about its predictions, because these predictions do not affect it. Caring requires stakes in the game. Stakes in the game require mortality.
Could this change?
Possibly. If we build AI systems that:
Have continuous existence that can be ended
Have self-models that include their own persistence
Have intrinsic goals (not just imposed objectives)
Are embodied in ways that create genuine homeostatic needs
...then the picture might shift. The question isn’t “can machines think?” It’s “can machines care?”
And caring might require the one thing we can’t engineer on purpose: something at stake.
The shape of a life
I’ve covered physics, societies, markets, minds. Now let’s bring it all home.
I/You are an inference engine made of meat, running for a few decades, embedded in networks of other such engines, constantly predicting and updating.
What does this mean for living life?
Meaning as computed relevance
I have often ben stuck at the ask: what is the meaning of life?
In the inference frame:
Meaning is the output of a computation that assigns relevance to experiences, actions, and connections.
Meaning isn’t “out there” waiting to be found. It’s computed, by me, by us, using our model, based on our priors.
Something is meaningful when our model says it matters, when it predicts significant consequences for our goals, our connections, our persistence.
This might sound deflationary. It isn’t, it’s beautiful and simple. The fact that meaning is computed doesn’t make it any less real. Our experience of meaningfulness is as real as any other experience. It’s just that the mechanism is probably inference, not discovery.
And here’s the liberating part: if meaning is computed, it means we have some control over the computation. We can update our priors. You can choose what to attend to. You can shift what your model counts as mattering.
Meaning is not fixed. It’s configurable, dynamic and updateable.
Suffering as prediction error
The Buddha said: life is suffering. But does it really have to be? Could it be that all suffering comes from some mis-alignment? The inference frame says something similar but more precise:
Suffering is persistent, high-magnitude prediction error that the system cannot resolve through action or belief update.
When your model predicts X and reality delivers not-X, you feel distress. The larger the error, the greater the distress. The longer the error persists without resolution, the more it becomes suffering.
This explains why suffering often comes from resistance; from refusing to update the model even when the evidence demands it. If you cling to a prediction that reality keeps violating, you’re choosing to sustain the error, and thus suffer.
Acceptance of reality in this frame, is updating the model. It’s you saying: my prediction was wrong, and I’m revising it to match what it is in reality.
This doesn’t mean acceptance is easy. Model updates are costly, especially when the model is identity-entangled. But it locates the mechanism: suffering persists when updating is blocked.
Love as mutual model entanglement
What is love, structurally?
Love is the state in which two agents’ predictive models become deeply entangled; each modeling the other, each updating on the other’s states, each acting to minimize the other’s prediction errors.
When you love someone, their surprise becomes your surprise. Their suffering generates prediction error in your model. Their flourishing brings you into lower free energy.
This is why love feels like expansion; because it probably is. Your self-model expands to include another. Their states become your states to predict and to care about.
And this is why loss of love is so devastating to us, on an emotional level and identity level. The entangled model doesn’t disappear when the person leaves or disappears or departs. You keep predicting them. And the new reality keeps failing to deliver. The error is enormous and cannot be resolved by any simple action.
Grief is the slow, painful process of updating the model to match a world where the loved one no longer generates new data.
Death: the end of prediction
Now to the hardest part.
We all will die. The inference engine that is “me” will stop. No more predictions. No more updates. The model will halt.
What does this mean?
Death as channel closure
In Part 1, I said physical processes are inference propagating through channels. Your consciousness is also a channel; a specific, local, temporary channel through which the universe is doing inference.
Death is the closure of your channel. The inference stops. But the state changes you propagated into other channels persist.
You have already changed others’ models, their predictions about the world, their beliefs about what matters, their memories of you. Those changes are in them now. When you die, those changes don’t disappear. They continue to propagate.
This is legacy. Not metaphorically. Literally. Your inference, cached in others, running after you’ve stopped.
The terror and the peace
I won’t pretend this is fully comforting. I lost my dad a year back. And yet, so much of him - his experiences, his perception of the world, of life, of suffering, of joy - lives on one me. The end of your channel means the end of your experience. There will be no “you” to observe the persistence of your influence.
And yet -
The inference frame offers something clear and unique: there continuity not of self, but of effect. You were never a thing. You were always a process; a pattern of prediction and update, running on borrowed atoms for a borrowed moment.
And one day, the pattern ends. But patterns don’t exist in isolation. They exist in relation. Your pattern has already interleaved with countless others. Those interleavings continue to thrive and propagate. Your inference’s “impact” will carry on in the others that you deeply influenced.
All of humanity, you, me; is a wave, not a particle. The wave passes. But the water it moved through is now changed forever.
The wisdom of presence
Here’s a final implication that I feel.
Free energy has a temporal structure. You can try to minimize prediction error about the past (rumination) or the future(anxiety). But the actual sensory data is always now.
Presence: the much-discussed quality of being “in the moment”; can be defined in inference terms:
Presence is attending to current prediction errors rather than simulating past or future ones.
When you’re present, you’re running inference on actual data. When you’re ruminating or anxious, you’re running inference on simulated data; predictions about predictions.
Both are valid computations. But only presence grounds me and you in reality. Only presence allows for a genuine update.
The poets and contemplatives were right. Presence is precious. Not because the present is metaphysically special, but because it’s where the actual data is.
Coda: the privilege of knowing
Let me end part 4 where we began, in part 1.
A wire doesn’t know it’s conducting electricity. A market doesn’t know it’s computing prices. A culture doesn’t know it’s compressing inference.
But you and I, we know.
You can watch yourself predict. You can feel yourself update. You can notice the moment when reality violates expectation and the model shifts.
This is consciousness. Not a thing. A process of the universe doing inference, and knowing that it’s doing inference, through and in us.
You and me are the universe’s way of watching itself learn.
This sounds grandiose. It isn’t meant to be. It’s meant to be precise. We are made of atoms that are made of quantum fields that obey equations that propagate state changes. We are physics. We are inference. And we are the place where inference becomes aware of itself.
We will end. The channel will close. But for now, right now; you and I have something extraordinary:
The ability to know what we are.
The ability to choose what to attend to.
The ability to update our priors, to change our model, to shift what matters.
The ability to love, which is to entangle our model with another’s and call their errors our own.
The ability to wonder; which is just inference without an answer yet, and the willingness to stay in that openness.
You are not a thing that thinks. You are thinking itself, running on meat, for a little while.
Make it count.
Appendix: Key References
Karl Friston — “The Free-Energy Principle: A Unified Brain Theory?” (2010). The foundational paper on FEP in neuroscience.
Andy Clark — “Surfing Uncertainty” (2015). The most accessible book-length treatment of predictive processing.
Anil Seth — “Being You: A New Science of Consciousness” (2021). Applies predictive processing to consciousness, beautifully written.
Jakob Hohwy — “The Predictive Mind” (2013). Rigorous philosophical treatment of predictive coding.
Karl Friston — “A Free Energy Principle for Biological Systems” (2012). Extends FEP beyond brains to all living systems.
Thomas Metzinger — “Being No One” (2003). The self-model theory of subjectivity. Dense but profound.
Bernard Baars — “In the Theater of Consciousness” (1997). Global Workspace Theory’s original popular treatment.
Giulio Tononi — “Phi: A Voyage from the Brain to the Soul” (2012). Integrated Information Theory presented as narrative.
Lisa Feldman Barrett — “How Emotions Are Made” (2017). Applies predictive processing to emotions.
Mark Solms — “The Hidden Spring” (2021). Integrates Friston’s FEP with affect and consciousness.



