The Social Inference Machine
Part 2 of “The Inference Universe”. Why cultures are compression algorithms, revolutions are phase transitions, and we are each other’s priors
Part 1 of the post (here) was on “The Physics of Inference”, on the takeaway that physics isn't about forces acting on objects. It's about state updates propagating through channels. The universe is a cellular automaton with no central processor, only local rules and infinite patience. In this article, I expand it to society and culture.
In 1947, a line was drawn across a massive subcontinent. Fourteen million people moved, some voluntary, most driven by paranoia, fear and violence. One million died. Two new nations emerged where one had been.
What happened?
The standard telling has always followed a script: politics, religion, colonialism, violence, ego between two very powerful individuals, both of who wanted to be Prime Ministers, across very different political ideologies. All of this is recorded with ample evidence, and true.
But underneath these words lies a stranger process. Millions of people, who had lived as neighbors for generations, suddenly updated their beliefs about who they were, who “the other” was, and what was possible. They didn’t move because they hated their neighbors. They moved because they inferred - from speeches, from rumours, from the movements of others; that staying meant death.
The partition wasn’t just a political event. It was a phase transition in a very large social inference network. The priors on beliefs shifted. The cascade to update and reorient to new beliefs began. And once enough people updated themselves, the update became self-fulfilling.
This is what we’ll explore today: the “physics” of societies. How cultures emerge, how politics computes, how progress happens, and why, sometimes; everything breaks all at once.
What is a society, actually?
Let’s start with a definition. Not a poetic one, but a boring, structural one.
A society is a network of members, entities or agents running inference on each other.
Each entity or agent (person) has:
Priors: beliefs about the world and everything around them and about them; inherited from the culture they grow up in, their family, their personal experience
Observations: what they see others around them doing and saying
Update rules: how they change beliefs, given the observations they make
Actions: behaviours that become observations for others to pick up
The larger network emerges because my actions become your observations, and your actions become my observations. We are constantly and continuously entangled in each other’s inference.
This isn’t metaphor. It’s the actual structure of things around us. Sociologists call it symbolic interactionism. Economists call it game theory with beliefs. Network scientists call it opinion dynamics.
I’m calling it what it is: distributed inference on a human substrate.
The fundamental problem
Here’s the central problem societies solve: coordination without a central processor.
A colony of cells solves this problem by sharing an RNA or DNA, a common instruction set. A flock of birds solves it with simple rules: match your neighbour’s velocity, avoid collisions. But we humans face a harder version of this. We need to coordinate on:
What counts as property (and who owns what)
What counts as marriage (and who’s related to whom)
What counts as authority (and who can compel whom)
What counts as sacred (and what’s worth dying for)
These aren’t physical facts, no universal laws govern them. They’re shared beliefs, existing as abstracts; with real world artefacts derived from them. And these shared beliefs require a shared inference, some way for millions of minds to converge on compatible models of reality, that can be enforced and observed everyday life.
How does this work and why does happen?
Culture as lossy compression
Imagine you’re an infant. You know absolutely nothing. Everything around you is just noise.
Then, slowly but surely, patterns emerge. You learn that certain sounds mean food is coming. Certain voices and faces mean safety. Certain tones mean danger. You’re not being taught these things explicitly. You’re inferring them sub-consciously from the statistical regularities and repetitions in your environment.
Now scale this up further to a bunch of adults. Imagine you’re in a tribe. There are thousands of individuals, each inferring patterns from their environment. But here’s the problem: raw experience is not transmissible. I cannot upload my sensory stream into your brain. Words cannot wholly transmit the sensations and feelings of an experience. Language is lossy. Gesture is ambiguous. We cannot share experience directly till date.
So we compress. Into words, in stories, into narratives.
Culture is a shared compression algorithm for encoding and transmitting inferences across minds and generations.
So when a grandmother tells a grandchild “don’t swim in that river after dark,” she’s not transmitting her raw experience of crocodile attacks.
She’s transmitting a compressed inference: river + dark = danger.
The grandchild doesn’t need to survive a crocodile to know that the river is a dangerous place to be in the dark. The inference has been cached, from the communication by the grandmother.
The forms compression takes
Each of these is a codec: a way of encoding high-dimensional experience into low-dimensional transmissible form, to be passed down for generations.
Why cultures differ
Different environments that the people grow up in, generate different data. Different data goes on to generate different inferences. Different inferences get compressed into different codecs.
For eg., why do tropical cultures have different cuisine than arctic ones? Not (only) because of the ingredient availability, but because what works in each of these places is different. The inference “raw fish is safe” is true in cold climates (parasites die in ice) of the artic; and false in warm ones (parasites thrive). The compression adapts to the data.
Why do some cultures compress kinship into “family vs strangers” while others have seventeen distinct cousin categories?
Because the inference problems are different. In small-scale societies (eg. tribal groups), knowing exactly who is related to whom determines marriage eligibility, inheritance, and alliance. The compression must be higher-resolution for them to act based on their codec.
Hence you will see cultures with varied naming styles. Some with extremely long naming styles are designed to compress and convey information about not just the individual, but their root/source network.
Culture, therefore; isn’t arbitrary. It’s compressed inference from local data. This is why cultures feel so right to insiders and so strange to outsiders; you’re running a different decompression algorithm based on your “home” culture.
Tradition: the cached inference
Now here’s where all of it gets interesting. Compression has a cost to it: you lose the derivation.
Meaning: when grandma says “don’t swim in the river after dark,” she may not remember why. Maybe the original crocodile attack was four generations ago. The inference got cached; the evidence got garbage-collected.
Why the compression was derived in the first place, is mostly forgotten. Enter: Tradition.
Tradition is cached inference from ancestors, preserved without the original evidence.
This is both powerful and dangerous.
Powerful because it lets you benefit from your ancestors’ experience without repeating their mistakes. You don’t need to discover for yourself that certain mushrooms are poisonous. The cache that got passed down says “don’t eat those.”
Dangerous because environments change. The cache was computed for a different dataset. Time, place, people, risks, weather, belief systems - everything has changed. If the river no longer has crocodiles (they were hunted out), the prohibition becomes superstition - an inference that is completely severed from its evidence. Most religions today find themselves here.
Chesterton’s fence
G.K. Chesterton proposed a principle: don’t tear down a fence until you understand why it was built. In inference terms, this means: don’t invalidate a cached “prior” until you’ve reconstructed the likelihood function that generated it.
This is actually good Bayesian hygiene(refer part 1 for more on Bayesian thinking).
If a tradition has ensured it persists across generations, it probably solved some problem. The tradition is evidence that the problem existed, even if you can’t see it now. Tearing down the fence without understanding is ignoring evidence.
But, and this is crucial, the most important thing is : Chesterton’s Fence has limits.
Sometimes the fence was built by idiots. Sometimes it was built to solve a problem that no longer exists in any shape or form. Sometimes it was built to benefit the fence-builder at others’ expense.
The tradition’s persistence is evidence, not proof. You still have to reason about it, and validate if it needs to exist. Chesterton only says you to question, it is not a call for inaction about updating beliefs. Superstitions such as caste system fall into this category of beliefs that shouldn't exist.
The caste system as cached Inference
Let’s consider caste. What is it, structurally?
One reading: caste is a very old, very high-resolution compression of:
Occupational specialization (who does what work)
Purity/pollution categories (who can touch what, who can enter where)
Marriage constraints (who can reproduce with whom)
Economic relationships (who owes what to whom)
At some point, centuries ago; these compressions may have been formulated by those in power with an agenda or solved for someone in power of enforcement, a bunch of unknown coordination problems. The truth is we will never know why. It could have been gate keeping resources. Could have been that occupational inheritance meant skill transmission, which was being guarded intentionally.
In most cultures, endogamy meant wealth retention. A lot of purity related rules may have begun as hygiene heuristics, and probably drifted in superstitious, traditional absolutes.
But what is true and measurable is that the environment has changed. Culture has evolved. Hygiene has become the norm. Knowledge is universally accessible and is available(mostly) on merit. The economy has industrialised. The evidence that generated the original inferences for traditions can’t seem to be found. Yet the cache persists - now as a system of oppression rather than coordination.
This is the failure mode of tradition: when the cache outlives its evidence, it becomes a cage.
Politics as distributed state synchronisation
Every society faces a coordination problem: how do millions of entities, or participants or agents align their behaviour without a central controller?
Politics is how we solve this.
Politics is the process by which a society negotiates and propagates shared priors about authority, resource allocation, and legitimate action.
Notice: politics isn’t about finding truth or equanimity. It’s about achieving synchronization.
A society where 51% believe “taxes should be higher” and 49% believe “taxes should be lower” isn’t wrong; it’s unsynced. Politics is the protocol for enforcing an outcome, with or without resolving the desync.
Different political systems = different inference architectures
Each architecture has its tradeoffs. And we live in a time where we can find live examples for each of these political systems across the globe.
Autocracy is fast in decision-making and enforcement; but fragile (if the autocrat’s priors are wrong and trending to coup, no mechanism corrects them). Democracy is robust but slow (consensus-building takes time, easy for corruption in the lower rung of bureaucracy to take form).
And democracies and federalism achieve outcome alignment this via elections.
Elections as synchronisation checkpoints
What is an election, structurally?
An election is a scheduled synchronisation event where members, entities or agents reveal their priors and the system computes a new consensus state.
Before the election, individuals, or entities (could be voting group clusters), or agents hold private beliefs. The election process aggregates these beliefs into a public outcome. The outcome then propagates back as a new prior for all individuals or agents who make up the populace: “The government is now X. Plan accordingly.”
This is why elections feel so weighty. They’re not just about choosing leaders, or picking a side on ideology. They’re technically resetting the shared state.
The day after an election, everyone - winners and losers; updates their model of “what is possible now.”
Constitutions as immutable priors
Some beliefs are too important to update easily. We don’t want 51% of the population to be able to vote away the rights of the other 49%, just because the 51% believe it should be so. So we created constitutional constraints -priors that are expensive to change, and need effort, alignment and prolonged institutional process.
In inference terms: a constitution is a prior with very high resistance to updating. You can’t change it with a single election.
You need supermajorities, ratification processes, years of effort.
This is a feature, not a bug. Some priors should be sticky. “Don’t murder” is a good sticky prior for a society. “This particular family should rule forever” is a bad sticky prior.
The art of constitutional design is deciding which priors should be sticky, and how sticky.
Revolutions as phase transitions
Most of the time, social inference is gradual. People update their beliefs slowly, sometimes over decades, or never. Institutions adapt incrementally. This is a continuous regimen of slow drift.
But once in a while, a spark ignites; and everything changes at once.
In physics, a phase transition occurs when a system shifts from one state to another discontinuously. Water goes from liquid to gas at 100°C. Below that temperature, adding heat just makes water warmer. Yet at that exact and particular temperature, adding heat makes water change state.
Revolutions are social phase transitions.
A revolution is a discontinuous shift in a society’s shared priors, occurring when the existing consensus becomes untenable.
The mechanics of preference falsification
Here’s how it works. Suppose most people in a society privately disbelieve or abhor the official ideology, but each individual thinks they’re in the minority. They falsify their preferences; publicly conforming with everything for their safety, while privately dissenting.
The sociologist/economist Timur Kuran documented this in communist Eastern Europe. Surveys showed overwhelming support for the regime. But the support was hollow- people said what was safe, not what they believed.
In this state, the society is in a metastable equilibrium. Like supercooled water; liquid below freezing, stable until disturbed, then crystallising all at once.
The cascade
When something disturbs the equilibrium- a crisis, a defection, a signal that dissent is safe; the cascade begins:
A few people publicly defect (update their visible state)
Others observe the defection and update their belief about “how many dissenters are there?”
Some of these update enough to defect themselves
More observation, more updating, more defection
The cascade accelerates until the old consensus collapses
This is exactly how rumors spread, how bank runs happen, how revolutions ignite. Each member or entity or agent is doing local inference on their neighbours. The global phase transition emerges from these local updates.
The Berlin Wall fell in 1989 not because the regime was weaker than in 1988, but because enough people simultaneously updated their beliefs about what was possible. The wall was always made of belief. When the belief evaporated, so did the wall.
Progress: prior-updating what we approve of
Here’s a word that carries enormous weight in our history: progress.
But what is it, actually?
One view: progress is movement toward objective betterment: more prosperity, more freedom, more flourishing.
Another view: progress is just change, and we retroactively label the changes we like as “progress.”
The inference frame offers a third view:
Progress is the process of updating priors toward models that better predict and control reality, where “better” is measured by the goals of the member or principal or agent doing the updating.
This definition has some specific features to it:
Progress is real but not universal. When we update from “disease is caused by demons” to “disease is caused by pathogens,” we can now cure diseases. That’s progress by any reasonable measure - our predictions are better, and our control is greater. Diagnosis becomes reliable, interventions become testable, and many diseases become preventable or curable. That is progress in the strict sense: improved inference and improved ability to shape outcomes. Germ theory massively improved prediction and control over disease, yet those gains didn’t arrive everywhere at once and still doesn’t benefit everyone equally.
Progress is path-dependent. The updates we make depend on the data we encounter. A society that industrialized via coal will have different priors than one that industrialized via hydropower.
Progress is contestable. What counts as “better prediction and control” depends on what you’re trying to predict and control. Economic growth? Spiritual enlightenment? Military power? Environmental sustainability? Different goals, different progress.
The Enlightenment movement as prior replacement
Consider the European Enlightenment movement. What happened?
In inference terms: a critical mass of thinkers replaced their meta-priors. They stopped asking “what does scripture say?” and started asking “what does evidence say?” They stopped treating authority as self-validating and started treating it as hypothesis to be tested.
This wasn’t just a new belief system. It was a new update rule, on scale. And once the new update rule propagated, it generated an avalanche of new beliefs: experimental science, democratic theory, human rights, market economics.
The Enlightenment Age that led to modern science in the current form and manner wasn’t a discovery of truths that were always there. It was a change in how inference was conducted. And that meta-update changed everything.
Polarisation: When channels diverge
Now to the dark side.
In Part 1, I argued that noise is just signal on channels you’re not tuned to. In society, this has a disturbing implication.
Polarisation occurs when sub-populations run inference on different data, using different update rules, and converge on incompatible posteriors.
Let me be more precise about this mechanism, and how it manifests.
Information diets
Suppose Alice watches Channel A. Bob watches Channel B. Over years, they consume different data:
Alice sees: immigration crime stories, inflation reports, traditional values affirmed
Bob sees: police brutality footage, climate data, diversity celebrated
Even if both are rational Bayesian updaters, they will converge on different posteriors. Not because one is stupid and one is smart, but because they’re running inference on different datasets.
Trust networks
It gets worse. Alice and Bob also have different trust networks; very different beliefs about whose testimony to weight highly.
Alice trusts: her church, her family, commentators who share her values.
Bob trusts: scientists, journalists, activists who share his values
When Alice’s trusted sources say “X is true” and Bob’s trusted sources say “X is false,” they don’t just disagree about X. They disagree about who is credible. And credibility judgments are upstream of all other beliefs.
The feedback loop
Now add social media into the mix: a channel that solely exists only to optimise for engagement, which correlates with outrage.
Alice sees content that triggers outrage at Bob’s tribe
Alice updates: “Bob’s tribe is worse than I thought”
Alice engages with more such content
Algorithm shows more such content
Repeat
Bob has a symmetric experience on his side. Both are caught in their update loops that push their posterior inferences further apart.
The physics term for is symmetry breaking. A system that was once mixed separates into distinct phases. The social term is polarisation.
The experience is: “I don’t recognise my country anymore.”
Is there a fix?
The inference frame suggests where to intervene:
Shared data: If people are diverging because of different inputs, expose them to common inputs. But this is extremely hard to accomplish when attention is scarce and curated, and data inherently is not trusted.
Shared update rules: If people are diverging because of different likelihoods (different beliefs about what sources to trust), rebuild shared trust. But trust is slow to build and fast to destroy.
Depolarisation cascades: If the problem is cascade dynamics, trigger reverse cascades; find ways to make moderation contagious.
None of these are easy, or can be done in a day. They take years of slow build-up. But at least the frame tells us what the problem is: not that people are stupid, but that they’re running inference in separated channels.
What we are to each other
Let me end somewhere more personal.
We’ve talked about societies as inference networks, cultures as compression algorithms, politics as synchronization protocols. All true. But it can sound cold; as if we’re just nodes in a graph, just variables in an equation.
We’re not. Or rather: we are, and we’re also something more.
Here’s the something more:
You are not just the product of your own inference. You are the product of everyone who ever updated your priors.
Your parents. Your teachers. Your friends. Your coach. The authors of the books you read, the makers of the films you watched, the strangers who were kind or cruel to you at the moments it mattered. They all wrote themselves into you. Not metaphorically. Actually. Your beliefs are their cached inferences, still running.
And you do the same for others. Every conversation, every gesture, every choice you make - you are data for someone else’s inference. You are updating the world’s priors just by existing.
The weight of witness
In physics, observation changes the system(remember the much mis-understood Schordinger’s cat). In society, the act of witnessing itself changes both parties.
When you listen to someone’s story, really listen; you’re doing something very sacred. You’re allowing their experience to update your priors.
You’re saying: “Your data counts. Your perspective could change mine.”
And when you speak, you’re offering your own data for others’ inference. So it takes courage to speak up against perceived wrongful notions, because speaking up is to offer data, to cause a living ripple in the inference pattern. You’re participating in the update on the network, contributing your unique vantage point to the collective computation of humanity.
This is why debate matters. This is why stories matter. This is why conversation matters. We are building reality together, inference by inference, update by update; and we all live in the collective fabric of this inference.
The tragedy and the hope
The tragedy is that inference can go wrong. Cascades can mislead. Caches can be rigid and unchanging, and calcify. Channels can diverge. With enough inference, hate can replace humanity in our belief systems. We can end up in worlds where millions of people share beliefs that are beautifully internally consistent and catastrophically detached from true reality.
The hope is that inference can also go right. We can update. We can learn. We can find common ground by finding common data. The same mechanism that produces mass delusion can also produce collective enlightenment for us all, if we choose it.
The same network that carried partition’s violence also carried partition’s healings; the millions of small kindnesses, the neighbours who sheltered neighbours, the belief that this person is still a person.
We are each other’s priors. We will be each other’s future updates. What we believe about each other also determines what becomes possible between us. That’s a responsibility, and also a gift.
Next in the series: Part 3: Markets as Inference Engines: how prices are posteriors, arbitrage is error correction, and money is the universe’s most successful communication protocol.
Appendix: Key References
Timur Kuran — “Private Truths, Public Lies” (1997). The definitive account of preference falsification and revolutionary cascades.
Joseph Henrich — “The Secret of Our Success” (2015). How cultural evolution accumulates adaptive knowledge beyond individual capacity.
James Scott — “Seeing Like a State” (1998). How states compress local knowledge into legible categories—and what gets lost.
Thomas Kuhn — “The Structure of Scientific Revolutions” (1962). Paradigm shifts as prior replacement in science.
Mark Granovetter — “The Strength of Weak Ties” (1973). Network structure’s effect on information propagation.
Daron Acemoglu & James Robinson — “Why Nations Fail” (2012). Institutional path-dependence as prior lock-in.
Robin Dunbar — “How Many Friends Does One Person Need?” (2010). Cognitive limits on social network size and inference capacity.



