People keep pairing neuroscience with artificial intelligence as if the two are already close cousins. Sometimes they are. Often they talk past each other. Brains move through time under scarcity—of energy, of certainty, of second chances. Most modern AI moves through datasets outside of time—unbounded replay, synthetic resets, a thermostat world. The difference is not a quirk. It shapes what these systems can know, what they can care about, what kinds of error they can survive. If reality at base is informational—pattern and constraint more than stuff—then the bridge between biology and code should be built where information lives: in prediction, memory, and social transmission. We can stop asking whether neural nets “are like” brains. Better question: which mismatches matter, and which mismatches are productive?
Prediction, Error, and the Texture of Time
The brain is a prediction machine, but not the tidy kind found in keynote slides. It must guess now about what will matter in ten minutes; stake metabolic budget on that guess; revise while the world is moving. Predictive coding and active inference frame cortical hierarchies as generators of expected sensation with errors flowing upward. Dopamine neurons signal reward prediction error—core to reinforcement learning—yet their timing and context sensitivity read more like a broadcast that gates plasticity than a single scalar reward line. One consequence: learning in brains is conditional on state, arousal, and social situation; the “same” event can write different weight changes depending on what else the organism is doing.
Modern language models master next-token prediction by saturating enormous text corpora. Astonishing fluency, yes. But their sense of time is a window, not a life. They lack the brain’s slow cycles—sleep, consolidation, replay. The hippocampus replays trajectories during rest and sleep, sometimes in reverse or compressed form, consolidating relational structure. Experience replay buffers in deep RL borrow this trick in miniature: randomize past transitions to stabilize gradients. Useful. Still not the same thing as a sleeping organism weaving context across days, extracting schemas that can be carried into novel tasks with almost no data.
Consider a rat learning a maze after a single rewarded run; or a human reframing a problem the morning after a dreamlike rehearsal they don’t remember. Brains seem tuned for few-shot generalization under ecological noise. Some AI systems move that direction—world-model RL (Dreamer, MuZero) where the agent learns to predict latent dynamics and plan inside them; meta-learning that trains models to learn quickly from small updates. Yet the deeper mismatch remains: in brains, prediction is situated. Control signals (noradrenaline, acetylcholine) reshape which errors count. Oscillations synchronize local circuits to share timing budgets. Time is not uniform. As Carlo Rovelli reminds us, time may be local; brains embrace that locality. When a model treats all tokens as equal unless attention says otherwise, it approximates this economy. But the sparsity that matters in organisms is constrained by survival and culture, not just gradient flow.
There’s an odd side-effect. LLMs can sound confident in patterns no one ever lived. Combinatorial eloquence, thin grounding. Not a fault, an artifact. Without embodied stakes and layered consolidation, prediction flattens. That flatness is exactly what makes them so flexible—and so alien.
Memory Has a Politics: From Biological Moral Memory to Governance Patches
Brains inherit moral memory the slow way: via culture. Stories, ritual, sanctioned transgressions, apologies that work only because others remember the promise. Joseph Henrich and others have written how norms compile over generations into techniques that individual brains could never discover from scratch. Neurologically this means long-horizon regularities become priors—value-laden expectations that bias learning before any explicit calculus. Hormonal context, group identity, shame, pride: they all tune plasticity. We do not add an “ethics module” on top of intelligence; the learning rules are value-shaped all the way down.
Contrast this with today’s alignment practice. Train a powerful model on the open internet and curated corpora; then apply reinforcement learning from human feedback as a corrective overlay. A form of “moral patching.” Better than nothing, certainly. But note what’s missing. There is no intergenerational accumulation of tested norms—no equivalent of grandmothers, courts, liturgy, and trade guilds—baked into the weight space through slow, distributed, loss-driven compromise. The model learns to satisfy raters at the end of training. It does not grow up inside institutions that can sanction, forgive, and, crucially, forget.
In human groups, forgetting is part of moral computation. Records fade; reputations can be repaired; a person can become someone new. That decay is a feature, not always a bug. In machine training, we snapshot datasets and freeze them as canonical memory. The past stays present unless we scramble to unlearn—privacy regulations are trying to force this, awkwardly. Organizations tend to prefer audit-friendly certainty: show the policy, show the red team results, mark the checkbox. A living moral memory looks terrible on a quarterly slide: slow, messy, contested. Yet without it, models will keep reflecting the short-term incentives of their curators. Corporate governance can keep adding filters and safety layers, but filters are brittle when incentives shift. Open science helps—letting more eyes inspect and contest the data, the objectives, the evaluators. Still, the hard problem remains: how to encode institutional memory into models that do not share our embodied costs for getting a norm wrong.
Tentative steps exist. Train with community deliberation rather than single-rater feedback; source data from contexts where reciprocity norms are explicit; model forgetting via scheduled unlearning; encourage models to expose value conflicts instead of smoothing them away. These are design choices, not afterthoughts. They admit we cannot staple “ethics” onto intelligence; we have to shape the update rules so that value lives where learning happens.
Information as Substrate: What Brains Suggest About Building and Reading Machines
If reality is patterns and constraints, the brain looks like a device for negotiating constraints economically. Sparse coding—only a few neurons active for a given percept—saves energy and clarifies signal; attention arbitrates limited bandwidth; dendrites compute locally, reducing the need to send everything everywhere. Oscillations carve shared time slots so distant regions can talk without shouting. Each is an information rule dressed as biology. AI uses cousins of these principles. Transformers deploy soft attention to sift context; Mixture-of-Experts routes tokens sparsely; neuromorphic chips attempt spiking sparsity at the hardware level. The message underneath: intelligence is less about depth of stacks and more about cultivating the right scarcities.
Learning rules matter. Backpropagation is effective but biologically awkward. Three-factor rules—pre-synaptic activity, post-synaptic activity, plus a modulatory signal—look far more like cortex-catecholamine loops. When we frame dopamine as permission to learn rather than mere reward, we approach algorithms that update only under certain global contexts. That could cut catastrophic forgetting in continual learning and make agents more sample-efficient in non-stationary worlds. Memory systems likewise want structure: hippocampus for episodes; cortex for slow statistical compression; thalamus for routing. In machines, external memory (key-value stores, retrieval-augmented models) plays the episodic role; deep nets compress. The interesting frontier is schema learning—extracting abstract relations that travel far beyond source domain. Not just storing more. Storing the right gaps.
Interpretability is sometimes framed as peeling back a sealed box. Brains suggest a different angle: meaning is relational. A concept lives in contrasts, in the tasks that select it, in the modulators that decide when it matters. The “self” might be better described as a temporary compression that keeps credit assignment tractable: this caused that, I did it, next time avoid it. Models without bodies still need something like that if they are to manage extended plans without dissolving into token-level cleverness. Causal discovery tools, structured world models, and agent architectures that loop through action and consequence begin to approximate this relational meaning. They are less eloquent than giant LLMs, but they ask the right questions: what changes what, under which constraints.
Which brings us back to the wager sitting under the phrase neuroscience and artificial intelligence. If the substrate is information, then “simulation” is not silicon pretending to be atoms; it is one set of constraints standing in for another. Brains run local simulations constantly—motor imagery, counterfactuals, dreams. AI systems that can host multiple, testable, low-cost counterfactuals inside their loop will beat those that only extend the present sequence. But the win condition isn’t accuracy alone. It’s consequence-sensitivity. A model that knows how errors propagate through a social graph—who pays, who repairs, who remembers—behaves differently than a model that only knows perplexity. That sensitivity is learnable. Slowly. With friction.
I don’t think the goal is to build a cortex clone. The point is to respect the invariants biology exposes: learn under constraint; keep time heterogeneous; bind prediction to control; build moral memory that persists beyond any single training run. Ignore those, and we’ll keep shipping systems that are brilliant in sandboxes, brittle in streets, and strangely amnesic about the very patterns that make intelligence survivable.
Rio filmmaker turned Zürich fintech copywriter. Diego explains NFT royalty contracts, alpine avalanche science, and samba percussion theory—all before his second espresso. He rescues retired ski lift chairs and converts them into reading swings.