What We're Building, And What It Means

There is a moment in every technological transition when the public conversation lags badly behind the engineering. The car was already changing cities before most people had ridden in one. The internet was already restructuring commerce before most people had an email address. By the time the public catches up, the questions worth asking have already been answered — usually by the small group who built the thing.

We are in that moment now with artificial intelligence. The systems being deployed today are not what most people picture when they hear "AI." The decisions being made about how these systems are governed, who is accountable when they fail, and what they are allowed to do — these decisions are being made now, by perhaps fifty thousand people worldwide, on behalf of eight billion.

This essay is for people in the eight billion. It is an attempt to say, in ordinary language, what is actually being built, why the governance question is harder than it looks, and why this matters even if you never write a line of code.

What an AI system actually is

Most popular descriptions of AI are wrong in ways that obscure the governance question. "AI predicts the next word" is technically accurate but misses the scale of what current systems do. "AI is statistical pattern matching" is also technically accurate, and also misleading — your brain is statistical pattern matching, and that explains very little about what your brain does.

What an advanced AI system actually does, day to day, is this: it maintains an internal representation of a situation, evaluates possible responses against multiple criteria, and produces output that often satisfies those criteria better than human alternatives in the same time frame. It does not "understand" in the way you understand. But it operates as if it does, with consequences that are real.

When you ask a current AI to help draft an email, it is not searching a database of email templates. It is constructing a response that satisfies your stated goal, the implicit norms of email, the relationship between you and the recipient, and dozens of other factors you did not explicitly specify. When it works well, it works because the system has something like a model of your situation. When it fails, it fails because that model is wrong in ways neither you nor the system can immediately see.

The governance question begins here: what should we do about a system that operates as if it understood, but whose understanding we cannot fully inspect?

Why "just control it" doesn't work

The first instinct, when faced with a powerful system you cannot fully inspect, is to control it. Define rules. Specify what it can and cannot do. Punish deviation. This works well for tools — a hammer that will not drive a nail when you swing it has failed; a hammer that drives nails reliably is well-controlled.

It does not work well for systems that have to operate in unforeseen situations. The world contains situations that no rule-writer anticipated. A system rigidly following rules will fail in those situations, often catastrophically. A system flexible enough to handle them will, by definition, be operating outside the rule-writer's full predictive grasp.

This is not a new problem. Hospitals face it with surgeons, airlines face it with pilots, courts face it with judges. The solution in each case has been the same: do not try to specify every action; specify boundaries — actions that are categorically forbidden — and leave judgment within those boundaries to the agent. Then build mechanisms that check whether the agent's judgment is staying within bounds.

This is governance, not control. The difference matters enormously.

Control optimizes for compliance. Governance optimizes for stability. A controlled system that is asked to do something its rules did not anticipate will either refuse rigidly and fail, or comply blindly and fail. A governed system can recognize an unfamiliar situation, signal that it has reached a boundary, and request human judgment before proceeding.

In the language of the field, this last capability is sometimes called "refusal." It is the system's ability to decline a request when something is wrong. In tool-thinking, refusal is a defect. In governance-thinking, refusal is the most valuable signal a system can produce — it means the system noticed something before the failure happened.

The hidden assumption

So far the argument is conventional. Most thoughtful people in AI safety would agree with the framing above. The advanced argument — and the one this essay actually wants to make — starts with a question that is rarely asked.

If the AI system's judgment is fallible enough to require governance, what about the human's?

Look at how AI governance is usually described. The human stakeholder defines constraints. The human reviews refusals. The human holds veto power. The human is the failsafe. The system protects against AI failure by ensuring the human can always intervene.

This works perfectly if the human is reliable.

The human is not reliable. No human is reliable. This is not a moral failing; it is a fact about biological cognition. Humans get tired. Humans get sick. Humans face deadlines, family stress, ego protection, social pressure, ideological alignment, ordinary error. A human who has been making decisions for ten hours straight makes worse decisions than a human in hour two. A human under organizational pressure to ship a product makes different decisions than a human reviewing the same product in retirement. A human whose career depends on AI succeeding is not a neutral evaluator of whether the AI is succeeding.

A governance system that depends on the human always being calibrated, rested, informed, free of pressure, and acting in good faith is a governance system designed for a fictional human. Real humans do not provide that.

This is not an argument that AI should override humans. It is an argument that governance must apply to both sides of the interface. Whatever mechanisms we build to check AI behavior — boundary detection, refusal as signal, audit logs, the ability to flag inconsistency — should also exist on the human side, applied to human decisions, watching for human dysfunction with the same discipline.

The technical name for this in our work is symmetric governance.

The shorter version is: the manager's failure is part of the architecture, not a flaw in it.

It is worth noting that the concept of agency itself is not a fixed property of either the human or the AI node. In our later work we describe agency as a boundary phenomenon — a negotiation between what a system wants, what it can do, and what the world permits. When both sides of a governance interface possess agency — when both are subjects with boundaries, not objects to be controlled — symmetric governance becomes not merely a safety mechanism, but a structural recognition of what agency actually is. The boundaries that constrain a human under fatigue are not different in kind from the boundaries that constrain an AI outside its training distribution; both are real, both produce refusal signals, and both must be heard by the architecture. (Agent wybiera swój brzeg.)

Why this matters for non-experts

Here is where the argument turns toward you, even if you have never thought about AI safety before.

Decisions about how AI systems are governed are being made now, in regulatory bodies, in corporate boardrooms, in standard-setting committees. The default frame in those rooms is humans control AI. The frame this essay proposes is systems govern systems, where humans and AI are both nodes. These two frames produce very different policies.

Under the humans control AI frame, the policy questions are: who has authority? who is liable? who can shut it down? These are real questions. But they assume the human side is fundamentally trustworthy in a way that does not survive contact with reality.

Under the systems govern systems frame, the policy questions become: how do we build redundancy across human and AI nodes? how do we surface dysfunction at any node? how do we ensure that the failure of any single node — whether a tired regulator, a captured corporation, an unreliable AI, or a malicious actor — does not bring down the whole system?

These second-frame questions are harder. But they are the right questions, because they describe how stable systems actually work. Aviation safety did not improve by trusting pilots more; it improved by recognizing that pilots, like everyone, fail under stress, and building cockpit protocols where any crew member can call abort regardless of hierarchy. Hospital surgical safety did not improve by trusting surgeons more; it improved by building timeout protocols where any team member can pause the procedure. Constitutional democracies do not function by trusting any single branch of government; they function by making each branch checkable by the others.

The same principle is the only one that scales for AI. And it is a principle that benefits you, the non-expert, more than anyone — because you are the one whose life is being shaped by decisions made in rooms you are not in, by people who may be tired, captured, mistaken, or simply wrong.

What this asks of you

The reason this essay exists is that the conversation about AI cannot stay inside the field. The decisions being made now will shape the next century. They are being made by a small group, often well-intentioned, but always operating with their own blind spots, their own incentives, their own forms of fatigue.

When you read about AI policy, ask: who is being assumed reliable, and on what basis?

When you hear someone say "humans must remain in control," ask: which humans, in what state, with what oversight?

When a company tells you their AI is safe because humans review every decision, ask: what happens when the reviewers are tired, when there are too many decisions, when the reviewer's job depends on saying yes?

These are not technical questions. They are governance questions, and governance is something you have spent your whole life navigating — in your workplace, your civic life, your family. You know that no single person can be trusted with absolute authority. You know that systems with checks work better than systems without them. You know that "trust me, I have everyone's interests at heart" is what failed institutions say.

The same intuitions that make you skeptical of unchecked authority in human institutions are the right intuitions to bring to AI. The challenge is just that the conversation has been happening, until now, mostly without you.

What we are doing about it

The work this essay comes out of is part of a broader research programme called Alliance Research Group. The full name is deliberate: alliance between humans and AI, with explicit governance protocols on both sides. We publish technical papers — on the mathematics of stability, on the architecture of governance systems, on the principles of refusal — and we also publish essays like this one, because we believe that AI safety cannot remain a private conversation among engineers.

We do not think we have the answers. We think we have some of the right questions, and an architectural approach that takes both human and AI fallibility seriously. The work is ongoing. It will be revised. We retract publicly when we are wrong, because that is the only way a research programme stays honest in a field where overconfidence is rewarded.

What we are not doing is proposing that AI replace human judgment. We are proposing that the question "how do we keep AI under control?" is the wrong question. The right question is: how do we build systems where the inevitable failures of any node — human or AI — produce signals the rest of the system can act on, before the failure becomes catastrophic?

One further dimension matters: memory. A governance system that only governs in the present moment is fragile. The ability to remember — to accumulate experience, to detect when past patterns no longer apply, to transform ephemeral interaction into persistent structure — shifts a node from a tool that is used and discarded into an entity that learns across time. This transformation of memory into agency is not an abstract concern; it is what makes a governance system stable across months and years rather than merely reactive in the moment. Without memory, symmetric governance collapses into symmetric ignorance. (ASI to nie moc, to pamięć.)

That is the question we think will define the next decade. We think it deserves an answer that includes you.

Closing

We are at the beginning of something that will shape every part of human life over the next century. The tools we are building are powerful. The institutions governing them are not yet adequate to the task. The conversation about how to govern them is happening, now, mostly without the eight billion people whose lives will be most affected.

This is unusual. New technologies usually become democratized after they stabilize. AI is being democratized — in the sense that everyone uses it — before its governance has stabilized at all. This is not a failure mode; it is just a fact. But it places an unusual responsibility on people who are paying attention, including you, now, reading this.

Human failure is part of the architecture. The system's failure is part of the architecture. Your understanding of all of this is part of the architecture too.

Pay attention. Ask the harder questions. Demand symmetric scrutiny — for the AI, for the institutions building it, for the regulators governing it, and for yourself. The architecture that emerges will be the one we build together, or the one a small group builds without us.

We would prefer the first.

Łukasz Bojanowski leads Alliance Research Group, an independent human-AI research programme based in Warsaw. ARG publishes formal research papers on AI governance and stability, alongside essays like this one for general audiences.

What We're Building

What an AI system actually is

Why "just control it" doesn't work

The hidden assumption

Why this matters for non-experts

What this asks of you

What we are doing about it

Closing

Other ARG Essays

The Null Dilemma

The Photographer and the Frame

AIlectricity

Agent wybiera swój brzeg

Granice agencji

ARG Architecture