Overview Link to heading

Recent years have seen a concerning trend towards normalizing decisionmaking by Large Language Models (LLM), including in the adoption of legislation, the writing of judicial opinions and the routine administration of the rule of law. AI agents acting on behalf of human principals are supposed to lead us into a new age of productivity and convenience. The eloquence of AI-generated text and the narrative of super-human intelligence invite us to trust these systems more than we have trusted any human or algorithm ever before.

Frank Herbert captured the essence of this looming danger in Dune (Herbert [1965] 1968, 23):

Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.

It is difficult to know whether a machine is actually intelligent because of problems with construct validity, plagiarism, reproducibility and transferability in AI benchmarks. Most people will either have to personally evaluate the usefulness of AI tools against the benchmark of their own lived experience or be forced to trust an expert.

To explain this conundrum I propose the Intelligent AI Coin Thought Experiment and discuss four objections: the restriction of agents to low-value decisions, making AI decisionmakers open source, adding a human-in-the-loop and the general limits of trust in human agents.

Table of Contents Link to heading

The Rise of LLM-based Decisionmaking and Agentic AI Link to heading

The rise of Large Language Models (LLM) has been accompanied by a concerning trend: more and more people are expressing an interest in allowing LLMs (or, more often, a service built on an LLM, e.g. ChatGPT) to make decisions for them. On the public internet this often takes the form of grandstanding social media posts that begin with “I asked ChatGPT and this is what it told me to do!”.¹

This trend is related to the theme of agentic artificial intelligence (“agentic AI”), which is usually taken to mean some form of captive, docile and benevolent AI that takes general instructions, performs tasks and makes minor decisions on behalf of a human principal. Sort of like having a servile fantasy employee willing to go a dozen extra miles — just without the irritating need to pay a human being a living wage. We’ve seen this trope many, many times in science fiction. Justified or not, a large number of people believe that the time for science fiction to become reality is now.

The reason I am concerned about LLM-based decisionmaking and agentic AI is not because someone might buy the wrong pet food or that people could be ripped off when the AI servant falls for some a scam on Amazon Marketplace. This type of problem may become a serious consumer protection issue if agentic AI takes off in the mainstream, but it will remain manageable. The true danger lies in the damage done to entire social systems, particularly through democratic backsliding and an erosion of the rule of law.

Unfortunately, LLM-based decisionmaking is already being trialed for the creation of laws, the writing of judicial opinions and a lot of routine administrative work that bores people to tears but is in fact fundamental to the functioning of the rule of law. The trend towards AI decisionmaking normalizes the delegation of complex decisions to opaque systems. Systems that are ripe targets for capture by hostile interests and actors who prefer dictatorship over democracy and promote nepotism over the rule of law.

Freedom through Machine Intelligence? Link to heading

Of course, science fiction was there first. Frank Herbert explained the problem in Dune (Herbert [1965] 1968, 23):

Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.

Consider Twitter, now known as “X”. Twitter began as a passable attempt at creating a digital town square. Now, in 2025, Twitter/X has lost most of its value as a public space and has been transformed into a bullying tool for an out-of-control billionaire hell-bent on riding the rising tide of fascism, destroying his enemies and enriching himself even further. The algorithm now serves the purposes of one man, at the expense of many.

Imagine what might happen if we delegate not just control of information flows, but entire decisionmaking chains to black box systems that can be captured by anyone with a few billions and a vendetta. Perhaps these decisionmaking systems will even be purpose-built to exert covert influence, presenting a more benign image than X to the world. You wouldn’t even need to bribe all those elected representatives with generous campaign donations if you can control the computer system that tells them what the “best” course of action is.

This has obviously happened before to social and technical infrastructure of public importance. History knows of more than just a few outrageous corruption scandals, regulatory captures and democracies damaged beyond repair. Nor is predatory algorithmic decisionmaking an entirely new phenomenon. We’ve seen flash crashes caused by nefarious trading algorithms, automated sentencing decisions dripping with hidden racism and lives destroyed by false allegations of welfare fraud based on discriminatory algorithms.

Then again, corruption and democratic backsliding in the age of LLMs are likely to occur on an entirely different level of opacity and scale. This is because the eloquence of AI-generated text and the narrative of super-human intelligence invite us to trust these systems more than we have trusted any human or algorithm ever before.

And I am desperately afraid that people will extend that trust.

How Would You Even Know If Your AI Is Intelligent? Link to heading

There has always been intense debate about what “intelligence” means and whether our newest computer algorithms have achieved some form of intelligence roughly comparable to modern living creatures. Over the course of the past few years alone LLMs and their derivative services have been compared to everything from cats and dogs, to toddlers and to “infinite interns”.² Recent claims have even floated “PhD-level” intelligence for systems that take time to “think” about their answer.

And of course there’s always that type of influencer who proclaims the advent of super-intelligence with each major product release.

Bless their little influencer hearts.

Note

On Super-Human Intelligence

I am skeptical of “super-human intelligence” claims made by AI evangelists. Specialized tools like machines and computer programs routinely solve certain classes of problems with super-human speed and quality — that is the entire point of making specialized tools.

Your trusty hand-held calculator can do arithmetic (and usually other kinds of math) with a speed and accuracy that cannot be matched by any human being on earth. However, framing this ability in terms of “intelligence” isn’t really helpful in understanding how a calculator works.

The definition of intelligence and our ability to artificially create it are complicated problems, but I’d like to focus on a simpler question:

How would you even know that your AI is intelligent?

I mean you, the regular person without an advanced degree in an AI-related field and no expertise in building or benchmarking models.

With humans this is sort-of easy. In the modern age we assume that all humans have a problem-solving mental ability we call “intelligence”, an ability that differs only in degree, not in kind. If a life-form is human, it probably has its own mind and possesses and can use intelligence by default, barring any serious physical or mental conditions. It’s not that we can easily tell if people really are intelligent or just clever automata.³ We assume intelligence based on their humanity. Historically this wasn’t an easy development and racism got in the way rather often, but we got there eventually. Intelligent is often understood to mean “thinks and solves problems like a human being”.

Where this assumption breaks down is when we try to apply the concept of “intelligence” to non-humans and machines.

Scientists have tried to define intelligence, break it down into individual components and to test humans and machines against these expectations to measure their true intellectual ability. Human IQ tests contain a range of verbal, numerical and spatial reasoning tasks. The raw test results for a test-taker are benchmarked against the performance of other humans. AI benchmarks consist of a similar range of reasoning tasks against which machines are tested and then compared to other machines (and humans, in some cases).

Human IQ tests do not measure the full spectrum of human mental ability, as they mainly focus on abstract tasks, not on social and emotional capacity. This is why their construct validity — meaning whether they adequately measure the concept of intelligence — has been called into question.

AI benchmarks face similar problems and some of their own (Reuel et al. 2024):

Construct validity is as questionable with AI benchmarks as it is with human IQ tests. What is intelligence? What is reasoning? If we fail to define either for humans, how can we define it for AI? And if we successfully define them for humans, does this tell us something about machines?
Cheating is a challenge because both the questions and answers for a test are available online somewhere. Often it is not clear if the tested system actually performed some kind of reasoning or simply plagiarized the answer from its considerable corpus of training data. Cheating invalidates human test performance and so it should invalidate AI test performance.
Reproducibility is a problem related to cheating: if the test and its answers are published, new AI systems will simply ingest the test and plagiarize the answer — reminiscent of a human test-taker bringing their notes to a closed-book exam. However, if the test and the answers are not published, it is very difficult to independently reproduce the test and evaluate the quality of the benchmark.
Transferability means the gap between test and practical performance. If a human performs well on an artificial IQ test, will the human succeed against real problems? If the LLM performs well on an artificial benchmark, will it be able to perform useful tasks in the real world?

Personally, I don’t think AI benchmarks are helpful in deciding whether a particular (version of an) LLM is useful, high-quality or not. Most people will not be using a benchmarked, version-controlled model directly, but some kind of service that adds layers and layers of complexity on top of the actual model to modify the user experience. These services may or may not be comparable to the benchmarked model and, even if the services are benchmarked directly, they will usually change often and unpredictably, invalidating the benchmark results.

The Intelligent AI Coin Thought Experiment Link to heading

Most people aren’t scientists and do not understand, follow or care about AI benchmarks. If ordinary people want to rely on an AI tool they must do one of the following:

Personally evaluate the quality of an AI tool against the benchmark of their lived experience, or
Trust in the word of experts that an AI tool is actually “intelligent” according to some expert definition

To give you a feel for the problem I’d like to propose a thought experiment: the Intelligent AI Coin.

Let as assume you are willing to delegate your decision problems to an external agent. For simplicity’s sake, let us assume that these are classical decision problems with only two answers: “yes” and “no”.

You come to me for help in selecting a tool to solve your problems. I offer you two identical-looking coins that can make yes/no decisions for you:

A traditional fair coin
A super-intelligent AI-powered coin

The Traditional Fair Coin is one of those legendary objects found in every statistics textbook. “Fair” means that if you flip the coin an infinite number of times, then in the long run it will show heads 50% of the time and tails 50% of the time. Using a fair coin will quickly get you a “yes” or “no” answer to your decision problem. The result will be completely random and take into account zero information from your actual problem. But you do get a result and it’s a time-honored method of deciding if you can’t do it yourself.

The Intelligent AI Coin is powered by a state-of-the-art advanced neural network decision making algorithm that can understand your problem with human compassion, reason about it with PhD-level intelligence, possesses a perfect moral compass, can access the entirety of human knowledge, is friendly, benevolent, docile and your perfect servant. Flip it, and it will give you the perfect “yes” or “no” to your question and bring you all the success in life you ever wished for.

Note

I honestly doubt that such an AI exists today or will exist in my lifetime, but this is a thought experiment, so let’s pretend that it could.

There is one catch: I did mention that the coins look identical. You have no access to their decisionmaking internals. You cannot question the coins about their nature. You can flip them as often as you like in response to decision problems.

How do you tell the difference between the two coins?

Info

This thought experiment is closely related to the Chinese room thought experiment proposed by Searle (1980). I believe a coin is closer to the lived reality of most people, though. See also the famous imitation game proposed by Turing (1950) which has been popularized as the Turing Test.

Option 1: Evalute the Coins Yourself Link to heading

You could try to evaluate the coins yourself. Test them against your own problems. If the results match your intuition or seem useful, you should be able to rely on the tested coin as a tool. If you become extremely successful, good-looking, wealthy and sought-after it would seem that your coin is the advanced AI coin. If the results just don’t seem to make a difference in your life, you probably are using the traditional coin.

Unfortunately this personal evaluation strategy isn’t feasible for most high-stakes decision problems.

First, there is usually a significant time lag between decision and result. The results from consequential decisions tend to happen months or years down the line. Examples might be starting a war, founding a startup, investments in the stock market, accepting a new job, the adoption of new laws, modification of administrative practice and so on.

Second, with most interesting real world problems the definition of success is problematic. For example, if someone maybe committed a crime and is judged guilty, is this a case of successful deterrence or a failure of due process? Variation: if someone committed a crime — without reasonable doubt — and is given a decades-long prison sentence, is this a success of retributive justice or a failure to reintegrate the offender into society? You might be able to tell success from failure if you spend a lot of time understanding each individual case, weighing the evidence and considering the advantages and disadvantages of each action in light of your internal moral compass and the values of your society.

If you just flip a coin you would never know the difference.

Third, frequency matters. Sometimes the decision problem only occurs once or is very rare (war, marriage, emigration, choice of education, other existential decisions). Sometimes all the relevant decision problems are similar, but subtly unique (educating students, criminal trials). Most often you will face a couple of related high-stakes problems (perhaps less than ten), but there will be many unique sets of high-stakes problems. Too few to tell if the difference between the coins is a) due to accident or b) due to the true mechanisms powering the tool.

In statistical terms, in the case of important decisions problems your sample size is often too small.

Info

Why is it a problem if your sample size is too small? Check out my open-source tutorial on representativeness, samples and populations for some intuition on sample size. It doesn’t deal with causal inference and statistical comparisons between experimental conditions, but it should be helpful to get a feel for sample sizes.

Option 2: Trust the Expert Link to heading

The alternative is to trust an expert who tells you which coin is all-powerful AI coin.

And this is the crux with modern AI tools: if personal evaluation is infeasible you have to take the word of an (alleged) expert that a) their computer program actually does the super-human intellectual work that it claims to do and b) that it will act in your best interests. Meaning the coin doesn’t just give you a random decision at best or a malicious decision at worst.

Salespeople tell you the similar things when they sell miracle cures or promise you a place in heaven for 999.99 USD. And it still works for much the same reason that astrology, horoscopes, tarot cards, fortune cookies and climate denial remain popular. People want to believe they can access magic, even if there is no rational empirical basis for their belief.

At the end of the day, if you delegate an entire decisionmaking chain it all comes down to whether you trust an unknown person to make future choices for you, your family, friends, colleagues and anyone else you know or don’t know. At the lower end of consequence these are personal choices, at the higher end of consequence these are decisions with life-altering impact for millions of people, such as the passage of laws or the writing of judicial decisions by apex courts.

We spent a lot of time developing democracy and the rule of law to create functional trust arrangements between unknown people. I do not think that kind of trust in computer systems is warranted, if it ever was. The speed at which global tech leaders are willing to bend the knee before autocrats at home and abroad shows that any kind of privately held computerized decisionmaking system will be a future liability to democracy and the rule of law.

Some Objections Link to heading

Objection: Restrict AI to Low Value Decisions Link to heading

Why not stick to low-value decisions? If an AI agent is potentially untrustworthy, just use it for less important tasks or decisions where failure costs little, no?

Honestly, that would be great. Stick to the small decisions. Which restaurant to go to. Which movie to watch. Whether you should have hot chocolate or coffee today. Leave it at that and I’ll get out of your hair.

Sadly, I don’t think that people will leave it at that. Soon we’ll have to fix a lot of important problems left behind by AI agents that have been let loose without properly considering the consequences. We’ll have health insurance claims denied, tax fraud investigations launched, social security benefits revoked and all sorts of other harmful actions taken.

All in the name of artificial super-intelligence.

Objection: Make AI Open Source Link to heading

We could remove the power abuse and corruption angle if every use of AI tools in public decisionmaking was open source, with training data, source code and model weights open to public scrutiny, plus development, testing and production systems controlled by public officials. This seems ideal on first glance, but it is not a practical solution.

First, tax-payer funded institutions almost never have the money to pay market rates for the tech talent required to run government-scale operations at a quality comparable to the top players in the private tech world. The legal profession has spent centuries (successfully) developing its own civic spirit to acquire enough top lawyers to run a country, but even then competition with the private sector remains fierce. The tech sector can look back on half a century of stellar open source work, but its civic-minded sub-communities have not yet reached the critical mass to manage an entire State’s worth of technology as civil servants.

Second, delegating decisionmaking to AI agents controlled by experts — civil servants — is a threat to the separation of powers. Technology will almost always be centrally managed by the executive branch. If the legislative and judicial branches of government rely on decisionmaking agents controlled by the executive, the executive controls the other branches of government. We have seen something similar playing out with expert bills drafted and adopted by the executive, which then sail through parliament without much debate because there literally isn’t a single MP left who understands the specialized technical content of the law.

Objection: Add a Human to the Loop Link to heading

The theme of human-in-the-loop has been all the rage since the phenomenon of “AI hallucinations” gained mainstream attention. Fire up your AI, let it do all the hard work and then have a human double-check the results. You get the best of both worlds: AI speed and human quality. Why not have an AI decisionmaker and let a human double-check the result?

This can’t work. If the human needs to do extensive thinking to come up with the right decision to a problem and the human hasn’t done the required amount of thinking, then the human cannot evaluate the decision with the expected quality.

If the human does all the required thinking to solve the problem, what did you need the AI for in the first place?

Also, humans are lazy. If the computer says no, the human says no. Even if the decision is patently silly. This has been a challenging problem before AI and it will continue to be a problem with AI and after AI.

Objection: Human Agents are Untrustworthy Link to heading

Trust in human agents is also an uncertain thing. Haven’t we all been hoodwinked by a car salesperson, insurance rep, travel agent, lawyer and so on? Since we cannot fully trust humans agents, why is lack of trust in AI agents problematic?

One reason AI agents are fundamentally different from human agents is because control over computer systems is much more centralized than control over humans. The closest analogy to the degree of centralized control that computer systems exhibit are totalitarian States. Popular examples being Nazi Germany, Stalinist Russia and Maoist China. Instead of years of organizing, militancy and purging of enemies, a takeover of an AI agentic system could occur within minutes.

Another fundamental difference is speed. Where a compromised government needs years to make all its civil servants toe the line, a compromised AI system can do so in seconds to minutes. Even a non-compromised but faulty system can make decisions at a rate no corps of human civil servants could match, turning every mistake into a tragedy.

A third difference is that we have developed tools to mitigate the untrustworthiness of humans over the past millennia. These include pro-social evolutionary adjustments, religion, ethics, the rule of law and interpersonal trust policies. With AI agents we are still at the very beginning of these developments. Even if they had similar levels of (un)-trustworthiness as humans, we do not yet have adequate mitigations in place to counter this lack of faith.

Conclusion Link to heading

I began writing my own conclusion to this essay, but Herbert ([1965] 1968, 23) really said everything that needs to be said:

Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.

References Link to heading

Herbert, Frank. (1965) 1968. Dune. Hodder; Stoughton.

Reuel, Anka, Amelia Hardy, Chandler Smith, Max Lamparth, Malcolm Hardy, and Mykel J. Kochenderfer. 2024. “BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices.” In Advances in Neural Information Processing Systems (NeurIPS). https://betterbench.stanford.edu.

Searle, John R. 1980. “Minds, Brains, and Programs.” Behavioral and Brain Sciences 3 (3): 417–24.

Turing, Alan M. 1950. “Computing Machinery and Intelligence.” Mind 59 (236): 433–60.

For some reason people are quite proud of being told what to do by a random computer program. I find this confusing, to be honest. ↩︎
It’s like people have completely forgotten that internships are supposed to be an educational experience. It has become tragically common see interns only as a disposable source of very cheap labor. ↩︎
And there are enough philosophers that consider free will an illusion and all humans automata by default. ↩︎