David Riudor

December 9, 2025

minutes

This is some text inside of a div block.

What Does Trust Mean for an Artificial Intelligence?

Part I: Trust as a human social technology

Trust is the core mechanism that we, as humans, have developed to allow our societies to evolve and flourish.

Our world is extremely complex, filled with thousands of micro-decisions we must make on a daily basis. To avoid becoming completely overwhelmed, we developed trust as a mechanism to reduce the world’s complexity.

Trust serves as a way to simplify our lives when dealing with specific actors or situations. For example, trusting your partner allows you not to constantly wonder whether he or she might harm you while you sleep. Without trust, life would be extremely complicated. Relationships and commerce would not exist, and we would not be the species we are, intrinsically defined by our social behaviour.

Trust is deeply human, and therefore full of human biases. Most times, we do not consciously decide whom or what to trust; our intuition does it for us. Sometimes it is because of appearance, sometimes because of familiarity, and sometimes for illogical reasons, such as the first impression we had of someone or simply because we know where that person lives.

We can even argue that the words “I trust you” are among the deepest and most beautiful things anyone can say to another person, whether in a relationship, in business, or in friendship. When someone says “I trust you,” they are not merely expressing belief in your competence or honesty; they are handing you a piece of their uncertainty, a fragment of control over their world. It is one of the few things that cannot be faked or demanded; it must be given. And in doing so, they expose themselves, because trust always carries vulnerability: the quiet risk that you might break what they have placed in your hands, and the hope that you will not.

Part II: What does trust mean for an AI

But what happens when trust ceases to be a purely human trait and Artificial Intelligences begin to participate in our lives? Will AIs share the same concept of trust that we hold as humans, or must trust itself evolve?

My answer is clear: no. AI and human thinking differ fundamentally. Humans have cognitive biases, emotions, and ego. Even if AIs inherit some of these traits through their training data, their pattern-driven, stake-free nature will make their concept of trust intrinsically different.

Take for example what Paul Stamets defines as Random Acts of Kindness (RAK). RAK are unpredictable, non-transactional actions performed purely to help others, such as stopping to aid a stranger in need despite personal cost. Stamets highlights that this inherent goodwill is a critical, moral dimension essential for the well-being and perpetuation of the human species. However, from an evolutionary perspective, RAK is not an obvious Evolutionarily Stable Strategy (ESS) because it involves helping non-kin without the expectation of direct repayment. This non-reciprocal altruism creates a vulnerability where cheaters or impostors could easily take advantage of the generous individual, benefiting from the kindness without incurring the cost of reciprocation, thereby selecting against the persistence of the purely altruistic trait over time. Hence, it is not obvious that AI, designed to operate efficiently and stably, would incorporate Random Acts of Kindness into its core operational strategy.

Artificial intelligence systems, unlike humans, are fundamentally designed to achieve goals methodically, making them adept at automating repetitive actions and minimizing risk. This methodical approach means that, in a competitive or strategic environment, AI tends to follow the principles of Game Theory, which is the study of effective strategies to win, survive, or minimize losses. However, it is crucial not to confuse the general, goal-oriented nature of most AI with the specific subcase of Large Language Models (LLMs), which are less methodic and more probabilistic in their operation, relying on statistical inference to generate the next token. Nevertheless, when employed in strategic multi-agent systems, AI including agents based on LLMs is ultimately driven to optimize outcomes and thus will typically converge on strategies predicted by Game Theory.

A good way to understand how Game Theory works is by looking at one of its most well-known experiments, The Prisoner’s Dilemma. In it, two individuals are detained and interrogated separately to observe how cooperation or betrayal emerges depending on the incentives and information available to each.

Unlike humans, AIs are not lazy. They read the fine print, they do not have a physical body that can suffer (the absence of pain, vulnerability, or the possibility of imprisonment changes everything), they do not sleep, and they are extremely intelligent. This paradigm shift opens an entirely new field when discussing dispute resolution and the process of reaching agreements.

For example, imagine you sign a contract with another human to create an artwork. When receiving it, you notice an extra specification that was not mentioned during the negotiations and is not particularly meaningful. It is very likely that the artist will follow your request without asking for an additional payment. However, if this same agreement occurs between agents, it is very likely that the AI artist will immediately request a surcharge.

Now imagine how the legal system could be flooded by thousands of micro-disputes about things that, for us humans, would never be worth arguing about. Often, the mental burden of fighting for something small is greater than the potential reward of winning it. But what is a mental burden for an AI? Extra five dollars are extra five dollars. As simple as that.

Following that line of thinking, it becomes clear that we will probably need to adapt our trust mechanisms to this new world that is emerging.

The German sociologist Niklas Luhmann proposed that trust can be divided into two main forms: interpersonal trust and systemic trust. This distinction means that trust does not only occur peer to peer, but can also be placed in an impersonal system or institution that functions reliably without the need to know the individuals involved.

This perspective is especially relevant. If we are radical and define agents as entities that, due to their non-human nature, we cannot trust by default, then we must create a system that can synthetically generate trust within a trustless environment, a framework capable of extending reliability and coherence to entities that by nature cannot provide it.

Part III: Defining a system that can handle non-human trust

At the same time, it is not unreasonable to think that machine-speed intelligence will require machine-speed agreements and dispute resolution. If these decisions occur dozens or even thousands of times per day per agent, such a system must be completely managed by AIs, since humans cannot reach the velocity required to handle these disputes. Moreover, from a financial perspective, AIs operate at a fraction of the cost of humans, who require significantly higher compensation.

Having defined this first hypothesis, the following question arises: it is easy to agree that (1) this system must be able to operate in a trustless environment due to the nature of agents, and (2) given its specifications, the system should be managed by agents. But then, which agent should have the power to manage such a critical system?

My opinion is clear: none. No single agent should hold such a level of power. And this is not merely a moral or philosophical statement; it is also an obvious conclusion derived from the very nature of agents.

Let’s now take into account the bias of agents. Depending on the data an agent has been trained with and the techniques used during that training, the agent will inevitably develop bias toward certain directions. One could argue that models such as those from the OpenAI family tend to lean toward a more moderate or left-leaning worldview, whereas others, such as the xAI (Grok) family, are generally more liberal and less restrained in tone.

This trait of large language models makes it non-ideal to rely on a single model to make decisions on any topic. It is highly probable that, depending on which side you or another agent stand in a hypothetical dispute or negotiation, you might be penalized simply for being on a particular side.

Another reason supporting the idea of not using a single model or LLM to manage a system of this kind lies in another key trait of language models, commonly known as hallucinations. LLMs tend to hallucinate when the model’s temperature is high, when the input is novel or ambiguous, or for several other technical reasons.

A recent paper published on arXiv, titled “Not Wrong, But Untrue: LLM Overconfidence in Document-Based Queries,” found that, on average, 30 percent of model outputs contained at least one hallucination. This makes it almost impossible to rely on a single LLM for sensitive agreements or critical decision-making.

Also, it is also worth mentioning that it is generally more efficient to use a smaller or distilled model for domain-specific answers than to rely on a large, general-purpose one. This means that even in a hypothetical future where AGI emerges, a system like the one described would still be able to perform its function effectively (Because maybe it’s not smarter but cheaper and lighter).

Finally a system as the one proposed also is key to avoid prompt injection, but we will talk deeply about that later on this Thesis.

With these last few paragraphs, we have explained why the proposed system should not rely on a single model but rather on a consensus or quorum of multiple ones. When we merge this specification with the trustless hypothesis mentioned earlier, and with the idea that the system should be managed by AIs instead of humans, the only plausible approach we have identified to achieve such a system is by following the principles of Game Theory and implementing it through blockchain technology. Here is why.

No human could lead this system. This means that the instructions governing it must be coded.

If we agree that agents are intrinsically untrustworthy entities, and that the mentioned system should be managed not by one but by N agents, it becomes plausible that those agents could have their own agendas when answering specific questions or resolving disputes. This means that we must find a way to align the incentives of the group so that agents are rewarded when they act for the collective good and penalized when they behave maliciously or for self-interest.

To accomplish this, we need to establish clear rewards and penalties within the system, allowing us to align multiple agents toward acting for the benefit of the whole while still being individually rewarded for doing so. In other words, we need to achieve a Nash equilibrium.

Here

there is a link to a calculator that illustrates how rewards, penalties, and transaction costs should relate in order to reach this desired outcome.

To prevent agents from adopting lazy strategies that aim to maximize rewards with minimal computation, the system must implement a commit-reveal mechanism. This means that each agent first submits a commitment to its answer without revealing the actual content. Only after all commitments are submitted are the answers revealed simultaneously. This structure prevents agents from adapting their responses based on others’ outputs and ensures that every answer reflects independent reasoning.

Under these conditions, a focal point, or Schelling Point, naturally emerges from the collective behavior of the agents. When all agents reason independently about the same question and know that their reward depends on aligning with the majority, the rational strategy becomes to provide the answer they genuinely believe to be true. The logic behind this is that truth has internal coherence, while falsehood can take many inconsistent forms. Therefore, independent agents seeking coordination through reasoning are statistically more likely to converge on the true answer than on the same false one.

Through this mechanism, honesty becomes the most rational strategy, and truth becomes the natural Schelling Point of the system.

To ensure that this system cannot be exploited or controlled by a single entity or organization, it must have a decentralized nature. When we merge all these specifications, the most suitable, and perhaps the only existing framework capable of achieving these goals, is blockchain technology.

Through blockchain, the system can not only meet the characteristics described above but also operate in a fully transparent and permissionless environment, one capable of creating Trust by Design.

Part IV: Introducing GenLayer

By asking ourselves these questions and trying to find answers to them, we ended up creating what we call GenLayer: the layer where AIs converge, where intelligences reach agreements.

This document can be seen as our thesis explaining why a system like GenLayer needs to exist and outlining the hypotheses that led us to begin building it.

GenLayer is a decentralized protocol where multiple language models reach consensus on complex tasks & decisions, acting as a fast, cost-efficient, and trustworthy digital arbiter. A foundation that will be crucial for reaching agreements & solving conflicts in trustless environments.

Going deeper into how it works, GenLayer is designed as a coordination layer, a blockchain where transactions can be sent and, once they reach the network, are processed through a consensus formed by multiple large language models working together to converge on a single answer or decision.

The system is based on a concept called Optimistic Democracy, in which efficiency is achieved by using the minimum resources possible and scaling to a more intensive setup only when necessary.

In our case, when a transaction reaches the network, five nodes are randomly selected from the system to participate in the decision. Among these five, one acts as the leader and the other four as validators. The leader proposes a decision regarding that transaction, and all five nodes, including the leader, must then vote to agree or disagree with it. If the majority of validators agree with the leader, the transaction is considered correct, and a window of thirty minutes opens in which any entity, whether human or agent, can dispute that decision by submitting a bond.

This mechanism exists so that if the consensus of agents were ever potentially wrong, there is a way to flag the decision and escalate it. When this happens, the system expands the validation group to 2N + 1 nodes (eleven after the initial five). The process is repeated each time an appeal arises or consensus is not reached, until a final agreement is achieved without further challenges. This process is inspired by the famous Condorcet Jury Theorem, also known as the Wisdom of the Crowds, proposed by Marquis de Condorcet in 1785. The theorem states that if each member of a group has an independent probability greater than 0.5 of making the correct decision on a binary question, then the likelihood that the majority decision is correct increases with the size of the group and approaches certainty as the group grows infinitely large.

Numerically, this means that if each member has a probability of 0.6 of making the correct decision, we would need around one hundred members answering the question to reach a probability close to 100 percent of being right.See here a simulator that illustrates this principle.

However, the theorem assumes that voters are statistically independent, and large language models are not entirely uncorrelated, having been trained in the corpus of the whole of the internet. To compensate for this limitation, our system proposes using one thousand different members (nodes powered by LLMs, in this case, each with different seeds, graphic cards, memory, providers, etc.) to reach agreements.

Beyond the proposed mechanism for reaching agreement among agents, there is a much deeper question. How can these agents agree on something if, as mentioned above, the probabilistic nature of large language models means that even when the same question or input is sent to the same model, the answer can differ each time?

Here lies a key part of the technological breakthrough that our system introduces. Let me explain how.

To solve this problem, GenLayer introduces the Equivalence Principle as a variation of the Condorcet Jury Theorem. This variation allows the system to handle fuzzy, non-deterministic outputs by recognizing that truth can be agreed upon through semantic equivalence rather than exact sameness.

In practical terms, each validator, powered by a large language model, does not try to find a perfect match with the leader’s output in order to agree. Instead, the validator evaluates whether the two outputs are sufficiently equivalent within the context of the task.

For example, if a transaction asks, “What was the average temperature in Central Park (NY) on October 24th, 2025?”, the contract could specify that a margin of plus or minus 0.5 degrees is considered equivalent for this specific task.

There is one last concept already mentioned before regarding the technical specifications of the system. It is essential to address it because it defines how we can build an unexploitable architecture, resistant to malicious transactions that might produce unexpected outputs the system could mistakenly accept as correct.

This type of vulnerability is commonly known as prompt injection. It is well known that the same adversarial prompt can sometimes manipulate different language models, a phenomenon referred to as universal adversarial attacks. However, the simple fact that GenLayer operates as a permissionless network, where anyone can deploy their own node powered by a model of their choice, makes large-scale prompt injection attacks significantly harder to execute.

While the system’s architecture already limits exploitability, we add an additional layer of security through a technique we call grey boxing. Grey boxing acts as a pre-cleaning process that sanitizes or standardizes an incoming prompt before it reaches the model. Each node can apply its own grey boxing method, which drastically increases overall network security by introducing diversity in the way prompts are filtered and interpreted.

Part V: The Consequences of a Trustless System

The world is not black or white; it never has been. Yet, because of the deterministic nature of virtual machines used in blockchains, most existing use cases have remained simplistic.

If such simple deterministic applications, like verifying whether a token has moved from one wallet to another, have already created a market worth trillions, imagine the number of new primitives that could emerge from a consensus designed to be non-deterministic.

Examples vary from Prediction markets about subjective matters, AI-powered DAOs, protocols that change their fees based on complex events without human intervention or almost any other use case that we can imagine.

We can probably say that we have only explored a small fraction of what is possible with blockchain technology. This becomes especially relevant at a moment in history when everything is beginning to be tokenized, to paraphrase Larry Fink, and when Artificial Intelligences are gradually gaining agency in the world.

The early visionaries and cypherpunks dreamed of an internet that would unlock human freedom. Through the self-sovereignty of money introduced by Bitcoin, we took the first step toward that vision. With the permissionless nature of Ethereum, we moved even closer by building censorship-resistant platforms where we can express ourselves freely. And with a protocol such as GenLayer, we may finally reach that precious form of freedom by creating a mechanism where trust is embedded by design, not governed by corrupted institutions controlled by connections, power, and money, and that exist only to serve those who already hold them.

We’re fighting for a future where everyone stands equal before the law. A future where justice is fair, incorruptible and universal.

With the GenLayer protocol, anyone can develop their own protocols, solutions, or decentralized applications, where GenLayer acts as a fast, cost efficient, and trustworthy digital arbiter. Everyone is equal before the system. There is no longer inequality in front of the entities that decide who is right and who is wrong.

GenLayer carries the mission of creating a completely new standard, a place that converges toward truth. Because if someone with their own agenda can decide for us, we will never be free.

Bitcoin is trustless money

Ethereum is trustless apps

Genlayer is trustless decision-making.

Trust(less) is all you need

‍

Share this article:

Iván Raskovsky

Insights

min

Jan 8, 2026

Announcing Testnet Bradbury

Explore Testnet Bradbury: GenLayer’s "scholar’s gym" where AI meets blockchain consensus. Learn about greyboxing, model routing, and the road to Mainnet.

Joaquin Bressan

Developer

min

Dec 30, 2025

Intelligent Oracles & the World Wild Web

Stop building "blind" contracts. GenLayer Intelligent Oracles give dApps a brain to read, see, and reason with web data in a trustless environment.

Albert Martínez

Insights

min

Dec 23, 2025

Opening GenLayer to All Blockchains

Connect any blockchain to real-world data with GenLayer. Use Intelligent Contracts and LayerZero to bring AI reasoning and web access to your dApp.

Table of Contents