Microsoft researcher builds goat-powered neural network in Age of Empires 2 to show why we should 'stop assuming that LLMs behave like humans just because they were trained with natural language'

Large-language models like ChatGPT can generate natural language responses that appear human-like in tone, leading to considerable discussion over whether LLMs might themselves be sentient. However, there are far more reasons to conclude that AIs are not and will never be conscious. Despite this, the idea of AI sentience persists, partly due to our tendency to perceive human-like qualities in non-human things and partly due to equivocation from AI companies. To demonstrate how absurd the notion is, Microsoft AI researcher Adrian de Wynter built a neural network within the classic strategy game Age of Empires 2.

As reported by 404 Media, de Wynter created an LLM in Age of Empires 2 using the game's scenario editor. He wrote a paper titled "If LLMs Have Human-Like Attributes, Then So Does Age of Empires II", which highlights the absurdity of assuming LLMs behave like humans simply because they were trained with natural language.

De Wynter explained that he has a tendency to "dial up things to 11" when making a point, and that absurdism is a common tool in philosophy and theoretical computer science. Using objects in the game world to represent computer binaries, he built a functioning NOT AND gate and a 1-bit perceptron — a simple form of neural network. In this setup, grass represents 0, bridges represent 1, and goats play the role of bits. This is similar to how some players have built neural networks using Minecraft redstone, but de Wynter specifically chose Age of Empires 2 for its less obvious choice.

There are videos of de Wynter’s goat-powered LLM in action on his GitHub page. To the casual observer, the processes look completely baffling, which de Wynter believes demonstrates his point. The processes at work here are similar to those that power tools like ChatGPT and Claude. However, because the fundamentals are goats and grass rather than natural language, it prevents observers from perceiving the resulting behaviors and output as human-like.

The importance of not anthropomorphizing LLMs

The point of the paper is to formally show that we anthropomorphise too readily and that some claims about LLMs' capabilities are too strong. De Wynter explains that the reason for using goats is to separate the actual components that make LLMs function — like the relationship between weights defined by some operation — from the human-like qualities we perceive in them.

Assuming LLMs have human-like properties without demonstrative proof could lead to various problems, especially in scientific research. In his paper, de Wynter mentions he has peer-reviewed more than 300 computer science papers in the last two years and found that over half of them began with the assumption that LLMs have human-like traits.

A call for more rigorous experimentation

De Wynter argues that we need to stop assuming LLMs behave like humans just because they were trained with natural language. Instead, he suggests that we should perform experiments that allow us to see LLMs for what they are, not how we believe they should be. His unconventional project is a clear demonstration of the need for more rigorous experimentation and a critical reevaluation of how we perceive and interact with large language models.