Gary Marcus calls it the imminent enshittification of the Internet. He’s way too dramatic: compared to some corners of the Internet I saw almost 30 years ago (anyone remembers alt.* groups on Usenet?), the Web these days is pristine and civilised—we managed to clean up a lot of human-created digital guano. But if you filter out Marcus’ doomsday thinking, what remains is still quite significant: indeed, it seems like there’s another wave of binary dung about to hit us. And academics have a word for it, too (though not as catch as enshittification): MAD.
Let me explain.
Artificial intelligence has traditionally been trained on human-generated (text, drawings) and natural (images, sounds of nature) content. Despite that, the outputs of AI systems, and generative AI systems in particular, have been “flawed”: seven-fingered people or “hallucinations” of chatbots are the most obvious examples. Long story short: AI outputs are neither human nor natural.
So, what happens if a system, AI or not, cannot distinguish between human or natural content and one that’s an outcome of an AI system? Seven-fingered people become natural, and “hallucinations” amplify.
Well, this is beginning to happen. Google’s search engine is falling into this trap, and it is now amplifying the delusions of generative AI systems. We all know to be careful when reading the outputs of ChatGPT. But this is something new—Google is spreading generative confabulations.
You can try it for yourself. Google “country in Africa that starts with k”.
What you see is a fascinating – and concerning – loop. There is a website called “Emergent Mind” that aggregates a lot of interesting content. Its quality of content seems to be making Google’s algorithm believe that it is a relatively respected source of information. There is an article on the site which contains a response generated by ChatGPT, which is completely incorrect. Perhaps it was posted there as a joke. But most algorithms do not understand jokes, Google’s PageRank included.
And so, without realising it is not true, Google captures the content of the article and presents it to its users in a very authoritative way: on top of the page, even above the main list of results. The article is a silly conversation with ChatGPT, showing how the model cannot answer a simple question about countries in Africa.
Can you sense the downward spiral about to unfold?
The Digital Ouroboros
Ouroboros is an ancient symbol depicting a serpent or dragon eating its own tail, often interpreted as a representation of the eternal cycle of life, death, and rebirth. Its imagery has been used across various cultures and philosophical systems to symbolize infinity, unity, and the interconnectedness of all life.
But I am going to borrow the symbol to represent something else: the emerging phenomenon of self-ingestion (autophagy) of generative AI models. Let’s call it the digital ouroboros.
The Digital Ouroboros is like the mythical serpent that eats its own tail. But in its XXI century incarnation, in the context of AI and the Web, it reflects a self-perpetuating loop of misinformation. Generative AI tools ingest content from the Web while training, some of which might be erroneous or misleading, and then they generate new content based on what they've learned. This new content is then published on the Web, where it becomes part of the data that future Generative AI tools will learn from. It's a cycle that, if left unchecked, can lead to the gradual degradation of information quality online.
Welcome to the age where machine-created misinformation is propagated and amplified by machines themselves.
I don't want to sound like an alarmist Gary Marcus, but if this content cancer1 continues to spread, the reliability of the Web could be fundamentally compromised.
Ouroboros getting MAD
How can we stop this? Generative AI content detectors have proven unreliable, and watermarking may not be a foolproof solution (especially if watermarks become easy to remove). A few months ago, Adobe introduced a “do not train” tag that could be used to request to skip images during AI training. Perhaps such tags should be mandatory for all AI-generated content, in a similar way that websites can request search engines not to index them.
Such data-inbreeding in Generative AI can lead to disorders in these systems. Jathan Sadowski, a researcher at Monash University in Australia, refers to it as “Habsburg AI” after the German dynasty2. He wrote on X: “Habsburg AI [is] a system that is so heavily trained on the outputs of other generative AI's that it becomes an inbred mutant, likely with exaggerated, grotesque features.”
More researchers are now looking into this topic. A fascinating academic analysis of this phenomenon has been recently published as a preprint3. In the paper, the authors say: “without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease”. They introduce a term to describe this new phenomenon: “generative autophagy” and call the negative outcome of it MAD: Model Autophagy Disorder.
Here’s hoping that the Internet will not go MAD. What can we all do? Let’s be extremely cautious and responsible when publishing the outputs of generative AI systems. And if we do so, let’s at least ensure we’re not spreading misinformation. Hopefully, soon we will have an agreed-upon way of tagging generative AI content.
Let’s do it for Kenya.
Prof. Marek Kowalkiewicz is a Professor and Chair in Digital Economy at QUT Business School. Listed among the Top 100 Global Thought Leaders in Artificial Intelligence by Thinkers360, Marek has led global innovation teams in Silicon Valley, was a Research Manager of SAP's Machine Learning lab in Singapore, a Global Research Program Lead at SAP Research, as well as a Research Fellow at Microsoft Research Asia. His upcoming book is called "The Economy of Algorithms: Rise of the Digital Minions".
I think that’s too many metaphors. Oh well.
The Habsburg Dynasty's inbreeding problem arose from a long history of intermarriage within the family, leading to a lack of genetic diversity and resulting in various physical and health issues, most famously the "Habsburg jaw."
Alemohammad, S., Luzi, L., Humayun, A. I., Babaei, H., LeJeune, D., Siahkoohi, A., & Baraniuk, R. G. (2023). Self-Consuming Generative Models Go MAD. ArXiv. /abs/2307.01850
Generative AI content is just another version of the age-old response produced by every culture and at every moment in time, as "I don't want to think".