Pay up
Forty-five years ago, Bill Gates banged his fist on the table and asked people to pay up. At the time, almost no one thought you could make money selling software to mere mortals: “the hobby market”. Hardware, sure. But software? No way! Bill disagreed. We know the rest of the story: he built a de-facto software monopoly and became the wealthiest human a couple of decades later. Much more is up for grabs now, in an area almost no one considered monetising. And a few big players are already sharpening their teeth.
Pay up
In the 70s, “hobbyists” started to experiment with new technology: microcomputers. There was a strong culture of sharing, and clubs, such as The Homebrew Computer Club, were the hubs for exchanging ideas and copying software. For most hobbyists, software was “free as in free beer” and “free as in free speech”. But Bill Gates, who had just recently co-founded Micro-Soft, wasn’t too excited about it. He and Paul Allen just built their first piece of software, Altair BASIC. They had to hire a developer, Monte Davidoff, to help. Altair BASIC became extremely popular among hobbyists. But almost no one paid for it (why pay for something you can copy for free?), and Micro-Soft couldn’t break even. So, Bill wrote An Open Letter to Hobbyists. He pointed out that if a business puts a lot of effort into creating software, it is their right to be paid.
Even though there’s logic in Gates’ argument, he was seen as the “bad guy”—the hobbyists detested such commercial thinking. Despite a rift between those “trying to make things happen” and those “trying to make money”, Gates achieved his goal1.
Déjà vu
These days, computer hobbyists are sharing something else. In the machine learning community, there is a strong culture of sharing models. Downloading a so-called “pre-trained model” and further refining it for your specific scenario is, in fact, a preferred approach in deep learning. In most cases, it makes the work easier, faster, more replicable, less error-prone. Sharing is caring. It’s all about trying to make things happen.
Who cares if the people who worked on them get paid?
Sounds familiar?
One of the more impressive models built recently, GPT-3, was trained on half a trillion words. It would take a human almost three thousand years of reading2, 16 hours per day, day in, day out, to read as much text. GPT-3 was built by OpenAI, an organisation that in 2019 received a $1 billion investment from Microsoft. Incidentally, OpenAI transitioned from non-profit to for-profit the same year. I am not suggesting these two events are somehow linked.
Nothing would please Microsoft more than being able to hire a hundred developers and deluge the hobby market with deep learning models.
I made up the previous sentence.
During the most recent edition of the Disruptive Innovation Leadership Course, I “invited” GPT-3 to the standup ideation session, where we practice using “ideation lenses”. I showed GPT-3 one example for each lens (I “prompted the model”), and then I asked it the same questions we asked the human participant on stage. I chose a few of them below. Can you tell which ones come from a human and which ones from the algorithm? Scroll to the bottom to see the correct answers3.
Stunning, isn’t it? These new models are impressive for a very particular reason.
While so far a lot of effort in AI has focussed on generalisation to new examples (the model is expected to be able to deal with examples it never saw before), these new models can generalise to new tasks (the model can perform well on tasks that it was never explicitly trained for)!
No one intentionally trained GPT-3 to create new business ideas! It is a fascinating prospect in itself to be the first to discover the capabilities of these advanced models by carefully designing prompts that will get them to reveal their powers.
It likely took 355 GPU-years on a Tesla V100 GPU equivalent to train GPT-3, a whopping cost of $4.6M for just one complete training pass. In layman’s terms, training a system like this will take weeks, if not months, and a LOT of money. And this process will have to be repeated if the model needs to be retrained, for instance, with new information. GPT-3 has no idea about COVID-19, as the model was trained on data until 2019. It is telling that the model hasn’t been updated since.
Creating a model like GPT-3 is prohibitively expensive. The task is practically unachievable for anyone but the largest corporations. And GPT-3 is not even the biggest model anymore. At the current pace, the size of the best models increases more than tenfold every year. So do increase the costs of creating them, as processing speeds are not improving nearly as fast. Remember Moore’s Law?
Now, imagine the power you might have if you’re the only one who controls the model. There are already hundreds of applications built on GPT-3. All of them rely on the model that OpenAI hosts and the application developers have to pay up: OpenAI introduced a pay-as-you-go pricing scheme to use their model.
Monopolies can emerge in various ways. One of them is through control of resources. We are witnessing the emergence of corporations that control digital resources: models that are so expensive to create that very few in the world can afford to do it. Right now, the cost of training a state-of-the-art model is in the range of millions of dollars. In a few years, we will see models that cost hundreds of millions of dollars in computing power to create.
Who are the businesses betting on large models? The usual suspects: Google (BERT, T5, LaMDA, MUM), Facebook (BART, RoBERTa, XLM), Microsoft (Turing-NLG), OpenAI (GPT-3, DALL-E, CLIP, CODEX), Nvidia (MegatronLM, Megatron-Turing NLG - built jointly with Microsoft), and a few others: Huawei (PanGu-Alpha), BAAI (WuDao 2.0), Naver (HyperCLOVA), AI21labs (Jurassic-1).
There are efforts in the machine learning community to build open models that can compete with these players (the most notable initiative is Hugging Face). But it does feel like the days of The Homebrew Computer Club trying to come to terms with the fact that those who want to make money have the upper hand over those who want to make things happen. Who will win this time?
Reengage
Next week I will join MIT Professor of Management Science and Innovation, Steven D. Eppinger and QUT Pro Vice-Chancellor (Entrepreneurship), Prof. Rowena Barrett, for The Curious Enterprise webinar, the second episode of the Future Enterprise global webinar series. It will be an exciting conversation, and I hope you can join the hundreds of others who have already registered!
At the same time, the letter likely energised the open-source software movement, which fundamentally disagreed with the letter’s argumentation. Our digital world depends on open-source and free software, which is unlikely to change any time soon.
At a very decent speed of 500 words per minute.
Right, left, left. From top to bottom, GPT-3 provided the answers on the right, left, and left. The other responses were provided by course participants (three people in total, one per question).