There’s a strange phenomenon that marathon runners exhibit during races. They arrive at the finishing line in waves. There could be dozens of runners crossing the line at one moment, only for the chute to become almost empty for a few minutes until the next wave arrives.
Bear with me—this line of thinking will lead to the impact of AI on our lives.
What’s happening with those runners?
Most marathon runners set goals. They want to finish their race by a specific time. They call themselves “sub-three-hour runners" or so. And it turns out their goals indeed impact their performance, which was observed by a group of researchers. Pay close attention to the graph below, taken from their paper, and you will see waves of runners: one just before the 3h mark, the next one at 3:10, then 3:15, 3:20, 3:30 and so on.
Look again - the finishing crowd thins out significantly just after 3:00, 3:30, 4, 4:30, and 5 hours. My own personal best is 3:45:21, which both confirms the rule and causes me a lot of grief (just because of these 21 seconds, I cannot call myself a sub-3:45 runner).
The AI executive order
On Monday, President Biden issued an executive order on safe, secure, and trustworthy artificial intelligence. The order is a response to growing concerns about the impact of advanced artificial intelligence systems on our lives.
There’s been quite a lot of coverage of this (here’s a “what to expect” take on it by Prof. Toby Walsh). I want to dive deeper into one particular aspect of it—one that is both a bit nerdy and potentially more significant than it seems.
The order addresses a conundrum I’ve been grappling with: should AI development be controlled in a similar way to how drug development is governed?
At a recent event at Queensland AI Hub, I took this question head-on. On one hand, such governance is needed, but on the other hand, it would stifle innovation. Specifically, I brought up an example of my children training their own AI models. Should their afterschool fun activities be subject to government oversight1?
Well, it seems the Biden administration has an answer to it. AI system development should be governed, but only for really, really big models (that, presumably, my children wouldn’t be able to build anyway).
Let’s have a closer look at this part of the executive order. Too geeky? Don’t worry, I’ll explain below.
What do these paragraphs really mean?
If your model requires a lot of computing power to develop, it will have to be reported.
If you’re using a supercomputer for training your AI models, it will need to be reported.
Okay, but what do they REALLY mean?
The number used in the executive order is higher than ANY AI model developed so far, and by some estimates, it’s 5 times higher than the amount of computing power used to train GPT-4. While this threshold will likely be broken (possibly in less than a couple of years), the reality is that the reporting requirement will apply to a handful of models at most! My conclusion: the industry lobbyists fooled the Biden-Harris administration: this section has no practical applicability right now (remember, executive orders are issued to address pressing matters).
The executive number limits the training performed on extremely fast computers (computing clusters). Let’s give the number used in the order (“1020 floating point operations per second”, which is a measure of how fast a computer is) some context. 1020 FLOPS (for short) is 100 exaFLOPS. The fastest supercomputer in the world (and there’s only one of it) is called Frontier. Its performance? Theoretically, the Frontier’s speed is just shy of 1.7 exaFLOPS. In practice, it is 1.2 exaFLOPS. And so here we go again: the executive order is setting a benchmark that’s more than 50 times higher than anyone can jump. It will take several years until we are able to build supercomputers that are that fast!
Let me say this again: this section of the executive order does not apply to anyone and will, quite-likely, not apply to anyone for quite some time!
How will AI developers be like marathoners?
Let’s ignore the ignorance of the U.S. administration for a second. If someone reads this post 10 years from now, I certainly don’t want to sound like someone who couldn’t imagine a faster future (“1026 floating point operations should be enough for anybody” sounds like a quote that will age badly). Here are my predictions of what the order might lead to:
There will be a crowded group of models just below the threshold and very few—if any—models being above it.
The order will spur innovation in “smarter” AI models - those that do not improve solely based on more training but on better architectures and better training approaches (including cleaner, preselected data).
A new business model, or at least a service, will emerge: benchmarking your model training and stopping the training just below the threshold.
If other jurisdictions around the world introduce similar requirements, this will spur the new wave of “partially trained models”—some jurisdictions will see slightly less trained models and some of them will see more trained ones.
This is exactly what marathoners would do, come to think about it!
You already know about the waves of runners trying to finish just before an imaginary deadline.
Runners who want to finish under a certain time know that training runs are just one part of the equation. What really makes a difference is smart training, good nutrition, and the right mindset.
A good runner will make sure not to overtrain and will taper for a week or two before the race.
The best marathoners in the world race at 100% in only a few locations—those that really matter.
There’s a major difference between marathon runners and AI developers. Marathon running goals are set based on what is achievable for an average runner. It’s quite astonishing that the rules set for AI development seem to be based on benchmarks that no one, absolutely no one, can achieve right now.
It’s an imaginary example.