The challenge of scaling AI technology

scaling AI DXC Blogs

One key indicator of a safe investment is the need to scale. After a proof of concept has been established, the technology has been shown to function properly and customers demonstrate a clear interest, all that stands between the concept and profits is scale.

For most technology investments, scaling is quite safe. Sure, there are a few hurdles to overcome, but the risk tends to be small because technology scaling is usually sublinear: To serve twice as many customers, fewer than twice the amount of resources is needed. Big data technology is designed exactly this way to make this process as painless as possible.

But when we consider scaling for AI technology, an important distinction needs to be made. We need to separate scaling users from scaling intelligence. The growth in the number of users of a proven, well performing AI is sublinear and, technologically speaking, easy to accomplish. However, scaling intelligence is not.

Scaling up?

Human brains are capable of meaningfully distinguishing huge numbers of objects. One estimate puts this number somewhere between 1022 (as the minimum) up to 1048 (as a maximum), for an averagely educated adult person. In AI, today’s top performers have an accuracy level of about 76% for pictures of 100 different classes of objects (dolphins, sharks, roses, bottles). That would mean that machines of today perform something like 1020 times worse than humans.

How can we rise machines up to this level? So far, nobody has found a way. No one has demonstrated a way to effectively scale AI technology in this direction. In fact, it seems that machines scale quite inefficiently when it comes to intelligence gains.

For example, take the 76% performance level mentioned above. That project required thousands of simulated neurons in a deep network distributed over 18 layers. This may not seem like much when compared to some 80 billion neurons in the human brain, but it makes you wonder what’s needed to effectively scales toward large numbers of objects.

To estimate this number, let’s look backward, toward smaller numbers of objects. When the same architecture is used to distinguish between only 10 classes of objects, the performance is much better, about 93% correct. To achieve the 76% level of performance, much simpler architectures suffice.

Between only two classes of objects, even simpler architectures suffice. In fact, to distinguish as accurately or better between just two classes, simple logistic regression is sufficient.

This shows us that every additional class added to the intelligence level of machine requires significant resource (memory, floating point operations, etc.) additions. Growth follows a superlinear model, meaning that doubling the number of classes requires more than double the resources.

The high cost of learning

Why is this? Because a machine learning model must learn to represent various relationships between independent classes. What the machine learns about distinguishing dolphins from roses is not useful for distinguishing dolphins from bottles, or bottles from roses. As a result, the amount of learned knowledge and processed information grows with the number of pairwise comparisons. Even with various optimizations and simplifications of features, exponential growth cannot be avoided.

In the theoretically best scenario, the growth would be linear: For each additional class, only the resources for detecting this class would need to be introduced, without affecting the performance on other classes.

But even if a linear growth in demands is achieved by some new amazing technology (exponent = 1.0), this is by no means sufficient to reach human level performance. A thousand neurons for 100 classes (10 neurons per class) would need to be scaled up to 1023 neurons for 1022 classes. Compare that to ~1010 neurons in the human brain.

Note that we’re estimating only the amount of resources needed to run an already trained network. We did not calculate the amount of resources needed to train the actual network. This may be an even a bigger problem because, with an increase in the number of classes, the current technology has more and more difficulties converging to a solution during training. Hence, even with infinite computational resources, we may not be able to properly train existing neural networks.

But disregarding the training problem, the implementation of an already trained deep neural net is quite difficult. Even with maximally optimized use of RAM (computer memory), the above example would imply the need to use something like 1020 gigabytes, which is an astronomical number. If the price of memory fell to only $1 per gigabyte, that amount of memory would require more money that what exists in the world today (the entire world has today about $1 quadrillion, or $1015). And this estimate does not even include memory for training samples, GPUs, etc.

These calculations paint a very grim picture for scaling the intelligence of our machines. But there may be a solution I have hinted at in other posts — and we can learn it from biology.

The answer is out there

Biology solves this scaling problem through an adaptive process that involves a unique type of re-learning. The process is both very rapid (less than second) and extensively intelligent in itself; the learning rules contain a mass of knowledge about the world. But, unfortunately, this rapid re-learning is something no AI built today has at its disposal.

If we do not find a way to scale the intelligence of machines sublinearly (exponent << 1) we probably cannot hope to reach AGI. And, in my view, the scaling problem will always exist unless we change our approach to AI to match what philosopher John Searle had in mind (as I discussed in my first post here).

Only when we find a sublinear solution to intelligence scaling will we have strong AI and open a path toward AGI. And in the meantime, what should we do?

Stay tuned.


Should we be preparing for an AI bust?

Enrolling in Artificial Intelligence Kindergarten

Is investing in AI technology a bet against the odds?


  1. […] This much scarier type of bubble has to do with unrealistic expectations of how AI technology will scale. And this is what I’ll delve into in my next post. […]

  2. […] my last post, I made a big distinction between investments in proven artificial intelligence solutions and […]

  3. […] have described in a previous post that existing AI technologies, such as deep learning, do not scale well when the number of […]

  4. […] The challenge of scaling AI technology […]

  5. […] The challenge of scaling AI technology […]

  6. […] Thus, in contrast to technological innovations rooted in the physical world, there is often no straight-forward way to scale up AI […]

  7. […] Thus, in contrast to technological innovations rooted in the physical world, there is often no straight-forward way to scale up AI […]

Speak Your Mind


This site uses Akismet to reduce spam. Learn how your comment data is processed.