AI is great, but greatly misunderstood

question-mark-on-wall

In the past year, I have been witness to many misunderstandings about artificial intelligence (AI), from both co-workers and clients. I was about to blog on the state of AI when I came across this HBR article, “The Business of Artificial Intelligence” by Erik Brynjolfsson and Andrew. I was so glad to find such a well-written piece. It avoided presenting a pessimistic or contrarian perspective while still managing to set the record straight as to the maturity of AI today and what it takes to achieve success.

So, instead of simply repeating what Brynjolfsson and McAfee state in their article, I thought I would highlight some of their key points and provide annotations that I believe are useful to clarifying the state of AI today.

If someone performs a task well, it’s natural to assume that the person has some competence in related tasks. But ML systems are trained to do specific tasks, and typically their knowledge does not generalize. The fallacy that a computer’s narrow understanding implies broader understanding is perhaps the biggest source of confusion, and exaggerated claims, about AI’s progress. We are far from machines that exhibit general intelligence across diverse domains.

Reading this was music to my ears. It really drives to the heart of the misunderstanding surrounding AI’s abilities. Machine learning (ML) is the codification of the patterns surrounding a particular task or goal. The means to automation has primarily been human codification of task steps in the form of a program, and ML is the machine-generated version of this codification. However, the execution of this machine-generated code is still dependent upon the matching of a particular set of inputs.

So, the extent of AI today requires a human to target the use of AI on a problem domain, train the AI, and then use the AI to handle routine tasks with a realization that effectiveness of AI-based understanding is going to be less than 100% accuracy. When looking for a needle in a haystack, this is one of the most marvelous technological advancement mankind has seen. When looking for the Volkswagen in the parking lot of the mall, well, the investment is questionable.

Much of the knowledge we all have is tacit, meaning that we can’t fully explain it. It’s nearly impossible for us to write down instructions that would enable another person to learn how to ride a bike or to recognize a friend’s face… we all know more than we can tell.

Humans can only write code to the point where they can codify the knowledge in their head; what they can transcribe into steps that the machine can understand. Allowing the machine to define code in a way that is native to the machine reduces the need for translation and leads to a much more fluid result. ML models represent the encoding of algorithms that approximate tacit knowledge.

Successful systems often use a training set of data with thousands or even millions of examples, each of which has been labeled with the correct answer. The system can then be let loose to look at new examples. If the training has gone well, the system will predict answers with a high rate of accuracy.

When I hear people talk about AI it often sounds like a description of some “automagical” response that was gleaned from some alien race. AI has been around for a very long time, but with very limited capabilities. One of the reasons for this has been the high costs of storage and compute. As the authors state above, training data requires “thousands or even millions of examples,” which, in the past, would have required significant investment in hardware. Thanks to the commoditization of hardware and the advent of the cloud, this same storage and compute can now be consumed for a minor fraction of its costs just ten years ago.

There’s more to this picture, however, because machine learning models are highly reliant upon mathematical and statistical computation. Hence, the arrival of Graphical Processor Units (GPU) farms delivered in a pay-for-use manner to facilitate the high-performance computing required to not only process but iteratively improve the model in a reasonable time frame at affordable prices. Moreover, cloud service providers, such as Microsoft and Amazon, have packaged hardware and software together into machine learning services, reducing further the effort to derive new ML models.

Of note, a model can be derived with little data to train on, however, its accuracy will likely be low with regard to predictive analysis. As the authors note, however, performance gains can still be had even with minimal quantities of data. 

Machine learning in practice

The next three quotes highlight the ways that machine learning actually occurs. I believe these really help to accentuate the level of effort required to develop a single model.

Any situation in which you have a lot of data on behavior and are trying to predict an outcome is a potential application for supervised learning systems.

Supervised learning systems solve for the function from input to output, learning from a large set of examples comprised only of correct answers. The reason supervised learning is well-adapted to predictive behavior is that it approximates for a given behavior (X) what the likely outcome will be (Y). This is also why, in my opinion, American Football is one of the greatest challenges to AI. Contact me if you’d like to further the conversation around this statement, it is one of my favorite conversations.

Unsupervised learning systems seek to learn on their own. We humans are excellent unsupervised learners: We pick up most of our knowledge of the world (such as how to recognize a tree) with little or no labeled data. But it is exceedingly difficult to develop a successful machine learning system that works this way.

Unsupervised learning is what I believe the majority of people think when they hear AI. Unsupervised learning can help to uncover patterns and relationships between your data, but we are a long way off from useful, machine-generated generalizations based on unsupervised learning.

In reinforcement learning systems the programmer specifies the current state of the system and the goal, lists allowable actions, and describes the elements of the environment that constrain the outcomes for each of those actions. Using the allowable actions, the system has to figure out how to get as close to the goal as possible. These systems work well when humans can specify the goal but not necessarily how to get there…. Of course, this means that a reinforcement learning system will optimize for the goal you explicitly reward, not necessarily the goal you really care about (such as lifetime customer value), so specifying the goal correctly and clearly is critical.

Reinforcement learning (RL) systems are the next generation of expert systems. They are particularly well-suited to automation and require significant training to become proficient. RL systems are what are most likely used when you hear about AI systems beating humans at chess or Go. However, as Eli Goldratt stated in his book The Goal, “Tell me how you measure me and I will tell you how I behave.” RL systems will perform relative to how they are rewarded.

The current state of artificial intelligence

In summary, AI has advanced greatly in the last five years in tandem with the availability of inexpensive high-performance compute and storage. The state of AI is that it works very well in areas where they can be trained on a well-defined task or scope, but cannot yet generalize. AI can augment human performance by identifying the needle in the haystack much more quickly and accurately assuming it has been trained to identify a needle in a haystack to begin with.

Finally, I’ll leave you with some perspective on the power required to enable AI from Thomas Simonini in his blog entitled, How AI can learn to generate pictures of cats, “You can’t run this on your personal computer — unless you have your own GPUs or are ready to wait maybe 10 years! Instead, you must use cloud GPU services, such as AWS or FloydHub. Personally, I trained this DCGAN for 20 hours with Microsoft Azure and their Deep Learning Virtual Machine.”


JP Morgenthal, DXC Technology’s chief technology officer of Application Services, has been delivering IT services to business leaders for 30 years. He is a recognized thought leader in applying emerging technology for business growth and innovation. JP’s strengths center around transformation and modernization processes that leverage next-generation platforms and technologies. @jpmorgenthal

Comments

  1. Rightly said JP Morgenthal !! Thanks for sharing your thoughts. Very useful. Pl advice if you feel that in addition to Compute and Storage (extremely important for AI) communication speeds is another additional fundamental building blocks for AI advancement (I am considering more from IoT or Autonomous vehicles point of view wherein lot of data needs to be ingested – Lidar, radar, camera feeds etc. for realtime decision making by AI algorithms.

    Like

  2. Mark Gira says:

    A common sense description on the current state if AI. Thank You!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: