Putting machine learning into context

Machine Learning is getting a lot more air time these days but are we actually sure what it is?

The most common definition goes along the lines of:

It gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959).

This is an old quote but it has held the test of time. But,how can computers “learn” – have we really reached the age of artificial intelligence where they will take over the world and make humans redundant? I suspect not.

Let’s explore the core of the definition: the ability to learn

What this really means is there are a set of algorithms that, rather than simply following a static set of program instructions, they can make data driven predictions, or decisions through building a model.

There are three recognised categories of algorithms:

Supervised learning – The computer is presented with example inputs (training data) and their desired outputs, given by a “teacher”, and the goal is to learn a general rule that maps inputs to outputs. The “easiest” example of supervised learning is a decision tree – this uses a tree-like graph or model of decisions and their possible consequences, including chance-event outcomes, resource costs, and utility.   From a business decision point of view, a decision tree is the minimum number of yes/no questions that one has to ask, to assess the probability of making a correct decision, most of the time. As a method, it allows you to approach the problem in a structured and systematic way to arrive at a logical conclusion.

There are many other supervised learning alogorithms – which I will only reference; Naïve Bayes Classification, Ordinary Least Squares Regression, Logistic Regression, Support Vector Machines and  Ensemble Methods.

Unsupervised learning – this is where the data is not labeled and so there are no error or reward signals; hence leaving the algorithm to find structure. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end ). Examples include Clustering Algorithms, Principal Component Analysis, Singular Value Decomposition and Independent Component Analysis.

Reinforcement learning – this has been inspired by behavioural psychology, concerned with how software agents ought to take an action in an environment so as to maximise the “reward”.  There are many other adjacent theories in this space – from game theory, control theory, Operational research, swarm intelligence etc. Reinforcement learning is different because the correct input/output are never presented, nor suboptimal actions corrected. There is a focus on on-line performance. A good and relevant example is a self driving car (autonomous car)– where it operates without a teacher explicitly telling it whether it has come to close to its goal.

So, we now have a set of algorithms that clearly need people who understand the techniques deeply – hence the need for Data Scientists.

An interesting summary can be found here. This uses the tube map analogy, and as you can see, Machine Learning is merely a stop on the way in the full journey.

However, even with the best algorithms, we need ways of storing the data and visualising it. There are many analytic platforms; the Gartner magic quadrant top quartile being held by the likes of SAS, IBM, KNIME, Rapidminer, and Dell.  There is also MS-Cortana (see announcement regarding IML solutions) hosted on Azure.

Its important to put Machine Learning into context– best summarized, I believe via the 2016 Gartner Hype Cycle for Smart Machines. The diagram shows that machine learning is on the wave of data science but there is a lot before it and a lot after, and that there is a long cycle time; 5-10 years before it becomes properly mainstream.

The conclusion:  there are two discrete roles for machine learning. The first, where it is the solution, helping businesses transform their business and the second, where machine learning is within the solution; transforming an organisation’s products and services. Regardless of whether an organisation is using the first or second type, it is an area that I believe will explode in terms of use cases. We will get to a point that if an organisation isn’t using this technique, they will very quickly become inferior to those who do.

Don’t be put off with data science – this is a very new and exciting area.  First, you don’t need to be a deep mathematician who understands the complexities of clustering algorithms vs principal component analysis. Like many of these phenomena,  it’s knowing people that do understand these approaches. Of equal importance, however, is knowing how to apply the data findings.  Data science has its own challenges – a data scientist can look at the data, write some algorithms, find some patterns and start to tell a story – oftentimes, however it doesn’t get the right level of stakeholder commitment as it’s not close enough to the business problem itself. Using an agile technique here is clearly one of the options that will increase stakeholder engagement: involve the right people (a variety) and iterate on the hypotheses.  One technique I am particularly enthused about is the OODA loop.

Software tooling (the platform and the applications) is going to become the catalyst for increased speed-to-market and to a certain degree, dumbing down the complexities so it can become more mainstream. It will be a multi-layered architecture: industry applications and solutions at the top, followed by visualisation/collaboration, analytics, storage, and finally infrastructure.  The danger, as is the case with many of these stacks, is spending too much on the technology and associated components, rather than the actual business problem. A  classic example would be to ask an organisation whether they have a “Big Data strategy”–with the likely retort of “Yes, of course. We have deployed Hadoop.” Clearly not the right answer in this context.

As with many of these next generation technologies, the best way to understand and learn is to simply try it with a use case that is relevant to where you are.  Get as close to the business problem and iterate on a hypothesis.

A great pace to start is kaggle.  Try the “Machine Learning from the Titanic” prediction competition.

This post first appeared in Neil’s blog.

Neil Fagan is an enterprise architecture expert, leading teams of architects who work on solutions from initial concept through delivery and support.

See Neil’s full bio.



Hesitant to adopt machine learning in 2017? This might change your mind

How machine learning and AI are transforming the workplace

Enrolling in Artificial Intelligence Kindergarten


  1. Bas Geerdink says:

    Hi Neil, thanks for the nice overview. You’re entirely correct about the definition and appliance of machine learning, although to put it “into context” I would also have liked to see it as the subset of Artificial Intelligence that it is. I still see a lot of confusion when it comes to terms like AI, Data Science, Machine Learning, Deep Learning, etc.



  1. […] Putting machine learning into context […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: