Data masking for GDPR: A force for positive change


If the words General Data Protection Regulations (GDPR) are enough to get your pulse racing, you are not alone. The GDPR, due to come into force on 25 May this year, is going to have a significant, permanent and far-reaching impact on any business that holds data on EU citizens, including organisations based outside of Europe.

To truly understand the impact GDPR will have on data security, we need to travel back in time to 1995, when Directive 95/46/EC of the European Parliament and of the Council was introduced. Back then, the internet was in its infancy, a shiny new toy that was used sparsely as a tool for business. Today paints a much different story as we grapple with an ever-changing, increasingly complex IT landscape. Digital identities, cloud computing, mobile applications and IoT have radically changed the way we work, and they have transformed the data landscape beyond all recognition.

GDPR is all about the protection of data. The regulations address the transformation of the data landscape, and are designed to guide and govern how organisations around the world manage their privacy and data security practices.

Where is your data?

Knowing where your data is at any given time is a huge consideration when preparing for GDPR. Organisations share production system data with third parties for any number of business reasons. Huge chunks of production data are also copied to test and development environments to mimic the production build as closely as possible.

In all cases, there is a very real and dangerous risk of a data security breach occurring. Organisations that have clear visibility of their data and where it resides — including data stored in non-production environments — will be better prepared to take protective and preventive measures.

Organisations that have clear visibility of their data and where it resides — including data stored in non-production environments — will be better prepared to take protective and preventive measures.

Anonymisation vs. pseudonymisation

Organisations have two options when it comes to de-identifying their data:

Anonymisation: Anonymisation, as defined in Recital (26) of the GDPR, is achieved when the “data subject is not or no longer identifiable”.

Pseudonymisation:Defined in Article 4 of the GDPR, pseudonymisation is used to remove links between the data subject and the data. It is achieved through “processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately.”

It is important to note that even though pseudonymisation removes all direct identifiers from sensitive data, it could still be reverted to an identifiable format if it were to fall into the wrong hands. Therefore, it cannot be classified as anonymous and will still be regarded as personal data under GDPR. As a general rule, if a set of data can be re-identified with a reasonable amount of effort, it will not be classified as anonymous, whereas data masking that makes it impossible to identify an individual will.

De-identification and the data sensitive landscape

As we have seen, the GDPR encourages organisations to implement data masking techniques to secure and protect data, either as a way to reduce the burden of legislation or as part of a solid data management strategy. Data masking involves a process that de-identifies classified, sensitive or personal data by replacing, removing or hiding characters within the original set of data.

Many organisations are already working with data masking to a certain degree, using processes that transform data into dummy or fake data. They can do it statically (once forever when the data is stored) or dynamically (every time the data is read; in this case it is sometime called “redaction”). Data masking scenarios often involve varying levels of security clearance assigned to different members of staff. For example, help desk agents may be able to see partial payment details of clients they are assisting, while the finance department may have full access to the client’s data. By masking data in this way, and also ensuring non-production and test environments are treated in a similar fashion, the risk of data breaches is greatly reduced.

Keeping it real in non-production environments

A sprawling mass of non-production environments, teeming with copies of unprotected, easily identified data, is another big concern for organisations chasing GDPR compliance. To reach and maintain compliance, robust solutions are required that efficiently protect all data, regardless of its location or use. The first step in any initiative is understanding where data sits in the IT environment and finding ways to protect this data without slowing down projects and progress.

Alessandro Vallega, Security Business Development Director for Oracle EMEA, explains how using real data in non-production systems can pose a big risk for organisations. “Developers do not need real data for testing. The complexity of a test environment requires technology that can transform production data to good quality data that cannot be linked to an individual.”

Protecting critical data in non-production environments has become a challenge for organisations in recent years. What is required is a solution that can mask data while still retaining realistic values, making it safe for testing, development and outsourcing to third parties.

A big incentive to get things right

The GDPR offers a carrot and stick approach when it comes to masking data — the carrot being a number of incentives for any organisation taking this approach (more on this in a moment) and a big stick in the form of fines of up to four percent of annual worldwide turnover. There is a huge incentive to get things right, but many organisations will not realise the incentives and implications until they experience their first GDPR breach.

There is a huge incentive to get things right, but many organisations will not realise the incentives and implications until they experience their first GDPR breach.

GDPR incentivises data masking in the following ways:

  • In the event of a data breach: If the breach involves a low risk to individuals, for example, and data has been masked, a data breach notification to affected individuals and regulators may not be necessary.
  • In the event of data disclosure requests: Article 11 of the GDPR states that controllers are not required to provide data subjects with access, rectification, erasure or data portability if they can no longer identify a data subject. For example, if directly identifying data has been deleted rather than held separately, it may not be possible to re-identify the data without obtaining additional information. This exemption is applicable only if the controller can prove the data is not re-identifiable and, if possible, is able to provide notice to data subjects about this practice.

Under the GDPR, regulatory bodies and individuals will have additional powers to make legal claims and make requests for data. Organisations today need to be looking at solutions that help them put solid data masking procedures, encryption technologies and data security policies firmly in place.

Harvey Maddocks headshotHarvey Maddocks is the global lead for DXC’s Oracle offerings, focussed on developing revenue-producing relationships with decision makers and the C-suite. Recognising the impact of cloud in the Oracle community, Harvey promotes multiple Oracle offerings to simplify blending future cloud applications with the integration and running of existing systems.

Alessandro Vallega headshotAlessandro Vallega is Security Business Development Director for Oracle EMEA. He leads a cross functional team on the GDPR (General Data Protection Regulation, EU 679/2016) at EMEA level (marketing, legal, sales, training, technology). He founded and coordinates an external blog on the same topic. He has defined a European methodology to evaluate the database security degree of a data center and the advantages of identity and access management technology. In 2007 he founded, and still leads, the Oracle Community for Security, and in that context led the creation of several publications about security and privacy involving the cloud, mobility, social media, healthcare and return on security investments; about the role of the CISO; and how to prevent fraud. He is an author of the Italian annual ICT Security Report by CLUSIT and a member of the CLUSIT board of directors.


Getting the analytics right for GDPR compliance – and beyond

Lemons, silos and trust issues: How to turn GDPR into lemonade



  1. I don’t understand how anonymisation or pseudonymisation address concerns over k-anonymity. Surely it is necessary to pre-define all possible queries of a dataset to understand whether it is k-anonymous? (eg suppose the data show gross salary, age and state and you know that there’s only three people 2Mpa in UT, you’ve got a 1 in 3 chance of finding out more from each data collection.

  2. The work of dynamic data masking is to protect personally identifiable data. Dynamic data masking does not require any additional server resources.

Speak Your Mind


This site uses Akismet to reduce spam. Learn how your comment data is processed.