Athena Bourka: Artificial Intelligence brings more challenges for cybersecurity

Pseudonymisation can facilitate the processing of personal data, while protecting the identity of individuals

Photo: ENISA Athena Bourka.

Healthcare is a domain where the processing of personal data is not only inevitable but also crucial. Pseudonymisation techniques are, thus, critical in several processes, as we at ENISA have also seen recently in the context of the Covid-19 pandemic and the discussion on the associated tracing apps, says Athena Bourka, a Network and Information Security Expert at ENISA, in an interview to EUROPOST.

Ms Bourka, how do advancing technologies change the techniques and measures that are in the forefront of cybersecurity?

As technology progresses, more advanced cybersecurity techniques and measures have to be deployed. The overarching protection principles remain the same though. For example, the progress in Artificial Intelligence (AI) brings more challenges from a cybersecurity perspective, which must be met with novel threat assessment models and relevant mitigation measures.

Still, the key security properties (confidentiality, integrity and availability) are the starting point for any such new models and measures. The same applies for data protection. The EU General Data Protection Regulation (GDPR) sets out the data protection principles, which must then be translated into measures, processes and procedures. These measures, processes and procedures need to be continuously monitored and developed as technology progresses to ensure that the data protection principles are always met.

Data is the new oil for the digital economy, many experts say, but what does pseudonymisation for Personal Data Protection mean, as nowadays a huge flow of data is collected and stored by various players?

Pseudonymisation is a technical process whereby individuals' identifiers are replaced by pseudonyms, in a way that individuals cannot be identified within a given dataset, unless additional information is available, e.g. the correlation between the original identifier and the pseudonym. To this end, pseudonymisation can facilitate the processing of personal data, while protecting the identity of individuals in different scenarios. It is important to stress that pseudonymised data are still personal data, as it is possible for the entity performing the pseudonymisation (e.g. a controller under GDPR) to identify the individuals with the use of the additional information. Pseudonymisation should not be confused with anonymisation, where data are rendered anonymous and, thus, re-identification of individuals is not practically feasible.

How do users benefit from this technical process and what are the best practices?

Pseudonymisation is a security and data protection by design technique that can enhance the protection of personal data in several scenarios. Its most obvious benefit is the possibility to hide the identity of the individuals from third parties (aside from the entity performing the pseudonymisation). For example, as we have shown in our recent report by the European Union Agency for Cybersecurity (ENISA), 'Data Pseudonymisation: Advanced Techniques and Use Cases' (https://www.enisa.europa.eu/publications/data-pseudonymisation-advanced-techniques-and-use-cases/), a controller could pseudonymise its data before passing it further to a data processor, e.g. a contractor involved in the processing on the controller's behalf. In this way, the identity of the individuals is not revealed to the processor, while the processing of data is still possible on its side. It is possible to have similar scenarios on information sharing between different controllers - for example, in the area of medical research, where proper pseudonymisation techniques can facilitate the process by offering enhanced protection of personal data.

Why does this report use cases for healthcare and information sharing in cybersecurity measures?

In the report, the ENISA chose to cover use cases for the healthcare sector and information sharing in cybersecurity measures to highlight the diversity of the possible scenarios and different approaches that can be employed.

On one hand, healthcare is a domain where the processing of personal data is not only inevitable but also crucial. Pseudonymisation techniques are, thus, critical in several processes, as we at ENISA have also seen recently in the context of the Covid-19 pandemic and the discussion on the associated tracing apps. It is also an area with more complex scenarios and diverse processes that require advanced pseudonymisation architectures and techniques.

On the other hand, cybersecurity is an area where personal data processing is often not clearly considered, despite being a key element of several operations. Today, most modern cybersecurity technologies no longer rely on static, signature-based protection; but rather they depend on security telemetry analytics - such as correlating suspicious events that reveal the existence of an advanced threat, training Machine Learning systems to classify threats, establishing reputation-based protection, building behavioural threat models, etc. As such, cybersecurity technologies rely strongly on the processing of personal data; and pseudonymisation plays an important role in this.

How can it be decided which pseudonymisation approaches are the most suitable technical option?

There are various pseudonymisation techniques available, but there is not a one-size-fits-all solution. The suitable approach should be selected based on the organisation's goals, the overall context of the processing of personal data, and the risks to the rights and freedoms of individuals. At the ENISA, we stress the need to thoroughly analyse the requirements and compare the merits of each approach during the design phase of pseudonymisation.

For example, as shown in the above-mentioned report, pseudonymisation can be achieved with basic techniques, such as symmetric encryption and keyed hash functions, as well as with more advanced ones, such as Merkle Trees and zero-knowledge proof. The choice of which technique to use greatly depends on the levels of protection and utility desired.

The same applies to the choice of the pseudonymisation policy (e.g. deterministic pseudonymisation versus randomised pseudonymisation), as well as to the choice of the overall architectural model, including the different actors involved. The ENISA report provides different use cases with diverse requirements to show how these choices vary, depending on the specific scenarios in question.

At what level is the adoption of data pseudonymisation in Member States?

Pseudonymisation is explicitly mentioned in the GDPR as a security and data protection by design measure and the recent example of contract tracing apps is an indication of good progress across the EU. At the EU Agency for Cybersecurity we collaborate with researchers, regulators and industry representatives to define state-of-the-art means in the field of pseudonymisation, as well as to support practical implementation across the Union.

Close-up

Athena Bourka is a Network and Information Security Expert at the European Union Agency for Cybersecurity (ENISA) in the areas of data security, privacy and trust. She is also ENISA's Data Protection Officer. Before joining ENISA, Athena had been working for more than 10 years as a privacy and security expert at the Hellenic Data Protection Authority and the European Data Protection Supervisor (seconded national expert). Athena Bourka has also worked in the past in the areas of healthcare data security and environmental information systems and networks. She has studied electrical and computer engineering and holds a PhD in information security.

Similar articles