Many organizations are familiar with statistical assessments of identifiability, like Expert Determinations under HIPAA or anonymization assessments under other regulations like the GPDR. In these assessments, a technical expert evaluates the identifiability of the data by modeling the likelihood of an opportunity to identify someone in the data against the probability of success. Conversely, a Motivated Intruder Test, also known as adversarial testing, assesses the effectiveness of privacy protections practically by having a data analyst attempt to re-identify people in the data.
Both approaches are considered components of effective anonymization strategies in international standards and guidance, such as ISO/IEC 27559 and guidance released by the United Kingdom’s Information Commissioner’s Office (ICO). However, each approach is different in terms of how it is executed and how its findings are interpreted.
What is a Motivated Intruder Test?
A Motivated Intruder Test involves a data analyst attempting to re-identify individuals in a dataset deemed to be de-identified or anonymized. This is analogous to a penetration test in physical security, where a tester attempts to break into a physical space without the usual authorization, for example, by social engineering, defeating locks, or other security measures.
In a Motivated Intruder Test, an analyst attempts re-identification by flagging fields in the data they can reference externally and looking for individuals who are unique (or close to unique) in their combinations of those fields. For example, suppose a dataset pertains to a medical condition primarily affecting a specific demographic, like older adults. In that case, an analyst might look for very young patients in the data, cross-referencing media articles covering young patients with this condition.
For data that is not well de-identified or anonymized, this can yield high-confidence matches between the dataset and externally referenceable individuals.
Motivated Intruder Tests are conducted under some impactful practical constraints:
- Effort constraints, where the project scope fits a pre-established budget of hours to attempt re-identification.
- Financial constraints, which affect what data assets might be purchased and what data storage and computing power might be used in the re-identification attempt.
- Technical constraints, in terms of analysis tools to detect potential identifiers, particularly in unstructured data like text, images, audio, or video.
- Ethical constraints, like limiting access to datasets that may be available through illicit means, such as breaches or leaks, prohibiting contact with an individual who may be in the dataset to attempt to validate the identity, and prohibiting any additional monitoring or surveillance of an individual.
How do I interpret the findings of a Motivated Intruder Test?
The illustrative power of a Motivated Intruder Test can be stark. If the test has several high-confidence re-identifications, it is a clear “smoking gun” demonstration that the existing privacy protections applied to a dataset are not sufficient to mitigate re-identification. However, if the test does not yield any re-identification or near re-identification, re-identification could still be possible through other means.
The lack of success may be due to one or more of the constraints listed above or some bad (or is it good?) luck in the re-identification attempt. To bring back the physical security analogy—a burglar failing to break into a building does not mean the building cannot be broken into.
When and why would I choose a Motivated Intruder Test?
Re-identification risk determinations—statistical assessments of identifiability—provide an unambiguous evaluation incorporating assumptions or modeling about how the data might be breached. They are often used, being explicitly required under HIPAA Expert Determination and more implicitly by some guidance under the GDPR, like EMA Policy 0070.
Motivated Intruder Tests, on the other hand, provide an unambiguous illustration of data being re-identifiable in cases where a candidate re-identification is found. It also allows the assumptions and modeling used in statistical assessments to be effectively evaluated under real-life scenario testing.
The ISO/IEC 27559 standard recommends using both analytical techniques and motivated intruder testing to challenge assumptions and improve data-sharing strategies. ICO guidance also encourages the use of both assessments.
In short, Motivated Intruder Tests are a great option if you are looking to evaluate your assumptions, validate your methods, or provide additional evidence to support the strength of your de-identification or anonymization scheme. They offer practical proof points when results are positive and are illustrative when results are negative, which is particularly helpful in high-urgency or high-impact cases.
Contact the experts at Privacy Analytics to learn more about these or any of our other offerings.