Anonymization and pseudonymization are data protection measures. Anonymization is the modification of personal data in such a way that this data can no longer be assigned to a specific or identifiable natural person or can only be assigned with a disproportionately large expenditure of time, cost and manpower. Complete anonymization is very difficult to obtain.
In the case of pseudonymisation, the name or other identifier is replaced by a pseudonym (usually a code consisting of a combination of letters or numbers) in order to exclude or significantly complicate the identification of the person concerned. In contrast to anonymization, pseudonymization preserves references to different data sets that have been pseudonymized in the same way.
Pseudonymisation therefore makes it possible – with the help of a key – to assign data to a person, which is impossible or difficult to do without this key, since data and identification features are separated. It is therefore crucial that a merging of person and data is still possible.
The more meaningful the data collection is (e.g. income, medical history, place of residence, height), the greater the theoretical possibility of assigning it to a specific person and being able to identify them even without a code. In order to maintain anonymity, this data may have to be separated or falsified in order to make it more difficult to establish identity.
The deliberate removal of a previous anonymization is called deanonymization. The General Data Protection Regulation does not apply to anonymised data.
Examples of pseudonymization
- A pseudonym is used as an e-mail address and nickname on the Internet. The communication partners do not know the real identity. If the service provider is aware of this, it will be disclosed upon request (e.g. in the case of civil lawsuits, criminal investigations). Alternatively or additionally, remailers can be used, which prevent the traceability of the message content by anonymizing the header (headers).
- If a professor at a university wants to make the results of a (written) exam easily accessible to students, he asks them to write down a pseudonym of their choice on the sheets of paper during the exam. After correction, the professor can publish a notice (if necessary also on the Internet) in which all results are listed according to the scheme. Thus, the assignment of the pseudonym to the respective student is only to be established by the professor or, in individual cases, by the student.
Examples of anonymization
- If, in the “professor” example above, the examination sheets with the pseudonyms noted by the students were subsequently destroyed, the information on the grade notice would be anonymized for the general public, since it would no longer be possible to assign them to the respective students. However, every student will be able to recognize his entry on the scoreboard because he has remembered his pseudonym.
- A secret ballot in elections is based on the principle of anonymity (see Secrecy of the Ballot). It is still possible to trace who voted, but it is no longer possible to assign the ballot paper to the voter.
- Aggregation, i.e. the merging of different data sets into a common group, can lead to anonymization. Here it depends on the parameters, such as the size of the group and the individual characteristics of the group. A calculated grade point average for 100 participants in an examination can be described as sufficiently anonymized, a grade point average for two participants would allow conclusions to be drawn about the persons if necessary.
Usage in Internet
Pseudonyms are considered permissible on the Internet. This is subject to the condition that the “service providers have no knowledge of the illegal activity or information and, in the case of claims for damages, are not aware of any facts or circumstances from which the illegal act or information becomes apparent, or that they have taken immediate action to remove the information or disable access to it, as soon as they have become aware of it.” But the actual use of pseudonyms triggers reactions in society:
- Anonymous: A person’s reputation seems to diminish when they act anonymously. For many people, the desire to “want to hide something” means that “you have something to hide”. Attempts are also being made to intervene on the part of the rule of law, as complete anonymisation hinders criminal prosecution. Especially in the discussion about data retention, it became clear that the criminal authorities are increasingly trying to gain access to data.
- Pseudonym: Since access to the connection data of real people is possible under the rule of law in the case of pseudonymization, the suspicion of “trying to hide something” may be minimized. However, the fact remains that some people who use pseudonyms think they are “anonymous” and act accordingly. For this reason, some bemoan the decline of the “culture of etiquette” on the Internet associated with the pseudonym or create rules for correct behavior on the Internet. On the other hand, some defend the use of pseudonyms as a prerequisite for shielding individual freedom of expression and personal development from state, social or political restrictions.
As can be seen from the above examples, there is a point in the so-called anonymization or pseudonymization services in their systems that are open: system administrators have insight into the data and activities of Internet users. Since internal abuse is a serious threat from the Internet in addition to hackers, service providers are trying to protect themselves.
Possible protections taken by service providers
Providers of services that are committed to privacy on the Internet want to use anonymization on the Internet to ensure that Internet users trust them. The question of who has access to the data is important here. The following mechanisms play a role in hedging:
- Laws of the respective country where the servers are located
- Internal policies or technical organizational measures
- Technical exclusion of the operator’s employees