This is the English summary of a longer German-language article. The publication is part of the „Databroker Files“ series.
A new data set obtained from a US data broker reveals for the first time about 40,000 apps from which users‘ data is being traded. The data set was obtained by a journalist from netzpolitik.org as a free preview sample for a paid subscription. It is dated to a single day in the summer of 2024.
Among other things, the data set contains 47 million “Mobile Advertising IDs”, to which 380 million location data from 137 countries are assigned. In addition, the data set contains information on devices, operating systems and telecommunication providers.
Ths investigation is part of an international cooperation by the following media: Bayerischer Rundfunk/ARD (Germany), BNR Nieuwsradio (Netherlands), Dagens Nyheter (Sweden), Le Monde (France), netzpolitik.org (Germany), NRK (Norway), SRF/RTS (Switzerland) and WIRED (USA).
Overview of our findings
- The approximately 40,000 apps in the new dataset cover a wide range of categories, from gaming, dating and shopping to news and education. They include some of the most popular apps worldwide, with millions of downloads in some cases.
- For a smaller number of apps, the data set contains alarmingly precise location data. This data can help to identify a person’s place of residence. These apps include the queer dating app Hornet with more than 35 million users; the messaging app Kik with more than 100 million downloads in the Google Play Store alone; Germany’s most popular weather app Wetter Online, which also has more than 100 million downloads in the Google Play Store; and the flight tracking app Flightradar24 with more than 50 million downloads in the Googles Play Store; the app of German news site Focus Online and classifieds apps for German users (Kleinanzeigen) and French users (leboncoin).
- For a bigger number of apps, less precise locations which appear to have been derived from IP addresses can be found in the data set. This list includes popular apps such as Candy Crush, Grindr, Vinted, Happy Color, dating apps Lovoo and Jaumo, news aggregator Upday, German email apps gmx.de and web.de as well as the popular dutch weather app Buienalarm.
- Since the sample only covers one day, it is difficult to identify people based on their locations from this data set alone. However, in combination with other data sets from the advertising industry, which the research team obtained from data brokers, it’s possible to identify and track people on a large scale. The location data might for example provide clues to their home and work addresses.
- Thus, the team was able to identify users of Wetter Online in Germany and Kik in Norway. The individuals confirmed that the data must belong to their devices and their use of the respective apps.
- Location data aside, the mere information about who uses which apps can already be dangerous. For example the data set includes numerous Muslim and Christian prayer apps, health apps (blood pressure, menstruation trackers) and queer dating apps, which hint at special categories of personal data under GDPR.
Where did the data set come from?
The research team obtained the data set from US data broker Datastream Group, which now uses the name Datasys. The company did not respond to multiple requests for comment.
Contact with the data broker was established through Berlin-based data marketplace Datarade. The company states in response to inquiries that it does not host any data itself. According to a spokesperson „Data providers use Datarade to publish profiles and listings, enabling users to contact them directly“. Datarade „requires data providers to obtain valid consent in case they’re processing personal data and to aggregate or anonymize data in case they’re processing sensitive personal data“.
Where does the data originate?
According to our analysis, the data originates from Real Time Bidding (RTB), which is a process in the online advertising ecosystem. These are auctions in which advertising inventory of apps and websites is sold. In the process, apps and websites send data about their users to hundreds or thousands of companies. These data contains the information that we can see in our dataset. There have already been multiple warnings that advertising companies are collecting the data from RTB in order to sell it – often without the knowledge or explicit consent of the users or their apps.
What the apps say
None of the apps we confronted so far states they had business relations with Datastream Group / Datasys. The apps Hornet and Vinted for example wrote, that they cannot explain how their users‘ data ended up with data brokers. Queer dating app Hornet emphasizes that it does not share actual location data with third parties and announces an investigation. Other companies such as Kik, Wetter Online, Kleinanzeigen, Flightradar, Grindr and King, the company behind the game Candy Crush, did not respond to press inquiries.
Reactions
Experts from politics, government agencies and civil society expressed concern about the findings.
Michael Will, the Bavarian data protection commissioner, said there would be consequences. In an interview, he describes the findings of the investigation as disillusioning and alarming. The data protection official views the situation as a blatant breach of trust. “This is contrary to everything that the average users of apps would expect – to be able to track where they have been for months afterwards.” The data broker should not have had this data. ”This is beyond the agreed rules of the game.”
Will also expresses criticism on Real Time Bidding: anyone who uses it to display advertising must ask themselves whether their own contractual partners are really abiding by the contract. As a result of the investigation, the data protection authority wants to take action itself. “We have investigative powers. We will now make intensive use of them on the basis of the information you have provided,” says Will – and points out that his authority can also impose sanctions. “We have the option of imposing quite considerable fines.”
In view of the latest findings, the German Federal Ministry for Consumer Protection (BMUV) writes: The collection of the data alone must be prevented. “We need effective EU-wide protection against personalized advertising to prevent app providers from having incentives to collect more data than is necessary to offer an app.” The ministry continues to advocate a “consistent switch to alternative advertising models”.
In addition, the Ministry for Consumer Protection is calling for technical standards to prevent devices from collecting identifying data in the first place. “The manufacturers of operating systems and end devices also have a role to play here.” Finally, the supervisory authorities would need to take consistent action.
Michaela Schröder from the Federation of German Consumer Organizations (vzbv) comments: “The current findings show and confirm once again that the global online advertising market has escaped any control. Unscrupulous data traders collect and disseminate highly sensitive information about people, while websites and apps make these illegal practices possible in the first place and the supervisory authorities seem to be completely overwhelmed.”
Consumers are left defenceless against the massive risks posed by data trading, says Schröder. The vzbv is therefore calling for action at the European level. “It is long overdue for the European Commission to effectively protect consumers and present a proposal to ban personalized advertising – for example, through the announced Digital Fairness Act,” Schröder said.
Difference to Gravy Analytics leak
The results of our investigation confirm and expand on the insights that experts gained in early January from data obtained by hackers from US data broker Gravy Analytics. The Gravy Analytics leak also mentions thousands of apps; this data also apparently comes from Real Time Bidding. Among them are numerous apps that are also represented in our dataset from the Datastream Group: Candy Crush, Grindr, Kik, Wetter Online, Focus Online, FlightRadar24, Kleinanzeigen and many more.
However, the list of apps in our data set is much longer. For the first time, we can also differentiate between the apps for which only very rough location data is available and those for which users can be located exactly. It is this the precise location data that puts users at particular risk, as it allows to draw conclusions about their home address and movement patterns.
Well, you name some popular apps. But there should be a way for anyone to lookup tracking apps in the list of the 40,000 apps.
Is there any legal problem for publishing the whole list?