Today, Google launched an “origin experiment” of Federated Learning of Cohorts (also called FLoC), its experimental new ad targeting technology. A switch has been quietly turned around in millions of cases by Google Chrome: these browsers start sorting users into groups based on behavior, and then share group labels with third-party trackers and advertisers around the web. A random set of users is selected for the trial period, and at this time they can only opt out by disabling third-party cookies.
Although Google announced this should come, the company has been sparse with details of the trial until now. We have pored over blog posts, mailing lists, draft web standards, and Chromiums source code to find out exactly what is happening.
The EFF has already written it FLoC is a terrible idea. Google̵
Below we describe how this trial version will work, and some of the most important technical details we have learned so far.
FLoC will replace cookies. In the trial, it will supplement them.
Google designed FLoC to help advertisers target ads when third-party cookies disappear. During the trial period, trackers will be able to collect FLoC IDs in addition to third-party cookies.
This means that all trackers that currently monitor your behavior across a fraction of the web using cookies now also receive your FLoC cohort ID. Cohort ID is a direct reflection of your online behavior. This can supplement the behavioral profiles that many trackers already have.
The lawsuit will affect up to 5% of Chrome users worldwide.
We’ve been told that the trial is currently distributed to 0.5% of Chrome users in some regions – for now it means Australia, Brazil, Canada, India, Indonesia, Japan, Mexico, New Zealand, the Philippines and US users in Qualified regions will be selected at random, regardless of most ad and privacy settings. Only users who have turned off third-party cookies in Chrome will be opted out by default.
Furthermore, the team behind FLoC has it Requested that Google supports the sample of 5% of users, so that ad tech companies can better train models using the new data. If the request is granted, tens or hundreds of millions more users will be registered in the trial.
Users have been automatically registered in the trial version. There is no dedicated opt-out (yet).
As described above, a random number of Chrome users will be registered in the trial without notice, much less consent. These users will not be asked to sign up. In the current version of Chrome, users can only opt out of the trial version by turning off all third-party cookies.
Future versions of Chrome will be added dedicated controls for Google’s privacy sandbox, “Including FLoC. But it is not clear when these settings will go live, and in the meantime, users who want to turn off FLoC must also turn off third-party cookies.
Turning off third-party cookies is generally not a bad idea. After all, cookies are the core of the privacy issues that Google says it will solve. But turning them off completely is a gross countermeasure, and it breaks many conveniences (such as easy login) that Internet users trust. Many privacy-conscious users of Chrome use more targeted tools, including extensions like Privacy Badger, to prevent cookie-based tracking. Unfortunately, Chrome extensions cannot control whether a user reveals a FLoC ID.
Websites are also not asked to sign up.
FLoC calculates a label based on your browser history. For the trial period, Google will use by default each site that displays ads—Which are most websites. Websites may choose not to be included in FLoC calculations by sending an HTTP header, but some hosting providers do not give customers direct control over headings. Many website owners may not be aware of the lawsuit at all.
Each user’s FLoC ID – the label that reflects the last week’s browser history – will be available to any website or tracker that wants it.
There will be over 33,000 possible cohorts.
One of the most important parts of the FLoC specification that is undefined is exactly how many cohorts there are. Google ran one preliminary experiment with 8-bit cohort ID, which meant there were only 256 possible groups. This limited the amount of information that trackers can learn from the user’s cohort ID.
However, a study of the latest version of Chrome reveals that the live version of FLoC uses 50-bit cohort identifiers. The litters are then collected together to 33,872 litters in total, over 100 times more than in Google’s first experiment. Google has said it will ensure “thousands“People are grouped in each cohort, so no one can be identified by using the cohort alone. But cohort IDs will still reveal a lot of new information – around 15 biter—And will give fingerprints a big leg up.
The trial is likely to last until July.
Any tracker, advertiser or other third party can register through Google’s Origin Trial portal to start collecting FLoC from users. The site currently indicates that the trial period can last until July 13. Google has also made it clear that the exact details of the technology – including how cohorts are calculated – can change, and we can see more iterations of the FLoC grouping algorithm. between now and then.
Google plans to revise FLoC for correlations with “sensitive categories.” The bigger picture is still missing.
Google has promised to ensure that cohorts are not too closely correlated with “sensitive categories“Like race, sexuality or medical conditions. To monitor this, Google plans to collect data on which websites are visited by users in each cohort. It has released one white paper describes the approach.
We’re happy to see a specific proposal, but white paper is moving away from the most pressing issues. The question Google should address is “you can target people in vulnerable groups;” whitepaper reduces this to “you can target people who have visited a particular site.” This is a dangerous simplification. Instead of working on the hard problem, Google has chosen to focus on a simpler version as they see fit can solve. In the meantime, it failed to deal with FLoC’s worst potential damage.
During the trial period, any user who has turned on “Chrome Sync” (let Google collect browser history), and who has not disabled any of the default sharing settings, will now share their cohort ID associated with their browser history with Google.
Google will then check to see if each user visited sites that they consider part of a “sensitive category.” For example, WebMD may be labeled in the “medical” category, or PornHub in the “adult” category. If too many users in a cohort have visited a particular type of “sensitive” page, Google will block that cohort. All users who are part of a “sensitive” cohort are placed in an “empty” cohort instead. Of course, trackers will still be able to see that said users are part of the “empty” cohort, revealing that they were originally classified as a kind of “sensitive”.
For the trial of origin, Google relies on its huge cache of personal browser data to perform the audit. In the future, Google plans to use something else privacy technology to do the same without knowing individuals’ browser logs.
No matter how Google does it, this plan will not solve the major problems of FLoC, discrimination and predatory targeting. The proposal is based on the assumption that people in “sensitive categories” will visit specific “sensitive” websites, and that people who are not in these groups will not visit those websites. But behavior correlates with demographics in unintuitive ways. It is highly likely that certain demographics will visit a different subset of the web than other demographics, and that such behavior will not be captured by Google’s “sensitive sites” framing. For example, people with depression may exhibit similar surfing behaviors, but not necessarily via something as explicit and direct as, for example, visiting “depression.org”. Meanwhile, tracking companies are well equipped to collect traffic from millions of users, link it to data on demographics or behavior, and decode which cohorts are associated with which sensitive properties. Google’s site-based system, as suggested, has no way to stop it.
As we said for, “Google may choose to dismantle the old scaffolding for surveillance without replacing it with anything new and uniquely harmful.” Google has not been able to resolve the damages from FLoC, or even convince us that they can be addressed. Instead, it runs a test that will share new data about millions of unsuspecting users. This is another step in the wrong direction.