How machine learning methods that claim to protect privacy fuel discrimination and social sorting

In my lecture this week on the Ethics of AI, I argued that current trends in privacy-friendly machine learning present a new data protection challenge.

Differential privacy (DP) and federated learning (FL) claim to protect your individual privacy by ensuring that you are not re-identifiable from the model parameters. However, behavioural profiling, social sorting and discrimination by AI do not rely on re-identification, but on predictive models that can be applied to anyone. DP and FL make it possible to run such predictive models under the flag of state of the art privacy safeguards.

How differential privacy and federated learning fuel discriminatory AI

Data protection must address the structural effects of data processing that go beyond individual privacy. Big data and AI create data-driven social and economic power imbalances. Big data business models often produce and stabilise relations of exploitation, dominance/subordination or sexualised and racialised violence.

Today, many of the structural effects of big data and AI are driven by predictive analytics – specific machine learning applications that can predict potentially sensitive information (such as gender, race, credit risks, health, wealth, psychological dispositions, etc.) about any individual. These predictions are used to place an arbitrary user into risk related or behavioural boxes to treat them differently. To derive predictions about an arbitrary user, predictive models are trained on the data that a minority of often privileged users provide voluntarily because they think they have “nothing to hide”.

Predictive Analytics explained

The promise of robust anonymisation makes uses more willing to share even sensitive information with platform companies that train predictive models. While this may be beneficial in some contexts, in many cases DP and FL are exploiting the public’s increasing sensitivity to privacy to secure the input stream of training data for socially harmful big data business models. By emphasising claims of “mathematically rigorous” privacy guarantees, users are distracted from the fact that privacy is about more than individual control over our personal data, as demonstrated by AI applications that discriminate and promote social inequality.

Anti-discrimination and mitigation of structural effects of big data and AI is all about regulating the effects on third parties of the data you choose to disclose. We urgently need a collectivist understanding of privacy to enable data protection that is fit for the age of AI.

A conceptual starting point of such a collectivist approach is the insight that predicted information can violate your privacy. It’s not only the information you disclosed that could be misused or reconstructed from model parameters that might violate your privacy. Rather, predictive models trained on your data enable predictions about any person – and such predictions violate that person’s privacy.

This updated understanding of privacy is captured by the concept of predictive privacy which extends the scope of privacy to include predicted information:

An individual’s or group’s predictive privacy is violated if sensitive information about that person or group is predicted against their will or without their knowledge on the basis of data of many other individuals. [Mühlhoff 2021]

Predictive Privacy explained

Predictive privacy and differential privacy are thus antagonists. In fact, it seems that differential privacy is pushed by the industry for the very reason of ensuring the viability of discriminating and inequality-inducing business models in a climate of increasing privacy awareness and stricter privacy regulations.

Articles and essays on predictive privacy

  1. Mühlhoff, Rainer. 2021. „Predictive Privacy: Towards an Applied Ethics of Data Analytics“. Ethics and Information Technology. doi:10.1007/s10676-021-09606-x.
  1. Mühlhoff, Rainer. 2020. „We Need to Think Data Protection Beyond Privacy: Turbo-Digitalization after COVID-19 and the Biopolitical Shift of Digital Capitalism“. Medium, März. doi:10.2139/ssrn.3596506.
  1. Mühlhoff, Rainer. 2020. „Automatisierte Ungleichheit: Ethik der Künstlichen Intelligenz in der biopolitische Wende des Digitalen Kapitalismus“. Deutsche Zeitschrift für Philosophie 68 (6): 867–90. doi:10.1515/dzph-2020-0059.