Updating Purpose Limitation for AI

Joint research project by the philosopher Rainer Mühlhoff and the legal scholar Hannah Ruschemeier.

The project reacts to the risk of secondary use of trained AI models, which is currently one of the most severe regulatory gaps with regard to AI. Purpose Limitation for Models limits the use of trained AI models to the purpose for which it was originally trained and for which the training data was collected.

See also the Predicive Privacy project.

The risk of secondary use of trained models

Imagine medical researchers build an AI model that detects depression from speech data. Let’s say the model is trained from the data of volunteer psychiatric patients. This could be a beneficial project to improve medical diagnosis, and that’s why many consent to the use of their data. But what if the trained model falls into the hands of the insurance industry or of a company that builds AI systems to evaluate job interviews? In these cases, the model would facilitate implicit discrimination of an already vulnerable group.

There are currently no effective legal limitations to reusing trained models for other purposes (this includes the forthcoming AI Act). Secondary use of trained models poses an immense societal risk and a blind spot of ethical and legal debate.

The risk of misuse of trained models

Purpose Limitation for AI: Schematic representation ((c) Mühlhoff/Ruschemeier)

What is Purpose Limitation for AI?

In our interdisciplinary project combining critical AI ethics and legal studies, we develop the concept of Purpose Limitation for Models as an ethical and legal framework to govern the purposes for which trained models (and training dataset) may be used and reused.

The concept comes in two variations that state:

  1. A machine learning model shall only be trained, used and transferred for the purposes for which the training data was collected.
  2. An actor shall only train a machine learning model from a data set if the purposes for which the data set was originally collected is compatible with the purpose for which the model is trained.

Purpose limitation is originally a concept from data protection regulation (Art 5(1)(b) GDPR), but it does not automatically apply to the training and/or further processing of AI models. This is because AI models can be trained from anonymised training data and in many relevant cases, the model data itself might be anonymous data. The GDPR does not apply to the processing of anonymous data.

Two variations of PL for AI

Purpose Limitation for AI: Schematic representation ((c) Mühlhoff/Ruschemeier)

Why is Purpose Limitation for AI important?

We argue that possession of trained models is at the core of an increasing asymmetry of informational power between AI companies and society. Limiting this power asymmetry must be the goal of regulation, and Purpose Limitation for Models would be a key step in that direction. The production of predictive and generative AI models signifies the most recent form of informational power asymmetry between data processing organisations (mostly large companies) and societies. Without public control of the purposes for which existing AI models can be reused in other contexts, this power asymmetry poses significant individual and societal risks in the form of discrimination, unfair treatment, and exploitation of vulnerabilities (e.g., risks of medical conditions being implicitly estimated in job applicant screening). Our proposed purpose limitation for AI models aims to establish accountability, effective oversight, and prevent collective harms related to the regulatory gap.

Does the AI Act prevent the risk of secondary use of trained models?

The AI Act does not regulate AI models in the training phase, but only applies when they are placed on the market. Following this, the AI Act’s risk categorisation is based on a dichotomy of product safety risks and fundamental rights risks, where the secondary use of models is not considered a risk factor. Consequently, the secondary use of models plays no role in the AI Act’s risk assessment of an AI system. The AI Act’s requirements for the creation of a database of high-risk systems can be used as a starting point for a governance structure for our proposal of a Purpose Limitation for AI.

Research articles

on Purpose Limitation for Models

  1. Mühlhoff, Rainer, und Hannah Ruschemeier. 2024. „Regulating AI via Purpose Limitation for Models“. AI Law and Regulation. https://dx.doi.org/10.21552/aire/2024/1/5.
  1. Mühlhoff, Rainer, und Hannah Ruschemeier. 2024. „Updating Purpose Limitation for AI: A Normative Approach from Law and Philosophy“. SSRN Preprint, Januar. https://papers.ssrn.com/abstract=4711621.

On the risk of secondary use of trained models

  1. Ruschemeier, Hannah. 2024. „Prediction Power as a Challenge for the Rule of Law“. SSRN Preprint. doi:10.2139/ssrn.4888087.
  1. Mühlhoff, Rainer. 2024. „Das Risiko der Sekundärnutzung trainierter Modelle als zentrales Problem von Datenschutz und KI-Regulierung im Medizinbereich“. In KI und Robotik in der Medizin – interdisziplinäre Fragen, herausgegeben von Hannah Ruschemeier und Björn Steinrötter. Nomos. doi:10.5771/9783748939726-27.
  1. Mühlhoff, Rainer, und Theresa Willem. 2023. „Social Media Advertising for Clinical Studies: Ethical and Data Protection Implications of Online Targeting“. Big Data & Society, 1–15. doi:10.1177/20539517231156127.
  1. Mühlhoff, Rainer. 2023. Die Macht der Daten. Warum Künstliche Intelligenz eine Frage der Ethik ist. V&R unipress, Universitätsverlag Osnabrück. doi:10.14220/9783737015523.

On the limitations of existing data protection regulation

  1. Mühlhoff, Rainer, und Hannah Ruschemeier. 2024. „Predictive Analytics and the GDPR: Collective Dimensions of Data Protection“. Law, Innovation and Technology. doi:10.1080/17579961.2024.2313794.
  1. Ruschemeier, Hannah. 2024. „Generative AI and Data Protection“. SSRN Preprint. https://papers.ssrn.com/abstract=4814999.
  1. Mühlhoff, Rainer, und Hannah Ruschemeier. 2022. „Predictive Analytics und DSGVO: Ethische und rechtliche Implikationen“. In Telemedicus – Recht der Informationsgesellschaft, Tagungsband zur Sommerkonferenz 2022, herausgegeben von Hans-Christian Gräfe und Telemedicus e.V., 38–67. Deutscher Fachverlag.