Data analytics and data-driven approaches in Machine Learning are now among the most hailed computing technologies in many industrial domains. One major application is predictive analytics, which is used to predict sensitive attributes, future behavior, or cost, risk and utility functions associated with target groups or individuals based on large sets of behavioral and usage data. This paper stresses the severe ethical and data protection implications of predictive analytics and outlines a new approach in tackling them. First, it introduces the concept of “predictive privacy” to formulate an ethical principle protecting individuals and groups against prediction of sensitive information using Big Data and Machine Learning. Secondly, it analyses the typical data processing cycle of predictive systems to provide a step-by-step discussion of ethical implications, locating occurrences of predictive privacy violations. Thirdly, the paper sheds light on what is qualitatively new in the way predictive analytics challenges ethical principles such as human dignity and the (liberal) notion of data protection as the preservation of privacy. These new challenges arise when predictive systems transform statistical inferences, which are knowledge about the cohort of training data donors, into individual predictions, thereby crossing what I call the “prediction gap”. Finally, the paper summarizes that data protection in the age of predictive analytics is a collective matter as we face situations where an individual’s (or group’s) privacy is violated using data other individuals provide about themselves, possibly even anonymously.
Comments and feedback welcome!