Big Data: analisi e proposte

Andrea Fedi * e Monica Riva **

Questo breve articolo ricapitola la nozione di Big Data e il rapporto difficile tra detti Big Data e le attuali regole sulla protezione dei dati personali (limitazione del trattamento, trasparenza, consenso informato, profilazione e decisioni automatizzate). A valle di tale sintesi, gli autori provano a indicare un cammino interpretativo per riconciliare l’uso massivo di ampie banche dati a fini di profilazione con i principi del GDPR.

PAROLE CHIAVE: dati personali - intelligenza artificiale - Big Data - GDPR

Big Data: analysis and proposals

This short article recapitulates the notion of Big Data and the difficult match between Big Data and current data protection rules (purpose limitation, transparency, consent, profiling and automated decisions). After doing that, the Authors try to indicate an interpretative way to reconcile massive use of large data sets to profile individuals with the principles of the GDPR.

Keywords: Big data – artificial intelligence – GDPR – personal data.


1. Introduction

The term “Big Data” has long since appeared in the jargon of practitioners interested in legal ramifications of new technologies. By this term it is common to address extremely large data sets that may be analysed computationally to extract inferences about data patterns, trends and correlations [1]. In other terms, the term refers to a phenomenon characterised by

i. a magnitude requirement (large data sets),

ii. a methodology (computational analysis, i.e., through a machine),

iii. a final goal (the extraction of knowledge from the data sets, as a miner extracts a mineral from a mine) [2].

The current technology has indeed opened doors to the possibility to collect and process huge amounts of data, and extract from their analysis new and predictive knowledge with great ‘velocity’, from large ‘volume’ databases containing a ‘variety’ of different data (so large and various that a natural person could not reasonably make such analysis), controlling their ‘veracity’ and, eventually, creating ‘value’ [3].

Indeed, the computational insight of data fields (also through artificial intelligence, AI) consents to find (often unexpected) correlations among data and draw consequences (forecasts) almost in real time, which of course delivers a high value capacity to predict (and influence) reactions of individuals, communities and societal groups (elections, consumer preferences, investment appetites, etc.) [4]. Big Data have been therefore indicated as a new paradigm for the collection, storage, management, analysis and visualisation of large data collections with heterogeneous characters. That paradigm is based on the notorious 5 Vs (volume, velocity, variety, veracity and value) [5].


2. Big Data and IP

Big Data entail several legal issues. In this article we investigate the interplay between Big Data and IP legislation (mainly to answer the question on whether the work of data aggregators is protected as an IP asset) and the controversial relationship between Big Data and data protection laws (mainly to explore whether Big Data are compatible with the current protections awarded to data subjects). We will see that data aggregators are exposed to two types of different but equally serious risks: that their investments and works to aggregate data is not protected by IP rules and remains exposed to exploitations by others; and that the data analytics triggers very cumbersome compliance duties towards data subjects and data protection authorities.

As a start, the IP ramifications. The relationship between Big Data and IP is extremely complex and currently discussed. Indeed, Big Data have features that are not fully compatible with the traditional IP rights of common legal systems.

Notwithstanding the legal difficulties, the principle of “technological neutrality” requires that the same rules of IP rights – e.g. copyright, patents, trade secrets – should be applicable to new technologies, regardless of the format in which a piece of work is incorporated and the technical methods with which it is reproduced.

The protection, and consequent remuneration, of Big Data in terms of proprietary rights is an important goal to reach, especially for all data-related companies, first and foremost large data aggregators, or for new professional figures, such as data scientists. The absence of IP protection can be a tough limitation in data-driven businesses, since the level of competition is very high, and it is important to fill the gaps by providing specific exceptions with new pieces of legislation and, in the meantime, by sound interpretation.

i)Creative databases- Leaving aside the de iure condendo perspective and considering all the legal arrangements that the Italian legal system already provides for the protection of data collections, copyrights law shall apply, in particular its provisions relating to ‘databases’. Here it is stated the principle according to which an aggregation of data, by the very fact of its collection, can have an economic value and can even attract Intellectual Property rights.

The first question to answer is whether art. 2, paragraph 9, of the Law 22 April 1941, n. 633 –the Italian Copyright Act – shall apply, since it lists among the intellectual works subject to copyright protection "the databases which, for the choice or arrangement of the material, constitute an intellectual creation of the author”.

This last ..

