Unchaining data portability

The role of data portability in the EU Digital Strategy

The concept of data portability relates to the characteristic of a set of data to be moved to, from and among applications, operating systems, or devices, with minimal friction.

The European legislator individuated  data portability as fundamental means to develop digital policies that benefit both citizens, giving them higher levels of control over their data, and the market, revamping competition thanks to clearer rules and easier mechanisms for data sharing, access and re-use.

In practice, the possibility for end-users and businesses to move to and from digital service providers seamlessly, without losing content or disrupt their services is, at least in theory, the perfect arena to fuel competition and better services.

Yet the realization of such seemingly simple capability of data is hindered by an unparalleled amount of legal, economical, and technological complications. Data portability is in fact hard to regulate. Its complexity is due to its inbred, bi-parted soul: one half being its economic-driven capacity of impacting market competition, the other being its human-rights-driven capacity of enabling people’s informational self-determination. Additionally, the two souls of data portability are inseparable, meaning that it is impossible to regulate only for data protection leaving competition untouched. The “inseparability of souls“ is a characteristic that in itself impacts the regulation of a right to data portability, because of its cascade implications on multiple domains, ranging from technological interoperability and industrial standardization to human and economic rights, to market competition and consumer’s rights, to policy in data sharing and governance.

Historically, the concept of portability stemmed from “number portability” enshrined in art 30 of the Universal Services Directive. Yet ever since, in the legislative action of the EU legislator, the concept of portability has assumed different objects, affected different subjects, required for new technologies, as well as the development of theoretical frameworks for data sharing and governance to keep pace with the evolution of the European digital market.

Objects of portability legislation have so become personal data (GDPR, proposals for the Digital Governance Act (‘DGA’), Digital Markets Act (‘DMA’) and Data Act –even special categories thereof (such as health data in the proposed European Health Data Space Regulation (‘EHDS’)), non-personal data (Free Flow of non-personal data Regulation ‘FFNPD’, Open Data Directive, and again DSA and DMA), but also the services and online content of European users (Content Portability Regulation and Digital Content Directive (‘DCD’)). Self-evidently, such heterogeneous legislative framework, among Directives and Regulations, realizing diverse political strategies and each with their specific objectives, through horizontal and vertical regulation, over the span of 20 years –and importantly: the last twenty years—in the context of the ever changing economics, technologies and societal structures of the digital ecosystem, have created a legislative omnishambles.

Such legislative omnishambles is extremely hard to navigate –even for legal experts—but  does have legal effects and does create rights and obligations for end-users as well as businesses.

Even within GDPR, the text of art. 20 allows for multiple interpretations of rights and obligations. For instance, it is unclear for users and providers what personal data shall be portable, considering “data provided by the data subject” can be that personally generated, or also that observed by the provider, or even inferred. Also unclear is, which types of formats that are structured, machine-readable, and commonly-used are also functional to ensure interoperability, and what is the legal relationship between interoperability and portability. Another fundamental, unanswered question is about where to trace the line of a “portable minimum” for each service so that the ported data are still meaningful to the data subject. There is in fact difference in being able to port single photos v. albums, or entries of a list of contacts v. a social graph of relationships. In these cases, keeping the structures and the collections of the single data entries can sometimes be as vital to the service as their entries themselves. It is in such cases that the law does not clarify what’s the minimum “data unit”.

Snowballing, the portability of single data or collections thereof might create issues with the “rights of others”, such as privacy or intellectual property. As for privacy, data portability requests may encompass personal data of others, e.g. in the case of contact lists, conversations with, or pictures of others, where it will be hard to balance the interests at stake or even find legitimate grounds for processing. Moreover, when the personal data of others is finally in the control of the requesting data subject, they fall under the data subject’s household exemption. Because of this exemption, they get no longer protected under the GDPR, which is a problem in terms of security –and a big one, should the downloaded datasets contain hundreds of contacts of vulnerable subjects or hazardous content, or be sent via insecure means. As for intellectual property, there will be cases where pictures were adjusted with proprietary filters, or collections created by the service provider on the basis of the generated or observed subject’s personal data, or content generated by a “prosumer”. These cases raise questions of legitimacy of the portability requests and need legal and technical answers.

Nowadays, the reality of fact is that, on one side, the majority of people do not know about the existence of their right to data portability, or would anyways not know how to enjoy it. On another side, there are no clear rules for service providers on how to address such requests without infringing somewhere down the line some stakeholders’ legitimate interests, nor how to create portability services that comply with all the potentially applicable rules. And finally, the European and national courts as well as Regulatory Authorities have not yet clarified questions on portability –only the case UK drivers v. Ola decided on a data portability requests, but without solving any of the above; same holds for the guidelines adopted by the WP29/EDPB.

“Politics” of data portability

Data portability is not a necessary function in most information systems. It is instead a function that the architects of an information exchange system may want to, or are obliged to embed —by a regulatory constraint in the case of Art. 20 GDPR. This means that, decisions on the existence and extent of data portability functions are the result of a normative decision of either the developers or, in our case, the legislator. Such decisions relate to how should (what?) data (personal, non-personal, content, etc…) be governed, who should decide what to do with it, and who should be responsible for making that possible. With the capability of enabling or hindering such decisions, data portability is a crucial means within the realm of “politics of data”.

The political goal of the EU legislator is to unchain the power of European data first by destabilising the de facto ownership over data by the –mostly American—information industry, and then using data the “European way”, that is fairly, securely, to the benefit of its people and businesses, and in respect of fundamental rights. To operationalize such “data sovereignty” strategy, the results of a heated political and academic conversation about governance models have rewarded forms of data openness, access, sharing and re-use, which are allegedly better at reaching goals of information privacy, innovation and competition, as opposed to data property, ownership, and other exclusivity models.

Technologies for data portability

From a technological perspective, to realize portability is quite impractical. Data migration and re-adaptation does not happen smoothly and the lack of mandated top-down coordination from the EU is not helping the standardisation process. As for data formats, practical research on portability requests showed that respondents favour some types of data formats depending on their field of service, of which only a few are GDPR compliant according to the interpretation of the U.K. ICO. As for information systems enabling portability, the EU is moving on multiple fronts. First is the upcoming roll-out of the vertical European Data Spaces, with the Health one already at proposal stage. Additionally, the development of personal information management systems (‘PIMS’) is underway, which will allow users to be “holders” of personal information to manage in secure, local or online storage systems, and to share them at will. Reading from the EDPS’ TechDispatch 3/2020,

“PIMS can usually offer personal data and other metadata describing their properties in machine readable formats, as well as programming interfaces (APIs) for data access and processing. This last feature implies the use of standard policies and system protocols. This is an essential element, the lack thereof currently also represents a limit for PIMS adoption.”

Private projects such as Nextcloud, Solid, and MyData have made promising steps toward portability-enabling systems, but have not reached a level of technology readiness to allow for market acceptance and critical adoption. Unsurprisingly, there are open-source, industry-led initiatives, championed by the Data Transfer Project of Google, Meta, Twitter and Apple, which aim to ensure the entering the market of products and services to address the consumers’ requests of downloadable user data in structured, commonly used formats (Google Takeout, etc.), as well as of direct, seamless data portability from one service to another. These projects, however, may encounter legal obstacles in European competition law ex art. 102 TFUE, but also political ones: the consideration of the EC policy agenda regarding data governance altogether excludes that American big-tech players will unilaterally establish the de facto standard data formats and systems for data portability.

 

Conclusions

Under normal circumstances, and considering the results of the EC’s impact assessments, the interplay of the mentioned regulatory efforts should shape a digital market that benefits everyone. The EU has seemingly found a silver bullet that makes market players and consumers happy, both economically and in the respect of the fundamental rights to privacy, intellectual property, and fairness in the distribution of data value.

In reality, nobody seems interested in using such silver bullet. Why is that?

After months of research, my educated guess is that the reasons are to be found in a mixture of the following:

  • From a market perspective, portability has potentially disruptive, market-wide economic effects mostly stacked against big market players. The EU has been extremely careful in (not) imposing rules and technologies for full harmonization, with a hope that multi-stakeholderism could find its ways. Citing Alek Tarkowski from the Open Future Foundation “no one tried hard to make it work, while others tried very hard not to make it work”.
  • From a law and economics perspective, although it is said that portability will benefit users and businesses, there have not been exhaustive and conclusive economic analyses providing evidence of benefits for big tech companies, nor for Small and Medium Enterprises.
  • From a regulatory perspective, the careful, delicate approach of participatory regulation and technological neutrality has been excessively open, creating uncertainties that have benefited the maintenance of the status quo –meaning, the monopolistic control over data of big tech players.
  • From a technological perspective, there remains the need to develop information systems enabling data portability. The problem is that, in privacy engineering, such development starts with the identification of the requirements, both legal and technical, and in such a moment of regulatory turmoil these are hard to identify, let alone systematize, operate, and put into the market.

Early-Stage Researchers Present their Cutting-Edge Research at the International Conference on AI

 

Crete, Greece, June 2022 – On the 19th of June 20212, the LeADS consortium organised a workshop on best practices for the development of intelligent and trustworthy algorithms and systems as part of the 18th International Conference on Artificial Intelligence Applications and Innovations (AIAI) that took place in Crete. The Workshop consisted of two parts:

The first part featured two panel discussions on “Data Ownership, Privacy and Empowerment” and “Trustworthy Data Processing Design”. Leading scholars in their field, such as Paul de Hert from Vrije Universiteit Brussel and Giovanni Comandé from Scuola Superiore Sant’Anna, discussed pressing questions in the data economy. Is a potential data ownership right compatible with the existing data protection law? Which legal, ethical, and technological framework is needed to ensure a more trustworthy data economy?

Second, the 15 Early-Stage Researchers (ESRs) of the LeADS project presented their cutting-edge research during a session on “Best Practices for the Development of Intelligent and Trustworthy Algorithms and Systems” and  discussed their research with the participants of the AIAI conference. The workshop intervenes at a moment when the European Commission is proposing various pieces of legislation that should create a framework to reconcile diverging interests in data. Thus far, however, many questions have been left unanswered on how to find an appropriate regulatory framework. With its interdisciplinary approach, the LeADS project aims to find innovative solutions to these questions.

 

Further Information:

Blogpost

LeADS project at AIAI 2022

The LeADS project is organising a workshop, titled “Best Practices for the development of intelligent and trustworthy algorithms and systems, at the 18th International Conference on Artificial Intelligence Applications and Innovations (AIAI). The workshop will take place on Sunday, 19th June in Crete, Greece, and is divided into three sessions: 1) Panel discussion on Data Ownership, Privacy and Empowerment 2) Panel discussion on Trustworthy Data Processing Design and 3) Poster Session.

Below you can find the detailed programme:

 

Legality Attentive Data Science

Workshop: Best Practices for the development of intelligent and trustworthy algorithms and systems

19/06/2022

 

10:30 – 11:30 AM

Panel 1: Data Ownership, Privacy, and Empowerment

Pointing to its limitations, legal uncertainties and issues with implementation, quite a few legal scholars argue that a new legal instrument in the form of data ownership is unnecessary. However, ownership and property are at the core of liberal political theories of the modern state and modern law, as these form the source of rights and liberty. And if data is a valuable resource, forms and scales of its ownership should be discussed- not only in legal writing but also in public debate. Reflecting on the current regimes of data exchange and ownership structures related to data, this panel will discuss if and how a potential data ownership right can empower data subjects and right holders. Covering the scope and elements of a potential data ownership right, the panelists will guide us to have a closer look at the powers and limitations of such a right in relation to pervasive technologies such as AI and machine learning. Some questions the panel will explore: How would a potential data ownership right integrate with existing data protection law? Would it potentially empower individuals regarding access rights and ‘data portability’? Can we talk about collective ownership of data? If so, how can we justify it dwelling on the political questions of property and dispossession? 

Duration: 60 min (including 15 min debate)

Moderator: Imge Ozcan, LSTS, Vrije Universiteit Brussel

Panelists:

  • Katerina Demetzou, Future of Privacy Forum 
  • Paul De Hert, LSTS, Vrije Universiteit Brussel (remote participation) 
  • Afonso Ferreira, CNRS, Institut de Recherche en Informatique de Toulouse

 

11:30 AM – 12:00 PM Coffee Break

 

12:00 – 01:00 PM

Panel 2: Trustworthy Data Processing Design

Data are fuelling the economy. The borders between personal and non-personal data, sensitive and non-sensitive data are fading away while the need for their secondary uses is growing exponentially. The Panel focuses on these issues moving from legal, ethical and technological framework needed to design data processing trustworthy for all the players. 

Duration: 60 min (including 15 min debate)

Moderator: Giovanni Comandé, Scuola Superiore Sant’Anna

Panelists: 

  • Jessica Eynard, Toulouse Capitole University 
  • Elias Pimenidis, University of the West of England 
  • Gabriele Lenzini, University of Luxemburg 
  • Salvatore Rinzivillo, Italian National Research Council

 

01:00 – 2:15 PM

 

Poster Session: Gallery Walk on “Best Practices for the development of intelligent and trustworthy algorithms and systems”.

 

In addition, University of Piraeus, a LeADS beneficiary, in collaboration with University of Sunderland, is co-organizing the second workshop on “Artificial Intelligence and Ethics(AI & ETHICS – https://ifipaiai.org/2022/workshops/#aiethics), which will take place on Monday, 20th June.

We are looking forward to engaging with the interdisciplinary and diverse community at AIAI, sharing our research and having fruitful discussions surrounding our project.

Participation of Barbara Lazarotto at the SSN2022

Early Stage Researcher Barbara Lazarotto (ESR 7) presented her research at the 9th biennial Surveillance & Society conference of the Surveillance Studies Network (SSN), hosted by Erasmus University Rotterdam on June 1-3 2022 in Rotterdam, The Netherlands.

As a part of the panel named “Human Rights Europe and Global South” Barbara presented her topic of public-private data sharing under the lens of State Surveillance, analyzing the role of sensors in a Smart City context and how this data can be used to analyze and monitor entire populations.

To do that, Barbara first pointed out the broad concept of “smart cities” which is used for several different purposes and political agendas which might be concerning when it comes to the public expectations of locational privacy. Subsequently, she presented three examples of the use of public-private data sharing in smart city contexts, namely the city of Kortrijk (BE), a bridge installed in Amsterdam (NL), and the city of Enschede (NL). These three examples were able to demonstrate the extensive use of public-private partnerships that constantly track and monitor individuals’ behaviors and movements.

At last, Barbara focused on the fundamental rights that are violated by these sensors, highlighting that the idea that sensors are non-personal data might be misleading since the number of sensors is becoming so high in some cities that is in fact possible to single out an individual. Thus, Barbara pointed out that by violating locational privacy, other rights might be also violated such as freedom of religion and liberty and security. Barbara finalized her presentation by highlighting the necessity to increase citizens’ participation within the decision-making process of smart cities’ public-private partnerships, the requirement for citizen digital literacy, and further regulation of smart city sensors.

What do you mean a robot gets a say in this?

Predictive analytics, which can be understood as the application of artificial intelligence (machine learning and deep learning) based computer algorithms, to predict the future activities of a datasets based on their past or current behaviors are increasingly finding applications across different domains and sectors.

In the legal and civic fields, predictive analytics while previously known for its application in carrying out subservient tasks such as providing case law insights, mapping of contracts, vetting of legal provisions, are now being used for providing insights during legal trials and dispute resolution. These predictive jurisprudence softwares find their application in predicting recidivism probabilities of persons, investigating evidence, and even predicting the possible resolution of a civil dispute or a criminal charge based on the precedent and the legal infrastructure of the jurisdiction in which they operate.

In toto, predictive jurisprudence finds its application in three broad spheres, namely – (1) Predictive Justice: Criminal Sentencing, Settlement of Civil Disputes, Increasing Access to Justice; (2) Predictive Recidivism Software: Parole related decisions, Commutation of Criminal Sentences; (3) Legal Tools: Drafting tools, Contract Analysis Tools, and Legal Insight Tools.

The use of predictive jurisprudence and its development can be first found all the way back to the Supreme Court Forecasting Project (SCFP), a combined study conducted by students at the University of Pennsylvania, Washington University, and the University of California Berkeley, which was a statistics-based legal project that aimed to predict the outcome of every case argued in front of the United States Supreme Court in the year 2002. The backbone of this project was a statistics formula that when used, performed very well in predicting the decisions of the court to a degree of accuracy that even seasoned legal experts could not match. The statistics model predicted 75% of the court’s affirm/reverse results correctly, whereas the legal experts collectively predicted only 59.1% of the decisions correctly. Although the SCFP did not employ the use of computers per se, it provided proof of concept for the ideation that judicial decisions and the jurisprudence of courts are indeed parameters which can be readily predicted.

Since the SCFP, various companies and individuals have developed their digital products focused on assisting legal professionals by providing insights regarding legal materials or predicting the pattern of the judicial pronouncements, specificized based on the judges and courts across a plethora of legal matters such as settlement of insurance related claims, small cause matters such as traffic violations, granting of parole and commutation of sentences for convicts.

There are many companies who have well established digital products operating in the legal sphere, one of these is CaseCrunch, a UK based startup whose predictive jurisprudence application CaseCruncher Alpha showcased 86.6% accuracy in legal predictions by their algorithms, while the pool of 112 lawyers pitted against the CaseCruncher Alpha had an overall accuracy of 62.3%.

Another company flourishing in the predictive jurisprudence sphere is Loom Analytics which is a predictive analytics platform that features win/loss rates and judge ruling information but only for civil cases in select Canadian provinces, however, they are in the process of scaling up.

However, the established market powers in the predictive jurisprudence sphere are Prédictice– a French company in the business of providing legal case analysis (except for criminal cases). Another French company operating in this sphere is Case Law Analytics, which also works on providing legal analysis albeit much like Prédictice, it does not analyse criminal cases.

Another major player in the US market is Lex Machina which is owned by the global conglomerate LexisNexis and is in the business of providing legal analysis (including criminal cases) to legal professionals amongst other services such as insights such as how a judge behaves in a specific case, a compendium of crucial insights regarding the arguing styles, litigation experience which allows for persons to formulate an appropriate litigation strategy, Lex Machina also provides analysis of a party before a specific judge, courts or forums. Further, Lex Machina provides outcome-based analytics, timing-based analytics and helps in analysing the motions submitted to the court which helps professionals in crafting the appropriate motions to move the courts for specific causes.

Predictive jurisprudence is clearly a winner in terms of analysing not just volumes of data accurately but also identifying patterns in judicial behaviour which may not be visible to even the most seasoned experts.

The companies and private projects engaged in the use of predictive jurisprudence commercially, point towards an inherent market for predictive jurisprudence tools which have many users relying on the same to not only hone their professional skills and insights but also provide an increased access to justice across many jurisdictions. However, this brings us to our most important consideration yet- is it prudent to rely upon predictive jurisprudence software to carry out legal functions? And if so, what are the core tenants of designing and using such software.

The use of AI in this context relies on two specific considerations- the domain or sector in which it will operate and the characteristics of the tasks it will carry out. For example: the AI based software when applied to the legal sector if carries out administrative tasks such as retrieval or organisation of files, can be considered a low-risk AI and therefore, its users need not be made to go through a wide array of disclosure and notifications that they are interacting with an AI system. However, in the case where the AI based software is carrying out complex tasks such as legal deliberation, which would normally require a degree of expertise, the AI will be classified as a high-risk AI since any mistakes or shortcomings can have a direct impact on the life and liberty of an individual.

This brings us to our next crucial consideration about what core tenets are supposed to be kept in mind while designing a predictive jurisprudence-based AI software. First and foremost, a strict compliance with data protection laws takes centrestage in such a software, making a wonderful case for incorporating privacy by design.

Secondly, all legal procedures across jurisdictions- whether civil law, common law etc., have a common abidance to the principles of natural justice which are namely- (1) Adequate Notice; (2) No presence of Bias and; (3) Providing a reasoned order for all delibertations.

This brings us to an important component in all predictive jurisprudence-based AI applications- a degree of explainability. Explainable AI (XAI) has made many developments in the recent times, and a degree of explainability in  a predictive jurisprudence application is crucial in as much as it allows for natural persons to readily rely on them since they understand the reasoning behind the computational results of the AI. The use of XAI as a core design tenet will also enable the predictive jurisprudence application to function independently in low-risk or moderate-risk tasks.

In their current form, since most predictive jurisprudence are far from perfect, they require human oversight and thus function to accentuate the legal analysis of lawyers, judges, and other legal professionals.

The EU Agency for Fundamental Rights (FRA) published a report (The European Union Agency on Fundamental Rights, 2020) in 2020 under the directorship of Michael O’Flaherty titled “Getting the future right: AI and fundamental rights” (FRA Report).

The FRA Report mentions the requirement for adequate disclosure while using the AI-based predictive jurisprudence technologies, this will provide the persons with the successful opportunity to complain about the use of AI and challenge the decisions which have been arrived upon based on the AI as this grievance and complaints mechanism is crucial for upholding the access to justice. The following are crucial to be reported to persons using the predictive jurisprudence based tools, in order to ensure access to justice-

  1. Making people aware that AI is used
  2. Making people aware of how and where to complain (A proper and designated grievance redressal mechanism)
  3. Making sure that the AI system and decisions based on AI can be explained (Use of XAI)

Many legal scholars have voiced their concerns about the use of predictive jurisprudence by courts and legal officers asserting that justice must be deliberated and not predicted hinting at the possibility of its users succumbing to automation bias. This concern has been adequately addressed in the current scenario as currently and until the time the AI based predictive jurisprudence software cannot explain the reasoning behind its computational results, it will operate only under the supervision of a natural person who may use the results as a component to deliberate upon while arriving at their well-reasoned decisions, while primarily relying on their own experiences and expertise.

ESRs share their experiences disseminating their research at CPDP

As mentioned in our previous blog post, the LeADS Project organized a dissemination and public engagement activity at CPDP. ESRs Xengie Doan and Barbara Lazarotto presented their research using creative tactics, here they share their experiences and impressions.

 

Xengie Doan

On Monday and Tuesday of CPDP I had the privilege to share information about the LeADS program at our booth and meet really interesting people and potential partners. On Tuesday I had dedicated 4+ hours to sharing results from my preprint, “Context, Prioritization, and Unexpectedness: Factors Influencing User Attitudes About Infographic and Comic Consent” by Xengie DoanAnnika SelzerArianna RossiWilhemina Maria Botes, and Gabriele Lenzini in an interactive show and tell demo for any interested parties who also had the time to spare in between many interesting panels. I felt nervous about sharing a lo-fi interactive demo while the Google booth had multiple screens and very hi-fi things to show, but people were interested and receptive to the idea! I think that there is an element of touching and interacting with things that we miss going fully digital, even though you can touch screens. I was also grateful for the collective Twitter promotion I did in conjunction with Imge Ozcan and Barbara Lazarotto. I got great interactions and a good number of retweets and likes and that helped me more confidently ask if any people interested in LeADS wanted to hear a bit more about my research.

First, I shared my research questions via a prompt to flip over a paper to read the hidden text, which was the first interactive moment. Research question 1 was about research participants’ prior experiences, and research question 2 was about trying to better understand research participants’ expectations, preferences, emotions, and engaging elements of the experience by comparing different mediums of consent.

Then, I shared a short version of the methods, highlighting the number (24 German adults) and type of participants and the scenario given to participants (a data trustee is asking a participant for consent to share their email and information with a clinical trial that may be relevant for them).

Last, I presented a short quiz session to disseminate some results. There were different questions that covered participant preferences for mediums (comics were ranked lowest based on un-seriousness for the medical context), reported time (1-5min was the most common option due to the medical consent vs. cookie consent), and elements that influenced their decisions (understanding, time and interest).

Overall, people who were interested in consent or different mediums of communication had a good time at the booth demo. I also got into interesting discussions about the issues with consent and received contacts who would be interested in reading the paper or hearing more. Personally, having to do a short interactive demo really forced me to share only the most important parts of the paper.

 

 

 

 

Barbara Lazarotto

On Monday I had the opportunity to disseminate my research and the LeADS Project at our booth at CPDP. It was a really interesting experience to develop an interactive research presentation, especially for academic purposes since it is not the most common format. I talked about my research topic which is “Public and Private Data Sharing: From dataveillance to data relevance” and developed a presentation focusing on the topic of data sharing in smart cities. First, I pushed participants to point out what words came to their minds when talking about dataveillance.

With that, we started to talk about how smart cities are the perfect opportunity to create a datafied society and how citizens are becoming disempowered. I proceeded to aske them to try to guess the market size of smart cities, to make them think about how this market is growing and the importance of discussing this subject. I added some examples of news articles that demonstrated how smart cities can become bad for citizens if data sharing processes are not done with them in mind.
In the end, I shared the main ideas for my research. Many participants were really interested in the topic, some expressed concern with surveillance cameras in the context of cities, and others shared that they know that many people have no idea about surveillance apparatus in their cities, however, many of them don’t know how to engage citizens in this conversation due to digital iliteracy. I also got some great feedback, and ideas for research and made some great networking with people from the field. Without a doubt, it was a very fruitful experience and I expect to create new interactive presentations in the future again.

My car, whose data?

In February 2022 it was reported that the automotive company Tesla completed all requirements to join the German car insurance market. A new business model that provides consumers with premiums calculated from their actual driving behaviour is intended to facilitate their market entry. The huge amount of data created by connected cars would thus enable the manufacturer to join and compete in a completely different market.

At the same time, this example poses questions policy makers and academia have been debating for decades: should data be considered as property of a specific stakeholder? Should consumers have the right to “take their data” in order to conclude a contract with a different company? Are competitors entitled to access that data or should Tesla have the prerogative to shield it from them in order to maintain its competitive advantage and protect protentially valuable information? Is it possible to create a competitive data economy where both fundamental rights from individuals and economic rights from companies are protected?

Property Rights in Data – A Viable Solution for the Data Economy?

Whether or not the introduction of new property rights in data would constitute an adequate means to provide for an equilibrium between the sometimes diverging interests of data holders and exploiters has been subject to a long-lasting scholarly debate on both sides of the Atlantic. Both national[1] and European[2] lawmakers considered the introduction of property rights in data as a possible solution to create a competitive data economy.

The idea was to link the protection of data with incentives of the market by using the laws of property as a control mechanism.[3] If data could only be transferred with the owner’s consent, individuals would be put in the position of valuing their privacy according to their personal preferences. Initial justifications for data as property were thus typically grounded in utilitarian theory that focuses on economic efficiency via mutually agreed exchanges. If actions from individuals constitute a measure for what they want, overall utility should be enhanced.[4] Furthermore, the status quo devoid of well-defined property rights in data would result in a de-facto assignment of economic property rights to the information industry, eroding data subject’s autonomy, privacy, and informational self-determination.[5] Property rights in data might thus simply be a necessity to protect and ultimately to empower data subjects.

Such a property rights approach, however, was confronted with severe criticism. In particular in Europe where information privacy is seen as a non-commodifiable fundamental right. Critics argued that recognizing property rights in data might ultimately erode existing levels of privacy if individuals simply increasingly traded away their personal data. Information asymmetries and inherent uncertainties regarding the future uses of data would prevent individuals from correctly estimating the value of their data. Market solutions based on a property rights model would therefore not cure any of the problems related to control, but only legitimize them.[6] In several instances, such as genetic data, the same information could “belong” to several individuals. It would thus make little sense to grant individualized property rights in data if it can be used to make inferences about other data subjects.

Debates on data property culminated when the European Commission proposed the creation of a ‘data producer’s right’ in non-personal machine generated data. Legislative intervention should facilitate the creation of markets for data by protecting economic interests from businesses and by creating legal certainty for data entitlements. Criticisms on the creation of a new property right in data, however, prevailed. In the case of networked cars it would, for instance, already be too difficult to decide to whom a new right in data should be allocated to: the producer of the software or data collection device, the manufacturer of the car, the owner of the car or the driver?[7] Adding a new layer of rights in data in addition to already existing intellectual property rights would result in multiple competing claims of ownership over the same content.[8] Such a “tragedy of the anticommons” would thus risk resulting in an underuse of data due to the high number and complex interrelations of overlapping property rights.[9]

Commodification without Propertization

A full-blown ownership approach to data now seems to have been abandoned by the European legislator. Neither the European Strategy for Data[10] nor two of its key actions, the Data Governance Act[11] and the Data Act[12], contain any references to data ownership or any new sui generis rights in data. Instead, the focus has shifted towards facilitating data access and data sharing – but without propertizing it. The Data Act, for instance, would grant consumers access rights to data generated from connected devices and it would enable them to share it with other companies that provide different services.

Despite the abandonment of a property approach to data, however, data protection authorities fear that the current proposals would extensively push towards a development of “commodification” of personal data.[13] Conflicts and diverging interests by different stakeholders are therefore not expected to automatically disappear. The example of the connected car exemplifies to what extent the interests from stakeholders, such as the owner of the car, the car manufacturer, insurance companies and even public authorities can differ. Instead, conflicts between stakeholders that defend their claims in data might rather intensify. In several instances economic interests by companies to protect their intellectual property rights or their trade secrets might clash with rights and interests from data subjects.

The rationale for the introduction of property rights in data, i. e. for example to protect and empower data subjects in light of de facto data ownership by technology companies, therefore still remains valid today and has even become more pressing. It has however been shown that it constitutes an inadequate means to achieve these goals. One of the many challenges of the data economy will thus be to find alternative solutions. Propositions to have a more integrated policy approach on a ‘data consumer law‘ seem to offer promising alternatives to find ways of solving conflicts between data holders and exploiters and will be investigated further throughout the LeADS project.

 

[1] Cf. e. g. Press Release, Merkel A. (2017): Regulate Ownership of Data. https://www.bundesregierung.de/breg-
de/aktuelles/merkel-eigentum-an-datenregeln-745810

[2] Building a European Data Economy COM (2017).

[3] Lawrence Lessig, ‘The Architecture of Privacy: Remaking Privacy in Cyberspace’ (1999) 1 Vanderbilt Journal of Entertainment & Technology Law 11.

[4] Julie E Cohen, ‘Examined Lives: Informational Privacy and the Subject as Object’ (2000) 52 Stanford Law Review 66.

[5] Nadezhda Purtova, ‘The Illusion of Personal Data as No One’s Property’ (2015) 7 Law, Innovation and Technology 83.

[6] Jessica Litman, ‘Information Privacy/Information Property’ (2000) 52 STANFORD LAW REVIEW 31.

[7] Andreas Wiebe, ‘Protection of Industrial Data – a New Property Right for the Digital Economy?’ (2017) 12 Journal of Intellectual Property Law & Practice 62.

[8] P Bernt Hugenholtz, ‘Against “Data Property”’ in Hanns Ullrich, Peter Drahos and Gustavo Ghidini, Kritika: Essays on Intellectual Property (Edward Elgar Publishing 2018).

[9] Ivan Stepanov, ‘Introducing a Property Right over Data in the EU: The Data Producer’s Right – an Evaluation’ (2020) 34 International Review of Law, Computers & Technology 65.

[10] A European Strategy for Data COM(2020) 66 final.

[11] Proposal for a Data Governance Act COM/2020/767 final.

[12] Proposal for a Data Act COM (2022) 68 final.

[13] EDPB-EDPS Joint Opinion 2/2022 on the Data Act Proposal’ (European Data Protection Board 2022).

LeADS Project at CPDP 2022

The annual Computers, Privacy and Data Protection conference (CPDP) will take place from 23 May to 25 May in Brussels, Belgium. CPDP is widely considered as the go-to conference for everyone whose work touches privacy and data protection. CPDP brings together a stimulating mix of stakeholders from all over the world to exchange ideas and discuss the emerging issues and the latest trends.

LeADS Project is organising a dissemination and public engagement activity at CPDP. Through interacting with the diverse community that CPDP brings to Brussels, we will spread the word about the ongoing research and activities taking place within the LeADS Project. Apart from informing the conference participants about LeADS Project, we will also organize a dissemination activity where ESRs Xengie Doan and Barbara Lazarotto will present their research using creative tactics. You can find us at the exhibit space in the main venue (Les Halles de Schaerbeek).

Some of LeADS Project beneficiaries are also contributing to CPDP with panels and talks. Scuola Superiore Sant’Anna di Pisa is organising a panel titled “Justice 3.0: AI In and For Justice and Case Law as Big Data Challenges” which is programmed to take place on Wednesday, 25th May. Prof. Giovanni Comandé from LeADS consortium will be moderating the panel. Another LeADS beneficiary, University of Luxembourg‘s Interdisciplinary Centre for Security, Reliability and Trust (SnT) is also organizing a panel which is titled “Privacy Design, Dark Patterns, and Speculative Data Futures”. The panel is scheduled to take place on Tuesday, 24th May. Dr. Arianna Rossi from LeADS consortium will speak on the panel. Prof. Paul De Hert, from Vrije Universiteit Brussel will deliver the opening remarks to the conference.

We are looking forward to engaging with the interdisciplinary and diverse community at CPDP, sharing our research and having fruitful discussions surrounding our project.

ESR Presentation during LeADS mid-term meeting

The Legal Perspective in LeADS – Training in Pisa, Italy

From 28th March to 9th April 2022, the 15 ESRs met again for two weeks in Pisa at Scuola Superiore Sant’Anna (SSSA) for their third training module of the LeADS project. Whereas the first training modules in Novembre-Decembre 2021 involved mainly the discussion of subjects related to computer and data science, this training module focussed on the legal perspective.

The cross-interdisciplinary training program for the 15 ESRs constitutes a fundamental part of the LeADS project. Being capable to fully understand the challenges posed by the digital transformation and data economy requires knowledge in different fields such as computer science, law, or economics. The LeADS training program is therefore structured around six modules that together aim at training a new generation of researchers as Legality Attentive Data Scientists that become experts in both law and data science capable of working within and across the two disciplines.

Training Week 1: Digitalisation, the Law and Challenges Related to its Enforcement

The first week began with a research design course on how to assess the societal relevance of research questions taught by Madeline Polmear from Vrije Universiteit Brussel (VUB). The following lectures were marked in particular by introductory courses to EU and international law for non-experts to level the playing field between lawyers and computer scientists in the group. Fryderyk Zoll from Jagiellonian University (JU) introduced the researchers to EU anti-discrimination law and how it translates into legislation that prohibits, for instance, unjustified geo-blocking to prevent discriminations based on customers’ nationality or their place of residence. A more general introduction to international and European protection of human rights was given by Paul De Hert (VUB). In addition, an introduction to human rights theory and how human rights law developed since the creation of the first international instruments, De Hert gave an outlook on possible new fundamental rights that might emerge in the digital era.

Gloria González Fuster (VUB) zoomed into the regulatory framework that surrounds a fundamental right that is of particular importance for the LeADS project: the right to data protection. In her engaging presentation she gave a short overview on the history of the right to data protection and presented both the key players (data subjects, data controllers & processors, data protection authorities) as well as the main data protection principles that lie at the heart of the protection regime of the GDPR. Vagelis Papakonstantinou (VUB) introduced the researchers to another important branch of law in the context of digitalisation: cyber security law. In his lectures he presented EU law and policy with a particular focus on legislative instruments such as the NIS directive or the EU Cybersecurity Act.

Besides various introductory lectures to different fields of law, the researchers had the opportunity to gain practical insights into enforceability of technological regulation in the EU, thanks to the intervention by Anotonio Buttà, chief economist at the Italian competition authority AGCM (Garante della Concorrenza e del Mercato). He elucidated how the competition authority had to set up a diverse team of computer scientists, data scientists, economists, and lawyers in order to address challenges and identify collusive and discriminatory practices in an environment characterized by the implementation of new technologies like machine learning algorithms.

The researchers had furthermore the opportunity to present and discuss the advancement of their work within the foESR Presentation during LeADS mid-term meetingur crossroads of the LeADS project (1. Privacy vs Intellectual Property 2. Trust in Data processing & Algorithmic Design 3. Data Ownership 4. Empowering individuals), i. e. challenges that still need to be addressed in data-driven societies. Since all ESRs are placed in beneficiary institutions in six different countries, the meeting in Pisa constituted a valuable opportunity to discuss and give meaningful feedback in person on each other’s research. The first training week ended with the mid-term meeting where the general advancement of the LeADS project was discussed together with the project officer from the European Commission and where the researchers had the opportunity to present themselves and their individual research.

Training Week 2: The Law and Challenges Posed by New Technologies

During the second training week Giovanni Comandé (SSSA) ensured that the perspective would not only remain interdisciplinary but also cross-cultural, by introducing the researchers to American and Chinese approaches to privacy. The international perspective was maintained in the class by Maria Gagliardi (SSSA) who presented the complex rules of the GDPR that apply to international transfers of personal data.

Furthermore, Giovanni Comandé explained how new computing capabilities and processing technologies transformed us into a networked, data driven, classifying society, where data is increasingly being exploited and used, for instance, to make predictions about individuals’ future actions and to increasingly employ nudging techniques that impact our decision-making behaviour. Questions that will have to be dealt with would therefore concern the frontier between techniques that merely support the decision-making process versus techniques that coerce somebody to take a specific decision. Arianna Rossi and Marietjie Botes from University of Luxembourg (LU) further elaborated on this topic by introducing dark patterns and ethics of online manipulative behaviour.Presentation on Dark Patterns

Gianclaudio Malgieri (VUB) elucidated how big data and AI technologies have created new risks for data subjects since discrimination would be possible based on new “affinity profiling” characteristics. The AI ecosystem would thus lead to layers of vulnerability based on specific contexts.

Caterina Sganga (SSSA) presented in her lecture how AI technology challenges the regulatory framework of intellectual property – would it be possible to conceive machines as an inventor or author and what happens if AI agents infringe IP rights?

Finally, one of the latest legislative proposals, the AI Act, was presented in detail by Giovanni Comandé and enabled discussions on how EU policy attempts to respond to challenges posed by new technologies.

The next training module in June 2022 will take the ESRs to Greece where, in addition to courses in data management and analytics, the ESRs will have the possibility to present and discuss in detail the findings of their research which they have gathered within the first months of their participation in the LeADS project.

PREDICTIVE JUSTICE ON TG2 NEWS RAI

LiderLab and EMbeDS of Scuola Superiore Sant’Anna showed the experimentation of an innovative platform for searching and interpreting legal documents at TG2 News Rai (National broadcasting).

The platform exploits the latest developments in Artificial Intelligence on legal texts to automatically recognize personal data to be anonymized, proposed facts, verified facts, the reasoning of the judge, legal rules used, and decisions made on a specific legal case.

The annotated data is made available to different types of users using a sophisticated search engine that allows to easily identify and compare the most relevant parts of a trial.

This tool will provide decision support to judges, who can easily evaluate the reasoning and decisions of colleagues on similar cases.  Lawyers will be able to identify the determining factors in decisions, such as which legal rules have been applied and which facts have been established. Citizens will be able to autonomously analyze the jurisprudential consistency and the compliance with the principle of equality on a fully anonymized database, for example by identifying judgments with similar facts but different decisions.

The platform makes it possible to strengthen the trust of citizens in the judicial system. Also obvious are the spin-offs in terms of reduction of litigation and the possibility of agreed solutions between the parties involved.

VIDEO