Blog – Page 7 – Legality Attentive Data Scientists

European Researchers’ Night in Athens

15 October 2023/in Blog/by fatmasumeyradogan

ESRs Fatma S. Doğan, Armend Duzha and Christos Magkos along with Manolis Alexakis attended the event organized within European Researchers’ Night on 29 September 2023 in Athens. The team welcomed visitors of all ages into their booth and presented details about the LeADS project. Given the project’s focus on the intersection of law and technology, explanatory sheets facilitated the conceptualization of these concepts. The majority of visitors expressed interest in AI developments, making the AI Act the central topic of most conversations. High school students were also intrigued by the backgrounds of the ESRs; the diversity of disciplines represented by the three ESRs captured their attention about potential future careers.

ESRs also had the opportunity to observe the scientific fair, where researchers from various fields presented their projects, proving to be inspiring. The connection with CERN labs stood out as a prominent example. The program provided a valuable chance for ESRs to engage with other researchers.

ESR Onntje Hinrichs and Barbara Lazarotto at Beyond Data Protection Conference

5 October 2023/in Blog/by veronica

On 21-22nd September 2023, the “Beyond Data Protection Conference: Regulating Information and Protection against Risks of the Digital Society” took place in Utrecht, The Netherlands. For 2 days, scholars from all over the world explored the challenges of data-centred legal protection against information-induced harms and considered alternatives beyond data protection. Keynotes were given by the well-known scholars Karen Yeung, Lokke Moerel, and Michael Veale. LeADS ESRs Onntje Hinrichs and Barbara Lazarotto were both present at the conference and presented their latest research.

ESR Onntje Hinrichs presented his paper on “Consumer Law as Second Vantage Point for the Protection of Consumer Data – Protecting or Polluting the Privacy Ecosystem?” in a panel on ‘Beyond Data Protection‘. He elaborated how the regulation of data has increasingly become a regulatory concern for the (market-focussed) European consumer law, despite being anchored in the (fundamental rights-focussed) European data protection framework. Whilst authors have increasingly identified potential complementarities between both fields of law for the protection of consumer data, he argued that they might at times pursue opposite objectives. By drawing parallels with debates that surrounded the uneasy relationship between consumer and environmental policy, Onntje showed how the regulation of consumer data under consumer law potentially not only contributes to the protection but also the pollution of the privacy ecosystem. At the same time, he relied on this analogy to showcase how existing tensions between both policy areas can be overcome.

Barbara presented her paper “The Smart Government Paradox: A Critical Reflection on EU Constitutional and Data Law Landscape in Light of Techno-Solutionism” in a panel on Government data and surveillance. She explored how business to government data sharing practices can give governments superpowers which threatens the rights of citizens. She made an analysis of how the creation of multilayered data access rights in the public sector can give citizens tools to fight against smart government abuses.

Mitisha Gaur at Digital Legal Talks and LawTomation Days

29 September 2023/in Blog/by mitisha.gaur

Mitisha Gaur, ESR with the LeADS Project, recently presented at Digital Legal Talks which took place on the 15^th of September in Utrecth, NL. Mitisha is researching predictive justice applications deployed across courts and government bodies to augment and support decision making practices as well as how such predictive justice systems interact within the legal and regulatory ecosystem. The presentation was titled Predictive Justice and human oversight under the EU’s proposed Artificial Intelligence Act. The presentation was focused on the provisions of the EU’s draft AI Act dealing with human oversight requirements. The presentation delved into an analysis of the human oversight requirements while detailing the material gaps in the human oversight strategy adopted by the draft AI Act. Finally, a specific plan focused on ensuring human oversight across 4 primary stakeholders namely (1) Developers of AI systems; (2) Deployers of AI Systems; (3) Users of the AI System and; (4) the Impact Population on whom the computational results of the AI system are applied was shared.

Subsequently on the 29^th of September, Mitisha also presented her work at the LawTomation Days 2023 Conference organised by the IE Law School. The presentation titled Regulating Algorithmic Justice Applications under the EU’s proposed Artificial Intelligence Act: A Critical Analysis took a panoramic view of the provisions of the draft AI Act which are applicable to the high-risk AI systems (which includes predictive justice systems) as classified under the draft AI Act. The presentation was oriented around discussing the various compliance requirements, specifically for predictive justice application and whether they are adequate to allow for predictive justice systems being developed and deployed for perform or augment functions on behalf of public authorities and judicial bodies. The core discussion revolved around the provisions pertaining to risk management systems, fundamental rights impact assessment, transparency and provision of information, human oversight, accuracy-robustness and cybersecurity and finally the obligations of deployers of high-risk AI systems.

LeADS Working Paper Series Part VI: Data Collaboratives with the Use of Decentralised Learning – an Interdisciplinary Perspective on Data Governance

14 September 2023/in Blog/by veronica

Data Collaboratives with the Use of Decentralised Learning – an Interdisciplinary Perspective on Data Governance

Maciej Zuziak (ESR 6), Onntje Hinrichs (ESR 13), and Aizhan Abdrasulova (ESR 12) are the authors of the paper “Data Collaboratives with the Use of Decentralised Learning – an Interdisciplinary Perspective on Data Governance”. The paper constitutes a collaboration across three different LeADS Crossroads, i. e. Privacy vs Intellectual Property (Onntje) Data ownership (Aizhan) and Empowering Individuals (Maciej) as well as across disciplines: whereas Onntje and Aizhan focused their research on questions related to data ownership and how conflicted interests in data can be resolved, Maciej is examining various methods of distributed learning.

Their different research interests came together in this paper which exemplified that an interdisciplinary perspective from both law and data science is necessary to tackle urgent questions in the data-dependent economy. Finding appropriate Data Governance solutions that are capable of achieving data policy objectives while at the same time being compliant with EU law requires an interdisciplinary understanding of how both disciplines can complement each other. Their Working Paper cooperation thus perfectly exemplifies the LeADS approach. The paper was presented at this year’s ACM Conference on Fairness, Accountability, and Transparency (FAccT) in Chicago.

Abstract of the Working Paper

The endeavor to find appropriate data governance frameworks capable of reconciling conflicting interests in data has dramatically gained importance across disciplines and has been discussed among legal scholars, computer scientists as well as policy-makers alike. The predominant part of the current discussion is centered around the challenging task of creating a data governance framework where data is ‘as open as possible and as closed as necessary’. In this article, we elaborate on modern approaches to data governance and their limitations. It analyses how propositions evolved from property rights in data towards the creation of data access and data sharing obligations and how the corresponding debates reflect the difficulty of developing approaches that reconcile seemingly opposite objectives – such as giving individuals and businesses more control over ‘their’ data while at the same time ensuring its availability to different stakeholders. Furthermore, we propose a wider acknowledgement of data collaboratives powered by decentralised learning techniques as a possible remedy to the shortcomings of current data governance schemes. Hence, we propose a mild formalization of the set of existing technological solutions that could inform existing approaches to data governance issues. By adopting an interdisciplinary perspective on data governance, this article highlights how innovative technological solutions can enhance control over data while at the same time ensuring its availability to other stakeholders and thereby contributing to the achievement of the policy goals of the European Strategy for Data.

LeADS Working Paper Series Part V: From Data Governance by Design to Data Governance as a Service

7 September 2023/in Blog/by Armend Duzha

From Data Governance by Design to Data Governance as a Service

The WoPa titled “From Data Governance by Design to Data Governance as a Service” is a result of collaborative work between three ESRs from the LeADS project, Armend Duzha (ESR10) and Christos Magkos (ESR11) from University of Piraeus Research Centre (UPRC), and Louis Sahi (ESR2) from University of Toulouse III (UT3), under the supervision of Dr Manolis Alexakis from UPRC and Dr Ali Mohamed Kandi from UT3. Hence, it was designed and conceptualized as a multidisciplinary work, spanning from data processing and management to security and privacy-preservation areas. This is directly related to the research topics of the ESR fellows involved, but most importantly the study carried out as part of the Crossroads (ESRs 10 and 11 are part of Crossroad 4 “Empowering Individuals” while ESR 2 is part of Crossroad 2 “Trust”). Specifically, during the second iteration of the Crossroads, a more in-deep analysis of the state of the art and assessment of existing methods were conducted first individually and then shared among the teams.

Abstract of the Working Paper

Nowadays, companies and governments collect data along every phase of a product/service life cycle. The data is acquired and stored in various and continuous ways, contributing to a large and unique data fingerprint for every product/service in use. Thus, establishing policies, processes and procedures (P3) around data and subsequently enacting those to compile and use such data for effective management and decision-making is extremely important. Data governance (DG) plays an essential role in a dynamic environment with multiple entities and actors, complex IT infrastructures, and heterogeneous administrative domains. Indeed, not only is it beneficial for existing products/services/processes, but it can also support appropriate adjustments during the design of new ones. This research provides an overview of the existing literature and current state-of-the-art in the domain of data processing and government aiming to evaluate existing approaches and to investigate their limitations. To this extent, this study introduces a novel approach for data governance as a service (DGaaS), which provides a framework for (private or public) organizations that facilitate alignment with their vision, goals and legal requirements. Finally, it discusses the potential implication of DGaaS in the smart city and healthcare sectors.

LeADS Working Paper Series Part IV: Data Access And Re-Use In The European Legal Framework For Data, From The GDPR To The Proposed Data Act: The Case Of Vehicle Data

1 September 2023/in Blog/by barbara.darosa

Data Access And Re-Use In The European Legal Framework For Data, From The GDPR To The Proposed Data Act: The Case Of Vehicle Data

In their working paper on “Data Access and Re-Use in the European Legal Framework for Data, from the GDPR to the Proposed Data Act: The Case of Vehicle Data” ESRs Tommaso Crepax, Mitisha Gaur, and Barbara Lazarotto explore the topic of data access and portability of vehicles’ black boxes, which are electronic data recorders that record a vehicle’s speed, breaking, and other information in the event of a crash. The paper centers the analysis on three main stakeholders: consumers, the public sector, and insurance companies, each having a distinct interest in data. It identifies a number of legal enablers and blockers that could affect data access and portability. The topic of this paper is closely related to the individual topics of the ESRs since it covers topics such as data sharing, access, and portability between

different public and private stakeholders, and how this data can be used to feed algorithms that will generate driver’s profiles. Furthermore, it can be considered a case study that explores the topics of Crossroad 1 “Privacy vs Intellectual Property” and of Crossroad 3 “Data Ownership”, analyzing how black box data can be accessed based on privacy regulations and how individuals can take more control of their vehicle data. The tension of Crossroad 1 becomes evident in the case of black box data in the debate over whether individuals have the right to access their own black box data, and whether companies have the right to use that data for commercial purposes. Questions related to data ownership (crossroad 3) are particularly important in the case of black box data because the data can be used to create driver profiles, which could be used for a variety of purposes, such as assessing driver risk and providing personalized safety recommendations.

The paper concludes by providing valuable insights and recommendations for enhancing data access and reuse while improving transparency and accountability in the process. The authors present a methodology for comparing and evaluating the degree of access conferred by various regulations and put it to practical use to assess how much data is currently left out from access by the existing legislation, how much of such data is covered by the Data Act, and ultimately, how much still remains inaccessible for reuse. The proposed framework can deliver on the promises of access and reuse, but there are promising research areas that have not been extensively discussed, including topics such as competition, data governance and markets, and quality of data.

Abstract of the Working Paper

This article delves into the difficulties and opportunities associated with the acquisition, sharing, and re-purposing of vehicle data, particularly information derived from black boxes used by insurance companies and event data recorders installed by manufacturers. While this data is usually utilized by insurers and car makers, the authors contend that consumers, rival firms, and public institutions may also profit from accessing the data for objectives such as data portability between insurance companies, traffic and transportation management, and the development of intelligent mobility solutions. Among other regulations, the authors examine the proposed Data Act as the European chosen champion to address the legal and technical hurdles surrounding the reuse of privately held corporate data, including privacy and intellectual property, competition, and data interoperability issues. The text also offers an overview of the sorts of data obtained through vehicle recording systems and their potential benefits for various stakeholders. The authors present a methodology for comparing and evaluating, in an ordinal fashion, the degree of access conferred by various regulations and put it to practical use to assess how much data is currently left out from access by the existing legislation, how much of such data is covered by the Data Act, and ultimately, how much still remains inaccessible for reuse.

LeADS Working Paper Series Part III: Transparency and Relevancy of Direct-To-Consumer Genetic Testing Privacy and Consent Policies in the EU

22 August 2023/in Blog/by xengie.doan

Transparency and Relevancy of Direct-To-Consumer Genetic Testing Privacy and Consent Policies in the EU

Xengie Doan and Fatma Dogan participated in the WOPA, ESR 9 and 8 respectively. Xengie is working on collective dynamic consent for genetic data and was interesting in exploring the WOPA topic to better understand the current state of publicly available information from popular direct-to-consumer genetic test companies. Of the information given, how transparent are the data processing activities, the communication about risks and benefits (including collective implications, e.g. risks and benefits also affect family members), and was it framed in a way that enabled potential customers to know their rights? These rights are
granted by the company policies, and by EU regulations such as the GDPR. While these companies may be global or serve multiple countries, for EU countries or residents they must respect EU regulations. This coincides with Fatma’s legal expertise and interest in health data sharing in the EU. This WOPA is related to the LeADS crossroads, inspired by concepts such as trust and transparency, user empowerment, and more. However, it is not directly related to any previous work with the crossroads SOTAs. This work contributes to a better understanding of how such companies operate, what information they deem important to share (for legal and customer empowerment reasons), and we offer suggestions for more user-centred, collective, and transparent policies.

Abstract of the Working Paper

The direct-to-consumer (DTC) genetic testing market in Europe is expected to grow to more than 2.7 billion USD by 2032. Though the service offers ancestry and wellness information from one’s own home, it comes with privacy issues such as the non-transparent sharing of highly sensitive data with third parties. While the GDPR states transparency requirements, in practice they may be confusing to follow and fail to upload the goals of transparency – for individuals to understand their data processing and exercise their rights in a user-centered manner. Thus, we examined six large DTC genetic companies’ public privacy and consent policies and identified information flows using a contextual integrity approach to answer our research questions 1) How vague, confusing, or complete are information flows?; 2) How aligned with GDPR transparency requirements are existing information flows?; 3) How relevant is the information to users?; 4) What risk/benefit information is available? This study identified 59 public information flows regarding genetic data and found that 69% were vague and 37% were confusing regarding transfers of genetic data, consequently GDPR transparency requirements may not be met. Additionally, companies lack public user-relevant information, such as the shared risks of sharing genetic data. We then discuss user-centered and contextual privacy suggestions to enhance the transparency of public privacy and consent policies and suggest the use of such a contextual integrity analysis as a governance practice to assess internal practices.

LeADS Working Paper Series Part II: Contribution to Data Minimization for Personal Data and Trade Secrets

18 August 2023/in Blog/by qifan.yang

Contribution to Data Minimization for Personal Data and Trade Secrets

To respond to tensions between personal data protection and trade secrets, ESR 1 Qifan Yang and ESR 4 Cristian Lepore participated in the WOPA 4 “Personal Data vs. Trade Secret.” Qifan’s research explores legal and economic solutions to balance personal data protection and market competition. Cristian’s research analyses technologies for data minimization. The link between the two is personal data. In the collaborative work, we investigate a new legal and technical tool to sketch boundaries between personal data vs. trade secrets and propose a GDPR-compliant model to protect personal data and trade secrets with legal and technical data minimization considerations. The title of the work is “Personal Data vs. Trade Secret.” The aim is to enhance privacy protection and effective competition in the marketplace, thus contributing to privacy protection and effective competition. To achieve the objective, ESR1 analyses legal tools to draw clearer boundaries between personal data and trade secrets while promoting data sharing to unleash competitive dynamics. On the other hand, ESR4 explores frameworks to achieve data minimization by default considering technical and legal aspects.
The work “Contribution to data minimization for personal data and trade secret” is closely related to Crossroad 1, “Privacy vs. Intellectual Property”, and Crossroad 3, “Data Ownership.” The former is an important principle to be observed by personal data controllers and processors in data processing. It plays a pivotal role in determining the scale of data collection, processing, storage, and availability. This principle prompts the trade secrets directive to be applied more rationally while respecting the data subject’s rights, so personal data can flow more freely with the individual’s intention, facilitating the improvements of products or services and promoting market competition. With the idea to strengthen the European digital single market, the EU Commission designed a cross-interoperability framework with privacy objectives in mind — here, data minimization plays a crucial role. The latter studies the work of the European Commission related to cross-border interoperability through coming standards from the academia and industry (e.g., the W3C VC model). Compared to other platforms, the data exchange format designed by the W3C with verifiable credentials is a significant advantage, opening up a better data minimization implementation. While the specific term “IP” is not explicitly mentioned in the provided text, the researchers’ work on data minimization and its impact on the classification of personal data as a trade secret aligns with the broader topic of “personal data vs IP.” They seek to address one of the important tensions between personal data protection and market competition, which are critical considering a data-driven society’s legal and ethical implications.

Abstract of the Working Paper:

Personal data is a resource with significant potential economic value and has a natural and intrinsic “bloodline” tied to personal privacy. The control over personal data, at the intersection of personal privacy and commercial assets, shapes market competition to slight advantages for market dominants. The data minimization from the General Data Protection Regulation as a general benchmark builds bridges between personal data and trade secrets and imprisons excessive breaches against personal data by trade secrets. Still, the legal and technical framework for achieving this target is left open. This work contribution is multifold. (1) It explores the intersection and relationship between personal data and trade secrets, specifically whether personal data can be considered a trade secret beyond personal data protection and minimization. (2) A high-quality GDPR-compliant teaching model is proposed to protect personal data and trade secrets with legal and technical data minimization considerations. (3) It presents the parallelisms between the European framework for electronic transactions and the introduced model. In the long run, we aim to provide citizens with tools to understand and mitigate privacy issues.

LeADS Working Paper Series Part I: The Flawed Foundations of Fair Machine Learning

10 August 2023/in Blog/by robertlee.poe

The Flawed Foundations of Fair Machine Learning

Robert Lee Poe (ESR 14) and Soumia Zohra El Mestari (ESR 15) authored “Borders Between Unfair Commercial Practices and Discrimination in Using Data.” Robert and Soumia, having initially investigated algorithmic fairness/discrimination in their Crossroad “Trust in Data Processing and Algorithmic Design,” narrowed the WOPA subject matter to an in-depth analysis of particular fair machine learning strategies used in practice for purportedly ensuring non-discrimination/fairness in automated decision-making systems. The intersection of algorithmic unfairness and non-discrimination law is the focal point of Robert’s Ph.D. research, specifically the legality of using fair machine learning techniques in automated decisions from both a European Union and United States legal perspective (hiring, admissions, loan decisions, etc.). Soumia’s Ph.D. research focuses on the implementation of privacy-preserving techniques as constraints to be enforced to achieve trustworthy processing in complex machine learning pipelines, where she also investigates the gap between data protection legislation and trustworthy machine learning implementations, and how the different components of trustworthiness such as privacy, robustness, and fairness interact. The study of the dynamics of these interactions offers a better understanding of how a trustworthy machine learning pipeline should be implemented, exposed as a service, and interpreted under the different legal instruments. The WOPA focuses on studying one type of those interactions namely: the robustness (measured as accuracy) and fairness (measured as group similarity) and how the focus on one of those two components affects the other under different data distributions. The main contribution of the WOPA is the clarity provided by the conceptual and empirical understanding of the trade-off between statistically accurate outcomes (robust) and group similar outcomes (fair). While that distinction is not a legal one, it has many implications for non-discrimination law, and further research in that direction is needed, with specific suggestions being given in the conclusion of the article.

Abstract of the Working Paper

The definition and implementation of fairness in automated decisions have been extensively studied by the research community. Yet, there hides fallacious reasoning, misleading assertions, and questionable practices at the foundations of the current fair machine learning paradigm. Those flaws are the result of a failure to understand that the trade-off between statistically accurate outcomes and group similar outcomes exists as an independent, external constraint rather than as a subjective manifestation as has been commonly argued. First, we explain that there is only one conception of fairness present in the fair machine learning literature: group similarity of outcomes based on a sensitive attribute where the similarity benefits an underprivileged group. Second, we show that there is, in fact, a trade-off between statistically accurate outcomes and group-similar outcomes in any data set where group disparities exist and that the trade-off presents an existential threat to the equitable, fair machine learning approach. Third, we introduce a proof-of-concept evaluation to aid researchers and designers in understanding the relationship between statistically accurate outcomes and group-similar outcomes. Finally, suggestions for future work aimed at data scientists, legal scholars, and data ethicists that utilize the conceptual and experimental framework described throughout this article are provided.

LeADS Working Paper Series

4 August 2023/in Blog/by barbara.darosa

This blog post inaugurates the LeADS Working Paper Series, a series of blog posts that will give the possibility for our Early-Stage Researchers (ESRs) to present and contextualize the Working Papers they have been working on over the past months.

Working Papers (WOPA) represent the state of the art in the field of Privacy and Data Protection, Intellectual Property, Algorithm Regulation, Privacy Enhancing Technologies, and Data Processing Transparency. These papers were written by groups of Early Stage Researchers, in an effort to reflect how each crossroad topic still needs to be addressed in data-driven societies that cannot be viewed and fully grasped in isolation but are instead fully interconnected.

The Working Papers are also a very important landmark for the LeADS project since they constitute the first public deliverable of the project. For this “LeADS Working Paper Series,” each WOPA team wrote an introductory text which should serve to contextualise their work with regard to the LeADs project. It is followed by the abstract of their respective paper. The following 6 WOPAs were written and will have dedicated blog posts.

“The Flawed Foundations of Fair Machine Learning” –
Robert Lee Poe and Soumia Zohra El Mestari
“Contribution to data minimization for personal data and trade secrets” – Qifan Yang and Cristian Lepore
“Transparency and Relevancy of Direct-To-Consumer Genetic Testing Privacy and Consent Policies in the EU” – Xengie Doan and Fatma Dogan
“Data Access And Re-Use In The European Legal Framework For Data, From The GDPR To The Proposed Data Act: The Case Of Vehicle Data” – Tommaso Crepax, Mitisha Gaur and Barbara Lazarotto
“From Data Governance by Design to Data Governance as a Service” – Armend Duzha, Christos Magkos, and Louis Sahi
“Data Collaboratives with the Use of Decentralised Learning – an Interdisciplinary Perspective on Data Governance” – Maciej Zuziak, Onntje Hinrichs and Aizhan Abdrasulova

Keep tuned in for the first blog post in a few days!

Data Collaboratives with the Use of Decentralised Learning – an Interdisciplinary Perspective on Data Governance

From Data Governance by Design to Data Governance as a Service

Data Access And Re-Use In The European Legal Framework For Data, From The GDPR To The Proposed Data Act: The Case Of Vehicle Data

Transparency and Relevancy of Direct-To-Consumer Genetic Testing Privacy and Consent Policies in the EU

Contribution to Data Minimization for Personal Data and Trade Secrets

The Flawed Foundations of Fair Machine Learning

Info

LeADS Newsletter