Posts

ESRs presenting their research at ACM FAccT

ESRs Maciej Zuziak, Onntje Hinrichs and Aizhan Abdrassulova collaborative work got accepted for this years’ ACM conference on Fairness, Accountability, and Transparency (ACM FAccT) that was held in Chicago from 12-15th of June 2023. Maciej Zuziak presented their article on “Data Collaboratives with the Use of Decentralised Learning – an Interdisciplinary Perspective on Data Governance” during a paper session dedicated to Privacy.

In their collaboration, the ESRs combined their different research interests in law and machine learning, which enabled them to develop an interdisciplinary perspective on challenges posed by different approaches to data governance. The predominant part of the current discussion in EU data policy is centered around the challenging task of creating a data governance framework where data is ‘as open as possible and as closed as necessary’. In their article, the ESRs further elaborate on the concept of data collaboratives powered by decentralised learning techniques as a possible remedy to the shortcomings of existing data governance schemes.

Data collaboratives have been described as new emerging forms of partnerships where privately held data is made accessible for analysis and where collaboration between participants is facilitated to unlock the public good potential of previously siloed data. [1] They thus fit well the EU policy shift that has taken place over the past years. Whilst for decades, the introduction of exclusive property rights was discussed as a potential tool to empower data subjects with regard to ‘their’ data and to facilitate the emergence of data markets, this changed with the 2020 European Data Strategy. Now, the potential of the data economy should be unlocked by facilitating data access and sharing. The authors presented their concept of data collaboratives powered by decentralised learning techniques, which can be used as a tool to reach the goals of the current EU data strategy: facilitate access to data while protecting privacy and intellectual property interests which individuals and companies might have with regard to ‘their’ data. The collaboration between the ESRs thus reflected the idea behind the LeADS project that solutions to existing tensions in the data economy require an interdisciplinary perspective from both law and data science.

The paper can be accessed in the Digital ACM library via this link.

 

[1] 2020. Wanted: Data Stewards – (Re-) Defining the Roles and Responsibilities of Data Stewards for an Age of Data Collaboration

 

What do you mean a robot gets a say in this?

Predictive analytics, which can be understood as the application of artificial intelligence (machine learning and deep learning) based computer algorithms, to predict the future activities of a datasets based on their past or current behaviors are increasingly finding applications across different domains and sectors.

In the legal and civic fields, predictive analytics while previously known for its application in carrying out subservient tasks such as providing case law insights, mapping of contracts, vetting of legal provisions, are now being used for providing insights during legal trials and dispute resolution. These predictive jurisprudence softwares find their application in predicting recidivism probabilities of persons, investigating evidence, and even predicting the possible resolution of a civil dispute or a criminal charge based on the precedent and the legal infrastructure of the jurisdiction in which they operate.

In toto, predictive jurisprudence finds its application in three broad spheres, namely – (1) Predictive Justice: Criminal Sentencing, Settlement of Civil Disputes, Increasing Access to Justice; (2) Predictive Recidivism Software: Parole related decisions, Commutation of Criminal Sentences; (3) Legal Tools: Drafting tools, Contract Analysis Tools, and Legal Insight Tools.

The use of predictive jurisprudence and its development can be first found all the way back to the Supreme Court Forecasting Project (SCFP), a combined study conducted by students at the University of Pennsylvania, Washington University, and the University of California Berkeley, which was a statistics-based legal project that aimed to predict the outcome of every case argued in front of the United States Supreme Court in the year 2002. The backbone of this project was a statistics formula that when used, performed very well in predicting the decisions of the court to a degree of accuracy that even seasoned legal experts could not match. The statistics model predicted 75% of the court’s affirm/reverse results correctly, whereas the legal experts collectively predicted only 59.1% of the decisions correctly. Although the SCFP did not employ the use of computers per se, it provided proof of concept for the ideation that judicial decisions and the jurisprudence of courts are indeed parameters which can be readily predicted.

Since the SCFP, various companies and individuals have developed their digital products focused on assisting legal professionals by providing insights regarding legal materials or predicting the pattern of the judicial pronouncements, specificized based on the judges and courts across a plethora of legal matters such as settlement of insurance related claims, small cause matters such as traffic violations, granting of parole and commutation of sentences for convicts.

There are many companies who have well established digital products operating in the legal sphere, one of these is CaseCrunch, a UK based startup whose predictive jurisprudence application CaseCruncher Alpha showcased 86.6% accuracy in legal predictions by their algorithms, while the pool of 112 lawyers pitted against the CaseCruncher Alpha had an overall accuracy of 62.3%.

Another company flourishing in the predictive jurisprudence sphere is Loom Analytics which is a predictive analytics platform that features win/loss rates and judge ruling information but only for civil cases in select Canadian provinces, however, they are in the process of scaling up.

However, the established market powers in the predictive jurisprudence sphere are Prédictice– a French company in the business of providing legal case analysis (except for criminal cases). Another French company operating in this sphere is Case Law Analytics, which also works on providing legal analysis albeit much like Prédictice, it does not analyse criminal cases.

Another major player in the US market is Lex Machina which is owned by the global conglomerate LexisNexis and is in the business of providing legal analysis (including criminal cases) to legal professionals amongst other services such as insights such as how a judge behaves in a specific case, a compendium of crucial insights regarding the arguing styles, litigation experience which allows for persons to formulate an appropriate litigation strategy, Lex Machina also provides analysis of a party before a specific judge, courts or forums. Further, Lex Machina provides outcome-based analytics, timing-based analytics and helps in analysing the motions submitted to the court which helps professionals in crafting the appropriate motions to move the courts for specific causes.

Predictive jurisprudence is clearly a winner in terms of analysing not just volumes of data accurately but also identifying patterns in judicial behaviour which may not be visible to even the most seasoned experts.

The companies and private projects engaged in the use of predictive jurisprudence commercially, point towards an inherent market for predictive jurisprudence tools which have many users relying on the same to not only hone their professional skills and insights but also provide an increased access to justice across many jurisdictions. However, this brings us to our most important consideration yet- is it prudent to rely upon predictive jurisprudence software to carry out legal functions? And if so, what are the core tenants of designing and using such software.

The use of AI in this context relies on two specific considerations- the domain or sector in which it will operate and the characteristics of the tasks it will carry out. For example: the AI based software when applied to the legal sector if carries out administrative tasks such as retrieval or organisation of files, can be considered a low-risk AI and therefore, its users need not be made to go through a wide array of disclosure and notifications that they are interacting with an AI system. However, in the case where the AI based software is carrying out complex tasks such as legal deliberation, which would normally require a degree of expertise, the AI will be classified as a high-risk AI since any mistakes or shortcomings can have a direct impact on the life and liberty of an individual.

This brings us to our next crucial consideration about what core tenets are supposed to be kept in mind while designing a predictive jurisprudence-based AI software. First and foremost, a strict compliance with data protection laws takes centrestage in such a software, making a wonderful case for incorporating privacy by design.

Secondly, all legal procedures across jurisdictions- whether civil law, common law etc., have a common abidance to the principles of natural justice which are namely- (1) Adequate Notice; (2) No presence of Bias and; (3) Providing a reasoned order for all delibertations.

This brings us to an important component in all predictive jurisprudence-based AI applications- a degree of explainability. Explainable AI (XAI) has made many developments in the recent times, and a degree of explainability in  a predictive jurisprudence application is crucial in as much as it allows for natural persons to readily rely on them since they understand the reasoning behind the computational results of the AI. The use of XAI as a core design tenet will also enable the predictive jurisprudence application to function independently in low-risk or moderate-risk tasks.

In their current form, since most predictive jurisprudence are far from perfect, they require human oversight and thus function to accentuate the legal analysis of lawyers, judges, and other legal professionals.

The EU Agency for Fundamental Rights (FRA) published a report (The European Union Agency on Fundamental Rights, 2020) in 2020 under the directorship of Michael O’Flaherty titled “Getting the future right: AI and fundamental rights” (FRA Report).

The FRA Report mentions the requirement for adequate disclosure while using the AI-based predictive jurisprudence technologies, this will provide the persons with the successful opportunity to complain about the use of AI and challenge the decisions which have been arrived upon based on the AI as this grievance and complaints mechanism is crucial for upholding the access to justice. The following are crucial to be reported to persons using the predictive jurisprudence based tools, in order to ensure access to justice-

  1. Making people aware that AI is used
  2. Making people aware of how and where to complain (A proper and designated grievance redressal mechanism)
  3. Making sure that the AI system and decisions based on AI can be explained (Use of XAI)

Many legal scholars have voiced their concerns about the use of predictive jurisprudence by courts and legal officers asserting that justice must be deliberated and not predicted hinting at the possibility of its users succumbing to automation bias. This concern has been adequately addressed in the current scenario as currently and until the time the AI based predictive jurisprudence software cannot explain the reasoning behind its computational results, it will operate only under the supervision of a natural person who may use the results as a component to deliberate upon while arriving at their well-reasoned decisions, while primarily relying on their own experiences and expertise.

Algorithms are learning from our behaviour: How must we teach them

Algorithms are learning from our behaviour: How must we teach them

by Daniel Zingaro

Have you ever wondered about why the online suggestions on videos, products, services or special offers you receive fits so perfectly into your preferences and interests? Why your social media feed only shows certain content, but filters out the rest? Or why you get certain results on an internet search on your smartphone, but you can’t get the same results from another device? And why does a map application suggest a certain route over another? Or why you are always matched with cat lovers on dating apps?

Did you just click away and thought that your phone mysteriously understands you? And although you may have wondered about this, you may not have found out why.

How these systems work to suggest specific content or courses of actions is generally invisible.  The input, output and processes of its algorithms are never disclosed to users, nor are they made public. But still such automated systems increasingly inform many aspects of our lives such as the online content we interact with, the people we connect with, the places we travel too, the jobs we apply for, the financial investments we make, and the love interests we pursue. As we experience a new realm of digital possibilities, our vulnerability to the influence of inscrutable algorithms increases.

Some of the decisions taken by algorithms may create seriously unfair outcomes that unjustifiably privilege certain groups over others. Because machine-learning algorithms learn from the data that we feed them with, they inevitably also learn the biases reflected in the data. For example, the algorithm that Amazon employed between 2014 and 2017 to automatize the screening of job applicants reportedly penalised words such as ‘women’ (e.g., the names of women’s colleges) on applicants’ resumes. The recruiting tool learned patterns in the data composed of the previous 10 years of candidates’ resumes and therefore learned that Amazon preferred men to women, as they were hired more often as engineers and developers. This means that women were blatantly discriminated against purely based on their gender with regards to obtaining employment at Amazon.

To avoid a world in which algorithms unconsciously guide us towards unfair or unreasonable choices because they are inherently biased or manipulated, we need to fully understand and appreciate the ways in which we teach these algorithms to function. A growing number of researchers and practitioners already engages in explainable AI that entails that they design processes and methods allowing humans to understand and trust the results of machine learning algorithms. Legally the European Data Protection Regulation (GDPR) requires and spells out specific levels of fairness and transparency that must be adhered to when using personal data, especially when such data is used to make automated decisions about individuals. This imports the principle of accountability for the impact or consequences that automated decisions have on human lives. In a nutshell, this domain development is called algorithmic transparency.

However, there are many questions, concerns and uncertainties that need in depth investigation. For example: 1) how can the complex statistical functioning of a machine learning algorithm be explained in a comprehensible way; 2) to what extent transparency builds, or hampers, trust; 3) to what extent it is fair to influence people’s choices through automated decision-making; 4) who is liable for unfair decisions; … and many more.

These questions need answers if we wish to teach algorithms well to allow for a co-existence between humans and machine to be productive and ethical.

 

Authors:

Dr Arianna Rossi – Research Associate at the Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, LinkedIn: https://www.linkedin.com/in/arianna-rossi-aa321374/ , Twitter: @arionair89

Dr Marietjie Botes – Research Associate at the Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, LinkedIn:  https://www.linkedin.com/in/dr-marietjie-botes-71151b55/ , Twitter: @Dr_WM_Botes

The beginning of the LeADS era

On January 1st 2021 LeADS (Legality Attentive Data Scientists) started its journey. A Consortium of 7 prominent European universities and research centres along with 6 important industrial partners and 2 Supervisory Authorities is exploring ways to create a new generation of LEgality Attentive Data Scientists while investigating the interplay between and across many sciences.

LeADS envisages a research and training programme that will blend ground-breaking applied research and pragmatic problem-solving from the involved industries, regulators, and policy makers. The skills produced by LeADS and tested by the ESR will be able to tackle the confusion created by the blurred borders between personal and commercial information and between personality and property rights typical of the big data environment. Both processes constitute a silent revolution—developed by new digital business models, industrial standards, and customs—that is already embedded in soft law instruments (such as stakeholders’ agreements) and emerging in case law and legislation (Regulation EU 2016/679 and the e-privacy directive to begin with), while data scientists are mostly unaware of them. They cut across the emergence of the Digital Transformation, and call for a more comprehensive and innovative regulatory framework. Against this background, LeADS is animated by the idea that in the digital economy data protection holds the keys for both protecting fundamental rights and fostering the kind of competition that will sustain the growth and “completion” of the “Digital Single Market” and the competitive ability of European businesses outside the EU. Under LeADS, the General Data Protection Regulation (GDPR) and other EU rules will dictate the transnational standard for the global data economy while training researchers able to drive the process and set an example

The data economy or better way the data society we increasingly live is our explorative target under many angles (from the technological to the legal and ethics one). This new generation is needed to better answer to the challenges of the data economy and the unfolding of the digital transformation scoping. Our Early Stage Researchers (ESRs) will come from many experiences and backgrounds (law, computer science, economics, statistics, management, engineering, policy studies, and mathematics,..).

ESRs will find an enthusiastic transnational, interdisciplinary team of teams tackling the relevant issues from their many angles. Their research will be supported by these research teams in setting the theoretical framework and the practical implementation template of a common language.

LeADS research plan, although already envisages 15 specific topics to be interdisciplinary investigated, remain open-ended.

It is natural in the fields we have selected for which we identified crossover concepts in need of a common understanding of concepts useful for future researchers, policy makers, software developers, lawyers and market actors.

LeADS research strives to create, share cross disciplinary languages and integrate the respective background domain knowledge of its participants in one shared idiolect that it wants to share with a wider audience.

It is LeADS understanding that regulatory issues in data science and AI development and deployment are often perceived (and sometimes are) hurdles to innovation, markets and above all research. Our unwritten goal is to contribute to turn regulatory and ethical constraints which are needed in opportunities for better developments.

LADS aims at nurturing a data science capable of maintaining its innovative solutions within the borders of law – by design and by default – and of helping expand the legal frontiers in line with innovation needs, preventing the enactments of legal rules technologically unattainable.

By Giovanni Comandé