1. Introduction

Machines are taking over the work of people everywhere. In the not too distant past, spell-checking and search engines were regarded by many as ‘intelligent’ information technology. Today, facial recognition routinely checks travelers at our airports. Google Maps gives me unsolicited advice about my destination: “the restaurant may be closed”. My tablet and my mobile phone answer my spoken questions with friendly spoken replies. The newspapers talk about ‘robot’ justice. There are claims that algorithms can accurately predict court decisions, and that we won’t need human judges anymore. We enjoy talking about things that don’t exist yet, and fantasize about how they will make our lives easier. But what do we already know, actually, about the use of artificial intelligence?

This article explores the potential and risks of artificial intelligence in the courts based on the current state of knowledge.

The main questions of this article are: (1) how can artificial intelligence be useful for courts and judges, and (2) what is needed to make artificial intelligence useful? Court cases don’t always require a complex, custom-made approach to decision-making and many cases can be processed automatically, at least in part. That is why the application of information technology, including artificial intelligence, is not the same for all case types. Which form of artificial intelligence has already proven itself for these different processes? How can courts and judges work with artificial intelligence according to standards of fair procedure, for instance Article 6 of the European Convention on Human Rights? Are there risks to the standards of Article 6 when using artificial intelligence? As a result, the Council of Europe has developed the Ethics Guidelines1 for the use of AI in the administration of justice. And how can legal information be made more usable for artificial intelligence?

2. Artificial Intelligence

AI can be described as “allowing a machine to behave in such a way that it would be called intelligent if a human being behaved in such a way”. This is the definition that John McCarthy, considered to have invented the term “Artificial Intelligence”, gave to AI in 1956.2 This is important to establish, defining human intelligence as the measure of what AI does. Intelligence is the ability to reason abstractly, logically and consistently, discover, lay and see through correlations, solve problems, discover rules in seemingly disordered material with existing knowledge, solve new tasks, adapt flexibly to new situations, and learn independently, without the need for direct and complete instruction. What does this mean for AI?

AI, in order to work, needs ‘big data’. Luc Julia, one of the creators of the digital assistant Siri, evokes this image, ‘if a machine is to be able to recognize a cat with 95% certainty, we need about 100,000 pictures of cats.’3 We have collected a lot of data in the meantime, which is why AI has recently attracted so much interest. AI comes in many different forms, such as speech recognition and image recognition. This article primarily discusses machine learning and natural language processing. Deep learning, in which the technology itself learns, is still a subject for the future.

What do we know about AI, and especially about machine learning, in the administration of justice?

3. Courts and information technology

Administering justice means delivering justice in individual cases, and the judiciary also has a shadow function in presenting standards to society more broadly. But regardless of the subject matter, the work of courts and judges is to process information; parties bring information to the court, transformations take place in the course of the procedure, and the outcome is also information. Not all of this information processing is complex customization. Default judgments and statements of inadmissibility are often routinely produced; many cases require a simple assessment without a hearing, and some cases are settled. Only a limited proportion of the cases that the judiciary has to deal with, are complex, contradictory cases.4 It cannot be stressed enough that the process, and hence the need for information technology, is not the same for all cases.

In administrative and civil cases (including subdistrict/local/small claims court cases), the way in which cases are handled depends mainly on (a) the complexity of the information in a case and (b) the degree of predictability of the outcome. A relatively large proportion of routine cases have a predictable outcome. In those cases, the court ruling is a document produced in a largely automatic process based on data supplied. The judgment document provides a title for enforcement. Here, the court primarily receives digital submissions in which the filing party provides the data digitally, so that they do not have to be re-entered manually. Moreover, if the outcome is predictable, case processing could be partly or even largely automated using AI, precisely because the outcome is largely or entirely certain.

In family and employment matters, there is also a significant proportion of routine cases. Here, the judge, in a function similar to that of a civil-law notary, assesses a proposed arrangement of the parties for legal validity. This can be – in the Netherlands – a regular divorce, but also a parental authority provision, or the termination of an employment contract. Here too, the judgment is a largely automatically composed document, confirming that the proposed arrangement complies with the law. Here, too, digital filing and process automation are the primary information technology requirements. In addition, a smart filing portal can help the parties to bring their case to court in the best possible way.

Below is an example from practice.

A settlement is regularly reached in cases that are less routine. To bring about a settlement, there is software that can analyze the parties’ points of view and present an optimum result based on the parties’ input. Only in those cases that are not settled, is the end product of the court proceedings a judgment in the strict sense of the term.

In the criminal justice system, routine cases are handled (at least in the Netherlands) by the Public Prosecution Service, and only those cases where a judgment is required are brought before a court. Here, too, there is a wide range of cases from the relatively simple to the extremely complex.

In all complex cases, in which the judge or the panel has to give a judgment in order to bring the case to a conclusion, the need for information technology mainly consists of knowledge systems that make legal sources easily accessible, and a digital case file that can present large amounts of information in an accessible manner.

Artificial intelligence is also information technology, and consequently the AI can also have different uses for different cases.

4. What can artificial intelligence do for courts?

AI can be useful in many different ways to meet different requirements. Sales talk on AI for courts is abundant. It has been argued that ‘it would make it fairer, and moreover, unlike human judges, AI does not get tired and does not depend on its glucose levels to function.’5 That is mostly speculation. The discussion here, however, focuses mainly on what we already know from evidence. Its focus is on “proven technology”, AI that has already proven to be useful in practice. But are robots already able to judge? The jury is still out on this one.6

1. Organising information. Recognising patterns in text documents and files can be useful, for example when sorting large amounts of cases, or in complex cases that contain a lot of information. An example from the United States of America is ‘eDiscovery’, an automated investigation of electronic information for discovery, before the start of a court procedure. eDiscovery uses machine learning AI, which learns through training what the best algorithm is that is capable of extracting the relevant parts from a large amount of information. Parties agree which search terms and coding they use. The judge assesses and confirms the agreement. This is a method of document investigation recognised by the courts in the United States and the United Kingdom.7 The method is faster and more accurate than manual file research.

2. Advise. AI that is able to advise, can be useful for people and potential parties to a court case, who are looking for a solution to their problem, but do not yet know what they can do. Advisory AI can also be useful for legal professionals. AI not only looks for relevant information, but also provides an answer to a question. The user then decides for herself whether she will act on the advice. This advisory function can help people resolve more of their problems by themselves and thus prevent disputes or court cases. If the advice is not enough, support in finding a solution is also a possibility. Help in formulating a solution that requires judicial review, such as a request or a summons, can ensure that the judge’s assessment can become more of a routine matter.

A proven practical example of this function is in use at the Civil Resolution Tribunal (CRT) in British Columbia, Canada.8 The CRT was established to deal with disputes relating to strata, subsidised housing. When it proved successful, the jurisdiction was gradually extended, and in April 2019 personal injury resulting from collisions was added to its jurisdiction. CRT offers the Solution Explorer, with free public legal information and calculation aids, available 24/7. There are guided pathways, interactive questions and answers, dispute resolution or preparation for proceedings at CRT. Underneath is a purpose-built expert system that is updated every three months. This updating is still done by human experts, based on user feedback and analytical data about the system. So this is not yet “real” AI.

At the District Court of East Brabant in the Netherlands, in collaboration with Tilburg University, Eindhoven University of Technology and the Jheronimus Academy of Data Science (JADS), a study is ongoing into the possibilities of AI for traffic violation cases, in which a citizen appeals to the court within the framework of the administrative handling of traffic violations.9 The study aims to develop a tool to support judges in preparing and deciding such cases. The study uses data from the District Courts of East Brabant and Zeeland-West Brabant, and from the Arnhem-Leeuwarden Court of Appeal, which deals with appeals. This was – thinking back to the 100,000 cats – the only way to have sufficient data to work with. Evidently, this is still an experiment. Results are expected in the course of 2020.

3. Predictions. AI that claims to be able to predict court decisions attracts a lot of interest. The usual English/American term for this is “predictive justice”. This term has given rise to discussion, because the outcome of the prediction algorithms is neither justice nor predictive. The term “forecast” is a more accurate description, reflecting current debates. The outcome looks more like a weather forecast than like an established fact. Just like the weather, court proceedings risk having an unpredictable outcome. As the case becomes more complex with more information and more issues, that risk increases. This is one reason why there is so much interest in AI, because it claims to be able to reduce the risk. In the United States, various prediction tools are offered commercially. Their workings are therefore business secrets, so we do not know how they work. However, there are some non-commercial applications, and we do have some insight into their operation.

For example, a group of American academics has developed a machine learning application that claims to be able to predict the outcome of a case at the Supreme Court of the United States (SCOTUS) with an accuracy of 70.2%, and the voting behavior of individual judges with 71.9% accuracy.10 In addition to information about the case, this application uses information about the political preferences and past voting behavior of the individual justices.

The most extensively described application is one that claims to be able to predict decisions of the European Court of Human Rights (ECHR).11 This tool uses using natural language processing and machine learning to predict whether or not in a particular situation the Court will rule whether a particular provision of the European Convention on Human Rights (ECHR) has been violated. The tool works with information from earlier judgments. This AI claims 79% accuracy. The material this AI processes is already the result of many ‘complexity reduction’ steps. Most ECHR cases are dealt with by the registry, by the Commission or by chambers with one or more judges. The investigators (Aletras et al.) only used judgments from HUDOC, the online database of the ECHR, which does not include cases resulting from inadmissible requests. These requests were not used for their experiment, simply because they were not available. Importantly, the texts of the rulings were written to provide justification for the judgment.12 Aletras et al. note that their results indicate that the facts of a case, as presented by the Court, is the strongest indicator of the outcome of the case. They consider it a useful aid for judges because it can recognize patterns in a text document, and can thus quickly identify which direction a judgment could take.

Predicting recidivism in criminal cases is another practical example of the use of AI, from the US. This tool, the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), is used in practice by U.S. criminal judges in some states when assessing the recidivism risk of defendants or convicted persons, in decisions on pre-trial detention, sentencing or early release. Those in favor of using tools such as COMPAS say that they reduce the number of people detained because the tools make the assessment of the recidivism risk more objective. The US detains far more people than any other country, and this is considered undesirable for several reasons.13 COMPAS uses data from the criminal record and from a questionnaire with 137 questions, including questions such as: “Is someone who is hungry allowed to steal? Strongly disagree, disagree, etc.”. The tool has its flaws, however. By using data from the past, it systematically overestimates recidivism among African American defendants compared to Caucasian Americans.

The information is intended to indicate that defendants do not pose a risk and therefore do not need to be detained. However, judges using the tool in fact detain more people than before.14

Another example from the US is startup Ravel in the US. It developed tools for analyzing trends in judgments, courts, and also judge profiles, and offered the tools on a subscription basis. The tools operation is not public, and no information about its accuracy is known to the author. Ravel was purchased by LexisNexis, the largest provider of legal information in the US and the tools are now part of LexisNexis’ service package.

5. AI in court practice: Ethical principles

Technology is one thing, but how we can and should work with it, in practice, is still heavily debated. At the time of writing, already more than 25 documents set out ethical principles for the use of AI, including those of the Institute of Electrical and Electronics Engineers (IEEE), the European Union and the Council of Europe. The Commission for the Efficiency of Justice (CEPEJ) of the Council of Europe has addressed the issue. CEPEJ’s Working Party on Quality (GTQUAL) developed ethical principles for the use of AI in the administration of justice. CEPEJ adopted them in December 2018.15 These five ‘Ethical Principles’ overlap here and there, so dealing with them in a rigorous and systematic manner is a little problematic.

1 Respect for fundamental rights. Ensure that design and implementation of AI services and tools are compatible with fundamental rights such as privacy, equal treatment and fair trial.

2 Equal treatment. Avoid discrimination between individuals and groups of individuals. The example of COMPAS above shows that discrimination and unjustified distinction between individuals and groups, is a real risk. The data used by the algorithm may be the cause, and the prejudice may also be embedded in the algorithm itself.

3 Data security. When processing judicial decisions and data, certified sources and data that cannot be altered should be used, with models that are multidisciplinary in design, in a secure technological environment.

4 Transparency. Data processing methods should be made transparent and comprehensible, and external audits should be allowed. The requirement of transparency is now established case law. The user of an algorithm must make public the choices made, and the data and assumptions used, in a complete, timely and appropriate manner so that these choices, data and assumptions are accessible to third parties. Such full, timely and appropriate disclosure should make it possible to assess the choices made and the data, reasoning and assumptions used, so as to ensure effective legal protection against decisions based on those choices, data, reasoning and assumptions, with the possibility of judicial review by the courts.

This is now consistent case law in the Netherlands courts.

5 AI under user control. The algorithm may not be used as a prescription, i.e. the computer does not prescribe anything and cannot decide by itself. Users must know and understand what the AI does, and the users must be in control of the choices they make. This means that users must be able to deviate from the outcome of the algorithm without difficulty. This human control was an issue in the Loomis case, before the Supreme Court of Wisconsin.16 At stake were: (1) whether the use of the result of a risk assessment by an instrument such as COMPAS, where the operation is a business secret, violates the defendant’s right to a fair trial because the secret operation deprives a defendant of the opportunity to test the accuracy and scientific value of the risk assessment, and (2) whether it violates the right to a fair trial to rely on such a risk assessment because it includes gender and race in the assessment of the risk of recurrence. The Wisconsin Supreme Court dismissed Loomis’ objections, but said that the judge should give reasons as to how he or she uses COMPAS. The case was referred to the Supreme Court of the United States, which decided not to hear the case.

In the Netherlands, the Council of State recommended that the principles of good governance, and in particular the principle of a reasoned decision and the due diligence principle, should be interpreted more strictly in the context of digitisation.17 This means, among other things, that a decision must explain which decision rules (algorithms) have been used and which data have been copied from other administrative bodies. This will strengthen the position of citizens in automated and chain decision-making; in the phase of objection to automated decisions, customized and human reconsideration is recommended.

What can happen when IT is blindly relied upon is shown by an example from the courts in the United Kingdom.18 There, a relatively simple piece of IT determines the financial capacity of (ex)-spouses in maintenance proceedings. The parties fill in a PDF form, and the IT calculates the resulting capacity. Due to a small mistake, which went unnoticed, incorrect calculations were made in 3,638 cases between April 2011 and January 2012, and between April 2014 and December 2015. Debts, instead of being deducted, had been added to the assets, so the assets taken into account were too high. In cases that were still pending, this could still be corrected. However, incorrect decisions were issued, and presumably complied with, in more than 2,200 cases.

In conclusion: AI can have a number of functions for courts and judges, and also for parties to a case and individuals seeking justice. The function with the best evidence so far of success, is the structuring of large amounts of information, which could make the administration of justice more efficient. Advisory and forecasting are functions that are still subject to many reservations. But when judges use AI results in their judgment, this is accepted in practice, provided they give their reasons. However, there are conditions attached to making AI useful for courts. The following discusses those conditions.

6. What is needed to make AI useful in courts?

Article 6 of the ECHR, and consequently the Ethics Guidelines, set the standard for a proper procedure. It requires, among other things, a transparent procedure, equality of the parties to the proceedings, and also a well-founded judgment. Reduction of judicial complexity as described in section three above must therefore be substantiated, transparent and offer a level playing field to the litigants.

In order for AI to be able to process legal information effectively, the legal information must first be made machine processable. In order for AI to work in accordance with Article 6 of the ECHR, this also means the following. It has long been known that bad data, such as legally incorrect decisions, reduce the quality of the AI result.19 But correct data is not enough. Text recognition with natural language processing, in which the text-driven behavior of lawyers and judges is calculated from an external perspective, can recognise patterns.20 Patterns such as statistical relations are not enough to substantiate a judgment. For the AI to be able to process and understand legal information, that information needs to be enriched: structured and provided with legal meaning.21 At present, this structuring and meaning must be added to judgments (text documents) after they have been written. AI can be used much more effectively once legal information such as court decisions is made machine-processable before publication with textual readability, document structures, identification codes and metadata all available. Adding legal meaning in the form of structured terminology and defined relationships, will further increase the effectiveness of AI in the court process.

In addition, AI also needs enough data in order to work. How much would that be, if AI needs 100,000 pictures of cats, as suggested in my earlier example? Few jurisdictions will be able to provide that many decisions on exactly the same issue. Moreover, judgments will nearly always contain decisions on many issues, procedural as well as substantive. Nor are the substantive decisions invariably yes/no decisions.

How many such yes/no decisions does AI need to reach a reasonably reliable conclusion?22

AI should be able to explain how it came to its result. This can be an explanation of the processing process, but also a substantive explanation. Research shows that AI in general should be technically capable of the kind of explanation that we now ask from humans, but that in practice humans can explain some aspects more easily than AI.23

So there is still a lot of work to be done to make AI really useful in the courts. AI must be able to explain how the result came about, judiciaries must digitise their information and provide legal interpretation. Judges and others who work with AI also need to understand how the AI works. The Ethics Guidelines must be implemented and made to work in the institution and in the court work processes. Who is authorized to make which decisions, and who will monitor compliance?

The examples above show that human control is needed in all phases. First of all, human users need to determine what the AI has to do, how it is measured and evaluated; there has to be continuous testing to ascertain that the AI is still doing what it should, the system has to be designed in such a way that it can be easily and robustly adjusted, and continuous auditing is necessary.24 And should such an audit be done within the judiciary, which is after all independent, or is an external audit appropriate?

I am convinced that with an external audit the judiciary can be more transparent. That will generate more trust than just an internal audit.

7. Conclusions

What good can AI do for justice, and what does it take? This article explored what is known about AI in courts. Not all court work is complex custom work, therefore the need for information technology is not the same for all cases. AI, which is, after all, also information technology, can therefore be useful in different ways for different types of cases. Some AI has already proven itself in practice. There is not (yet) any evidence that robots (are going to) judge. The standard of Article 6 ECHR prescribes a proper procedure. A lot of work is still needed before AI can comply with this particular standard.

Legal information needs to be more structured and endowed with meaning. Explaining how the result was reached by means of AI is not yet feasible – for now. AI is already able to help individuals, litigants and judges with organising information. As the library of legal information is enriched, Artificial intelligence can also help with advice and suggestions. Judges need to understand how AI works, in order to make adequate use of it. Courts, in turn, need to digitalise their information and provide it with legal interpretation, in order to make it more usable for artificial intelligence systems. Courts must constantly monitor their system for effectiveness and adjust it if necessary. For courts and court systems, largely set up and run as production organisations, this kind of development work is a huge new task.