Georgetown’s Civil Justice Data Commons Seeks to Unlock Court Data

September 29, 2021

Professor Tanina Rostain

A couple of years ago, two Georgetown professors, Georgetown Law’s Tanina Rostain and the McCourt School for Public Policy’s Amy O’Hara, connected at a conference and discovered they shared a frustration: civil court data are largely inaccessible. Rostain, whose work centers on access to courts and the civil justice system and O’Hara, an expert on data governance at McCourt’s Massive Data Institute, decided to combine forces to try to find solutions.

The result is the Civil Justice Data Commons, a joint project of Georgetown Law’s Institute for Technology Law & Policy and the Massive Data Institute, which seeks to collect and organize information on the U.S. civil justice system and make it available to researchers and to courts themselves. The initiative recently announced a partnership with Amazon Web Services’ new AWS Innovation Studio, which focuses on providing valuable expertise in systems design and customer service.

We took the opportunity to speak with Rostain about the challenges of gathering information from the existing civil court system, how the Civil Justice Data Commons has developed to date, and what she hopes it can achieve once it’s fully operational.

Why is it currently so difficult to access information on the U.S. civil justice system?

It is a systems problem. We have 50 states and the District of Columbia and more than 3,000 counties in the United States. There are all sorts of courts: state, county, municipal, specialty courts and more. Court systems have different names for the same cases — an eviction in one jurisdiction is an unlawful detainer in another. They all collect data differently, and sometimes they don’t collect data at all. Criminal justice institutions are required to report data to the Department of Justice. But there’s no clearinghouse or standardized rules for data collection or reporting in civil courts.

No one has access to data to understand what’s going on in civil courts. Recent news has rightly focused on the number of evictions going on in courts. Less visible, but no less important, are the cases in which debt collectors are trying to collect on health care, credit card and other consumer debt owed by poor people. It is profoundly disturbing that we have so little knowledge of how these central democratic institutions treat the most vulnerable people in our society.

What kind of problems are researchers running into?

Researchers who want to study civil courts always have to struggle to get data. They have to negotiate back and forth with each court to get access. If you make it easy for courts to share their data and for researchers to find it, a lot more researchers are going to jump in to study courts.

It may come as a surprise, but many courts don’t know what is going on in their own systems. Making their data more readily available to researchers would make it easier for researchers to answer questions that courts have about their own operations.

So the courts would use this too, and not just to see what’s going on in a neighboring state? They don’t even have a good collection of their own information?

Right, there are courts that have approached us, concerned, for example, about issues of bias and racism. They would like to know how people with different demographic characteristics are being treated in their courts — if there’s a problem or not — and they don’t have data to answer that. Other courts want to know things like did the number of evictions go up recently, or down? Is there a trend? Maybe they need to allocate more court resources to eviction cases. Courts have both narrow questions about their operations and big-picture questions about how the system treats people. In one way or another, all these questions go to how well our justice system works.

What have you and Professor O’Hara done so far with the Civil Justice Data Commons?

We received a grant from the National Science Foundation to interview courts and legal services providers to understand what data they had, why they would want to share data, and what obstacles they faced in doing so. The most important question was “what’s in it for me?” We had to surface courts’ interest in sharing data and the specific questions they wanted answered.

Our next step was to build a prototype under a grant we received from The Pew Charitable Trusts. I was very lucky to be collaborating with Amy, who has more than 15 years working in data governance. We had to figure out the infrastructure. How do courts share data? How do we clean, standardize and document them? How do courts want their data displayed? How do researchers want to obtain access? We also had to create a governance regime that covered data sharers’ rights and obligations, our obligations and the obligations of researchers. For example, we agree to store the data securely and remove personally identifiable information. We also agree to vet research proposals to ensure they are consistent with how courts want their data used.

Then, to demonstrate that the CJDC worked, we obtained publicly available data from court websites, cleaned and documented it. A researcher then dove in and kicked the tires.

We are now working on methodologies to determine the demographic composition of people in court. Around half of tenants in evictions and debtors in consumer debt cases don’t show up in court, so it’s impossible for courts to collect that information. With funding from the JPB Foundation, we are going to link court data to Census Bureau data to understand the race and ethnic origins of litigants. This is obviously a necessary step if we are to understand whether there is systemic racism and bias in civil courts.

How has the team from AWS Innovation Studio helped develop the project?

We had a series of design sessions with stakeholders, brainstorming what it ought to look like. They have their process called “Working Backwards,” which starts with the user, or researcher or data sharer, and what their needs are. They showed us what a very sophisticated design process looked like — they started by listening very closely to the stakeholders and identified at a granular level what would make it easy for them to share data — in the case of courts — and get access to data — in the case of researchers. Courts needed efficient mechanisms to gather and send over data. For researchers, we wanted to create “fast, frictionless and facilitated access.” AWS Innovation Studio helped those of us who are developing the CJDC to figure out the many internal steps needed between getting data on one end and making them available for researchers on the other.

What would better access to this kind of data mean?

Simply put, it would mean much more knowledge about how our institutions of civil justice function. We don’t know what happens to people who get involved in the civil justice system. How is it affecting their life? Is it making it worse? If so, we’ve got to think about what you can do upstream to prevent people from ending up in court. But we don’t know the answer. We know little bits and pieces of the story, but there’s so much more that we need to know and understand about what happens to the millions and millions of people who end up entangled in the court system every year.