A mountain road with a splitting path, one side with an arrow and the other a "wrong way" sign

When we first set out to build the Civil Justice Data Commons in the spring of 2020, we embarked on a listening tour of various stakeholders in the field. From our kickoff meetings to our National Science Foundation planning grant interviews, we spoke to upwards of 50 actors, across courts, legal service providers (LSPs), academia/research, and policy nonprofits. We heard from our colleagues and interviewees what would be most helpful to them in a CJDC. Namely, a Commons that would:

  • Harmonize datasets from different jurisdictions by standardizing individual data elements, ideally by mapping them to the National Open Court Data Standards (NODS)
  • Obtain data through data sharing agreements directly from civil courts who expressed openness to this sharing
  • Create dashboards and other data visualization products for the courts, to aid their day-to-day operational needs

We planned our future efforts around these elements, as seen in our Amazon Web Services design sprints, this sample dashboard, and our pre-2022 blog posts


However, as we exited the planning stage and entered the implementation stage, we quickly discovered that the goals we had set for the CJDC did not align with the stakeholder needs on the ground. A few realizations came to light, that led to a shift in direction of the CJDC:

  • The courts we contacted were struggling administratively to share data, even if there was interest. They unfortunately do not have the staff nor time to participate in these partnerships at the moment.
  • Due to lack of court partnerships, data dashboards were no longer needed. Several courts already have data analytics team creating visualizations.
  • The standardization across different civil legal datasets proved extremely difficult as we discovered just how disparate court data schemas were from one another.

Therefore, in 2022, we shifted our focus from our previous CJDC model to one of a data repository. In this new repository model, we still serve as a secure virtual platform for civil justice datasets, through which authorized researchers can conduct analyses. Now, however, we are not thoroughly mining the data for standardization, and we are primarily utilizing scraped data sources (from some courts, LSPs, and secondary parties) instead of relying on court partnerships. Our decision to shift directions was further validated in our recent Clustering & Classifying Methods Convening, where fellow data scientists and researchers all agreed how difficult civil justice data harmonization is for any one party to tackle.


Encouragingly, other actors are already doing exciting work that fills in what the new CJDC model leaves out, such as NCSC helping state court partners map their data schemas onto NODS, and the various Clustering convening presenters who are working on cleaning and clustering methodologies for court data that leverage machine learning. We look forward to providing a secure hosting site for civil justice data research and supporting our colleagues in their neighboring efforts, as we all strive to continually improve access to justice for individuals in the U.S.