Teaching
Legal Data Science Link to heading
This page collects all the materials I’ve created to help people learn more about Legal Data Science.
Legal Data Science is the application of computational statistical methods to the legal domain. In other words, it combines programming, statistics and domain expertise (Conway 2013).
All the materials I create are published under open licenses and I would be glad to see them used wherever they can help — whether for self-study, in class or anywhere else.
As new tutorials, essays and other items are published I will add them to this page, sorted by topic. If you are looking for a chronological list, check out my blog for what is new.
Tutorials Link to heading
Tutorials introduce and teach key subjects from statistics and computer science with detailed narrative explanations and code, but always discuss the clear link to law and politics so it becomes clear how these tools are useful to lawyers.
Introduction Link to heading
Statistics Link to heading
- Distributions and Summary Statistics
- Representativeness, Samples and Populations
- The Importance of Data Visualization (Datasaurus Edition)
Natural Language Processing Link to heading
Essays Link to heading
Essays deal with contemporary legal and political issues, specialist mathematical/computational problems or just anything else that is tangentially related to my research interests.
It is important to understand that technology, especially legal technology, does not and cannot exist in some computer science paradise. Badly designed technology kills, maims and impoverishes real people. Legal technology can scale terrible injustice as easily as real justice.
Be warned and be careful with what you build and deploy.
Political and Legal Essays Link to heading
- The Breaking of a Social Contract, or Why I am Switching to Copyleft Licensing (2024)
- Some Thoughts on the Use of Large Language Models in the Legal Domain (2023)
Technical Essays Link to heading
- The Limits of Reproducibility with Rocker Docker Images for R (2024)
- Git Credential Management (2024)
- Visualizing Permutation Matrices as Graphs (2024)
- Modeling Data Workflows with {targets} (2023)
Selected Talks Link to heading
Talks are a useful way to get a high-level introduction to my work. I publish the dates of my upcoming talks on my blog and (almost) always upload the slides, notes and materials afterwards. They are designed to be informative even in the absence of a recording, so feel free to browse!
- Introduction to Legal Data Science (August 2024)
- ‘Corpus of Resolutions: UN Security Council’ Launch Event (May 2024)
- International Legal Data in Action: Ideas and Applications for the ICJ and PCIJ (March 2023)
Advanced Learning and Teaching Link to heading
Advanced students of Legal Data Science and teachers may wish to review the open source code I have written and published, build projects on top of the open legal data sets I make available or check out the list of my preferred tools as an inspiration for their own workflows.
- Open Legal Datasets to practice Legal Data Science or build your own projects
- Open Source Code to understand how I work and which techniques I use
- List of my favorite tools which might be as useful to you as they are to me
Many of my published Open Legal Datasets contain a ZIP archive called “analysis” which contains well-designed and informative diagrams analyzing key characteristics of each dataset.
All diagrams are available in high-resolution for print (PDF) or web (PNG) use.
These diagrams are published under the same public domain license as their parent dataset and are intended to help teachers of Legal Data Science by saving them some work.