- [Open Access] Entscheidungen des Bundesgerichtshofs in Strafsachen aus dem 20. Jahrhundert (BGH-Strafsachen-20Jhd)
- [Open Access] Source Code on Zenodo
- [Open Access] GitHub Repository
New Dataset available! Link to heading
Tilko Swalve and I are excited to announce the publication of a new dataset!
We collected, cleaned, sorted and published Open Access more than 36,000 criminal law judgments of the German Federal Court of Justice, the Bundesgerichtshof (BGH).
These data are extremely rare. Extensive collections of older judgments issued by German courts are usually only available commercially — if at all. They are especially valuable for the rule of law, because criminal law deeply impacts the rights of citizens. The German Federal Court of Justice has been setting down criminal law guidelines for all of Germany since its creation in 1950.
Unfortunately the German Federal Court of Justice only started publishing its judgments in 2000 and has not done so retroactively for older decisions. Many important precedents and landmark decisions have been denied to the public until now. They were only available to purchasers of expensive commercial subscription services.
We are changing this. Now.
About the Dataset Link to heading
The dataset Entscheidungen des Bundesgerichtshofs in Strafsachen aus dem 20. Jahrhundert (BGH-Strafsachen-20Jhd) is an as-complete-as-possible collection of judgments in criminal matters issued by the German Federal Court of Justice in the period between 1 October 1950 (the founding of the court) and 1 January 2000, the date that the court started publishing decisions online.
We obtained judgments from five of the courts “senates” (panels) for the years 1950 to 1999. A sixth senate existed from 1954 to 1956, but we have no data for this last senate.
We offer the dataset in machine-readable formats TXT and CSV, but also include the original PDF files for traditional legal work.
Features Link to heading
- 31 variables
- Data model compatible with the Corpus der Entscheidungen des Bundesgerichtshofs (CE-BGH)
- Public Domain (CC-Zero 1.0)
- Open and platform independent file formats (PDF, TXT, CSV)
- Extensive Codebook
- Compilation Report explains construction and validation of the data set in detail
- Large number of diagrams for all purposes (see the ‘ANALYSIS’ archive)
- Diagrams are available as PDF (for printing) and PNG (for web display), tables are available as CSV for easy readability by humans and machines
- Secure cryptographic signatures
- Publication of full source code (Open Source)
Content of the Dataset Link to heading
By Year Link to heading
By Senate (Panel) Link to heading
By President Link to heading
Workflow Link to heading
The data pipeline offers the following features:
- Clean filenames
- Correct rotation, standardize in portrait orientation
- Optical character recognition (OCR)
- Automated cleaning of OCR errors related to German legal terminology
- Extraction of additional variables
- Production of ready-to-use ZIP archives
- Comprehensive documentation
- Automated unit tests and statistical reporting
- Cryptographical signatures
Copyright Link to heading
The dataset is released into the public domain under a Creative Commons Zero 1.0 Universal Public Domain Waiver.