Overview Link to heading

New citation data available! I’ve updated the Corpus of German Federal Court of Justice decisions and it now includes a specialized variant containing all citations to its own decisions, extracted from the text of those decisions.

The scope of the citation network in Version 2024-09-25 is:

  • ca. 600,000 individual citations
  • ca. 440,000 edges (citation connections weighted by number of individual citations)
  • ca. 100,000 nodes (Aktenzeichen, BGHZ und BGHSt)

Aktenzeichen are German docket numbers. BGHZ and BGHSt are citations to the official collections of decisions in civil (BGH) and criminal (BGHSt) matters.

As of now this variant is in beta testing. It contains the following types of citations:

  • Citations from Aktenzeichen to Aktenzeichen
  • Citations from Aktenzeichen to BGHZ
  • Citations from Aktenzeichen to BGHSt

Aktenzeichen (docket number) citations are more inaccurate than decision citations (which would require a date as well). However, 91.78% of all Aktenzeichen in the CE-BGH dataset are unique (independent of the date) so this is a reasonable approximation. I intend to add decision-level citation support in the future.

Warning
Citing decisions can only be those where the full-text is available, this means from 2000 onwards. Cited decisions can be from any year.

Some Network Statistics Link to heading

Number of Nodes Number of Edges Strength (Out) Mean Degree Max Degree Min Degree
101,474.00 441,884.00 593,154.00 8.71 559.00 0.00

Network Diagram for the 11th Civil Senate (Banking Senate) Link to heading

This diagram visualizes the citation network extracted from the decisions of the 11th Civil Senate, also known as the Banking Senate. It represents only a subset of the data. The complete network is probably too large to be visualized in any single diagram.

The visualization algorithm is Sugiyama.

The small white dots are individual docket numbers of BGHZ decisions, the connecting lines are citations (any number). Multiple citations between the same nodes are not shown in the diagram, but the weights are available in the network data.

The network is hierarchical, because newer decisions can only cite older decisions, not the other way around. The diagram is read from top to bottom.

Because of the strong connections between certain decision clusters one might call them “lines of jurisprudence”, but the research on this topic is still in its infancy and it may simply be an artifact of the force-directed layout.

Info
High-resolution versions of the diagrams can be downloaded here..