Alexander V. Belikov (Growgraph, Paris), Sacha Raoult (Institut Universitaire de France, Aix-Marseille University)
January 24, 2025
arXiv | PDF
The paper presents a complete, end-to-end framework for transforming unstructured French criminal court appeals into structured knowledge graphs. The authors process 2,820 appeals from the criminal chamber of the French Cassation Court (France’s Supreme Court for criminal cases) using GPT-4o mini to automatically extract entities and relationships into RDF triples. The core contribution is a domain-specific criminal law ontology developed semi-automatically through iterative interaction with LLMs (GPT-4o mini and Claude 3.5 Sonnet), which guides the extraction process and ensures consistent, structured output. The key finding is that ontology-guided RDF triple generation significantly outperforms property graph approaches — the RDF method achieved >90% accuracy (93% precision, 89% recall) compared to only 50-60% for property graph extraction. This demonstrates that providing a well-designed domain ontology in the LLM prompt is critical for reliable legal knowledge graph construction.
A knowledge graph is a structured representation of information where nodes represent entities (people, crimes, courts, punishments) and edges represent relationships between them. Unlike flat databases or plain text, KGs capture the connections between pieces of information, making them ideal for domains like law where relationships between actors, events, and legal provisions are critical. Currently, there are two competing ways to store knowledge graphs: (a) Resource Description Framework (RDF) triples follow the format subject-predicate-object; and (b) Property Graphs (e.g., Neo4j) store nodes and edges with arbitrary key-value properties.