Automatic Summarization of Court Judgements using Machine Learning, with applications to summarizing Greek Court Judgements

Γαλάνης, Δημήτρης

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18542

Title:	Automatic Summarization of Court Judgements using Machine Learning, with applications to summarizing Greek Court Judgements
Authors:	Γαλάνης, Δημήτρης Τσανάκας Παναγιώτης
Keywords:	Automatic Text Summarization Court Judgements Machine Learning Neural Networks Natural Language Processing BERT ROUGE Legal-AI Αυτόματη Περίληψη Κειμένου Δικαστικές Αποφάσεις Μηχανική Μάθηση Νευρωνικά Δίκτυα Επεξεργασία Φυσικής Γλώσσας Τεχνητή Νοημοσύνη και Δίκαιο
Issue Date:	31-Oct-2022
Abstract:	The rapid increase of digitized text documents has accentuated the need for reliable automatic methods that discern the important information from the unimportant. In the legal domain of court judgements, this process is done mostly manually by specialized legal editors, which is a time-consuming process. However, court judgement summaries are an essential part of a legal practitioner’s workflow, as they are shorter in length, thus enabling faster and more specific search for relevant case-laws. Furthermore, summarized versions of court judgements allow the legal practitioner to intuitively focus on its main points and thus acquire a better understanding of it. Recent advances in Machine Learning have enabled better performance in Automatic Text Summarization (ATS) systems, in terms of automatic evaluation metrics. Moreover, deep pre-trained Language Models enable the use of ATS without large amounts of training data. However, most methods are trained and evaluated for the news-article domain, which differs from the court-judgements domain as the latter includes longer documents, having significantly different structure and making use of specialized legal terminology. In our work, we attempt to automatically summarize Greek court judgements using machine learning methods. To that end, we first conduct an extended survey of the automatic text summarization literature; the methods, the datasets and evaluation metrics used and the criticism that has been applied to them. Then we proceed by constructing a dataset of Greek court judgement texts and their summaries. We build an extractive summarization system, based on the LexRank algorithm, that extracts the most important sentences from a judgement. We train an Encoder-Decoder Deep Learning model based on the BERT architecture, using open-sourced checkpoints trained on Greek parliamentary corpora and use it to model abstractive summarization as a sequence generation task. We evaluate our methods using the ROUGE-family of automatic evaluation metrics and also conduct a human evaluation study. We show that domain informed preprocessing and including judgement classification information can increase the performance of our abstractive summarization methods. We provide a comparison of different variations of our extractive summarization methods. Legal experts’ evaluation shows our extractive methods perform average, and our abstractive methods, while generating moderately fluent and coherent text, have low scores in the relevance and consistency metrics, indicating the need of methods factually aligned to the judgement text.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18542
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
NTUA_ECE_Thesis_Template_EN__Copy_ (24).pdf		4.85 MB	Adobe PDF	View/Open

Show full item record