Please use this identifier to cite or link to this item:
Title: Automatic Summarization of Court Judgements using Machine Learning, with applications to summarizing Greek Court Judgements
Authors: Γαλάνης, Δημήτρης
Τσανάκας Παναγιώτης
Keywords: Automatic Text Summarization
Court Judgements
Machine Learning
Neural Networks
Natural Language Processing
Αυτόματη Περίληψη Κειμένου
Δικαστικές Αποφάσεις
Μηχανική Μάθηση
Νευρωνικά Δίκτυα
Επεξεργασία Φυσικής Γλώσσας
Τεχνητή Νοημοσύνη και Δίκαιο
Issue Date: 31-Oct-2022
Abstract: The rapid increase of digitized text documents has accentuated the need for reliable automatic methods that discern the important information from the unimportant. In the legal domain of court judgements, this process is done mostly manually by specialized legal editors, which is a time-consuming process. However, court judgement summaries are an essential part of a legal practitioner’s workflow, as they are shorter in length, thus enabling faster and more specific search for relevant case-laws. Furthermore, summarized versions of court judgements allow the legal practitioner to intuitively focus on its main points and thus acquire a better understanding of it. Recent advances in Machine Learning have enabled better performance in Automatic Text Summarization (ATS) systems, in terms of automatic evaluation metrics. Moreover, deep pre-trained Language Models enable the use of ATS without large amounts of training data. However, most methods are trained and evaluated for the news-article domain, which differs from the court-judgements domain as the latter includes longer documents, having significantly different structure and making use of specialized legal terminology. In our work, we attempt to automatically summarize Greek court judgements using machine learning methods. To that end, we first conduct an extended survey of the automatic text summarization literature; the methods, the datasets and evaluation metrics used and the criticism that has been applied to them. Then we proceed by constructing a dataset of Greek court judgement texts and their summaries. We build an extractive summarization system, based on the LexRank algorithm, that extracts the most important sentences from a judgement. We train an Encoder-Decoder Deep Learning model based on the BERT architecture, using open-sourced checkpoints trained on Greek parliamentary corpora and use it to model abstractive summarization as a sequence generation task. We evaluate our methods using the ROUGE-family of automatic evaluation metrics and also conduct a human evaluation study. We show that domain informed preprocessing and including judgement classification information can increase the performance of our abstractive summarization methods. We provide a comparison of different variations of our extractive summarization methods. Legal experts’ evaluation shows our extractive methods perform average, and our abstractive methods, while generating moderately fluent and coherent text, have low scores in the relevance and consistency metrics, indicating the need of methods factually aligned to the judgement text.
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
NTUA_ECE_Thesis_Template_EN__Copy_ (24).pdf4.85 MBAdobe PDFView/Open

Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.