Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/20087
Full metadata record
DC FieldValueLanguage
dc.contributor.authorΚαραφύλλης, Νικόλαος-
dc.date.accessioned2026-03-17T15:35:08Z-
dc.date.available2026-03-17T15:35:08Z-
dc.date.issued2026-03-17-
dc.identifier.urihttp://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/20087-
dc.description.abstractAbductive reasoning, the process of inferring the most plausible causes from incomplete evidence, remains a significant challenge for Large Language Models (LLMs), demanding simultaneous evaluation of competing hypotheses under uncertainty. This diploma thesis addresses this challenge through the lens of SemEval 2026 Task 12: Abductive Event Reasoning, where a system must identify all plausible direct causes of a target event from four candidate explanations, using multi-document evidence as context. We develop two complementary approaches: a three-stage direct prompting pipeline combining hybrid GraphRAG retrieval, structured XML prompting refined through GEPA prompt optimization, and eight deterministic post-hoc verification rules; and an auxiliary multi-expert causal graph in which four specialized experts collaboratively construct explicit directed acyclic graphs with confidence-scored edges, providing interpretable causal chains that support human verification of the system’s reasoning. Our system achieves an accuracy of 0.95 on the test set, ranking first on the SemEval 2026 Task 12 evaluation-phase leaderboard. Through a cross-model error analysis spanning 15 configurations across 7 LLM families and the Causal Graph System, we identify three shared inductive biases: a single-cause default that reduces the annotated cause count by 47%, temporal proximity preference driving all wrong-answer failures, and salience preference favouring dramatic over subtler contributing causes. The Causal Graph System partially mitigates these biases, exhibiting the smallest multi-answer gap (−14.7 pp) and contributing 12 unique correct predictions, the most of any individual system.en_US
dc.languageenen_US
dc.subjectAbductive Reasoningen_US
dc.subjectLarge Language Modelsen_US
dc.subjectCausalityen_US
dc.subjectCausal Graphsen_US
dc.subjectRetrieval-Augmented Generationen_US
dc.subjectPrompt Engineeringen_US
dc.subjectMulti-agent Systemsen_US
dc.subjectPrompt Optimisationen_US
dc.titleAbductive Event Reasoning with Large Language Modelsen_US
dc.description.pages119en_US
dc.contributor.supervisorΒουλόδημος Αθανάσιοςen_US
dc.departmentΤομέας Τεχνολογίας Πληροφορικής και Υπολογιστώνen_US
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
Diploma_Thesis_N_Karafyllis .pdf1.85 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.