Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19380
Full metadata record
DC FieldValueLanguage
dc.contributor.authorΠαναγιωτόπουλος, Ιωάννης-
dc.date.accessioned2024-11-05T10:11:02Z-
dc.date.available2024-11-05T10:11:02Z-
dc.date.issued2024-10-24-
dc.identifier.urihttp://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19380-
dc.description.abstractThis thesis investigates methods to improve the reasoning capabilities of large language models (LLMs) by leveraging lateral thinking, focusing on two distinct proposals that address different approaches to enhancing model performance: the first through fine-tuning and training, and the second through advanced prompting techniques without additional training. The first proposal is tied to the SemEval-2024 Task 9 competition: "BRAINTEASER: A Novel Task Defying Common Sense," where the focus is on fine-tuning transformer-based models using the BRAINTEASER dataset. This approach involves training models to solve lateral thinking challenges, such as sentence and word puzzles. By employing lightweight tuning on smaller encoder models and LLMs, the aim was to surpass baseline performance. A key element of this proposal was transforming multiple-choice problems into binary classification tasks, allowing the models to explore diverse reasoning paths. The analysis highlighted the influence of model size and hyperparameters, along with an investigation into the reasoning cues that lead to model failures. The goal was to enhance model accuracy and reasoning skills, while providing insights into how LLMs handle lateral thinking problems through targeted fine-tuning. In contrast, the second proposal takes a different approach by avoiding model training altogether and instead focusing on enhancing performance through a novel prompting technique called RISCORE (RIddle Solving with COntext REcontruciton). This method, inspired by the structure of the BRAINTEASER dataset, augments few-shot learning by providing contextually reconstructed examples of riddles, designed to improve the model's in-context problem-solving abilities. RISCORE operates by preserving the original reasoning process while altering the context to offer a clearer reasoning trajectory for the model to follow. By comparing RISCORE to other popular prompting methods, the results showed its effectiveness in improving both lateral and vertical thinking tasks without the need for additional training. This approach highlights the potential of strategic prompting in enhancing LLM performance, particularly in complex reasoning tasks that challenge common sense. These two proposals showcase distinct methodologies—one focused on model training and fine-tuning, and the other on innovative prompting techniques—both contributing valuable insights into how LLMs can be improved for lateral thinking challenges.en_US
dc.languageenen_US
dc.subjectLateral thinkingen_US
dc.subjectVertical thinkingen_US
dc.subjectLarge Language Models (LLMs)en_US
dc.subjectFine-tuningen_US
dc.subjectLow-Rank Adaptation (LoRA)en_US
dc.subjectQuantized Low-Rank Adaptation (QLoRA)en_US
dc.subjectWord embeddingsen_US
dc.subjectFew-shot learningen_US
dc.subjectZero-shot learningen_US
dc.subjectContext reconstructionen_US
dc.subjectSemantic reconstructionen_US
dc.subjectSemantic similarityen_US
dc.subjectIn-context learningen_US
dc.subjectReasoning capabilitiesen_US
dc.subjectRISCORE method (RIddle Solving with COntext REcontruciton)en_US
dc.titleExploring Transformer-Based Reasoning through Efficient Tuning and Advanced Prompting Strategiesen_US
dc.description.pages138en_US
dc.contributor.supervisorΣτάμου Γιώργοςen_US
dc.departmentΤομέας Τεχνολογίας Πληροφορικής και Υπολογιστώνen_US
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
Giannis_Panagiotopoulos_Diploma_Thesis.pdfThesis Document2.27 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.