Exploring Transformer-Based Reasoning through Efficient Tuning and Advanced Prompting Strategies

Παναγιωτόπουλος, Ιωάννης

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19380

Full metadata record

DC Field	Value	Language
dc.contributor.author	Παναγιωτόπουλος, Ιωάννης	-
dc.date.accessioned	2024-11-05T10:11:02Z	-
dc.date.available	2024-11-05T10:11:02Z	-
dc.date.issued	2024-10-24	-
dc.identifier.uri	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19380	-
dc.description.abstract	This thesis investigates methods to improve the reasoning capabilities of large language models (LLMs) by leveraging lateral thinking, focusing on two distinct proposals that address different approaches to enhancing model performance: the first through fine-tuning and training, and the second through advanced prompting techniques without additional training. The first proposal is tied to the SemEval-2024 Task 9 competition: "BRAINTEASER: A Novel Task Defying Common Sense," where the focus is on fine-tuning transformer-based models using the BRAINTEASER dataset. This approach involves training models to solve lateral thinking challenges, such as sentence and word puzzles. By employing lightweight tuning on smaller encoder models and LLMs, the aim was to surpass baseline performance. A key element of this proposal was transforming multiple-choice problems into binary classification tasks, allowing the models to explore diverse reasoning paths. The analysis highlighted the influence of model size and hyperparameters, along with an investigation into the reasoning cues that lead to model failures. The goal was to enhance model accuracy and reasoning skills, while providing insights into how LLMs handle lateral thinking problems through targeted fine-tuning. In contrast, the second proposal takes a different approach by avoiding model training altogether and instead focusing on enhancing performance through a novel prompting technique called RISCORE (RIddle Solving with COntext REcontruciton). This method, inspired by the structure of the BRAINTEASER dataset, augments few-shot learning by providing contextually reconstructed examples of riddles, designed to improve the model's in-context problem-solving abilities. RISCORE operates by preserving the original reasoning process while altering the context to offer a clearer reasoning trajectory for the model to follow. By comparing RISCORE to other popular prompting methods, the results showed its effectiveness in improving both lateral and vertical thinking tasks without the need for additional training. This approach highlights the potential of strategic prompting in enhancing LLM performance, particularly in complex reasoning tasks that challenge common sense. These two proposals showcase distinct methodologies—one focused on model training and fine-tuning, and the other on innovative prompting techniques—both contributing valuable insights into how LLMs can be improved for lateral thinking challenges.	en_US
dc.language	en	en_US
dc.subject	Lateral thinking	en_US
dc.subject	Vertical thinking	en_US
dc.subject	Large Language Models (LLMs)	en_US
dc.subject	Fine-tuning	en_US
dc.subject	Low-Rank Adaptation (LoRA)	en_US
dc.subject	Quantized Low-Rank Adaptation (QLoRA)	en_US
dc.subject	Word embeddings	en_US
dc.subject	Few-shot learning	en_US
dc.subject	Zero-shot learning	en_US
dc.subject	Context reconstruction	en_US
dc.subject	Semantic reconstruction	en_US
dc.subject	Semantic similarity	en_US
dc.subject	In-context learning	en_US
dc.subject	Reasoning capabilities	en_US
dc.subject	RISCORE method (RIddle Solving with COntext REcontruciton)	en_US
dc.title	Exploring Transformer-Based Reasoning through Efficient Tuning and Advanced Prompting Strategies	en_US
dc.description.pages	138	en_US
dc.contributor.supervisor	Στάμου Γιώργος	en_US
dc.department	Τομέας Τεχνολογίας Πληροφορικής και Υπολογιστών	en_US
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
Giannis_Panagiotopoulos_Diploma_Thesis.pdf	Thesis Document	2.27 MB	Adobe PDF	View/Open

Show simple item record