Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19380
Τίτλος: | Exploring Transformer-Based Reasoning through Efficient Tuning and Advanced Prompting Strategies |
Συγγραφείς: | Παναγιωτόπουλος, Ιωάννης Στάμου Γιώργος |
Λέξεις κλειδιά: | Lateral thinking Vertical thinking Large Language Models (LLMs) Fine-tuning Low-Rank Adaptation (LoRA) Quantized Low-Rank Adaptation (QLoRA) Word embeddings Few-shot learning Zero-shot learning Context reconstruction Semantic reconstruction Semantic similarity In-context learning Reasoning capabilities RISCORE method (RIddle Solving with COntext REcontruciton) |
Ημερομηνία έκδοσης: | 24-Οκτ-2024 |
Περίληψη: | This thesis investigates methods to improve the reasoning capabilities of large language models (LLMs) by leveraging lateral thinking, focusing on two distinct proposals that address different approaches to enhancing model performance: the first through fine-tuning and training, and the second through advanced prompting techniques without additional training. The first proposal is tied to the SemEval-2024 Task 9 competition: "BRAINTEASER: A Novel Task Defying Common Sense," where the focus is on fine-tuning transformer-based models using the BRAINTEASER dataset. This approach involves training models to solve lateral thinking challenges, such as sentence and word puzzles. By employing lightweight tuning on smaller encoder models and LLMs, the aim was to surpass baseline performance. A key element of this proposal was transforming multiple-choice problems into binary classification tasks, allowing the models to explore diverse reasoning paths. The analysis highlighted the influence of model size and hyperparameters, along with an investigation into the reasoning cues that lead to model failures. The goal was to enhance model accuracy and reasoning skills, while providing insights into how LLMs handle lateral thinking problems through targeted fine-tuning. In contrast, the second proposal takes a different approach by avoiding model training altogether and instead focusing on enhancing performance through a novel prompting technique called RISCORE (RIddle Solving with COntext REcontruciton). This method, inspired by the structure of the BRAINTEASER dataset, augments few-shot learning by providing contextually reconstructed examples of riddles, designed to improve the model's in-context problem-solving abilities. RISCORE operates by preserving the original reasoning process while altering the context to offer a clearer reasoning trajectory for the model to follow. By comparing RISCORE to other popular prompting methods, the results showed its effectiveness in improving both lateral and vertical thinking tasks without the need for additional training. This approach highlights the potential of strategic prompting in enhancing LLM performance, particularly in complex reasoning tasks that challenge common sense. These two proposals showcase distinct methodologies—one focused on model training and fine-tuning, and the other on innovative prompting techniques—both contributing valuable insights into how LLMs can be improved for lateral thinking challenges. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19380 |
Εμφανίζεται στις συλλογές: | Διπλωματικές Εργασίες - Theses |
Αρχεία σε αυτό το τεκμήριο:
Αρχείο | Περιγραφή | Μέγεθος | Μορφότυπος | |
---|---|---|---|---|
Giannis_Panagiotopoulos_Diploma_Thesis.pdf | Thesis Document | 2.27 MB | Adobe PDF | Εμφάνιση/Άνοιγμα |
Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.