Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19731
Full metadata record
DC FieldValueLanguage
dc.contributor.authorΣτριγγλή, Ελένη-
dc.date.accessioned2025-07-16T07:34:44Z-
dc.date.available2025-07-16T07:34:44Z-
dc.date.issued2025-07-02-
dc.identifier.urihttp://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19731-
dc.description.abstractAs Large Language Models (LLMs) continue to grow in scale, they exhibit increasingly sophisticated behaviors, including abilities that resemble logical reasoning. However, the authenticity of such advances remains a subject of debate, with many arguing that they are largely a byproduct of memorization and advanced statistical pattern recognition rather than genuine understanding. To shed light on these limitations, researchers have developed experimental conditions that challenge LLMs to override entrenched associations, highlighting gaps in reasoning and adaptability when compared to human cognition. Inverse scaling tasks are designed to uncover such weaknesses by revealing a paradoxical decline in performance as model scale increases, thereby exposing critical blind spots in scale-driven improvements. In this thesis, we explore the redefinition task, which challenges LLMs to adopt nonstandard definitions for familiar scientific constants and units of measurement and then respond based on these altered values. We evaluate state-of-the-art models from multiple LLM families and demonstrate that larger LLMs not only perform worse at following redefinitions, anchoring more strongly to their memorized knowledge, but also demonstrate increased confidence in generating false responses rather than choosing to abstain. In addition, although factors such as response formatting and prompting techniques can influence these behaviors, no strategy fully counteracts the tendency of larger models to revert to pretraining priors.en_US
dc.languageenen_US
dc.subjectLarge Language Modelsen_US
dc.subjectprompt engineeringen_US
dc.subjectinverse scalingen_US
dc.subjectreasoning capabilitiesen_US
dc.subjectadaptabilityen_US
dc.subjectinterpretabilityen_US
dc.titlePitfalls of Scale: Investigating the Inverse Task of Redefinition in Large Language Modelsen_US
dc.description.pages107en_US
dc.contributor.supervisorΒουλόδημος Αθανάσιοςen_US
dc.departmentΤομέας Τεχνολογίας Πληροφορικής και Υπολογιστώνen_US
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
Diploma_Thesis_Elena_Stringli.pdf8.85 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.