Automatic Generation of Fashion Images using Prompting in Generative Machine Learning Models

Αργυρού, Γεωργία

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19176

Full metadata record

DC Field	Value	Language
dc.contributor.author	Αργυρού, Γεωργία	-
dc.date.accessioned	2024-07-17T10:31:05Z	-
dc.date.available	2024-07-17T10:31:05Z	-
dc.date.issued	2024-07-15	-
dc.identifier.uri	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19176	-
dc.description.abstract	In the contemporary landscape of fashion, the convergence of technology and creativity has catalyzed a transformative shift, ushering in new opportunities and redefining industry standards. At the forefront of this evolution lies the integration of computer vision and artificial intelligence, revolutionizing fashion through innovation, efficiency, and refined aesthetic precision. This thesis investigates methodologies for generating tailored fashion descriptions using two distinct Large Language Models (LLMs) and a Stable Diffusion model for image creation. Emphasizing efficiency and adaptability in AI-driven fashion creativity, we depart from traditional approaches and focus on prompting techniques, such as zero-shot, one-shot and few-shot learning as well as Chain-of-Thought. Central to our methodology is Retrieval-Augmented Generation (RAG), enriching models with insights from fashion magazines, blogs, and other sources to ensure accurate and contemporary fashion representations. Evaluation combines quantitative metrics like CLIPscore with qualitative human judgment, highlighting strengths in creativity, coherence, and aesthetic appeal across diverse styles. Comparative analysis demonstrates the efficacy of techniques such as Few-shot learning and RAG with PDFs in producing descriptions and images tailored to specific fashion variables. Qualitative assessment reveals advancements in realism and visual diversity, supported by the Chain-of-Thought methodology	en_US
dc.language	en	en_US
dc.subject	Large Language Models	en_US
dc.subject	Prompting	en_US
dc.subject	Stable Diffusion	en_US
dc.subject	Knowledge Injection	en_US
dc.title	Automatic Generation of Fashion Images using Prompting in Generative Machine Learning Models	en_US
dc.description.pages	117	en_US
dc.contributor.supervisor	Στάμου Γιώργος	en_US
dc.department	Τομέας Τεχνολογίας Πληροφορικής και Υπολογιστών	en_US
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
Diploma_Thesis_Georgia_Argyrou.pdf		8.61 MB	Adobe PDF	View/Open

Show simple item record