Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18870
Title: Morphological Diffusion for Handwritten Text Generation
Authors: Μπακάλης, Δημήτριος
Μαραγκός Πέτρος
Keywords: Deep Learning
Generative AI
Morphological Mathematics
Diffusion Models
Morphological Diffusion
Handwritten Text Generation
Latent Diffusion
Cold Diffusion
Issue Date: 20-Oct-2023
Abstract: Generative Artificial Intelligence, often referred to simply as "Generative AI", stands at the forefront of modern technology, pushing the boundaries of what machines can create and imagine. This remarkable field represents a ground breaking fusion of computer science, machine learning, and neural networks, enabling computers to generate original content that mimics human creativity. The emergence of generative AI has sparked a revolution in various domains, from content creation and entertainment to healthcare and finance. It has given birth to powerful applications, such as natural language generation, style transfer in images, and autonomous creative agents that can inspire, inform, and entertain. In this thesis, we use Diffusion Models to address the intricate challenge of Handwritten Text Generation (HTG), with a focus on conditioning it on textual content and writing style. Drawing inspiration from recent breakthroughs in the realm of generalized diffusions, we introduce a novel non-linear diffusion process rooted in a fundamental operation of morphological mathematics, specifically, the dilation. We initially present our baseline experiments conducted on the MNIST and CIFAR-10 datasets, serving as a foundational proof-of-concept for our novel approach. We compare our methodology with recent advancements in generalized diffusions, shedding light on its comparative performance. Furthermore, we advocate for a two-stage approach, complemented by the inclusion of a Generative Adversarial Network (GAN), to facilitate conditional generation within the MNIST dataset. This approach proves its mettle by outperforming classic diffusion frameworks, when operating within a constrained number of timesteps. Subsequently, we pivot our focus towards the intricate task of Handwritten Text Generation. In a quest for optimization, we enhance the existing state-of-the-art model with more efficient sampling algorithms, as documented in the bibliography. Furthermore, we streamline the model’s efficiency by introducing the concept of morphological diffusion. Specifically, we deviate from the conventional Gaussian framework and modify the degradation function within the latent diffusion process to embrace morphological diffusion. This transformation yields competitive results, rivalling the state-of-the-art, all while significantly reducing the computational demands imposed during both training and sampling procedures.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18870
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
thesis_dbakalis.pdf32.85 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.