Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18259
Title: Photo-realistic neural rendering for emotion-related semantic manipulation of unconstrained facial videos
Authors: Παραπέρας Παπαντωνίου, Φοίβος
Μαραγκός Πέτρος
Keywords: emotion manipulation
facial expressions
deepfakes
GANs
3DMMs
neural rendering
deep neural networks
video editing
VFX
Issue Date: 28-Feb-2022
Abstract: Recent advances in generative Deep Learning have made it possible to synthesize and manipulate images and videos with unprecedented realism, giving rise to a plethora of creative applications lying at the intersection of Computer Vision and Computer Graphics. In fact, a class of generative models, known as Generative Adversarial Networks (GANs), have proven particularly successful at generating images of human faces, leading to the new era of synthetic visual facial content known as “deepfakes”. For instance, deepfake techniques such as face swap or attribute (e.g. hair color, gender) manipulation have become quite popular since they rely solely on neural networks, without requiring expertise on digital effects. Yet, when it comes to manipulating dynamic facial expressions encountered in videos of talking faces, explicit prior knowledge of the face’s structure is usually needed. To this end, challenging applications such as face reenactment typically employ 3D face representations that can be obtained by fitting a statistical morphable model (3D Morphable Model - 3DMM) to a given image/video in a way that disentangles the expressions of the face from its rest modes of variation. Still, these methods are often limited to making a target actor directly mimic the expressions of a source actor without any further semantic control over these expressions. Motivated by this, our goal in this thesis is simple, yet challenging: the development of a novel deepfake system for altering the dynamic emotion conveyed by an actor in a video in an easily interpretable way, i.e. by even using as a sole input the semantic labels of the desired emotions, while at the same time preserving the original words of the talking person. Our main contributions can be summarized as follows: • We perform an in-depth review of the literature related to photo-realistic emotion manipulation in face images drawing conclusions about the limitations and challenges of the current SOTA. We, also, provide an overview of the latest developments in the fields of 3D face modelling and GAN-based image synthesis, some of which are carefully integrated in our system. • We propose the first - to our knowledge - deep learning method, which we call Neural Emotion Director, for “directing” the emotional state of actors in unconstrained (“in-the-wild”) videos, by translating their facial expressions to multiple unseen emotions or styles, without altering the lip movements. • We introduce a GAN-based network, called 3D-based Emotion Manipulator, that receives a sequence of facial expression parameters across consecutive frames and translates them to a given target emotion or a specific reference style. We, then, design a video-based neural face renderer for decoding the parametric representation of the altered expressions back to photo-realistic frames. We modify only the face area, while the background remains unchanged. • We assess our method through extensive qualitative and quantitative experiments, user and ablation studies and compare it with recent state-of-the-art methods demonstrating its superiority and advantages. We achieve promising results in very challenging scenarios like the ones found in movie scenes with moving background objects. Our work [93] was accepted to the 2022 IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), with the authors being Foivos Paraperas Papantoniou, Panagiotis P. Filntisis, Petros Maragos and Anastasios Roussos. Our demo youtube video and source code can be found in our project website: https://foivospar.github.io/NED/.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18259
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
FoivosPP_NTUA_thesis_final.pdf56.25 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.