Please use this identifier to cite or link to this item:
|Title:||Vulnerabilities and robustness of Convolutional Neural Networks against Adversarial Attacks in the spatial and spectral domain|
|Keywords:||ισχυρά νευρωνικά δίκτυα, ανταγωνιστική μηχανική μάθηση, συνελικτικά νευρωνικά δίκτυα, ταξινόμηση εικόνων, μετασχηματισμός Fourier|
robustness, adversarial machine learning, convolutional neural networks, image classification, Fourier transform
|Abstract:||The constant rise in the capabilities of Artificial Intelligence has led to its application in numerous domains even when safety is a critical component. In the area of computer vision, Convolutional Neural Networks (CNNs) achieve impressive results in image classification, segmentation and object detection. It has been proven though that CNNs are easily manipulated and fooled by very small and carefully crafted corruptions, imperceptible to the human eye. These corruptions known as adversarial attacks have raised the question of the robustness of modern CNNs to images deviating from the training data distribution and pose an important threat to their reliability. A variety of attack as well as defence and detection methods have been proposed but to this date models are still vulnerable. The purpose of this thesis is to examine the success rate of common adver- sarial attack algorithms as well as the defence method of adversarial training in image classification tasks. Specifically, we start by using common CNN architectures trained on the CIFAR-10 and 350 Bird Species datasets as victim models. We implement two attacks, namely the white-box C&W and PGD methods and manage to fool our models into misclassifying perturbed images with a success rate of up to 100%. In order to then investigate ways to defend our models we use adversarial training with the TRADES algorithm and significantly drop attack success rates, but also show the existing trade-off between accuracy and robustness. Lastly, since current detection methods propose a strong distinction between the spectral representation of adversarial examples and benign images, we explore the characteristics of adversarial attacks as well as training methods in the Fourier domain. Through this analysis we observe that perturbations are influenced by a number of factors related to the dataset, training algorithm and model architecture and aspire to bring forward the Fourier domain properties that differentiate robust from non-robust models and their vulnerabilities.|
|Appears in Collections:||Διπλωματικές Εργασίες - Theses|
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.