Enhancing Neural Network Compression through Adaptive Pruning Strategies

Zachou, Aliki

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19639

Title:	Enhancing Neural Network Compression through Adaptive Pruning Strategies
Authors:	Zachou, Aliki Μαραγκός Πέτρος
Keywords:	DNN-compression sparsification weight-pruning unstructured-pruning sparse-training magnitude-based-pruning
Issue Date:	18-Jun-2025
Abstract:	In recent years, deep neural networks have achieved state-of-the-art performance across various and increasingly complicated machine learning tasks. However, in order to achieve such performances the deep neural network models are increasingly complicated and deployment heavy. Model compression methods, particularly pruning, have emerged as effective strategies to address these concerns by eliminating redundant parameters and reducing computational overhead. However, extreme levels of pruning often lead to instability and performance degradation. This thesis addresses these limitations by introducing adaptive pruning strategies that improve performance while maintaining high compression targets. The work tests the proposed strategies on the Feather pruning module, a recent method that utilizes the Straight-Through Estimator to enable gradient-based dense-to-sparse training. Although Feather and many similar modules yield SoA results, they rely on some static hyperparameters for gradient scaling and sparsity scheduling, which may lead to reduced performance. This work proposes two contributions on the basis of those static parameters. The first contribution is a dynamic scaling method inspired by Feather's fixed gradient scaling hyperparameter replacing it with a function based on the training phase's achieved sparsity, allowing larger gradient flow during early iterations to prevent premature pruning and more conservative updates as the sparsity ratio rises. The second contribution introduces an adaptive pruning scheduler function family that adjusts the pruning rate according to the stability of the pruning mask. This adaptiveness ensures that pruning is more cautious when masks become unstable, reducing the likelihood of large accuracy drops. Evaluations on benchmark architectures such as ResNet20, DenseNet40-24, and MobileNet V1, trained on CIFAR-100, show that both contributions enhance performance, especially in extreme sparsity ratios. In addition to these two contributions, the thesis includes an in depth study of the significance of pruned weights, comparing training processes that retain pruned connections via STE against those that permanently remove them. More specifically, the study aims to find what percentage of pruned weights holds significance and what percentage can be permanently dropped from the optimization process, in order to introduce sparse gradients to the sparsification process without loss of accuracy.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19639
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
thesis_aliki_zachou.pdf		7.64 MB	Adobe PDF	View/Open

Show full item record