Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19639
Title: | Enhancing Neural Network Compression through Adaptive Pruning Strategies |
Authors: | Zachou, Aliki Μαραγκός Πέτρος |
Keywords: | DNN-compression sparsification weight-pruning unstructured-pruning sparse-training magnitude-based-pruning |
Issue Date: | 18-Jun-2025 |
Abstract: | In recent years, deep neural networks have achieved state-of-the-art performance across various and increasingly complicated machine learning tasks. However, in order to achieve such performances the deep neural network models are increasingly complicated and deployment heavy. Model compression methods, particularly pruning, have emerged as effective strategies to address these concerns by eliminating redundant parameters and reducing computational overhead. However, extreme levels of pruning often lead to instability and performance degradation. This thesis addresses these limitations by introducing adaptive pruning strategies that improve performance while maintaining high compression targets. The work tests the proposed strategies on the Feather pruning module, a recent method that utilizes the Straight-Through Estimator to enable gradient-based dense-to-sparse training. Although Feather and many similar modules yield SoA results, they rely on some static hyperparameters for gradient scaling and sparsity scheduling, which may lead to reduced performance. This work proposes two contributions on the basis of those static parameters. The first contribution is a dynamic scaling method inspired by Feather's fixed gradient scaling hyperparameter replacing it with a function based on the training phase's achieved sparsity, allowing larger gradient flow during early iterations to prevent premature pruning and more conservative updates as the sparsity ratio rises. The second contribution introduces an adaptive pruning scheduler function family that adjusts the pruning rate according to the stability of the pruning mask. This adaptiveness ensures that pruning is more cautious when masks become unstable, reducing the likelihood of large accuracy drops. Evaluations on benchmark architectures such as ResNet20, DenseNet40-24, and MobileNet V1, trained on CIFAR-100, show that both contributions enhance performance, especially in extreme sparsity ratios. In addition to these two contributions, the thesis includes an in depth study of the significance of pruned weights, comparing training processes that retain pruned connections via STE against those that permanently remove them. More specifically, the study aims to find what percentage of pruned weights holds significance and what percentage can be permanently dropped from the optimization process, in order to introduce sparse gradients to the sparsification process without loss of accuracy. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19639 |
Appears in Collections: | Διπλωματικές Εργασίες - Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
thesis_aliki_zachou.pdf | 7.64 MB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.