TY - GEN
T1 - Perturbed Gradients Updating within Unit Space for Deep Learning
AU - Tseng, Ching Hsun
AU - Liu, Hsueh Cheng
AU - Lee, Shin Jye
AU - Zeng, Xiaojun
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In deep learning, optimization plays a vital role. By focusing on image classification, this work investigates the pros and cons of the widely used optimizers and proposes a new optimizer: Perturbed Unit Gradient Descent (PUGD) algorithm with extending normalized gradient operation in tensor within perturbation to update in unit space. Via a set of experiments and analyses, we show that PUGD is locally bounded updating, which means the updating from time to time is controlled. On the other hand, PUGD can push models to a flat minimum, where the error remains approximately constant, not only because of the nature of avoiding stationary points in gradient normalization but also by scanning sharpness in a unit ball. From a series of rigorous experiments, PUGD helps models to gain a state-of-the-art Top-1 accuracy in Tiny ImageNet and competitive performances in CIFAR- {10, 100}. We open-source our code at link: https://github.com/hanktseng131415go/PUGD.
AB - In deep learning, optimization plays a vital role. By focusing on image classification, this work investigates the pros and cons of the widely used optimizers and proposes a new optimizer: Perturbed Unit Gradient Descent (PUGD) algorithm with extending normalized gradient operation in tensor within perturbation to update in unit space. Via a set of experiments and analyses, we show that PUGD is locally bounded updating, which means the updating from time to time is controlled. On the other hand, PUGD can push models to a flat minimum, where the error remains approximately constant, not only because of the nature of avoiding stationary points in gradient normalization but also by scanning sharpness in a unit ball. From a series of rigorous experiments, PUGD helps models to gain a state-of-the-art Top-1 accuracy in Tiny ImageNet and competitive performances in CIFAR- {10, 100}. We open-source our code at link: https://github.com/hanktseng131415go/PUGD.
KW - Deep Learning
KW - Generalization
KW - Image classification
KW - Optimization
KW - Smooth loss landscape
UR - http://www.scopus.com/inward/record.url?scp=85140709641&partnerID=8YFLogxK
U2 - 10.1109/IJCNN55064.2022.9892245
DO - 10.1109/IJCNN55064.2022.9892245
M3 - Conference contribution
AN - SCOPUS:85140709641
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2022 International Joint Conference on Neural Networks, IJCNN 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 International Joint Conference on Neural Networks, IJCNN 2022
Y2 - 18 July 2022 through 23 July 2022
ER -