Stochastic gradient descent with hyperbolic-tangent decay on classification

Bo Yang Hsueh, Wei Li, I-Chen Wu

研究成果: Conference contribution同行評審

19 引文 斯高帕斯(Scopus)

摘要

Learning rate scheduler has been a critical issue in the deep neural network training. Several schedulers and methods have been proposed, including step decay scheduler, adaptive method, cosine scheduler and cyclical scheduler. This paper proposes a new scheduling method, named hyperbolic-tangent decay (HTD). We run experiments on several benchmarks such as: ResNet, Wide ResNet and DenseNet for CIFAR-10 and CIFAR-100 datasets, LSTM for PAMAP2 dataset, ResNet on ImageNet and Fashion-MNIST datasets. In our experiments, HTD outperforms step decay and cosine scheduler in nearly all cases, while requiring less hyperparameters than step decay, and more flexible than cosine scheduler. Code is available at https://github.com/BIGBALLON/HTD.

原文English
主出版物標題Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019
發行者Institute of Electrical and Electronics Engineers Inc.
頁面435-442
頁數8
ISBN(電子)9781728119755
DOIs
出版狀態Published - 4 3月 2019
事件19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019 - Waikoloa Village, 美國
持續時間: 7 1月 201911 1月 2019

出版系列

名字Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019

Conference

Conference19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019
國家/地區美國
城市Waikoloa Village
期間7/01/1911/01/19

指紋

深入研究「Stochastic gradient descent with hyperbolic-tangent decay on classification」主題。共同形成了獨特的指紋。

引用此