153 related articles for article (PubMed ID: 33285876)
1. A Geometric Interpretation of Stochastic Gradient Descent Using Diffusion Metrics.
Fioresi R; Chaudhari P; Soatto S
Entropy (Basel); 2020 Jan; 22(1):. PubMed ID: 33285876
[TBL] [Abstract][Full Text] [Related]
2. Accelerating Minibatch Stochastic Gradient Descent Using Typicality Sampling.
Peng X; Li L; Wang FY
IEEE Trans Neural Netw Learn Syst; 2020 Nov; 31(11):4649-4659. PubMed ID: 31899442
[TBL] [Abstract][Full Text] [Related]
3. A mean field view of the landscape of two-layer neural networks.
Mei S; Montanari A; Nguyen PM
Proc Natl Acad Sci U S A; 2018 Aug; 115(33):E7665-E7671. PubMed ID: 30054315
[TBL] [Abstract][Full Text] [Related]
4. Anomalous diffusion dynamics of learning in deep neural networks.
Chen G; Qu CK; Gong P
Neural Netw; 2022 May; 149():18-28. PubMed ID: 35182851
[TBL] [Abstract][Full Text] [Related]
5. Preconditioned Stochastic Gradient Descent.
Li XL
IEEE Trans Neural Netw Learn Syst; 2018 May; 29(5):1454-1466. PubMed ID: 28362591
[TBL] [Abstract][Full Text] [Related]
6. Accelerating deep neural network training with inconsistent stochastic gradient descent.
Wang L; Yang Y; Min R; Chakradhar S
Neural Netw; 2017 Sep; 93():219-229. PubMed ID: 28668660
[TBL] [Abstract][Full Text] [Related]
7. Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks.
Vasudevan S
Entropy (Basel); 2020 May; 22(5):. PubMed ID: 33286332
[TBL] [Abstract][Full Text] [Related]
8. Stochastic Gradient Descent for Nonconvex Learning Without Bounded Gradient Assumptions.
Lei Y; Hu T; Li G; Tang K
IEEE Trans Neural Netw Learn Syst; 2020 Oct; 31(10):4394-4400. PubMed ID: 31831449
[TBL] [Abstract][Full Text] [Related]
9. The inverse variance-flatness relation in stochastic gradient descent is critical for finding flat minima.
Feng Y; Tu Y
Proc Natl Acad Sci U S A; 2021 Mar; 118(9):. PubMed ID: 33619091
[TBL] [Abstract][Full Text] [Related]
10. On the different regimes of stochastic gradient descent.
Sclocchi A; Wyart M
Proc Natl Acad Sci U S A; 2024 Feb; 121(9):e2316301121. PubMed ID: 38377198
[TBL] [Abstract][Full Text] [Related]
11. Stochastic Gradient Descent Introduces an Effective Landscape-Dependent Regularization Favoring Flat Solutions.
Yang N; Tang C; Tu Y
Phys Rev Lett; 2023 Jun; 130(23):237101. PubMed ID: 37354404
[TBL] [Abstract][Full Text] [Related]
12. Accelerating DNN Training Through Selective Localized Learning.
Krithivasan S; Sen S; Venkataramani S; Raghunathan A
Front Neurosci; 2021; 15():759807. PubMed ID: 35087370
[TBL] [Abstract][Full Text] [Related]
13. Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup.
Goldt S; Advani MS; Saxe AM; Krzakala F; Zdeborová L
J Stat Mech; 2020 Dec; 2020(12):124010. PubMed ID: 34262607
[TBL] [Abstract][Full Text] [Related]
14. PID Controller-Based Stochastic Optimization Acceleration for Deep Neural Networks.
Wang H; Luo Y; An W; Sun Q; Xu J; Zhang L
IEEE Trans Neural Netw Learn Syst; 2020 Dec; 31(12):5079-5091. PubMed ID: 32011265
[TBL] [Abstract][Full Text] [Related]
15. Low Complexity Gradient Computation Techniques to Accelerate Deep Neural Network Training.
Shin D; Kim G; Jo J; Park J
IEEE Trans Neural Netw Learn Syst; 2023 Sep; 34(9):5745-5759. PubMed ID: 34890336
[TBL] [Abstract][Full Text] [Related]
16. The Limiting Dynamics of SGD: Modified Loss, Phase-Space Oscillations, and Anomalous Diffusion.
Kunin D; Sagastuy-Brena J; Gillespie L; Margalit E; Tanaka H; Ganguli S; Yamins DLK
Neural Comput; 2023 Dec; 36(1):151-174. PubMed ID: 38052080
[TBL] [Abstract][Full Text] [Related]
17. Research on a learning rate with energy index in deep learning.
Zhao H; Liu F; Zhang H; Liang Z
Neural Netw; 2019 Feb; 110():225-231. PubMed ID: 30599419
[TBL] [Abstract][Full Text] [Related]
18. Understanding Short-Range Memory Effects in Deep Neural Networks.
Tan C; Zhang J; Liu J
IEEE Trans Neural Netw Learn Syst; 2023 Feb; PP():. PubMed ID: 37027555
[TBL] [Abstract][Full Text] [Related]
19. Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent.
De Sa C; Feldman M; RĂ© C; Olukotun K
Proc Int Symp Comput Archit; 2017 Jun; 2017():561-574. PubMed ID: 29391770
[TBL] [Abstract][Full Text] [Related]
20. Improving Deep Neural Networks' Training for Image Classification With Nonlinear Conjugate Gradient-Style Adaptive Momentum.
Wang B; Ye Q
IEEE Trans Neural Netw Learn Syst; 2023 Mar; PP():. PubMed ID: 37030680
[TBL] [Abstract][Full Text] [Related]
[Next] [New Search]