These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.
140 related articles for article (PubMed ID: 37354404)
1. Stochastic Gradient Descent Introduces an Effective Landscape-Dependent Regularization Favoring Flat Solutions. Yang N; Tang C; Tu Y Phys Rev Lett; 2023 Jun; 130(23):237101. PubMed ID: 37354404 [TBL] [Abstract][Full Text] [Related]
2. The inverse variance-flatness relation in stochastic gradient descent is critical for finding flat minima. Feng Y; Tu Y Proc Natl Acad Sci U S A; 2021 Mar; 118(9):. PubMed ID: 33619091 [TBL] [Abstract][Full Text] [Related]
3. Anomalous diffusion dynamics of learning in deep neural networks. Chen G; Qu CK; Gong P Neural Netw; 2022 May; 149():18-28. PubMed ID: 35182851 [TBL] [Abstract][Full Text] [Related]
4. Accelerating Minibatch Stochastic Gradient Descent Using Typicality Sampling. Peng X; Li L; Wang FY IEEE Trans Neural Netw Learn Syst; 2020 Nov; 31(11):4649-4659. PubMed ID: 31899442 [TBL] [Abstract][Full Text] [Related]
5. A mean field view of the landscape of two-layer neural networks. Mei S; Montanari A; Nguyen PM Proc Natl Acad Sci U S A; 2018 Aug; 115(33):E7665-E7671. PubMed ID: 30054315 [TBL] [Abstract][Full Text] [Related]
6. Stochastic Mirror Descent on Overparameterized Nonlinear Models. Azizan N; Lale S; Hassibi B IEEE Trans Neural Netw Learn Syst; 2022 Dec; 33(12):7717-7727. PubMed ID: 34270431 [TBL] [Abstract][Full Text] [Related]
7. Shaping the learning landscape in neural networks around wide flat minima. Baldassi C; Pittorino F; Zecchina R Proc Natl Acad Sci U S A; 2020 Jan; 117(1):161-170. PubMed ID: 31871189 [TBL] [Abstract][Full Text] [Related]
8. Towards Better Generalization of Deep Neural Networks via Non-Typicality Sampling Scheme. Peng X; Wang FY; Li L IEEE Trans Neural Netw Learn Syst; 2023 Oct; 34(10):7910-7920. PubMed ID: 35157598 [TBL] [Abstract][Full Text] [Related]
9. Accelerating deep neural network training with inconsistent stochastic gradient descent. Wang L; Yang Y; Min R; Chakradhar S Neural Netw; 2017 Sep; 93():219-229. PubMed ID: 28668660 [TBL] [Abstract][Full Text] [Related]
15. On the different regimes of stochastic gradient descent. Sclocchi A; Wyart M Proc Natl Acad Sci U S A; 2024 Feb; 121(9):e2316301121. PubMed ID: 38377198 [TBL] [Abstract][Full Text] [Related]
16. Learning smooth dendrite morphological neurons by stochastic gradient descent for pattern classification. Gómez-Flores W; Sossa H Neural Netw; 2023 Nov; 168():665-676. PubMed ID: 37857137 [TBL] [Abstract][Full Text] [Related]
17. A Geometric Interpretation of Stochastic Gradient Descent Using Diffusion Metrics. Fioresi R; Chaudhari P; Soatto S Entropy (Basel); 2020 Jan; 22(1):. PubMed ID: 33285876 [TBL] [Abstract][Full Text] [Related]
18. Achieving small-batch accuracy with large-batch scalability via Hessian-aware learning rate adjustment. Lee S; He C; Avestimehr S Neural Netw; 2023 Jan; 158():1-14. PubMed ID: 36436301 [TBL] [Abstract][Full Text] [Related]
19. Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup. Goldt S; Advani MS; Saxe AM; Krzakala F; Zdeborová L J Stat Mech; 2020 Dec; 2020(12):124010. PubMed ID: 34262607 [TBL] [Abstract][Full Text] [Related]
20. Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses. Frye CG; Simon J; Wadia NS; Ligeralde A; DeWeese MR; Bouchard KE Neural Comput; 2021 May; 33(6):1469-1497. PubMed ID: 34496389 [TBL] [Abstract][Full Text] [Related] [Next] [New Search]