These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.
130 related articles for article (PubMed ID: 34270431)
1. Stochastic Mirror Descent on Overparameterized Nonlinear Models. Azizan N; Lale S; Hassibi B IEEE Trans Neural Netw Learn Syst; 2022 Dec; 33(12):7717-7727. PubMed ID: 34270431 [TBL] [Abstract][Full Text] [Related]
2. Stochastic Gradient Descent Introduces an Effective Landscape-Dependent Regularization Favoring Flat Solutions. Yang N; Tang C; Tu Y Phys Rev Lett; 2023 Jun; 130(23):237101. PubMed ID: 37354404 [TBL] [Abstract][Full Text] [Related]
3. Weighted SGD for ℓ Yang J; Chow YL; Ré C; Mahoney MW Proc Annu ACM SIAM Symp Discret Algorithms; 2016 Jan; 2016():558-569. PubMed ID: 29782626 [TBL] [Abstract][Full Text] [Related]
4. Theoretical issues in deep networks. Poggio T; Banburski A; Liao Q Proc Natl Acad Sci U S A; 2020 Dec; 117(48):30039-30045. PubMed ID: 32518109 [TBL] [Abstract][Full Text] [Related]
5. Convergence of the RMSProp deep learning method with penalty for nonconvex optimization. Xu D; Zhang S; Zhang H; Mandic DP Neural Netw; 2021 Jul; 139():17-23. PubMed ID: 33662649 [TBL] [Abstract][Full Text] [Related]
6. The inverse variance-flatness relation in stochastic gradient descent is critical for finding flat minima. Feng Y; Tu Y Proc Natl Acad Sci U S A; 2021 Mar; 118(9):. PubMed ID: 33619091 [TBL] [Abstract][Full Text] [Related]
7. Weight Decay With Tailored Adam on Scale-Invariant Weights for Better Generalization. Jia X; Feng X; Yong H; Meng D IEEE Trans Neural Netw Learn Syst; 2024 May; 35(5):6936-6947. PubMed ID: 36279330 [TBL] [Abstract][Full Text] [Related]
8. Sign-Based Gradient Descent With Heterogeneous Data: Convergence and Byzantine Resilience. Jin R; Liu Y; Huang Y; He X; Wu T; Dai H IEEE Trans Neural Netw Learn Syst; 2024 Jan; PP():. PubMed ID: 38215315 [TBL] [Abstract][Full Text] [Related]
9. Dynamics in Deep Classifiers Trained with the Square Loss: Normalization, Low Rank, Neural Collapse, and Generalization Bounds. Xu M; Rangamani A; Liao Q; Galanti T; Poggio T Research (Wash D C); 2023; 6():0024. PubMed ID: 37223467 [TBL] [Abstract][Full Text] [Related]
10. Understanding Implicit Regularization in Over-Parameterized Single Index Model. Fan J; Yang Z; Yu M J Am Stat Assoc; 2023; 118(544):2315-2328. PubMed ID: 38550788 [TBL] [Abstract][Full Text] [Related]
11. On the different regimes of stochastic gradient descent. Sclocchi A; Wyart M Proc Natl Acad Sci U S A; 2024 Feb; 121(9):e2316301121. PubMed ID: 38377198 [TBL] [Abstract][Full Text] [Related]
12. Implicit Regularization and Momentum Algorithms in Nonlinearly Parameterized Adaptive Control and Prediction. Boffi NM; Slotine JE Neural Comput; 2021 Mar; 33(3):590-673. PubMed ID: 33513321 [TBL] [Abstract][Full Text] [Related]
13. Learning Rates for Stochastic Gradient Descent With Nonconvex Objectives. Lei Y; Tang K IEEE Trans Pattern Anal Mach Intell; 2021 Dec; 43(12):4505-4511. PubMed ID: 33755555 [TBL] [Abstract][Full Text] [Related]
14. A(DP) Xu J; Zhang W; Wang F IEEE Trans Pattern Anal Mach Intell; 2022 Nov; 44(11):8036-8047. PubMed ID: 34449356 [TBL] [Abstract][Full Text] [Related]
16. Estimation of Granger causality through Artificial Neural Networks: applications to physiological systems and chaotic electronic oscillators. Antonacci Y; Minati L; Faes L; Pernice R; Nollo G; Toppi J; Pietrabissa A; Astolfi L PeerJ Comput Sci; 2021; 7():e429. PubMed ID: 34084917 [TBL] [Abstract][Full Text] [Related]
17. A mean field view of the landscape of two-layer neural networks. Mei S; Montanari A; Nguyen PM Proc Natl Acad Sci U S A; 2018 Aug; 115(33):E7665-E7671. PubMed ID: 30054315 [TBL] [Abstract][Full Text] [Related]
19. The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization. Tao W; Pan Z; Wu G; Tao Q IEEE Trans Neural Netw Learn Syst; 2020 Jul; 31(7):2557-2568. PubMed ID: 31484139 [TBL] [Abstract][Full Text] [Related]