These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.
3. Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors. Duan J, Guan Y, Li SE, Ren Y, Sun Q, Cheng B. IEEE Trans Neural Netw Learn Syst; 2022 Nov; 33(11):6584-6598. PubMed ID: 34101599 [Abstract] [Full Text] [Related]
4. Relative Entropy Regularized Sample-Efficient Reinforcement Learning With Continuous Actions. Shang Z, Li R, Zheng C, Li H, Cui Y. IEEE Trans Neural Netw Learn Syst; 2023 Nov 09; PP():. PubMed ID: 37943648 [Abstract] [Full Text] [Related]
5. Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples With On-Policy Experiences. Banerjee C, Chen Z, Noman N. IEEE Trans Neural Netw Learn Syst; 2024 Mar 09; 35(3):3121-3129. PubMed ID: 35588412 [Abstract] [Full Text] [Related]
6. Improving Exploration in Actor-Critic With Weakly Pessimistic Value Estimation and Optimistic Policy Optimization. Li F, Fu M, Chen W, Zhang F, Zhang H, Qu H, Yi Z. IEEE Trans Neural Netw Learn Syst; 2024 Jul 09; 35(7):8783-8796. PubMed ID: 36306289 [Abstract] [Full Text] [Related]
8. Realistic Actor-Critic: A framework for balance between value overestimation and underestimation. Li S, Tang Q, Pang Y, Ma X, Wang G. Front Neurorobot; 2022 Jul 09; 16():1081242. PubMed ID: 36699950 [Abstract] [Full Text] [Related]
9. Stochastic Integrated Actor-Critic for Deep Reinforcement Learning. Zheng J, Kurt MN, Wang X. IEEE Trans Neural Netw Learn Syst; 2024 May 09; 35(5):6654-6666. PubMed ID: 36256721 [Abstract] [Full Text] [Related]
12. Reinforcement learning in continuous time and space. Doya K. Neural Comput; 2000 Jan 09; 12(1):219-45. PubMed ID: 10636940 [Abstract] [Full Text] [Related]
13. Actor-Critic Learning Control With Regularization and Feature Selection in Policy Gradient Estimation. Li L, Li D, Song T, Xu X. IEEE Trans Neural Netw Learn Syst; 2021 Mar 09; 32(3):1217-1227. PubMed ID: 32324571 [Abstract] [Full Text] [Related]
16. A priority experience replay actor-critic algorithm using self-attention mechanism for strategy optimization of discrete problems. Sun Y, Yang B. PeerJ Comput Sci; 2024 Jul 09; 10():e2161. PubMed ID: 38983226 [Abstract] [Full Text] [Related]
18. Optimal Policy of Multiplayer Poker via Actor-Critic Reinforcement Learning. Shi D, Guo X, Liu Y, Fan W. Entropy (Basel); 2022 May 30; 24(6):. PubMed ID: 35741495 [Abstract] [Full Text] [Related]
19. Adaptive bias-variance trade-off in advantage estimator for actor-critic algorithms. Chen Y, Zhang F, Liu Z. Neural Netw; 2024 Jan 30; 169():764-777. PubMed ID: 37981458 [Abstract] [Full Text] [Related]