Biomarkers Search

BIOMARKERS

Molecular Biopsy of Human Tumors

- a resource for Precision Medicine *

115 related articles for article (PubMed ID: 24491826)

1. Policy oscillation is overshooting.
Wagner P
Neural Netw; 2014 Apr; 52():43-61. PubMed ID: 24491826
[TBL] [Abstract][Full Text] [Related]

2. Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data.
Lewis FL; Vamvoudakis KG
IEEE Trans Syst Man Cybern B Cybern; 2011 Feb; 41(1):14-25. PubMed ID: 20350860
[TBL] [Abstract][Full Text] [Related]

3. Kernel-based least squares policy iteration for reinforcement learning.
Xu X; Hu D; Lu X
IEEE Trans Neural Netw; 2007 Jul; 18(4):973-92. PubMed ID: 17668655
[TBL] [Abstract][Full Text] [Related]

4. Autonomous reinforcement learning with experience replay.
Wawrzyński P; Tanwani AK
Neural Netw; 2013 May; 41():156-67. PubMed ID: 23237972
[TBL] [Abstract][Full Text] [Related]

5. Reinforcement learning in continuous time and space.
Doya K
Neural Comput; 2000 Jan; 12(1):219-45. PubMed ID: 10636940
[TBL] [Abstract][Full Text] [Related]

6. Reinforcement learning of motor skills with policy gradients.
Peters J; Schaal S
Neural Netw; 2008 May; 21(4):682-97. PubMed ID: 18482830
[TBL] [Abstract][Full Text] [Related]

7. Adaptive importance sampling for value function approximation in off-policy reinforcement learning.
Hachiya H; Akiyama T; Sugiayma M; Peters J
Neural Netw; 2009 Dec; 22(10):1399-410. PubMed ID: 19216050
[TBL] [Abstract][Full Text] [Related]

8. Adaptive dynamic programming approach to experience-based systems identification and control.
Lendaris GG
Neural Netw; 2009; 22(5-6):822-32. PubMed ID: 19632087
[TBL] [Abstract][Full Text] [Related]

9. Partially observable Markov decision processes and performance sensitivity analysis.
Li Y; Yin B; Xi H
IEEE Trans Syst Man Cybern B Cybern; 2008 Dec; 38(6):1645-51. PubMed ID: 19022734
[TBL] [Abstract][Full Text] [Related]

10. Approximate robust policy iteration using multilayer perceptron neural networks for discounted infinite-horizon Markov decision processes with uncertain correlated transition matrices.
Li B; Si J
IEEE Trans Neural Netw; 2010 Aug; 21(8):1270-80. PubMed ID: 20601311
[TBL] [Abstract][Full Text] [Related]

11. Parameter-exploring policy gradients.
Sehnke F; Osendorfer C; Rückstiess T; Graves A; Peters J; Schmidhuber J
Neural Netw; 2010 May; 23(4):551-9. PubMed ID: 20061118
[TBL] [Abstract][Full Text] [Related]

12. Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning.
Morimura T; Uchibe E; Yoshimoto J; Peters J; Doya K
Neural Comput; 2010 Feb; 22(2):342-76. PubMed ID: 19842990
[TBL] [Abstract][Full Text] [Related]

13. Efficient model learning methods for actor-critic control.
Grondman I; Vaandrager M; Buşoniu L; Babuska R; Schuitema E
IEEE Trans Syst Man Cybern B Cybern; 2012 Jun; 42(3):591-602. PubMed ID: 22156998
[TBL] [Abstract][Full Text] [Related]

14. Composition of web services using Markov decision processes and dynamic programming.
Uc-Cetina V; Moo-Mena F; Hernandez-Ucan R
ScientificWorldJournal; 2015; 2015():545308. PubMed ID: 25874247
[TBL] [Abstract][Full Text] [Related]

15. Individualization of pharmacological anemia management using reinforcement learning.
Gaweda AE; Muezzinoglu MK; Aronoff GR; Jacobs AA; Zurada JM; Brier ME
Neural Netw; 2005; 18(5-6):826-34. PubMed ID: 16109475
[TBL] [Abstract][Full Text] [Related]

16. Impedance learning for robotic contact tasks using natural actor-critic algorithm.
Kim B; Park J; Park S; Kang S
IEEE Trans Syst Man Cybern B Cybern; 2010 Apr; 40(2):433-43. PubMed ID: 19696001
[TBL] [Abstract][Full Text] [Related]

17. A policy iteration approach to online optimal control of continuous-time constrained-input systems.
Modares H; Naghibi Sistani MB; Lewis FL
ISA Trans; 2013 Sep; 52(5):611-21. PubMed ID: 23706414
[TBL] [Abstract][Full Text] [Related]

18. Approximate dynamic programming for optimal stationary control with control-dependent noise.
Jiang Y; Jiang ZP
IEEE Trans Neural Netw; 2011 Dec; 22(12):2392-8. PubMed ID: 21954203
[TBL] [Abstract][Full Text] [Related]

19. Fully probabilistic control design in an adaptive critic framework.
Herzallah R; Kárný M
Neural Netw; 2011 Dec; 24(10):1128-35. PubMed ID: 21752597
[TBL] [Abstract][Full Text] [Related]

20. Approximate Dynamic Programming for Nonlinear-Constrained Optimizations.
Yang X; He H; Zhong X
IEEE Trans Cybern; 2021 May; 51(5):2419-2432. PubMed ID: 31329149
[TBL] [Abstract][Full Text] [Related]

[Next] [New Search]