These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.


BIOMARKERS

Molecular Biopsy of Human Tumors

- a resource for Precision Medicine *

115 related articles for article (PubMed ID: 24491826)

  • 1. Policy oscillation is overshooting.
    Wagner P
    Neural Netw; 2014 Apr; 52():43-61. PubMed ID: 24491826
    [TBL] [Abstract][Full Text] [Related]  

  • 2. Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data.
    Lewis FL; Vamvoudakis KG
    IEEE Trans Syst Man Cybern B Cybern; 2011 Feb; 41(1):14-25. PubMed ID: 20350860
    [TBL] [Abstract][Full Text] [Related]  

  • 3. Kernel-based least squares policy iteration for reinforcement learning.
    Xu X; Hu D; Lu X
    IEEE Trans Neural Netw; 2007 Jul; 18(4):973-92. PubMed ID: 17668655
    [TBL] [Abstract][Full Text] [Related]  

  • 4. Autonomous reinforcement learning with experience replay.
    Wawrzyński P; Tanwani AK
    Neural Netw; 2013 May; 41():156-67. PubMed ID: 23237972
    [TBL] [Abstract][Full Text] [Related]  

  • 5. Reinforcement learning in continuous time and space.
    Doya K
    Neural Comput; 2000 Jan; 12(1):219-45. PubMed ID: 10636940
    [TBL] [Abstract][Full Text] [Related]  

  • 6. Reinforcement learning of motor skills with policy gradients.
    Peters J; Schaal S
    Neural Netw; 2008 May; 21(4):682-97. PubMed ID: 18482830
    [TBL] [Abstract][Full Text] [Related]  

  • 7. Adaptive importance sampling for value function approximation in off-policy reinforcement learning.
    Hachiya H; Akiyama T; Sugiayma M; Peters J
    Neural Netw; 2009 Dec; 22(10):1399-410. PubMed ID: 19216050
    [TBL] [Abstract][Full Text] [Related]  

  • 8. Adaptive dynamic programming approach to experience-based systems identification and control.
    Lendaris GG
    Neural Netw; 2009; 22(5-6):822-32. PubMed ID: 19632087
    [TBL] [Abstract][Full Text] [Related]  

  • 9. Partially observable Markov decision processes and performance sensitivity analysis.
    Li Y; Yin B; Xi H
    IEEE Trans Syst Man Cybern B Cybern; 2008 Dec; 38(6):1645-51. PubMed ID: 19022734
    [TBL] [Abstract][Full Text] [Related]  

  • 10. Approximate robust policy iteration using multilayer perceptron neural networks for discounted infinite-horizon Markov decision processes with uncertain correlated transition matrices.
    Li B; Si J
    IEEE Trans Neural Netw; 2010 Aug; 21(8):1270-80. PubMed ID: 20601311
    [TBL] [Abstract][Full Text] [Related]  

  • 11. Parameter-exploring policy gradients.
    Sehnke F; Osendorfer C; Rückstiess T; Graves A; Peters J; Schmidhuber J
    Neural Netw; 2010 May; 23(4):551-9. PubMed ID: 20061118
    [TBL] [Abstract][Full Text] [Related]  

  • 12. Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning.
    Morimura T; Uchibe E; Yoshimoto J; Peters J; Doya K
    Neural Comput; 2010 Feb; 22(2):342-76. PubMed ID: 19842990
    [TBL] [Abstract][Full Text] [Related]  

  • 13. Efficient model learning methods for actor-critic control.
    Grondman I; Vaandrager M; Buşoniu L; Babuska R; Schuitema E
    IEEE Trans Syst Man Cybern B Cybern; 2012 Jun; 42(3):591-602. PubMed ID: 22156998
    [TBL] [Abstract][Full Text] [Related]  

  • 14. Composition of web services using Markov decision processes and dynamic programming.
    Uc-Cetina V; Moo-Mena F; Hernandez-Ucan R
    ScientificWorldJournal; 2015; 2015():545308. PubMed ID: 25874247
    [TBL] [Abstract][Full Text] [Related]  

  • 15. Individualization of pharmacological anemia management using reinforcement learning.
    Gaweda AE; Muezzinoglu MK; Aronoff GR; Jacobs AA; Zurada JM; Brier ME
    Neural Netw; 2005; 18(5-6):826-34. PubMed ID: 16109475
    [TBL] [Abstract][Full Text] [Related]  

  • 16. Impedance learning for robotic contact tasks using natural actor-critic algorithm.
    Kim B; Park J; Park S; Kang S
    IEEE Trans Syst Man Cybern B Cybern; 2010 Apr; 40(2):433-43. PubMed ID: 19696001
    [TBL] [Abstract][Full Text] [Related]  

  • 17. A policy iteration approach to online optimal control of continuous-time constrained-input systems.
    Modares H; Naghibi Sistani MB; Lewis FL
    ISA Trans; 2013 Sep; 52(5):611-21. PubMed ID: 23706414
    [TBL] [Abstract][Full Text] [Related]  

  • 18. Approximate dynamic programming for optimal stationary control with control-dependent noise.
    Jiang Y; Jiang ZP
    IEEE Trans Neural Netw; 2011 Dec; 22(12):2392-8. PubMed ID: 21954203
    [TBL] [Abstract][Full Text] [Related]  

  • 19. Fully probabilistic control design in an adaptive critic framework.
    Herzallah R; Kárný M
    Neural Netw; 2011 Dec; 24(10):1128-35. PubMed ID: 21752597
    [TBL] [Abstract][Full Text] [Related]  

  • 20. Approximate Dynamic Programming for Nonlinear-Constrained Optimizations.
    Yang X; He H; Zhong X
    IEEE Trans Cybern; 2021 May; 51(5):2419-2432. PubMed ID: 31329149
    [TBL] [Abstract][Full Text] [Related]  

    [Next]    [New Search]
    of 6.