These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.


BIOMARKERS

Molecular Biopsy of Human Tumors

- a resource for Precision Medicine *

116 related articles for article (PubMed ID: 33223606)

  • 1. Safety-Guaranteed, Accelerated Learning in MDPs with Local Side Information.
    Thangeda P; Ornik M
    Proc Am Control Conf; 2020 Jul; 2020():1099-1104. PubMed ID: 33223606
    [TBL] [Abstract][Full Text] [Related]  

  • 2. Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation.
    Ornik M; Topcu U
    J Mach Learn Res; 2021; 22():1-40. PubMed ID: 35002545
    [TBL] [Abstract][Full Text] [Related]  

  • 3. Discovering and Exploiting Sparse Rewards in a Learned Behavior Space.
    Paolo G; Coninx M; Laflaquière A; Doncieux S
    Evol Comput; 2024 Sep; 32(3):275-305. PubMed ID: 37793063
    [TBL] [Abstract][Full Text] [Related]  

  • 4. An immediate-return reinforcement learning for the atypical Markov decision processes.
    Pan Z; Wen G; Tan Z; Yin S; Hu X
    Front Neurorobot; 2022; 16():1012427. PubMed ID: 36582302
    [TBL] [Abstract][Full Text] [Related]  

  • 5. Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning.
    Fernandez-Gauna B; Etxeberria-Agiriano I; Graña M
    PLoS One; 2015; 10(7):e0127129. PubMed ID: 26158587
    [TBL] [Abstract][Full Text] [Related]  

  • 6. Parameterized MDPs and Reinforcement Learning Problems-A Maximum Entropy Principle-Based Framework.
    Srivastava A; Salapaka SM
    IEEE Trans Cybern; 2022 Sep; 52(9):9339-9351. PubMed ID: 34406959
    [TBL] [Abstract][Full Text] [Related]  

  • 7. Bayesian reinforcement learning for navigation planning in unknown environments.
    Alali M; Imani M
    Front Artif Intell; 2024; 7():1308031. PubMed ID: 39026967
    [TBL] [Abstract][Full Text] [Related]  

  • 8. A Maximum Divergence Approach to Optimal Policy in Deep Reinforcement Learning.
    Yang Z; Qu H; Fu M; Hu W; Zhao Y
    IEEE Trans Cybern; 2023 Mar; 53(3):1499-1510. PubMed ID: 34478393
    [TBL] [Abstract][Full Text] [Related]  

  • 9. Reactive Reinforcement Learning in Asynchronous Environments.
    Travnik JB; Mathewson KW; Sutton RS; Pilarski PM
    Front Robot AI; 2018; 5():79. PubMed ID: 33500958
    [TBL] [Abstract][Full Text] [Related]  

  • 10. Diversifying Policies With Non-Markov Dispersion to Expand the Solution Space.
    Qu B; Cao X; Chang Y; Tsang IW; Ong YS
    IEEE Trans Pattern Anal Mach Intell; 2024 Dec; 46(12):11392-11408. PubMed ID: 39240734
    [TBL] [Abstract][Full Text] [Related]  

  • 11. Hierarchical approximate policy iteration with binary-tree state space decomposition.
    Xu X; Liu C; Yang SX; Hu D
    IEEE Trans Neural Netw; 2011 Dec; 22(12):1863-77. PubMed ID: 21990333
    [TBL] [Abstract][Full Text] [Related]  

  • 12. Learning parametric policies and transition probability models of markov decision processes from data.
    Xu T; Zhu H; Paschalidis IC
    Eur J Control; 2021 Jan; 57():68-75. PubMed ID: 33716408
    [TBL] [Abstract][Full Text] [Related]  

  • 13. Quantifying Reinforcement-Learning Agent's Autonomy, Reliance on Memory and Internalisation of the Environment.
    Ingel A; Makkeh A; Corcoll O; Vicente R
    Entropy (Basel); 2022 Mar; 24(3):. PubMed ID: 35327912
    [TBL] [Abstract][Full Text] [Related]  

  • 14. Experience Replay Using Transition Sequences.
    Karimpanal TG; Bouffanais R
    Front Neurorobot; 2018; 12():32. PubMed ID: 29977200
    [TBL] [Abstract][Full Text] [Related]  

  • 15. A new Q-learning algorithm based on the metropolis criterion.
    Guo M; Liu Y; Malec J
    IEEE Trans Syst Man Cybern B Cybern; 2004 Oct; 34(5):2140-3. PubMed ID: 15503510
    [TBL] [Abstract][Full Text] [Related]  

  • 16. Learning and exploration in action-perception loops.
    Little DY; Sommer FT
    Front Neural Circuits; 2013; 7():37. PubMed ID: 23579347
    [TBL] [Abstract][Full Text] [Related]  

  • 17. Fidelity-based probabilistic Q-learning for control of quantum systems.
    Chen C; Dong D; Li HX; Chu J; Tarn TJ
    IEEE Trans Neural Netw Learn Syst; 2014 May; 25(5):920-33. PubMed ID: 24808038
    [TBL] [Abstract][Full Text] [Related]  

  • 18. MOO-MDP: An Object-Oriented Representation for Cooperative Multiagent Reinforcement Learning.
    Da Silva FL; Glatt R; Costa AHR
    IEEE Trans Cybern; 2019 Feb; 49(2):567-579. PubMed ID: 29990289
    [TBL] [Abstract][Full Text] [Related]  

  • 19. The Convergence of a Cooperation Markov Decision Process System.
    Mo X; Xu D; Fu Z
    Entropy (Basel); 2020 Aug; 22(9):. PubMed ID: 33286724
    [TBL] [Abstract][Full Text] [Related]  

  • 20. Online reinforcement learning for dynamic multimedia systems.
    Mastronarde N; van der Schaar M
    IEEE Trans Image Process; 2010 Feb; 19(2):290-305. PubMed ID: 19884082
    [TBL] [Abstract][Full Text] [Related]  

    [Next]    [New Search]
    of 6.