Biomarkers Search

BIOMARKERS

Molecular Biopsy of Human Tumors

- a resource for Precision Medicine *

116 related articles for article (PubMed ID: 33223606)

1. Safety-Guaranteed, Accelerated Learning in MDPs with Local Side Information.
Thangeda P; Ornik M
Proc Am Control Conf; 2020 Jul; 2020():1099-1104. PubMed ID: 33223606
[TBL] [Abstract][Full Text] [Related]

2. Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation.
Ornik M; Topcu U
J Mach Learn Res; 2021; 22():1-40. PubMed ID: 35002545
[TBL] [Abstract][Full Text] [Related]

3. Discovering and Exploiting Sparse Rewards in a Learned Behavior Space.
Paolo G; Coninx M; Laflaquière A; Doncieux S
Evol Comput; 2024 Sep; 32(3):275-305. PubMed ID: 37793063
[TBL] [Abstract][Full Text] [Related]

4. An immediate-return reinforcement learning for the atypical Markov decision processes.
Pan Z; Wen G; Tan Z; Yin S; Hu X
Front Neurorobot; 2022; 16():1012427. PubMed ID: 36582302
[TBL] [Abstract][Full Text] [Related]

5. Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning.
Fernandez-Gauna B; Etxeberria-Agiriano I; Graña M
PLoS One; 2015; 10(7):e0127129. PubMed ID: 26158587
[TBL] [Abstract][Full Text] [Related]

6. Parameterized MDPs and Reinforcement Learning Problems-A Maximum Entropy Principle-Based Framework.
Srivastava A; Salapaka SM
IEEE Trans Cybern; 2022 Sep; 52(9):9339-9351. PubMed ID: 34406959
[TBL] [Abstract][Full Text] [Related]

7. Bayesian reinforcement learning for navigation planning in unknown environments.
Alali M; Imani M
Front Artif Intell; 2024; 7():1308031. PubMed ID: 39026967
[TBL] [Abstract][Full Text] [Related]

8. A Maximum Divergence Approach to Optimal Policy in Deep Reinforcement Learning.
Yang Z; Qu H; Fu M; Hu W; Zhao Y
IEEE Trans Cybern; 2023 Mar; 53(3):1499-1510. PubMed ID: 34478393
[TBL] [Abstract][Full Text] [Related]

9. Reactive Reinforcement Learning in Asynchronous Environments.
Travnik JB; Mathewson KW; Sutton RS; Pilarski PM
Front Robot AI; 2018; 5():79. PubMed ID: 33500958
[TBL] [Abstract][Full Text] [Related]

10. Diversifying Policies With Non-Markov Dispersion to Expand the Solution Space.
Qu B; Cao X; Chang Y; Tsang IW; Ong YS
IEEE Trans Pattern Anal Mach Intell; 2024 Dec; 46(12):11392-11408. PubMed ID: 39240734
[TBL] [Abstract][Full Text] [Related]

11. Hierarchical approximate policy iteration with binary-tree state space decomposition.
Xu X; Liu C; Yang SX; Hu D
IEEE Trans Neural Netw; 2011 Dec; 22(12):1863-77. PubMed ID: 21990333
[TBL] [Abstract][Full Text] [Related]

12. Learning parametric policies and transition probability models of markov decision processes from data.
Xu T; Zhu H; Paschalidis IC
Eur J Control; 2021 Jan; 57():68-75. PubMed ID: 33716408
[TBL] [Abstract][Full Text] [Related]

13. Quantifying Reinforcement-Learning Agent's Autonomy, Reliance on Memory and Internalisation of the Environment.
Ingel A; Makkeh A; Corcoll O; Vicente R
Entropy (Basel); 2022 Mar; 24(3):. PubMed ID: 35327912
[TBL] [Abstract][Full Text] [Related]

14. Experience Replay Using Transition Sequences.
Karimpanal TG; Bouffanais R
Front Neurorobot; 2018; 12():32. PubMed ID: 29977200
[TBL] [Abstract][Full Text] [Related]

15. A new Q-learning algorithm based on the metropolis criterion.
Guo M; Liu Y; Malec J
IEEE Trans Syst Man Cybern B Cybern; 2004 Oct; 34(5):2140-3. PubMed ID: 15503510
[TBL] [Abstract][Full Text] [Related]

16. Learning and exploration in action-perception loops.
Little DY; Sommer FT
Front Neural Circuits; 2013; 7():37. PubMed ID: 23579347
[TBL] [Abstract][Full Text] [Related]

17. Fidelity-based probabilistic Q-learning for control of quantum systems.
Chen C; Dong D; Li HX; Chu J; Tarn TJ
IEEE Trans Neural Netw Learn Syst; 2014 May; 25(5):920-33. PubMed ID: 24808038
[TBL] [Abstract][Full Text] [Related]

18. MOO-MDP: An Object-Oriented Representation for Cooperative Multiagent Reinforcement Learning.
Da Silva FL; Glatt R; Costa AHR
IEEE Trans Cybern; 2019 Feb; 49(2):567-579. PubMed ID: 29990289
[TBL] [Abstract][Full Text] [Related]

19. The Convergence of a Cooperation Markov Decision Process System.
Mo X; Xu D; Fu Z
Entropy (Basel); 2020 Aug; 22(9):. PubMed ID: 33286724
[TBL] [Abstract][Full Text] [Related]

20. Online reinforcement learning for dynamic multimedia systems.
Mastronarde N; van der Schaar M
IEEE Trans Image Process; 2010 Feb; 19(2):290-305. PubMed ID: 19884082
[TBL] [Abstract][Full Text] [Related]

[Next] [New Search]