These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.
Pubmed for Handhelds
PUBMED FOR HANDHELDS
Search MEDLINE/PubMed
Title: Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network. Author: Meng W, Zheng Q, Yang L, Li P, Pan G. Journal: IEEE Trans Neural Netw Learn Syst; 2020 Oct; 31(10):4374-4380. PubMed ID: 31765320. Abstract: The deep Q-network (DQN) and return-based reinforcement learning are two promising algorithms proposed in recent years. The DQN brings advances to complex sequential decision problems, while return-based algorithms have advantages in making use of sample trajectories. In this brief, we propose a general framework to combine the DQN and most of the return-based reinforcement learning algorithms, named R-DQN. We show that the performance of the traditional DQN can be significantly improved by introducing return-based algorithms. In order to further improve the R-DQN, we design a strategy with two measurements to qualitatively measure the policy discrepancy. We conduct experiments on several representative tasks from the OpenAI Gym and Atari games. The state-of-the-art performance achieved by our method with this proposed strategy validates its effectiveness.[Abstract] [Full Text] [Related] [New Search]