masterhead masterhead  masterhead

Adjustment of Discount Rate Using Index for Progress of Learning

Summary

We showed that it can be effective to adjust the discount rate using an index for progress of learning. In the strategy that we propose, the discount rate is small when the learning does not progress enough, and is increased as the learning advances. We also proposed three methods for its adjustment; exponential, by TD error, and by reliability, which were verificated by numerical experiments for a windy grid world task.

success rate for the conventional method success rate for the proposed method
Success rate per every 20 episodes. Left: conventional. Right: Proposed.

transition of normalized action value function for the conventional method transition of normalized action value function for the conventional method
Transition of normalized action value function. Left: conventional. Right: Proposed.

Reference (in Japanese)

  1. Naoko Ogawa, Akio Namiki and Masatoshi Ishikawa. Adjustment of Discount Rate Using Index for Progress of Learning. IEICE Neurocomputing Meeting (Sapporo, Japan, 2003.2.4) / IEICE Technical Report, NC2002-129, pp. 73-78, Feb. 2003. [PDF (1.2M)]
Ishikawa Group Laboratory
Research Institute for Science & Technology, Tokyo University of Science /
Data Science Research Division, Information Technology Center, University of Tokyo
Ishikawa Group Laboratory WWW admin: contact
Copyright © 2008 Ishikawa Group Laboratory. All rights reserved.