Reinforcement Learning with Constant Size Frequency Encodings

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Middle Tennessee State University

Abstract

In Reinforcement Learning, Markov Decision Processes (MDPs) enable agents to learn complex behavior by following simple algorithms and receiving sparse feedback from the environment. MDPs have a drawback, which is that due to their sequential nature, they lock an agent into operating at a particular time scale. Environments may then have signals that they can only express across a different time scale requiring the agent to have some mechanism, such as an episodic memory, to extract this information over multiple steps of an MDP. We humans do this easily, and it is believed that the hippocampus in our brains and those of living things is responsible for managing such information. In this work we propose and analyze a method to create a constant-length episodic memory trace we call a Holographic Frequency Trace (HFT) that can be calculated and used in real time during Reinforcement Learning processes.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By