Continuous action-space reinforcement learning methods applied to the minimum-time swing-up of the acrobot
Book chapter
Nichols, B. 2015. Continuous action-space reinforcement learning methods applied to the minimum-time swing-up of the acrobot. in: Systems, Man and Cybernetics (SMC), 2015 IEEE International Conference on Institute of Electrical and Electronics Engineers (IEEE). pp. 2084-2089
Chapter title | Continuous action-space reinforcement learning methods applied to the minimum-time swing-up of the acrobot |
---|---|
Authors | Nichols, B. |
Abstract | Here I apply three reinforcement learning methods to the full, continuous action, swing-up acrobot control benchmark problem. These include two approaches from the literature: CACLA and NM-SARSA and a novel approach which I refer to as NelderMead-SARSA. NelderMead-SARSA, like NM-SARSA, directly optimises the state-action value function for action selection, in order to allow continuous action reinforcement learning without a separate policy function. However, as it uses a derivative-free method it does not require the first or second partial derivatives of the value function. |
Page range | 2084-2089 |
Book title | Systems, Man and Cybernetics (SMC), 2015 IEEE International Conference on |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
ISBN | |
Hardcover | 9781479986965 |
Publication dates | |
Oct 2015 | |
Publication process dates | |
Deposited | 18 Jan 2016 |
Output status | Published |
Digital Object Identifier (DOI) | https://doi.org/10.1109/SMC.2015.364 |
Language | English |
Event | 2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC) |
https://repository.mdx.ac.uk/item/861qv
10
total views0
total downloads1
views this month0
downloads this month