Continuous action-space reinforcement learning methods applied to the minimum-time swing-up of the acrobot

Book chapter


Nichols, B. 2015. Continuous action-space reinforcement learning methods applied to the minimum-time swing-up of the acrobot. in: Systems, Man and Cybernetics (SMC), 2015 IEEE International Conference on Institute of Electrical and Electronics Engineers (IEEE). pp. 2084-2089
Chapter titleContinuous action-space reinforcement learning methods applied to the minimum-time swing-up of the acrobot
AuthorsNichols, B.
Abstract

Here I apply three reinforcement learning methods to the full, continuous action, swing-up acrobot control benchmark problem. These include two approaches from the literature: CACLA and NM-SARSA and a novel approach which I refer to as NelderMead-SARSA. NelderMead-SARSA, like NM-SARSA, directly optimises the state-action value function for action selection, in order to allow continuous action reinforcement learning without a separate policy function. However, as it uses a derivative-free method it does not require the first or second partial derivatives of the value function.
All three methods achieved good results in terms of swing-up times, comparable to previous approaches from the literature. Particularly NelderMead-SARSA, which performed the swing-up in a shorter time than many approaches from the literature.

Page range2084-2089
Book titleSystems, Man and Cybernetics (SMC), 2015 IEEE International Conference on
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
ISBN
Hardcover9781479986965
Publication dates
PrintOct 2015
Publication process dates
Deposited18 Jan 2016
Output statusPublished
Digital Object Identifier (DOI)https://doi.org/10.1109/SMC.2015.364
LanguageEnglish
Event2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
Permalink -

https://repository.mdx.ac.uk/item/861qv

  • 10
    total views
  • 0
    total downloads
  • 1
    views this month
  • 0
    downloads this month

Export as