Reinforcement Learning to Achieve Real-time Control of a Quadruple Inverted Pendulum

  • Oh, Yookyung
  • Lee, Taegun
  • Ryoo, Sanghyun
  • Koh, Kyoung Chul
  • Han, Soohee
  • ... Lee, Young Sam
Citations

WEB OF SCIENCE

2
Citations

SCOPUS

2

초록

For the first time in model-free control, the real-time swing up and balancing task of a quadruple inverted pendulum (QIP) is successfully performed through reinforcement learning (RL) in a sample-efficient, data-driven manner. As a full state-feedback closed-loop form, the policy network designed in this paper can achieve the control objective of efficiently swinging up and balancing a QIP even without any pre-computed reference trajectories and any mathematical models. To do so, this work first defines the Markov decision process (MDP) for QIP control and then applies a state-of-the-art off-policy RL algorithm, truncated quantile critics (TQC), to accomplish the control objective by using the probabilistic distribution of Q-values. For high sample-efficiency, virtual experience replay (VER) is used to leverage the geometrically symmetric structure and thereby reduce the learning time. The proposed demonstration shows that the complex QIP control problem, difficult-to-solve through classical model-based approaches, can be solved only from measured data without accurate mathematical model derivation and stability-guaranteed control design.

키워드

Quadruple inverted pendulumreinforcement learningtruncated quantile criticvirtual experience replaySWING-UPFEEDFORWARDCART
제목
Reinforcement Learning to Achieve Real-time Control of a Quadruple Inverted Pendulum
저자
Oh, YookyungLee, TaegunRyoo, SanghyunKoh, Kyoung ChulHan, SooheeLee, Young Sam
DOI
10.1007/s12555-025-0235-y
발행일
2025-09
유형
Article
저널명
International Journal of Control, Automation, and Systems
23
9
페이지
2797 ~ 2806