상세 보기
Reinforcement Learning to Achieve Real-time Control of a Quadruple Inverted Pendulum
- Oh, Yookyung;
- Lee, Taegun;
- Ryoo, Sanghyun;
- Koh, Kyoung Chul;
- Han, Soohee;
- ... Lee, Young Sam
WEB OF SCIENCE
2SCOPUS
2초록
For the first time in model-free control, the real-time swing up and balancing task of a quadruple inverted pendulum (QIP) is successfully performed through reinforcement learning (RL) in a sample-efficient, data-driven manner. As a full state-feedback closed-loop form, the policy network designed in this paper can achieve the control objective of efficiently swinging up and balancing a QIP even without any pre-computed reference trajectories and any mathematical models. To do so, this work first defines the Markov decision process (MDP) for QIP control and then applies a state-of-the-art off-policy RL algorithm, truncated quantile critics (TQC), to accomplish the control objective by using the probabilistic distribution of Q-values. For high sample-efficiency, virtual experience replay (VER) is used to leverage the geometrically symmetric structure and thereby reduce the learning time. The proposed demonstration shows that the complex QIP control problem, difficult-to-solve through classical model-based approaches, can be solved only from measured data without accurate mathematical model derivation and stability-guaranteed control design.
키워드
- 제목
- Reinforcement Learning to Achieve Real-time Control of a Quadruple Inverted Pendulum
- 저자
- Oh, Yookyung; Lee, Taegun; Ryoo, Sanghyun; Koh, Kyoung Chul; Han, Soohee; Lee, Young Sam
- 발행일
- 2025-09
- 유형
- Article
- 권
- 23
- 호
- 9
- 페이지
- 2797 ~ 2806