Reinforcement Learning to Achieve Real-time Control of a Quadruple Inverted Pendulum

Oh, Yookyung; Lee, Taegun; Ryoo, Sanghyun; Koh, Kyoung Chul; Han, Soohee; Lee, Young Sam

doi:10.1007/s12555-025-0235-y

상세 보기

Reinforcement Learning to Achieve Real-time Control of a Quadruple Inverted Pendulum

Oh, Yookyung;
Lee, Taegun;
Ryoo, Sanghyun;
Koh, Kyoung Chul;
Han, Soohee;
... Lee, Young Sam

Citations

WEB OF SCIENCE

2

Citations

SCOPUS

2

초록

For the first time in model-free control, the real-time swing up and balancing task of a quadruple inverted pendulum (QIP) is successfully performed through reinforcement learning (RL) in a sample-efficient, data-driven manner. As a full state-feedback closed-loop form, the policy network designed in this paper can achieve the control objective of efficiently swinging up and balancing a QIP even without any pre-computed reference trajectories and any mathematical models. To do so, this work first defines the Markov decision process (MDP) for QIP control and then applies a state-of-the-art off-policy RL algorithm, truncated quantile critics (TQC), to accomplish the control objective by using the probabilistic distribution of Q-values. For high sample-efficiency, virtual experience replay (VER) is used to leverage the geometrically symmetric structure and thereby reduce the learning time. The proposed demonstration shows that the complex QIP control problem, difficult-to-solve through classical model-based approaches, can be solved only from measured data without accurate mathematical model derivation and stability-guaranteed control design.

키워드

Quadruple inverted pendulum; reinforcement learning; truncated quantile critic; virtual experience replay; SWING-UP; FEEDFORWARD; CART

제목: Reinforcement Learning to Achieve Real-time Control of a Quadruple Inverted Pendulum

저자: Oh, Yookyung; Lee, Taegun; Ryoo, Sanghyun; Koh, Kyoung Chul; Han, Soohee; Lee, Young Sam

DOI: 10.1007/s12555-025-0235-y

발행일: 2025-09

유형: Article

저널명: International Journal of Control, Automation, and Systems

권: 23

호: 9

페이지: 2797 ~ 2806