Sim-to-Real Reinforcement Learning for a Rotary Double-Inverted Pendulum Based on a Mathematical Model

Ju, Doyoon; Lee, Jongbeom; Lee, Young Sam

doi:10.3390/math13121996

상세 보기

Sim-to-Real Reinforcement Learning for a Rotary Double-Inverted Pendulum Based on a Mathematical Model

Ju, Doyoon;
Lee, Jongbeom;
Lee, Young Sam

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

2

초록

This paper proposes a transition control strategy for a rotary double-inverted pendulum (RDIP) system using a sim-to-real reinforcement learning (RL) controller, built upon mathematical modeling and parameter estimation. High-resolution sensor data are used to estimate key physical parameters, ensuring model fidelity for simulation. The resulting mathematical model serves as the training environment in which the RL agent learns to perform transitions between various initial conditions and target equilibrium configurations. The training process adopts the Truncated Quantile Critics (TQC) algorithm, with a reward function specifically designed to reflect the nonlinear characteristics of the system. The learned policy is directly deployed on physical hardware without additional tuning or calibration, and the TQC-based controller successfully achieves all four equilibrium transitions. Furthermore, the controller exhibits robust recovery properties under external disturbances, demonstrating its effectiveness as a reliable sim-to-real control approach for high-dimensional nonlinear systems.

키워드

reinforcement learning; rotary double-inverted pendulum; sim2real transfer; system identification; model-based learning; SWING-UP; CART

제목: Sim-to-Real Reinforcement Learning for a Rotary Double-Inverted Pendulum Based on a Mathematical Model

저자: Ju, Doyoon; Lee, Jongbeom; Lee, Young Sam

DOI: 10.3390/math13121996

발행일: 2025-06

유형: Article

저널명: MATHEMATICS

권: 13

호: 12