Reinforcement Learning Based Guidance for Minimizing Radar Exposure of UAV

Citations

SCOPUS

0

초록

Unmanned aerial vehicles (UAVs) has been actively used in various fields due to their ability to perform rapid maneuvers and omnidirection motion at a lower cost compared to manned aircraft. There are tasks where UAVs are operated, such as such as surveillance, combat operations, and reconnaissance. These environments include disaster areas with challenges like wildfires or large geographical obstacles, and battlefields with radar detection system. In these scenarios, UAVs might encounter multiple threats in their path from the initial point to the target point. Therefore, it is important that UAVs pursue safe and efficient path for those missions. This paper proposes a guidance method to minimize radar exposure based on reinforcement learning(RL). The mission scenario involves flying to a target point while multiple radars track the UAVs, with the goal of minimizing detection probability. To calculate received power, an optical-based radar analysis tool, POFACETS, is used to analyze the RCS of the UAVs, assumed as a conductor. The analysis considers a Monostatic type radar setup with various frequency bands, covering elevation angles from -90° to 90° and azimuth angles from -180° to 180°. Received Power and Signal to Noise Ratio are calculated based on this analysis. To simulate the flight to the target, two dimensional 3-DOF dynamics are applied, and Pure Proportional Navigation (PPN) guidance law is used for movement towards the target point. The goal of RL agent is to steer the PPN commands derived from the geometric relationship to generate acceleration commands that approach the target while reducing the SNR. The RL agent's action is set to a single angle, which creates a virtual axis used to project the velocity vector. Observations in the RL model include initial relative positions between the target point, radars, and the aircraft, and actions derived from the policy. The reference trajectory to the target using PPN is determined by the initial engagement geometry. However, in the RL framework, the direction of the acceleration commands is generated at each step based on the observed target and radar positions. These commands are continuously evaluated through the reward function to determine whether they effectively contribute to reducing radar exposure. The environment predicts the aircraft state using 3-DOF motion equations and modified PPN commands by the action. The geometry between the aircraft and radars allows for calculating received power, with smaller values receiving higher rewards. To validate the learned model, simulations were conducted considering the aircraft's initial velocity and the locations of the radars. The proposed method demonstrates the potential for RL to generate guidance commands that minimize the radar exposure of UAVs. According to the simulation results, the RL agent reduces the average SNR by 0.2 dB by slightly increasing the path compared to the PPN trajectory. © © (2024) by Engineers Australia. All rights reserved.

제목
Reinforcement Learning Based Guidance for Minimizing Radar Exposure of UAV
저자
Kim, Hyo-jungAhn, Cho RokRyoo, C. K.
발행일
2024
유형
Conference paper
2
페이지
1271 ~ 1283