UAV Failure Recovery Using Deep Reinforcement Learning with Direct Simulation to Reality Transfer

  • Norhan Elocla

Student thesis: Master's Thesis

Abstract

Unmanned aerial vehicles (UAVs) are required to perform complex and dynamic tasks that demand high adaptability and precision in control. Developing robust and reliable f light controllers that can handle uncertain dynamics, including unexpected actuator failure, has become of great interest to the industry. This project aims to investigate using Reinforcement Learning (RL) to automate the design of controllers for complex trajectory planning and tracking under failures of actuators. The objectives of this project are divided into three main parts. Firstly, a one-dimensional altitude RL controller was trained in simulation for two scenarios: simple hovering and altitude trajectory control to track an oscillating window. The simple hovering case was verif ied experimentally using a quadrotor platform and reported, submitted, and accepted at the International Conference on Advanced Robotics (ICAR). A video summary of this paper can be watched at this YouTube link. Secondly, training was performed for the three-dimensional RL controller to navigate a moving narrow window. Initially, this controller was trained using accelerations as inputs. Subsequently, a new controller was developed using curriculum learning, which does not require accelerations. This second objective was also tested experimentally using a quadrotor platform. Thirdly, we introduced failures to one or more of the actuators in simulations to explore the adaptability of the RL agent in maintaining the narrow moving window traversal maneuver. The obtained results demonstrate significant improvements in the cautiousness of the RL agent and reduced RL training times for both the 1D and the 3D cases, thereby confirming that the project’s aim is attainable.
Date of Award12 Jul 2024
Original languageAmerican English
SupervisorSean Swei (Supervisor)

Keywords

  • Unmanned Aerial Vehicles
  • Reinforcement Learning
  • Complex trajectory
  • Failure

Cite this

'