Reinforcement learning for balancing a flying inverted pendulum


Abstract:

The problem of balancing an inverted pendulum on an unmanned aerial vehicle (UAV) has been achieved using linear and nonlinear control approaches. However, to the best of our knowledge, this problem has not been solved using learning methods. On the other hand, the classical inverted pendulum is a common benchmark problem to evaluate learning techniques. In this paper we demonstrate a novel solution to the inverted pendulum problem extended to UAVs, specifically quadrotors. This complex system is underactuated and sensitive to small acceleration changes of the quadrotor. The solution is provided by reinforcement learning (RL), a platform commonly applied to solve nonlinear control problems. We generate a control policy to balance the pendulum using Continuous Action Fitted Value Iteration (CAFVI) [1] which is a RL algorithm for high-dimensional input-spaces. This technique combines learning of both state and state-action value functions in an approximate value iteration setting with continuous inputs. Simulations verify the performance of the generated control policy for varying initial conditions. The results show the control policy is computationally fast enough to be appropriate of real-time control.

Año de publicación:

2014

Keywords:

  • Quadrotor control
  • inverted pendulum
  • Approximate value iteration
  • Aerial robotics
  • reinforcement learning

Fuente:

googlegoogle
scopusscopus

Tipo de documento:

Conference Object

Estado:

Acceso restringido

Áreas de conocimiento:

  • Inteligencia artificial
  • Sistema de control

Áreas temáticas:

  • Métodos informáticos especiales