Autonomous robotic manipulation is becoming increasingly important in both industrial and household contexts, with applications ranging from assembly line object manipulation to fruit sampling/harvesting in smart farms. Quick learning, smart and robust algorithms that work on readily available industrial robotic platforms is a necessity. This thesis report explores the challenge of exploration in deep reinforcement learning for robotic manipulation tasks. When dealing with sparse reward settings, an agent receives minimal positive feedback until it randomly achieves the goal. This becomes increasingly difficult for longer control sequences. To address this, we propose a two-pronged approach. First, we extend the sparse reward function such that it modifies the manipulation strategy based on sensor data, encouraging certain behaviour where intrinsic rewards are given while avoiding certain behaviours which give penalties. Secondly, we introduce an improvement to Hindsight Experience Replay (HER) specifically for robotic manipulation. Unlike previous methods that randomly replayed experiences, our approach prioritizes replays based on a "trajectory energy" function. This function considers the target object’s potential, kinetic, and rotational energy throughout the episode, focusing on experiences where work done on the object is higher. To further enhance exploration, we also incorporate demonstrations alongside prioritization. This combined approach achieves better performance and sample-efficiency compared to existing methods, all without incurring additional computational costs.
| Date of Award | 1 May 2024 |
|---|
| Original language | American English |
|---|
| Supervisor | Irfan Hussain (Supervisor) |
|---|
- Deep learning in grasping and manipulation
- Intrinsic Motivation
- Reinforcement Learning
- Prioritized Replay
- Learning from demonstration
- Hindsight Experience
- Energy (Physics)
Autonomous Object Manipulation Using Reinforcement Learning
Abdul, M. (Author). 1 May 2024
Student thesis: Master's Thesis