Deep reinforcement learning for the computation offloading in MIMO-based Edge Computing

Abdeladim Sadiki, Jamal Bentahar, Rachida Dssouli, Abdeslam En-Nouaary, Hadi Otrok

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

Multi-access Edge Computing (MEC) has recently emerged as a potential technology to serve the needs of mobile devices (MDs) in 5G and 6G cellular networks. By offloading tasks to high-performance servers installed at the edge of the wireless networks, resource-limited MDs can cope with the proliferation of the recent computationally-intensive applications. In this paper, we study the computation offloading problem in a massive multiple-input multiple-output (MIMO)-based MEC system where the base stations are equipped with a large number of antennas. Our objective is to minimize the power consumption and offloading delay at the MDs under the stochastic system environment. To this end, we introduce new formulation of the problem as a Markov Decision Process (MDP) and propose two Deep Reinforcement Learning (DRL) algorithms to learn the optimal offloading policy without any prior knowledge of the environment dynamics. First, a Deep Q-Network (DQN)-based algorithm to solve the curse of the state space explosion is defined. Then, a more general Proximal Policy Optimization (PPO)-based algorithm to solve the problem of discrete action space is introduced. Simulation results show that our DRL-based solutions outperform the state-of-the-art algorithms. Moreover, our PPO algorithm exhibits stable performance and efficient offloading results compared to the benchmarks DQN and Double DQN (DDQN) strategies.

Original languageBritish English
Article number103080
JournalAd Hoc Networks
Volume141
DOIs
StatePublished - 15 Mar 2023

Keywords

  • Computation offloading
  • Deep reinforcement learning
  • Massive multiple-input multiple-output
  • Multi-access Edge Computing

Fingerprint

Dive into the research topics of 'Deep reinforcement learning for the computation offloading in MIMO-based Edge Computing'. Together they form a unique fingerprint.

Cite this