TY - JOUR
T1 - Deep reinforcement learning for UAV navigation through massive MIMO technique
AU - Huang, Hongji
AU - Yang, Yuchun
AU - Wang, Hong
AU - Ding, Zhiguo
AU - Sari, Hikmet
AU - Adachi, Fumiyuki
N1 - Funding Information:
Manuscript received July 9, 2019; revised September 15, 2019; accepted November 5, 2019. Date of publication November 8, 2019; date of current version January 15, 2020. This work was supported in part by the National Natural Science Foundation of China under Grant 61801246, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20170910, in part by the China Postdoc Innovation Talent Supporting Program under Grant BX20180143, in part by the Open Research Foundation of National Mobile Communications Research Laboratory of Southeast University under Grant 2018D09, and in part by the NUPTSF under Grant NY217005 and Grant NY217031. The review of this article was coordinated by Dr. F. Tang. (Corresponding author: Hong Wang.) H. Huang is with the School of Communication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China (e-mail: [email protected]).
Publisher Copyright:
© 1967-2012 IEEE.
PY - 2020/1
Y1 - 2020/1
N2 - Unmanned aerial vehicles (UAVs) technique has been recognized as a promising solution in future wireless connectivity from the sky, and UAV navigation is one of the most significant open research problems, which has attracted wide interest in the research community. However, the current UAV navigation schemes are unable to capture the UAV motion and select the best UAV-ground links in real-time, and these weaknesses overwhelm the UAV navigation performance. To tackle these fundamental limitations, in this paper, we merge the state-of-the-art deep reinforcement learning with the UAV navigation through massive multiple-input-multiple-output (MIMO) technique. To be specific, we carefully design a deep Q-network (DQN) for optimizing the UAV navigation by selecting the optimal policy, and then we propose a learning mechanism for processing the DQN. The DQN is trained so that the agent is capable of making decisions based on the received signal strengths for navigating the UAVs with the aid of the powerful Q-learning. Simulation results are provided to corroborate the superiority of the proposed schemes in terms of the coverage and convergence compared with those of the other schemes.
AB - Unmanned aerial vehicles (UAVs) technique has been recognized as a promising solution in future wireless connectivity from the sky, and UAV navigation is one of the most significant open research problems, which has attracted wide interest in the research community. However, the current UAV navigation schemes are unable to capture the UAV motion and select the best UAV-ground links in real-time, and these weaknesses overwhelm the UAV navigation performance. To tackle these fundamental limitations, in this paper, we merge the state-of-the-art deep reinforcement learning with the UAV navigation through massive multiple-input-multiple-output (MIMO) technique. To be specific, we carefully design a deep Q-network (DQN) for optimizing the UAV navigation by selecting the optimal policy, and then we propose a learning mechanism for processing the DQN. The DQN is trained so that the agent is capable of making decisions based on the received signal strengths for navigating the UAVs with the aid of the powerful Q-learning. Simulation results are provided to corroborate the superiority of the proposed schemes in terms of the coverage and convergence compared with those of the other schemes.
KW - deep reinforcement learning
KW - Massive multiple-input-multiple-output (MIMO)
KW - UAV navigation
UR - http://www.scopus.com/inward/record.url?scp=85078441706&partnerID=8YFLogxK
U2 - 10.1109/TVT.2019.2952549
DO - 10.1109/TVT.2019.2952549
M3 - Article
AN - SCOPUS:85078441706
SN - 0018-9545
VL - 69
SP - 1117
EP - 1121
JO - IEEE Transactions on Vehicular Technology
JF - IEEE Transactions on Vehicular Technology
IS - 1
M1 - 8894381
ER -