TY - JOUR
T1 - Reinforcement Learning-Based Collision Avoidance and Optimal Trajectory Planning in UAV Communication Networks
AU - Hsu, Yu Hsin
AU - Gau, Rung Hung
N1 - Publisher Copyright:
© 2002-2012 IEEE.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - In this paper, we propose a reinforcement learning approach of collision avoidance and investigate optimal trajectory planning for unmanned aerial vehicle (UAV) communication networks. Specifically, each UAV takes charge of delivering objects in the forward path and collecting data from heterogeneous ground IoT devices in the backward path. We adopt reinforcement learning for assisting UAVs to learn collision avoidance without knowing the trajectories of other UAVs in advance. In addition, for each UAV, we use optimization theory to find out a shortest backward path that assures data collection from all associated IoT devices. To obtain an optimal visiting order for IoT devices, we formulate and solve a no-return traveling salesman problem. Given a visiting order, we formulate and solve a sequence of convex optimization problems to obtain line segments of an optimal backward path for heterogeneous ground IoT devices. We use analytical results and simulation results to justify the usage of the proposed approach. Simulation results show that the proposed approach is superior to a number of alternative approaches.
AB - In this paper, we propose a reinforcement learning approach of collision avoidance and investigate optimal trajectory planning for unmanned aerial vehicle (UAV) communication networks. Specifically, each UAV takes charge of delivering objects in the forward path and collecting data from heterogeneous ground IoT devices in the backward path. We adopt reinforcement learning for assisting UAVs to learn collision avoidance without knowing the trajectories of other UAVs in advance. In addition, for each UAV, we use optimization theory to find out a shortest backward path that assures data collection from all associated IoT devices. To obtain an optimal visiting order for IoT devices, we formulate and solve a no-return traveling salesman problem. Given a visiting order, we formulate and solve a sequence of convex optimization problems to obtain line segments of an optimal backward path for heterogeneous ground IoT devices. We use analytical results and simulation results to justify the usage of the proposed approach. Simulation results show that the proposed approach is superior to a number of alternative approaches.
KW - convex optimization
KW - optimal trajectory planning
KW - Reinforcement learning
KW - traveling salesman problem with neighborhood
KW - UAV collision avoidance
UR - http://www.scopus.com/inward/record.url?scp=85121042226&partnerID=8YFLogxK
U2 - 10.1109/TMC.2020.3003639
DO - 10.1109/TMC.2020.3003639
M3 - Article
AN - SCOPUS:85121042226
SN - 1536-1233
VL - 21
SP - 306
EP - 320
JO - IEEE Transactions on Mobile Computing
JF - IEEE Transactions on Mobile Computing
IS - 1
ER -