TY - GEN
T1 - Collaborative Deep Reinforcement Learning for Resource Optimization in Non-Terrestrial Networks
AU - Cao, Yang
AU - Lien, Shao Yu
AU - Liang, Ying Chang
AU - Niyato, Dusit
AU - Shen, Xuemin Sherman
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Non-terrestrial networks (NTNs) with low-earth orbit (LEO) satellites have been regarded as promising remedies to support global ubiquitous wireless services. Due to the rapid mobility of LEO satellite, inter-beam/satellite handovers happen frequently for a specific user equipment (UE). To tackle this issue, earth-fixed cell scenarios have been under studied, in which the LEO satellite adjusts its beam direction towards a fixed area within its dwell duration, to maintain stable transmission performance for the UE. Therefore, it is required that the LEO satellite performs real-time resource allocation, which however is unaffordable by the LEO satellite with limited computing capability. To address this issue, in this paper, we propose a two-time-scale collaborative deep reinforcement learning (DRL) scheme for beam management and resource allocation in NTNs, in which LEO satellite and UE with different control cycles update their decision-making policies through a sequential manner. Specifically, UE updates its policy subject to improving the value functions of both the agents. Furthermore, the LEO satellite only makes decisions through finite-step rollouts with a reference decision trajectory received from the UE. Simulation results show that the proposed scheme can effectively balance the throughput performance and computational complexity over traditional greedy-searching schemes.
AB - Non-terrestrial networks (NTNs) with low-earth orbit (LEO) satellites have been regarded as promising remedies to support global ubiquitous wireless services. Due to the rapid mobility of LEO satellite, inter-beam/satellite handovers happen frequently for a specific user equipment (UE). To tackle this issue, earth-fixed cell scenarios have been under studied, in which the LEO satellite adjusts its beam direction towards a fixed area within its dwell duration, to maintain stable transmission performance for the UE. Therefore, it is required that the LEO satellite performs real-time resource allocation, which however is unaffordable by the LEO satellite with limited computing capability. To address this issue, in this paper, we propose a two-time-scale collaborative deep reinforcement learning (DRL) scheme for beam management and resource allocation in NTNs, in which LEO satellite and UE with different control cycles update their decision-making policies through a sequential manner. Specifically, UE updates its policy subject to improving the value functions of both the agents. Furthermore, the LEO satellite only makes decisions through finite-step rollouts with a reference decision trajectory received from the UE. Simulation results show that the proposed scheme can effectively balance the throughput performance and computational complexity over traditional greedy-searching schemes.
KW - Non-terrestrial networks (NTNs)
KW - deep reinforcement learning (DRL)
KW - earth-fixed cell
KW - multi-time-scale Markov decision process (MMDPs)
KW - resource allocation
UR - http://www.scopus.com/inward/record.url?scp=85178278583&partnerID=8YFLogxK
U2 - 10.1109/PIMRC56721.2023.10294047
DO - 10.1109/PIMRC56721.2023.10294047
M3 - Conference contribution
AN - SCOPUS:85178278583
T3 - IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC
BT - 2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE Annual International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2023
Y2 - 5 September 2023 through 8 September 2023
ER -