NeurWIN: Neural Whittle Index Network for Restless Bandits Via Deep RL

Khaled Nakhleh, Santosh Ganji, Ping Chun Hsieh, I. Hong Hou, Srinivas Shakkottai

研究成果: Conference contribution同行評審

11 引文 斯高帕斯(Scopus)

摘要

Whittle index policy is a powerful tool to obtain asymptotically optimal solutions for the notoriously intractable problem of restless bandits. However, finding the Whittle indices remains a difficult problem for many practical restless bandits with convoluted transition kernels. This paper proposes NeurWIN, a neural Whittle index network that seeks to learn the Whittle indices for any restless bandits by leveraging mathematical properties of the Whittle indices. We show that a neural network that produces the Whittle index is also one that produces the optimal control for a set of Markov decision problems. This property motivates using deep reinforcement learning for the training of NeurWIN. We demonstrate the utility of NeurWIN by evaluating its performance for three recently studied restless bandit problems. Our experiment results show that the performance of NeurWIN is significantly better than other RL algorithms.

原文English
主出版物標題Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
編輯Marc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
發行者Neural information processing systems foundation
頁面828-839
頁數12
ISBN(電子)9781713845393
出版狀態Published - 2021
事件35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online
持續時間: 6 12月 202114 12月 2021

出版系列

名字Advances in Neural Information Processing Systems
2
ISSN(列印)1049-5258

Conference

Conference35th Conference on Neural Information Processing Systems, NeurIPS 2021
城市Virtual, Online
期間6/12/2114/12/21

指紋

深入研究「NeurWIN: Neural Whittle Index Network for Restless Bandits Via Deep RL」主題。共同形成了獨特的指紋。

引用此