Playing congestion games with bandit feedbacks

Po-An Chen, Chi Jen Lu

研究成果: Conference contribution同行評審

7 引文 斯高帕斯(Scopus)

摘要

Almost all convergence results from each player adopting specific "no-regret" learning algorithms such as multiplicative updates or the more general mirror-descent algorithms in repeated games are only known in the more generous information model, in which each player is assumed to have access to the costs of all possible choices, even the unchosen ones, at each tune step. This assumption in general may seem too strong, while a more realistic one is captured by the bandit model, in which each player at each time step is restricted to know only the cost of her currently chosen path, but not any of the unchosen ones. Can convergence still be achieved in such a more challenging bandit model? We answer this question positively. While existing bandit algorithms do not seem to work here, we develop a new family of bandit algorithms based on the mirror-descent algorithm with such a guarantee in atomic congestion games.

原文English
主出版物標題AAMAS 2015 - Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems
編輯Rafael H. Bordini, Pinar Yolum, Edith Elkind, Gerhard Weiss
發行者International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
頁面1721-1722
頁數2
ISBN(電子)9781450337717
出版狀態Published - 5月 2015
事件14th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2015 - Istanbul, 土耳其
持續時間: 4 5月 20158 5月 2015

出版系列

名字Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
3
ISSN(列印)1548-8403
ISSN(電子)1558-2914

Conference

Conference14th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2015
國家/地區土耳其
城市Istanbul
期間4/05/158/05/15

指紋

深入研究「Playing congestion games with bandit feedbacks」主題。共同形成了獨特的指紋。

引用此