Playing congestion games with bandit feedbacks

Po-An Chen, Chi Jen Lu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Almost all convergence results from each player adopting specific "no-regret" learning algorithms such as multiplicative updates or the more general mirror-descent algorithms in repeated games are only known in the more generous information model, in which each player is assumed to have access to the costs of all possible choices, even the unchosen ones, at each tune step. This assumption in general may seem too strong, while a more realistic one is captured by the bandit model, in which each player at each time step is restricted to know only the cost of her currently chosen path, but not any of the unchosen ones. Can convergence still be achieved in such a more challenging bandit model? We answer this question positively. While existing bandit algorithms do not seem to work here, we develop a new family of bandit algorithms based on the mirror-descent algorithm with such a guarantee in atomic congestion games.

Original languageEnglish
Title of host publicationAAMAS 2015 - Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems
EditorsRafael H. Bordini, Pinar Yolum, Edith Elkind, Gerhard Weiss
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages1721-1722
Number of pages2
ISBN (Electronic)9781450337717
StatePublished - May 2015
Event14th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2015 - Istanbul, Turkey
Duration: 4 May 20158 May 2015

Publication series

NameProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Volume3
ISSN (Print)1548-8403
ISSN (Electronic)1558-2914

Conference

Conference14th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2015
Country/TerritoryTurkey
CityIstanbul
Period4/05/158/05/15

Keywords

  • Convergence
  • Mirror-descent algorithm
  • No-regret dynamics

Fingerprint

Dive into the research topics of 'Playing congestion games with bandit feedbacks'. Together they form a unique fingerprint.

Cite this