$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

Kar, Soummya; Moura, Jose' M. F.; Poor, H. Vincent

doi:10.1109/TSP.2013.2241057

Statistics > Machine Learning

arXiv:1205.0047 (stat)

[Submitted on 30 Apr 2012 (v1), last revised 25 Oct 2012 (this version, v2)]

Title:$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

Authors:Soummya Kar, Jose' M.F. Moura, H. Vincent Poor

View PDF

Abstract:The paper considers a class of multi-agent Markov decision processes (MDPs), in which the network agents respond differently (as manifested by the instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. The paper investigates a distributed reinforcement learning setup with no prior information on the global state transition and local agent cost statistics. Specifically, with the agents' objective consisting of minimizing a network-averaged infinite horizon discounted cost, the paper proposes a distributed version of $Q$-learning, $\mathcal{QD}$-learning, in which the network agents collaborate by means of local processing and mutual information exchange over a sparse (possibly stochastic) communication network to achieve the network goal. Under the assumption that each agent is only aware of its local online cost data and the inter-agent communication network is \emph{weakly} connected, the proposed distributed scheme is almost surely (a.s.) shown to yield asymptotically the desired value function and the optimal stationary control policy at each network agent. The analytical techniques developed in the paper to address the mixed time-scale stochastic dynamics of the \emph{consensus + innovations} form, which arise as a result of the proposed interactive distributed scheme, are of independent interest.

Comments:	Submitted to the IEEE Transactions on Signal Processing, 33 pages
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Multiagent Systems (cs.MA); Optimization and Control (math.OC); Probability (math.PR)
Cite as:	arXiv:1205.0047 [stat.ML]
	(or arXiv:1205.0047v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1205.0047
Related DOI:	https://doi.org/10.1109/TSP.2013.2241057

Submission history

From: Soummya Kar [view email]
[v1] Mon, 30 Apr 2012 22:48:37 UTC (31 KB)
[v2] Thu, 25 Oct 2012 01:59:10 UTC (94 KB)

Statistics > Machine Learning

Title:$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators