A new accelerating algorithm for multi-agent reinforcement learning

来源 :哈尔滨工业大学学报(英文版) | 被引量 : 0次 | 上传用户:lghlgh82
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by jointaction. In this article, a prediction-based reinforcement learning algorithm is presented for multi-agent cooperation tasks, which demands all agents to learn predicting the probabilities of actions that other agents may execute. A multi-robot cooperation experiment is run to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation policy much faster than the primitive reinforcement learning algorithm.
其他文献
Based on the Kim-likes proxy multi-signature scheme[1],an improved proxy multi-signature scheme is proposed.The new scheme overcomes the two problems in the Kim
Frame skipping in low bit video coding could significantly reduce the visual quality of reconstructed video. At the same time, if the complexity of the video se
The impact of imperfect channel estimation on the forward-link performance in CDMA distributed antenna systems in multi-path fading environment is investigated.
Traditional research in automated negotiation is focused on negotiation protocol and strategy.This paper studies automated negotiation from a new point of view,
The piezoelectric actuator has been widely used in precision instruments and precision control. However, hysteresis, nonlinearity and creep exist in the actuato
Spacecraft formation flying is an attractive new concept in international aeronautic fields because of its powerful functions and low cost. In this paper, the f
Aimed at tracing out the email-virus propagation rules in communication network, this paper extends the traditional epidemiological model (i. e., SEIR) by takin
A fuzzy clustering analysis based phonetic tied-mixture HMM(FPTM) was presented to decrease parameter size and improve robustness of parameter training. FPTM wa
目前科技论文数量年增加呈翻番的趋势,科技期刊国际化是科技期刊编辑出版事业今后发展的主流,电子审稿是科技期刊实施其国际化过程的重要运作措施。重点分析了实现电子审稿的
Four methods for testing the thickness and defect of pipeline are compared and analyzed in this paper. The testing principle of magnetic leakage flux based on e