A new accelerating algorithm for multi-agent reinforcement learning

来源 :哈尔滨工业大学学报（英文版） | 被引量 : 0次 | 上传用户：lghlgh82

【摘要】

：

In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agent

【作者】

：

ZHANG Ru-bo ZHONG Yu GU Guo-ch

【机构】

：

Computer Science and Technology College, Harbin Engineering University, Harbin 150001, China,Robotic

【出处】

：

哈尔滨工业大学学报（英文版）

【发表日期】

：

2005年1期

【关键词】

：

distributed reinforcement learning accelerating algorithm machine learning multi

【基金项目】

：

Sponsored by Robotics Laboratory, Shenyang Institute of Automation, Chinese Academy of Sciences Foundation

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by jointaction. In this article, a prediction-based reinforcement learning algorithm is presented for multi-agent cooperation tasks, which demands all agents to learn predicting the probabilities of actions that other agents may execute. A multi-robot cooperation experiment is run to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation policy much faster than the primitive reinforcement learning algorithm.

其他文献

An Improved Proxy Multi-Signature Scheme

Based on the Kim-likes proxy multi-signature scheme[1],an improved proxy multi-signature scheme is proposed.The new scheme overcomes the two problems in the Kim

期刊

proxy signaturemulti-signatureproxy multi-signaturesecurityhigh efficiency

An Improvement of MPEG-4 Rate Control Algorithm

Frame skipping in low bit video coding could significantly reduce the visual quality of reconstructed video. At the same time, if the complexity of the video se

期刊

MPEG-4rate controlreduced resolutionframe skippingbit allocation

Forward-Link Performance Analysis in CDMA Distributed Antenna Systems with Imperfect Channel Estimat

The impact of imperfect channel estimation on the forward-link performance in CDMA distributed antenna systems in multi-path fading environment is investigated.

期刊

CDMAdistributed antenna systemsimperfect channel estimationoutage probability

Designing Negotiating Agent for Automated Negotiations

Traditional research in automated negotiation is focused on negotiation protocol and strategy.This paper studies automated negotiation from a new point of view,

期刊

automated negotiationBelief-Desire-Intention modelagentknowledge query and ma

Research of the algorithm of the closed-loop control system to control the piezoelectric actuator

The piezoelectric actuator has been widely used in precision instruments and precision control. However, hysteresis, nonlinearity and creep exist in the actuato

期刊

piezoelectric actuatorposition controlclosed-loop controlfuzzy control

Design and control of multiple spacecraft formation flying in elliptical orbits

Spacecraft formation flying is an attractive new concept in international aeronautic fields because of its powerful functions and low cost. In this paper, the f

期刊

spacecraft formation flyingrelative motion in elliptical orbitsformation desig

Impact of Information Security Policies on Email-Virus Propagation

Aimed at tracing out the email-virus propagation rules in communication network, this paper extends the traditional epidemiological model (i. e., SEIR) by takin

期刊

email viruspropagation simulationE-SEIR modelinformation security policy epid

Fuzzy C-Means Clustering Based Phonetic Tied-Mixture HMM in Speech Recognition

A fuzzy clustering analysis based phonetic tied-mixture HMM(FPTM) was presented to decrease parameter size and improve robustness of parameter training. FPTM wa

期刊

speech recognitionhidden Markov model (HMM)fuzzy C-means (FCM)phonetic decisi

期刊国际化中电子审稿的模式与技术方法

目前科技论文数量年增加呈翻番的趋势,科技期刊国际化是科技期刊编辑出版事业今后发展的主流,电子审稿是科技期刊实施其国际化过程的重要运作措施。重点分析了实现电子审稿的

期刊

期刊国际化电子审稿 E-mail Internet

Research on On-line detection system for natural gas pipeline

Four methods for testing the thickness and defect of pipeline are compared and analyzed in this paper. The testing principle of magnetic leakage flux based on e

期刊

magnetic flux leakage methodnatural gas pipelinethicknessdefectcontrol flow

A new accelerating algorithm for multi-agent reinforcement learning

其他学术论文