fbpx

[email protected]

购物车

 查看订单

  • 我的帐户
东东购 | EasternEast
  • 中文书店
    • 畅销排行榜
      • 小说 畅销榜
      • 童书 畅销榜
      • 外语畅销榜
      • 管理畅销榜
      • 法律畅销榜
      • 青春文学畅销榜
    • 热门分类
      • 社会小说
      • 成功/励志 畅销榜
      • 人物传记
      • 大陆原创
      • 绘本童书
      • 影视小说
    • 文学推荐
      • 文集
      • 戏剧
      • 纪实文学
      • 名家作品
      • 民间文学
      • 中国现当代随笔
    • 新书热卖榜
      • 小说 新书热卖榜
      • 青春文学 新书热卖榜
      • 童书 新书热卖榜
      • 管理 新书热卖榜
      • 成功/励志 新书热卖榜
      • 艺术 新书热卖榜
  • 精选分类
    • 小说
    • 保健养生
    • 烹饪/美食
    • 风水/占卜
    • 青春文学
    • 童书
    • 管理
    • 成功/励志
    • 文学
    • 哲学/宗教
    • 传记
    • 投资理财
    • 亲子家教
    • 动漫/幽默
    • 法律 Legal
    • 经济 Economics
    • 所有分类
  • 关于东东
  • 帮我找书
搜索
首页计算机/网络人工智能统计策略搜索强化学习方法及应用

统计策略搜索强化学习方法及应用

作者:赵婷婷 出版社:电子工业出版社 出版时间:2021年09月 

ISBN: 9787121419591
年中特卖用“SALE15”折扣卷全场书籍85折!可与三本88折,六本78折的优惠叠加计算!全球包邮!
trust badge

EUR €45.99

类别: 计算机/网络 新书热卖榜, 人工智能 SKU:6194030bf0f22475083b229c 库存: 有现货
  • 描述
  • 评论( 0 )

描述

开 本: 128开纸 张: 胶版纸包 装: 平装-胶订是否套装: 否国际标准书号ISBN: 9787121419591

内容简介
智能体AlphaGo战胜人类围棋专家刷新了人类对人工智能的认识,也使得其核心技术强化学习受到学术界的广泛关注。本书正是在如此背景下,围绕作者多年从事强化学习理论及应用的研究内容及国内外关于强化学习的近动态等方面展开介绍,是为数不多的强化学习领域的专业著作。该著作侧重于基于直接策略搜索的强化学习方法,结合了统计学习的诸多方法对相关技术及方法进行分析、改进及应用。本书以一个全新的现代角度描述策略搜索强化学习算法。从不同的强化学习场景出发,讲述了强化学习在实际应用中所面临的诸多难题。针对不同场景,给定具体的策略搜索算法,分析算法中估计量和学习参数的统计特性,并对算法进行应用实例展示及定量比较。特别地,本书结合强化学习前沿技术将策略搜索算法应用到机器人控制及数字艺术渲染领域,给人以耳目一新的感觉。后根据作者长期研究经验,对强化学习的发展趋势进行了简要介绍和总结。本书取材经典、全面,概念清楚,推导严密,以期形成一个集基础理论、算法和应用为一体的完备知识体系。
作者简介
赵婷婷,天津科技大学人工智能学院副教授,主要研究方向为人工智能、机器学习。中国计算机协会(CCF) 会员、YOCSEF 会员、中国人工智能学会会员、人工智能学会模式识别专委会委员,2017年获得天津市”131”创新型人才培养工程第二层次人选称号。
目  录
第1章 强化学习概述···························································································1

1.1 机器学习中的强化学习··········································································1

1.2 智能控制中的强化学习··········································································4

1.3 强化学习分支··························································································8

1.4 本书贡献·······························································································11

1.5 本书结构·······························································································12

参考文献········································································································14

第2章 相关研究及背景知识·············································································19

2.1 马尔可夫决策过程················································································19

2.2 基于值函数的策略学习算法·································································21

2.2.1 值函数·······················································································21

2.2.2 策略迭代和值迭代····································································23

2.2.3 Q-learning ··················································································25

2.2.4 基于小二乘法的策略迭代算法·············································27

2.2.5 基于值函数的深度强化学习方法·············································29

2.3 策略搜索算法························································································30

2.3.1 策略搜索算法建模····································································31

2.3.2 传统策略梯度算法(REINFORCE算法)······························32

2.3.3 自然策略梯度方法(Natural Policy Gradient)························33

2.3.4 期望化的策略搜索方法·····················································35

2.3.5 基于策略的深度强化学习方法·················································37

2.4 本章小结·······························································································38

参考文献········································································································39

第3章 策略梯度估计的分析与改进·································································42

3.1 研究背景·······························································································42

3.2 基于参数探索的策略梯度算法(PGPE算法)···································44

3.3 梯度估计方差分析················································································46

3.4 基于基线的算法改进及分析·························································48

3.4.1 基线的基本思想································································48

3.4.2 PGPE算法的基线······························································49

3.5 实验·······································································································51

3.5.1 示例···························································································51

3.5.2 倒立摆平衡问题········································································57

3.6 总结与讨论····························································································58

参考文献········································································································60

第4章 基于重要性采样的参数探索策略梯度算法··········································63

4.1 研究背景·······························································································63

4.2 异策略场景下的PGPE算法·································································64

4.2.1 重要性加权PGPE算法·····························································65

4.2.2 IW-PGPE算法通过基线减法减少方差····································66

4.3 实验结果·······························································································68

4.3.1 示例···························································································69

4.3.2 山地车任务················································································78

4.3.3 机器人仿真控制任务································································81

4.4 总结和讨论····························································································88

参考文献·····························

抢先评论了 “统计策略搜索强化学习方法及应用” 取消回复

评论

还没有评论。

相关产品

加入购物车

人工智能简史

EUR €30.99
加入购物车

机器人学中的状态估计(人工智能与机器人系列)

EUR €55.99
加入购物车

机器学习

EUR €53.99
评分 5.00 / 5
加入购物车

深度学习之TensorFlow:入门、原理与进阶实战

EUR €58.99

东东购的宗旨是服务喜爱阅读中文书籍的海外人民,提供一个完善的购书平台,让国人不论何时何地都能沉浸在书香之中,读着熟悉的中文字,回忆着家乡的味道。


安全加密结账 安心网络购物 支持Paypal付款

常见问题

  • 货物配送
  • 退换货政策
  • 隐私政策
  • 联盟营销

客户服务

  • 联系东东
  • 关于东东
  • 帮我找书
  • 货物追踪
  • 会员登入

订阅最新的优惠讯息和书籍资讯

选择币别

EUR
USD
CAD
AUD
NZD
NOK
GBP
CHF
SEK
CNY
UAH
ILS
SAR
MXN
KRW
MYR
SGD
HUF
TRY
JPY
HKD
TWD
facebookinstagram
©2020 东东购 EasternEast.com

限时特卖:用“SALE15”优惠券全场书籍85折!可与三本88折,六本78折的优惠叠加计算。 忽略