文章目录
time difference learningSARSA
Monte Carlo Methods两种算法的对比
参考书籍:
reinforcement learning : state-of-the-art
an introduction to reinforcement(Sutton)
上一篇笔记:
https://blog.csdn.net/Jinyindao243052/article/details/106985000
time difference learning
SARSA
Monte Carlo Methods
V函数的 Monte Carlo learning Monte Carlo 方法估计
V
(
S
)
V(S)
V(S) Monte Carlo 方法更新最优策略 算法1 算法2
两种算法的对比