强化学习无痛上手笔记第2课

    技术2022-07-10  116

    文章目录

    time difference learningSARSA Monte Carlo Methods两种算法的对比 参考书籍: reinforcement learning : state-of-the-art an introduction to reinforcement(Sutton) 上一篇笔记: https://blog.csdn.net/Jinyindao243052/article/details/106985000

    time difference learning

    SARSA

    Monte Carlo Methods

    V函数的 Monte Carlo learning Monte Carlo 方法估计 V ( S ) V(S) V(S) Monte Carlo 方法更新最优策略 算法1 算法2

    两种算法的对比

    Processed: 0.018, SQL: 9