{"url":"https://yhayato1320.hatenablog.com/entry/2023/01/09/123039","provider_name":"Hatena Blog","image_url":"https://docs.google.com/drawings/d/e/2PACX-1vQ4z9uj28HhPw84Bv2zLzioeIfoOKB0hu2TofGIYMLL-y3MfrOXtnaBV4r3-KhT-x14Kt_2JgQ94aq-/pub?w=765&h=560","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fyhayato1320.hatenablog.com%2Fentry%2F2023%2F01%2F09%2F123039\" title=\"\u3010\u6a5f\u68b0\u5b66\u7fd2\u3011\u5f37\u5316\u5b66\u7fd2 / Reinforcement Learning - \u30aa\u30e0\u30e9\u30a4\u30b9\u306e\u5099\u5fd8\u9332\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>","categories":["\u30c7\u30fc\u30bf\u30b5\u30a4\u30a8\u30f3\u30b9","\u30c7\u30fc\u30bf\u30b5\u30a4\u30a8\u30f3\u30b9-\u6a5f\u68b0\u5b66\u7fd2"],"published":"2023-01-09 12:30:39","width":"100%","provider_url":"https://hatena.blog","blog_title":"\u30aa\u30e0\u30e9\u30a4\u30b9\u306e\u5099\u5fd8\u9332","title":"\u3010\u6a5f\u68b0\u5b66\u7fd2\u3011\u5f37\u5316\u5b66\u7fd2 / Reinforcement Learning","description":"Index Index \u5f37\u5316\u5b66\u7fd2 / Reinforcement Learning \u30a2\u30eb\u30b4\u30ea\u30ba\u30e0 PEARL / 2021 Hierarchical Chunk Attention Memory / HCAM / 2021 Gato / 2022 Policy-Space Response Oracles / PSRO / 2023 Controllability-aware Skill Discovery / CSD /2023 Reusable Slotwise Mechanisms / RS / 2023 Scaled Q-learning / 2023 Stochastic MuZer\u2026","author_name":"yhayato1320","type":"rich","author_url":"https://blog.hatena.ne.jp/yhayato1320/","blog_url":"https://yhayato1320.hatenablog.com/","version":"1.0","height":"190"}