{"published":"2019-07-07 11:13:53","type":"rich","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fenakai00.hatenablog.com%2Fentry%2F2019%2F07%2F07%2F111353\" title=\"Reinforcement Learning 2nd Edition: Exercise Solutions (Chapter 9 - Chapter 12) - \u3081\u3082\u3081\u3082\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>","image_url":"https://images-fe.ssl-images-amazon.com/images/I/41-GQHUmMdL._SL160_.jpg","author_url":"https://blog.hatena.ne.jp/enakai00/","categories":[],"version":"1.0","description":"Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning series)\u4f5c\u8005:Sutton, Richard S.,Barto, Andrew G.\u767a\u58f2\u65e5: 2018/11/13\u30e1\u30c7\u30a3\u30a2: \u30cf\u30fc\u30c9\u30ab\u30d0\u30fc Chapter 9 Exercise 9.1 Define a feature as a one-hot representation of states , that is, .Then, . Exercise 9.2 There are choices of () for . Exe\u2026","title":"Reinforcement Learning 2nd Edition: Exercise Solutions (Chapter 9 - Chapter 12)","provider_name":"Hatena Blog","blog_title":"\u3081\u3082\u3081\u3082","height":"190","blog_url":"https://enakai00.hatenablog.com/","provider_url":"https://hatena.blog","url":"https://enakai00.hatenablog.com/entry/2019/07/07/111353","author_name":"enakai00","width":"100%"}