{"blog_title":"\"Truth of the Legend\"  Notes","width":"100%","image_url":null,"blog_url":"https://akito-fujita.hatenablog.com/","provider_name":"Hatena Blog","categories":[],"author_url":"https://blog.hatena.ne.jp/Akito_Fujita/","published":"2021-05-22 09:26:15","title":"Word Mover's Distance\uff08\uff15\uff09Wikipedia \u306e Page \u30c7\u30fc\u30bf\u3092\u62bd\u51fa\u3059\u308b","provider_url":"https://hatena.blog","height":"190","url":"https://akito-fujita.hatenablog.com/entry/2021/05/22/092615","description":"Word Mover's Distance (5) Extracting Page Data from Wikipedia XML Dump 2021/05/22 \u85e4\u7530\u662d\u4eba \u524d\u56de \u306f\u30c8\u30ea\u30d7\u30ec\u30c3\u30c8\u3092\u7d39\u4ecb\u3057\u307e\u3057\u305f\u3002 \u3067\u3082\u3001\u5b9f\u969b\u306b fastWMD \u3092\u5b9f\u884c\u3059\u308b\u969b\u306b\u306f \u30c8\u30ea\u30d7\u30ec\u30c3\u30c8\u30fb\u30c7\u30fc\u30bf\u3092\u53d6\u308a\u8fbc\u3093\u3067\u3044\u308b\u3088\u3046\u306b\u898b\u3048\u306a\u3044\u3053\u3068\u306b \u304a\u6c17\u3065\u304d\u306e\u65b9\u3082\u591a\u3044\u304b\u3068\u601d\u3044\u307e\u3059\u3002 \u672c\u7a3f\u3067\u306f\u305d\u306e\u3042\u305f\u308a\u306e\u30ab\u30e9\u30af\u30ea\u3092\u8aac\u660e\u3057\u305f\u4e0a\u3067\u3001 \u305d\u306e\u524d\u51e6\u7406\u306e\u305f\u3081\u306b\u5fc5\u8981\u306a Wikipedia \u306e\u30d0\u30c3\u30af\u30a2\u30c3\u30d7\u304b\u3089 Page \u30c7\u30fc\u30bf\u3092\u62bd\u51fa\u3059\u308b\u30c4\u30fc\u30eb wikiPageSelector \u3092\u7d39\u4ecb\u3057\u305f\u3044\u3068\u601d\u3044\u307e\u3059\u3002 \u4ed8\u5c5e\u30b9\u30af\u30ea\u30d7\u30c8\u3067\u751f\u6210\u3055\u308c\u308b\uff14\u3064\u306e\u30d5\u30a1\u30a4\u30eb \u2026","author_name":"Akito_Fujita","type":"rich","version":"1.0","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fakito-fujita.hatenablog.com%2Fentry%2F2021%2F05%2F22%2F092615\" title=\"Word Mover&#39;s Distance\uff08\uff15\uff09Wikipedia \u306e Page \u30c7\u30fc\u30bf\u3092\u62bd\u51fa\u3059\u308b - &quot;Truth of the Legend&quot;  Notes\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>"}