{"image_url":"https://docs.google.com/drawings/d/e/2PACX-1vT57UIQHCiqq7hojkiBHm8vCd2p7KLNyh8vw7gDFhmM9XxYs-9VI0_WpDGXeiknnPlRCBYijAJ-hBPk/pub?w=765&h=560","version":"1.0","author_url":"https://blog.hatena.ne.jp/yhayato1320/","height":"190","categories":[],"author_name":"yhayato1320","type":"rich","published":"2025-07-30 09:34:27","url":"https://yhayato1320.hatenablog.com/entry/2025/07/30/093427","blog_title":"\u30aa\u30e0\u30e9\u30a4\u30b9\u306e\u5099\u5fd8\u9332","blog_url":"https://yhayato1320.hatenablog.com/","title":"\u3010\u6df1\u5c64\u5b66\u7fd2\u3011ModelScopeT2V","description":"Index Index ModelScopeT2V Architecture VQGAN Text Encoder Denoising UNet \u53c2\u8003 ModelScopeT2V ModelScopeT2V \u306f\u3001\u4e0e\u3048\u3089\u308c\u305f\u30c6\u30ad\u30b9\u30c8 \u306e\u610f\u5473\u306b\u6cbf\u3063\u305f\u52d5\u753b \u3092\u51fa\u529b\u3059\u308b\u6f5c\u5728\u52d5\u753b\u62e1\u6563\u30e2\u30c7\u30eb / Latent Video Diffusion Model. Latent Video Diffusion Model yhayato1320.hatenablog.com Architecture ModelScope Text-to-Video Technical Report prompt output vi\u2026","width":"100%","provider_name":"Hatena Blog","provider_url":"https://hatena.blog","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fyhayato1320.hatenablog.com%2Fentry%2F2025%2F07%2F30%2F093427\" title=\"\u3010\u6df1\u5c64\u5b66\u7fd2\u3011ModelScopeT2V - \u30aa\u30e0\u30e9\u30a4\u30b9\u306e\u5099\u5fd8\u9332\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>"}