{"blog_url":"https://nikkie-ftnext.hatenablog.com/","provider_name":"Hatena Blog","type":"rich","width":"100%","height":"190","version":"1.0","author_url":"https://blog.hatena.ne.jp/nikkie-ftnext/","description":"\u306f\u3058\u3081\u306b \u4e03\u5c3e\u767e\u5408\u5b50\u3055\u3093\u3001\u304a\u8a95\u751f\u65e5 27\u65e5\u76ee \u304a\u3081\u3067\u3068\u3046\u3054\u3056\u3044\u307e\u3059\uff01 nikkie\u3067\u3059\u3002 \u7686\u3055\u3093\u3001AISI\u306eInspect\u3063\u3066\u805e\u3044\u305f\u3053\u3068\u3042\u308a\u307e\u3059\u304b\uff1f \u76ee\u6b21 \u306f\u3058\u3081\u306b \u76ee\u6b21 OpenAI PaperBench\u306e\u5b9f\u88c5\u3092\u898b\u3066\u3044\u3066\u77e5\u308b Inspect AI\u306eHello World \u74b0\u5883\u69cb\u7bc9 Task\u3068\u3044\u3046\u6982\u5ff5 \u5b9f\u884c\u6642\u306e\u30ed\u30b0\u304c\u898b\u3089\u308c\u308b \u843d\u7a42\u62fe\u3044 VS Code\u62e1\u5f35 Task\u3092\u5225\u306eLLM\u3067\u5b9f\u884c\u3067\u304d\u308b Python\u30b9\u30af\u30ea\u30d7\u30c8\u306b\u3082\u3067\u304d\u308b \u7d42\u308f\u308a\u306b OpenAI PaperBench\u306e\u5b9f\u88c5\u3092\u898b\u3066\u3044\u3066\u77e5\u308b We\u2019re releasing PaperBench, a benchmark evaluating the\u2026","image_url":"https://cdn-ak.f.st-hatena.com/images/fotolife/n/nikkie-ftnext/20250413/20250413174817.png","provider_url":"https://hatena.blog","title":"\u30a4\u30ae\u30ea\u30b9\ud83c\uddec\ud83c\udde7AI Security Institute\u767a\u306eLLM\u8a55\u4fa1\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af Inspect \u306eHello World","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fnikkie-ftnext.hatenablog.com%2Fentry%2Fuk-aisi-llm-evaluation-framework-inspect-ai-hello-world\" title=\"\u30a4\u30ae\u30ea\u30b9\ud83c\uddec\ud83c\udde7AI Security Institute\u767a\u306eLLM\u8a55\u4fa1\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af Inspect \u306eHello World - nikkie-ftnext\u306e\u65e5\u8a18\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>","author_name":"nikkie-ftnext","published":"2025-04-13 20:37:14","blog_title":"nikkie-ftnext\u306e\u65e5\u8a18","url":"https://nikkie-ftnext.hatenablog.com/entry/uk-aisi-llm-evaluation-framework-inspect-ai-hello-world","categories":["LLM"]}