{"width":"100%","provider_name":"Hatena Blog","blog_url":"https://blog-en.fltech.dev/","title":"Fujitsu's Corporate Benchmarking Proposal: To Unlock the True Value of AI Agent Models #2 AAAI 2026 AABA4ET Participation Report and Introduction to the Fujitsu RAG Hard Benchmark","author_name":"fukui-f_tech","published":"2026-03-13 08:51:37","image_url":"https://cdn-ak.f.st-hatena.com/images/fotolife/f/fltech-user/20260309/20260309093136.png","provider_url":"https://hatena.blog","height":"190","author_url":"https://blog.hatena.ne.jp/fukui-f_tech/","url":"https://blog-en.fltech.dev/entry/2026/03/11/RAG-Hard-Benchmark-en","type":"rich","description":"This article marks the beginning of a TechBlog series entitled 'Fujitsu's Corporate Benchmarking Proposal: To Unlock the True Value of AI Agent Models.' It covers three blogs to the following schedule: Part 1: When AI 'Sees' What Isn't There: Introducing a Benchmark for Diagnosing Hallucinations in \u2026","blog_title":"fltech - Technology Blog of Fujitsu Research","categories":["AI"],"version":"1.0","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fblog-en.fltech.dev%2Fentry%2F2026%2F03%2F11%2FRAG-Hard-Benchmark-en\" title=\"Fujitsu&#39;s Corporate Benchmarking Proposal: To Unlock the True Value of AI Agent Models #2 AAAI 2026 AABA4ET Participation Report and Introduction to the Fujitsu RAG Hard Benchmark - fltech - Technology Blog of Fujitsu Research\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>"}