Attention（注意機構）：必要な情報を都度参照する

Hal40n https://blog.hatena.ne.jp/Hal40n/ ゼロからAI理論を再構築する https://serenewealth.net/ 機械学習前回、Encoder-Decoderモデルでは入力を固定長ベクトルに圧縮するため、長い入力で情報が失われるという話を書きました。Attention（注意機構）は、この問題を「入力の全ステップの情報を保持しておき、必要なときに参照する」という方法で解消します。 Encoderの全状態を保持する従来のEncoder-DecoderではEncoderの最後の隠れ状態だけをDecoderに渡していましたが、Attentionを導入したモデルではEncoderの全ステップの隠れ状態を捨てずに保持します。 Decoderは出力を1トークン生成するたびに、Encoderの隠れ状態のリストを見て「今の自分に… 190 <iframe src="https://hatenablog-parts.com/embed?url=https%3A%2F%2Fserenewealth.net%2Fentry%2F2026%2F03%2F13%2F193348" title="Attention（注意機構）：必要な情報を都度参照する - ゼロからAI理論を再構築する" class="embed-card embed-blogcard" scrolling="no" frameborder="0" style="display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;"></iframe> Hatena Blog https://hatena.blog 2026-03-13 19:33:48 Attention（注意機構）：必要な情報を都度参照する rich https://serenewealth.net/entry/2026/03/13/193348 1.0 100%