自己注意

derwind https://blog.hatena.ne.jp/derwind/ らんだむな記憶 https://randommemory.hatenablog.com/ machine_learning torch.nn.Transformer の自己注意ブロックの実装を見ると、transformer.py#L352 のようになっている。 # self-attention block def _sa_block(self, x: Tensor, attn_mask: Optional[Tensor], key_padding_mask: Optional[Tensor]) -> Tensor: x = self.self_attn(x, x, x, 他の実装サンプルも同様の実装である。つまり、Q-K-V に同じテンソルを渡している。この効果について Attention (4) - らんだむな記… 190 <iframe src="https://hatenablog-parts.com/embed?url=https%3A%2F%2Frandommemory.hatenablog.com%2Fentry%2F2022%2F03%2F05%2F180312" title="自己注意 - らんだむな記憶" class="embed-card embed-blogcard" scrolling="no" frameborder="0" style="display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;"></iframe> Hatena Blog https://hatena.blog 2022-03-05 18:03:12 自己注意 rich https://randommemory.hatenablog.com/entry/2022/03/05/180312 1.0 100%