Deep Learning論文紹介「Learning Recurrent Neural Networks with Hessian-Free Optimization」

nishiohirokazu https://blog.hatena.ne.jp/nishiohirokazu/ 西尾泰和のはてなダイアリー https://nishiohirokazu.hatenadiary.org/ リカレントニューラルネット(RNN)に長距離相関を学習させるのは難しい問題だったが、Hessian-Freeを使ったらできた、という話。RNNはBack Propagation Through Time(BPTT)＋確率的勾配法で簡単に計算できることが長所とされているが、10タイムステップほど離れた相関は1次の勾配法では全然学習できないその原因は“vanishing/exploding gradients”。長距離相関はBPTTで何度も前の時刻の層へ伝搬されるため、誤差信号がすぐに減衰して消えてしまう。最近生まれたHessian-Free(HF) またの名をtruncated-Newton, … 190 <iframe src="https://hatenablog-parts.com/embed?url=https%3A%2F%2Fnishiohirokazu.hatenadiary.org%2Fentry%2F20140208%2F1391825340" title="Deep Learning論文紹介「Learning Recurrent Neural Networks with Hessian-Free Optimization」 - 西尾泰和のはてなダイアリー" class="embed-card embed-blogcard" scrolling="no" frameborder="0" style="display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;"></iframe> Hatena Blog https://hatena.blog 2014-02-08 11:09:00 Deep Learning論文紹介「Learning Recurrent Neural Networks with Hessian-Free Optimization」 rich https://nishiohirokazu.hatenadiary.org/entry/20140208/1391825340 1.0 100%