音声認識モデルwhisperの全モデル文字起こし比較

ysdyt https://blog.hatena.ne.jp/ysdyt/ 毎日がEveryday、日々 Day by Day https://ysdyt.hatenablog.jp/ 機械学習 OpenAIの音声認識モデルWhiper、いやー、まじですごすぎて感動しました。配信中のpodcast番組白金鉱業.FMを頑張って文字起こしするために、この記事とか、この記事とかでかなり真面目に既存文字起こしAPIの精度などを比較していましたが、もう今回は比べるまでもなく本当に雲泥の差です。ほぼ一言一句正確に文字起こしできます。GCP, AWS, Azureの文字起こしAPIは文字起こし精度が体感30~60%くらいでしたが、whisperは90%超えている印象です。もう笑うしかないです。最初に結論インストール実行方法結果 tinyモデルの結果 baseモデルの結果 smallモデル… 190 <iframe src="https://hatenablog-parts.com/embed?url=https%3A%2F%2Fysdyt.hatenablog.jp%2Fentry%2Fwhisper" title="音声認識モデルwhisperの全モデル文字起こし比較 - 毎日がEveryday、日々 Day by Day" class="embed-card embed-blogcard" scrolling="no" frameborder="0" style="display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;"></iframe> https://cdn-ak.f.st-hatena.com/images/fotolife/y/ysdyt/20221013/20221013002546.jpg Hatena Blog https://hatena.blog 2022-10-13 00:12:54 音声認識モデルwhisperの全モデル文字起こし比較 rich https://ysdyt.hatenablog.jp/entry/whisper 1.0 100%