【Python】Python 〜 Tesseract OCR 〜

dk521123 https://blog.hatena.ne.jp/dk521123/ プログラムの超個人的なメモ https://dk521123.hatenablog.com/ Python ◾️はじめにやりたいこと『画像又はPDFの情報からデータを抽出し、DBに入れるシステムを作成したい (無料のツールで)』があり、以下「【５】おまけ：構想していること」の「１）処理フロー」と「２）システム構成」で考えている。まず、その中の「Tesseract OCR」について、調べてみた目次【１】Tesseract OCR １）ライセンス２）公式サイト【２】環境設定１）Docekerを使用した場合【３】サンプル【４】オプション１）--psm (ページセグメンテーションモード) ２）--oem ３）-l jpn ４）-c preserve_interword_spaces=… 190 <iframe src="https://hatenablog-parts.com/embed?url=https%3A%2F%2Fdk521123.hatenablog.com%2Fentry%2F2025%2F10%2F03%2F141326" title="【Python】Python 〜 Tesseract OCR 〜 - プログラムの超個人的なメモ" class="embed-card embed-blogcard" scrolling="no" frameborder="0" style="display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;"></iframe> https://cdn.image.st-hatena.com/image/square/adad63b72f1d6545b2ba2538c3fc2923b2fd5989/backend=imagemagick;height=80;version=1;width=80/https%3A%2F%2Fcdn.blog.st-hatena.com%2Fimages%2Fcircle%2Fofficial-circle-icon%2Fcomputers.gif Hatena Blog https://hatena.blog 2025-10-03 14:13:26 【Python】Python 〜 Tesseract OCR 〜 rich https://dk521123.hatenablog.com/entry/2025/10/03/141326 1.0 100%