UTF-8/UTF-16/UTF-32の違い

tanamon https://blog.hatena.ne.jp/tanamon/ tanamonの稀に良く書く日記 https://tanamon.hatenablog.jp/ memo 軽く調べてみた。 UTF-8 1文字は1byte〜6byteで表される。（6byteで31bitまでの表現ができる） ASCIIは1byte ISO 8859-1以外のISO 8859の8bit文字は2byte 大抵の日本語文字（半角カナ含む）は3byte 5〜6byteの文字は定義されていないし、する気もなさそう ISO 8859-1部分に互換性があるため、マルチバイト文字に対応していないような行儀の悪いプログラムでもそのまま動く。 BOMを付ける場合、BOMは3byteになる。エンディアンに関係なくEF BB BF BOMは本来要らないはずだけど、付いていることがある。文字コードを自… 190 <iframe src="https://hatenablog-parts.com/embed?url=https%3A%2F%2Ftanamon.hatenablog.jp%2Fentry%2F20090216%2F1234764425" title="UTF-8/UTF-16/UTF-32の違い - tanamonの稀に良く書く日記" class="embed-card embed-blogcard" scrolling="no" frameborder="0" style="display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;"></iframe> Hatena Blog https://hatena.blog 2009-02-16 15:07:05 UTF-8/UTF-16/UTF-32の違い rich https://tanamon.hatenablog.jp/entry/20090216/1234764425 1.0 100%