User:ArrowHead294/UTF-8 extensions: Difference between revisions
ArrowHead294 (talk | contribs) mNo edit summary |
ArrowHead294 (talk | contribs) No edit summary |
||
| Line 1: | Line 1: | ||
This is a table illustrating how Unicode code points are converted into UTF-8, specifically, what code points correspond to two-, three-, four-, five-, and six-byte sequences. A break is made in the four-byte sequences as UTF-8 is currently restricted to U+10FFFF to match the constraints of UTF-16, but extensions are shown anyways to show how UTF-8 is capable of encoding up to {{nowrap|2<sup>31</sup> − 1}} = 0x7FFFFFFF without using <code>FE</code> and <code>FF</code>. | |||
{| class="wikitable" | {| class="wikitable" | ||
|+ style="font-size: 105%;" | Code point ↔ UTF-8 conversion | |+ style="font-size: 105%;" | Code point ↔ UTF-8 conversion | ||