On line slobodné slovníky | |
UTF-8 (8-bit Unicode Transformation Format) is a lossless, variable-length character encoding for Unicode created by Rob Pike and Ken Thompson. It uses groups of bytes to represent the Unicode standard for the alphabets of many of the world's languages. UTF-8 is especially useful for transmission over 8-bit mail systems.
It uses 1 to 4 bytes per character, depending on the Unicode symbol. For example, only one UTF-8 byte is needed to encode the 128 US-ASCII characters in the Unicode range U+0000 to U+007F.
While it may seem inefficient to represent Unicode characters with as many as 4 bytes, UTF-8 allows legacy systems to transmit this ASCII superset. Additionally, data compression can still be performed independently of the use of UTF-8.
The IETF requires all Internet protocols to identify the encoding used for character data with UTF-8 as at least one supported encoding.
to main page | About • Dictionaries • Top 10 • Login | top of page |
© 2008 Zdenko Podobný | XHTML | CSS | Powered by Glossword 1.8.12 |