UTF-8 (8-bit Unicode Transformation Format) is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes. The encoding is defined by the Unicode Standard, and was originally designed by Ken Thompson and Rob Pike. The name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8-bit.

Property Value
dbo:abstract
  • UTF-8 (8-bit Unicode Transformation Format) is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes. The encoding is defined by the Unicode Standard, and was originally designed by Ken Thompson and Rob Pike. The name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8-bit. It was designed for backward compatibility with ASCII. Code points with lower numerical values, which tend to occur more frequently, are encoded using fewer bytes. The first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single byte with the same binary value as ASCII, so that valid ASCII text is valid UTF-8-encoded Unicode as well. Since ASCII bytes do not occur when encoding non-ASCII code points into UTF-8, UTF-8 is safe to use within most programming and document languages that interpret certain ASCII characters in a special way, such as "/" (slash) in filenames, "\" (backslash) in escape sequences, and "%" in printf. Since 2009, UTF-8 has been the dominant encoding (of any kind, not just of Unicode encodings) for the World Wide Web (and declared mandatory "for all things" by WHATWG) and as of November 2019 accounts for 94.3% of all web pages (some of which are simply ASCII, as it is a subset of UTF-8) and 96% of the top 1,000 highest ranked web pages. The next-most popular multi-byte encodings, Shift JIS and GB 2312, have 0.3% and 0.2% respectively. The Internet Mail Consortium (IMC) recommended that all e-mail programs be able to display and create mail using UTF-8, and the W3C recommends UTF-8 as the default encoding in XML and HTML. (en)
dbo:thumbnail
dbo:wikiPageEditLink
dbo:wikiPageExternalLink
dbo:wikiPageExtracted
  • 2019-11-17 20:44:09Z (xsd:date)
dbo:wikiPageHistoryLink
dbo:wikiPageID
  • 32188 (xsd:integer)
dbo:wikiPageLength
  • 81305 (xsd:integer)
dbo:wikiPageModified
  • 2019-11-17 20:44:05Z (xsd:date)
dbo:wikiPageOutDegree
  • 361 (xsd:integer)
dbo:wikiPageRevisionID
  • 926650722 (xsd:integer)
dbo:wikiPageRevisionLink
dbp:wikiPageUsesTemplate
dct:subject
rdfs:comment
  • UTF-8 (8-bit Unicode Transformation Format) is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes. The encoding is defined by the Unicode Standard, and was originally designed by Ken Thompson and Rob Pike. The name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8-bit. (en)
rdfs:label
  • UTF-8 (en)
owl:sameAs
foaf:depiction
foaf:isPrimaryTopicOf
is dbo:knownFor of
is dbo:language of
is dbo:wikiPageDisambiguates of
is dbo:wikiPageRedirects of
is rdfs:seeAlso of
is foaf:primaryTopic of