Latin_Extended-B

Latin Extended-B

Latin Extended-B

Unicode character block


Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points 0180-01FF and contained 113 characters. During unification with ISO 10646 for version 1.1, the block range was extended by 80 code points and another 35 characters were assigned. In version 3.0 and later, the last 60 available code points in the block were assigned. Its block name in Unicode 1.0 was Extended Latin.[3]

Quick Facts Range, Plane ...

Character table

More information Code, Glyph ...

Subheadings

The Latin Extended-B block contains ten subheadings for groups of characters: Non-European and historic Latin, African letters for clicks, Croatian digraphs matching Serbian Cyrillic letters, Pinyin diacritic-vowel combinations, Phonetic and historic letters, Additions for Slovenian and Croatian, Additions for Romanian, Miscellaneous additions, Additions for Livonian, and Additions for Sinology. The Non-European and historic, African clicks, Croatian digraphs, Pinyin, and the first part of the Phonetic and historic letters were present in Unicode 1.0; additional Phonetic and historic letters were added for version 3.0; and other Phonetic and historic, as well as the rest of the sub-blocks were the characters added for version 1.1.

Non-European and historic Latin

The Non-European and historic Latin subheading contains the first 64 characters of the block, and includes various variant letters for use in Zhuang, Americanist phonetic transcription, African languages, and other Latin script alphabets. It does not contain any standard letters with diacritics.

African letters for clicks

The four African letters for clicks are used in Khoisan orthography.

Croatian digraphs matching Serbian Cyrillic letters

The Croatian digraphs matching Serbian Cyrillic letters are three sets of three case mappings (lower case, upper case, and title case) of Latin digraphs used for compatibility with Cyrillic texts, Serbo-Croatian being a digraphic language.

Pinyin diacritic-vowel combinations

The 16 Pinyin diacritic-vowel combinations are used to represent the standard Mandarin Chinese vowel sounds with tone marks.

Phonetic and historic letters

The 35 Phonetic and historic letters are largely various standard and variant Latin letters with diacritic marks.

Additions for Slovenian and Croatian

The 24 Additions for Slovenian and Croatian are all standard Latin letters with unusual diacritics, like the double grave and inverted breve.

Additions for Romanian

The Additions for Romanian are 4 characters that were erroneously unified as having a cedilla, when they have a comma below. The conflation of S and T with cedilla vs. comma below continues to plague Romanian language implementation up to the present.[4]

Miscellaneous additions

The Miscellaneous additions subheading contains 39 characters of various description and origin.

Additions for Livonian

The Additions for Livonian are 10 letters with diacritics for writing the Livonian language.

Additions for Sinology

The Additions for Sinology are three lowercase letters with curls used in the study of classical Chinese language.

Additions for Africanist linguistics

The Additions for Africanist linguistics are two lowercase letter with swash tails used in Africanist linguistics.

Additions for Sencoten

The Additions for Sencoten are 5 letters with strokes for writing Saanich.

Number of letters

The following table shows the number of letters in the Latin Extended-B block.

More information Type of subheading, Number of symbols ...

Compact table

Latin Extended-B[1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+018x ƀ Ɓ Ƃ ƃ Ƅ ƅ Ɔ Ƈ ƈ Ɖ Ɗ Ƌ ƌ ƍ Ǝ Ə
U+019x Ɛ Ƒ ƒ Ɠ Ɣ ƕ Ɩ Ɨ Ƙ ƙ ƚ ƛ Ɯ Ɲ ƞ Ɵ
U+01Ax Ơ ơ Ƣ ƣ Ƥ ƥ Ʀ Ƨ ƨ Ʃ ƪ ƫ Ƭ ƭ Ʈ Ư
U+01Bx ư Ʊ Ʋ Ƴ ƴ Ƶ ƶ Ʒ Ƹ ƹ ƺ ƻ Ƽ ƽ ƾ ƿ
U+01Cx ǀ ǁ ǂ ǃ DŽ Dž dž LJ Lj lj NJ Nj nj Ǎ ǎ Ǐ
U+01Dx ǐ Ǒ ǒ Ǔ ǔ Ǖ ǖ Ǘ ǘ Ǚ ǚ Ǜ ǜ ǝ Ǟ ǟ
U+01Ex Ǡ ǡ Ǣ ǣ Ǥ ǥ Ǧ ǧ Ǩ ǩ Ǫ ǫ Ǭ ǭ Ǯ ǯ
U+01Fx ǰ DZ Dz dz Ǵ ǵ Ƕ Ƿ Ǹ ǹ Ǻ ǻ Ǽ ǽ Ǿ ǿ
U+020x Ȁ ȁ Ȃ ȃ Ȅ ȅ Ȇ ȇ Ȉ ȉ Ȋ ȋ Ȍ ȍ Ȏ ȏ
U+021x Ȑ ȑ Ȓ ȓ Ȕ ȕ Ȗ ȗ Ș ș Ț ț Ȝ ȝ Ȟ ȟ
U+022x Ƞ ȡ Ȣ ȣ Ȥ ȥ Ȧ ȧ Ȩ ȩ Ȫ ȫ Ȭ ȭ Ȯ ȯ
U+023x Ȱ ȱ Ȳ ȳ ȴ ȵ ȶ ȷ ȸ ȹ Ⱥ Ȼ ȼ Ƚ Ⱦ ȿ
U+024x ɀ Ɂ ɂ Ƀ Ʉ Ʌ Ɇ ɇ Ɉ ɉ Ɋ ɋ Ɍ ɍ Ɏ ɏ
Notes
1.^ As of Unicode version 15.1

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Latin Extended-B block:

More information Version, Final code points ...

See also


References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.
  4. Kaplan, Michael. "The history of messing up Romanian on computers". Sorting it all out.

Share this article:

This article uses material from the Wikipedia article Latin_Extended-B, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.