GB_6345.1

CCITT Chinese set (ISO-IR 165)
MIME / IANA	iso-ir-165
Alias(es)	CN-GB-ISOIR165 (EUC form)
Language(s)	Simplified Chinese, English, Russian; Partial support:; Greek, Japanese
Standard	ITU T.101, annex C
Definitions	ISO-IR 165
Extends	GB 2312
Encoding formats	ISO-2022-CN-EXT, Videotex Data Syntax 2
Succeeded by	GB 18030
	v; t; e;

ISO-IR-165

Add article description

The CCITT Chinese Primary Set^[2] is a multi-byte graphic character set for Chinese communications created for the Consultative Committee on International Telephone and Telegraph (CCITT) in 1992.^[3] It is defined in ITU T.101, annex C, which codifies Data Syntax 2 Videotex.^[2] It is registered with the ISO-IR registry for use with ISO/IEC 2022 as ISO-IR-165,^[4] and encodable in the ISO-2022-CN-EXT code version.^[1]

Quick Facts MIME / IANA, Alias(es) ...

It is an extended modification of GB/T 2312-80, and corresponds to the union of the mainland Chinese GB standards GB 6345.1-86 and GB 8565.2-88, with some further modification and extensions. A subset of the GB 6345.1 extensions are incorporated into GB 18030, while GB 8565.2 serves as the mainland Chinese source reference for certain CJK Unified Ideographs.

GB 6345.1

GB 6345.1-86 (32 × 32 Dot Matrix Font Set of Chinese Ideographs for Information Interchange) includes both a corrigendum and an extension for GB 2312.^[3] The corrigendum alters the following two characters:

More information Row-cell, EUC ...

Corresponds to U+FF47 ｇ FULLWIDTH LATIN SMALL LETTER G in Unicode; however, the amended reference glyph can also correspond to U+0261 ɡ LATIN SMALL LETTER SCRIPT G. See below for how U+0261 is typically mapped to/from GB/T 6341.1, versus how it is mapped to/from ISO-IR-165. GB 18030 swaps this one back to the original^[5] looped glyph.^[6]
The unamended reference glyph is a Traditional Chinese character corresponding to U+937E. The character in question is usually replaced with 钟 (U+949F, also the simplification of 鐘) in Simplified Chinese except in names of persons; the amended glyph is an alternate simplified form corresponding to U+953A.

Deployed implementations incorporating GB 2312, such as Windows code page 936, generally follow these corrections in mapping 79-81 to U+953A.^[7]

The extension adds half-width ISO 646-CN characters in row 10 (in addition to the existing full-width characters in row 3) and extends the set of 26 non-ASCII pinyin characters in row 8 with six additional such characters. These GB 6345.1 extensions are also incorporated into GB/T 12345, the Traditional Chinese counterpart to GB 2312, in addition to 29 vertical presentation forms in row 6.^[3]^[8]

Later GB/T 6345.1-2010 published in 2011 officially adds half-width forms of the 32 pinyin characters (including the six new additions) in row 8 to row 11.^[9] This addition is not featured in GB 18030.^[6]

The six additional pinyin characters from GB 6345.1 and the vertical presentation forms from GB 12345 — but not the half-width forms — are included in the classic Mac OS encoding for Simplified Chinese (a modification of EUC-CN),^[10] and also as two-byte codes in GB 18030.^[6] The additional pinyin characters are as follows:^[10]

More information Row-cell, EUC ...

Mapped to the Private Use Area U+E7C7 by Windows code page 936^[11] and the first (2000) edition of GB 18030; this was amended by the 2005 edition.^[6]
This composed character was added in Unicode 3.0. Prior to this, this character was mapped to its composition sequence (i.e. U+006E U+0300) by Apple.^[10] This change predates the stabilisation of Unicode normalisation forms, which was introduced in Unicode 3.1.^[12] It is mapped to U+E7C8 by Windows code page 936.^[11]
Matches the unamended reference glyph for 03-71 (see above) in being a looped g, in spite of being typically mapped to U+0261. Mappings used for ISO-IR-165 differ (see below). GB 18030 swaps 03-71 back to the looped g, and makes this one the open g.^[6]

These extensions and modifications to GB 2312 were first introduced in GB 5007.1-85 in 1985.

Share this article:

This article uses material from the Wikipedia article GB_6345.1, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[7] Corresponds to U+FF47 ｇ FULLWIDTH LATIN SMALL LETTER G in Unicode; however, the amended reference glyph can also correspond to U+0261 ɡ LATIN SMALL LETTER SCRIPT G. See below for how U+0261 is typically mapped to/from GB/T 6341.1, versus how it is mapped to/from ISO-IR-165. GB 18030 swaps this one back to the original^[5] looped glyph.^[6]

[8] The unamended reference glyph is a Traditional Chinese character corresponding to U+937E. The character in question is usually replaced with 钟 (U+949F, also the simplification of 鐘) in Simplified Chinese except in names of persons; the amended glyph is an alternate simplified form corresponding to U+953A.

[14] Mapped to the Private Use Area U+E7C7 by Windows code page 936^[11] and the first (2000) edition of GB 18030; this was amended by the 2005 edition.^[6]

[16] This composed character was added in Unicode 3.0. Prior to this, this character was mapped to its composition sequence (i.e. U+006E U+0300) by Apple.^[10] This change predates the stabilisation of Unicode normalisation forms, which was introduced in Unicode 3.1.^[12] It is mapped to U+E7C8 by Windows code page 936.^[11]

[17] Matches the unamended reference glyph for 03-71 (see above) in being a looped g, in spite of being typically mapped to U+0261. Mappings used for ISO-IR-165 differ (see below). GB 18030 swaps 03-71 back to the looped g, and makes this one the open g.^[6]

[rfc1922-1] [1]
Zhu, HF.; Hu, DY.; Wang, ZG.; Kao, TC.; Chang, WCH.; Crispin, M. (1996). "Chinese Character Encoding for Internet Messages". Requests for Comments. IETF. doi:10.17487/rfc1922. RFC 1922.

[chung-2] [2]
Chung, Jaemin (2018-01-24). "Pseudo-G8 characters" (PDF). ISO/IEC JTC 1/SC 2/WG 2/IRG N2276.

[lunde2009-3] [3]
Lunde, Ken (2009). CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese Computing (2nd ed.). Sebastopol, CA: O'Reilly. pp. 94–111. ISBN 978-0-596-51447-1.

[iso-ir-4] [4]
CCITT (1992-07-13). Codes of the Chinese graphic character set for communication (PDF). ITSCJ/IPSJ. ISO-IR-165.

[ir58-5] [5]
China Association for Standardization. Coded Chinese Graphic Character Set for Information Interchange (PDF). ITSCJ/IPSJ. ISO-IR-58.

[gb18030-6] [6]
Standardization Administration of China (SAC) (2005-11-18). GB 18030-2005: Information Technology—Chinese coded character set.

[ms936-9] [7]
Steele, Shawn (2000). "cp936 to Unicode table". Microsoft, Unicode Consortium.

[cjkv-12345-10] [8]
Lunde, Ken (1998). Appendix F: GB/T 12345 (PDF). O'Reilly Media. ISBN 9781565922242. {{cite book}}: |work= ignored (help)

[:0-11] [9]
Standardization Administration of China (SAC) (2011-01-10). GB/T 6345.1-2010 信息技术汉字编码字符集(基本集) 32点阵字型第1部分宋体 (in Chinese (China)). China.{{cite book}}: CS1 maint: location missing publisher (link)

[macsimpchinese-12] [10]
"Map (external version) from Mac OS Chinese Simplified encoding to Unicode 3.0 and later". Apple, Inc.

[ms936-with-pua-13] [11]
Microsoft. "CODEPAGE 936: PRC GBK (XGB) - ANSI, OEM". Unicode Consortium.

[15] [12]
"Unicode Character Encoding Stability Policies". Unicode Consortium. 2017-06-23.

[18] [13]
Viswanadha, Raghuram (2000-08-30). "Unicode to ISO-IR-165 table". International Components for Unicode. IBM. (Note: codes are listed in the source in 7-bit form: add 0x80 to each byte for EUC form, or subtract 0x20 for kuten form)

[2]

[3]

[4]

[1]

[5]

[lower-alpha 1]

[lower-alpha 2]

[6]

[7]

[8]

[9]

[10]

[lower-alpha 1]

[lower-alpha 2]

[lower-alpha 3]

[11]

[12]

[13]

Row-cell	EUC	GB 2312 (Unamended)^[5]	GB 6345.1	Notes
03-71	0xA3E7		ɡ	^{[lower-alpha 1]}
79-81	0xEFF1	鍾	锺	^{[lower-alpha 2]}

Row-cell	EUC	Character^[10]^[6]	Notes
08-27	0xA8BB	U+0251 ɑ LATIN SMALL LETTER ALPHA
08-28	0xA8BC	U+1E3F ḿ LATIN SMALL LETTER M WITH ACUTE	^{[lower-alpha 1]}
08-29	0xA8BD	U+0144 ń LATIN SMALL LETTER N WITH ACUTE
08-30	0xA8BE	U+0148 ň LATIN SMALL LETTER N WITH CARON
08-31	0xA8BF	U+01F9 ǹ LATIN SMALL LETTER N WITH GRAVE	^{[lower-alpha 2]}
08-32	0xA8C0	U+0261 LATIN SMALL LETTER SCRIPT G	^{[lower-alpha 3]}

GB_6345.1

ISO-IR-165

GB 6345.1

GB 8565.2

CCITT changes

References

External links

Share this article:

Row-cell	EUC	GB 2312 (unamended)^[5]	GB 6345.1^[9]	GB 6345.1 mapping^[10]	ISO-IR-165^[4]	ISO-IR-165 mapping^[13]	GB 18030^[6]	GB 18030 mapping^[6]
03-71	0xA3E7		ɡ	U+FF47	ɡ	U+0261		U+FF47
08-32	0xA8C0	(absent)		U+0261		U+FF47	ɡ	U+0261
79-81	0xEFF1	鍾	锺	U+953A	锺	U+953A	锺	U+953A