Chapter 12 of the Unicode specification[2] defines the "Ideographic Description Sequences" (IDS) syntax used to describe characters in featural terms, by arrangements of components with code points. Sixteen special characters in the range U+2FF0 to U+2FFF act as prefix operators to combine other characters or sequences to form larger characters.
More information Character, Unicode Character Number ...
Ideographic Description Characters in Unicode
Character | Unicode Character Number | Full Unicode Name |
⿰ | U+2FF0 | Ideographic description character left to right |
⿱ | U+2FF1 | Ideographic description character above to below |
⿲ | U+2FF2 | Ideographic description character left to middle and right |
⿳ | U+2FF3 | Ideographic description character above to middle and below |
⿴ | U+2FF4 | Ideographic description character full surround |
⿵ | U+2FF5 | Ideographic description character surround from above |
⿶ | U+2FF6 | Ideographic description character surround from below |
⿷ | U+2FF7 | Ideographic description character surround from left |
| U+2FFC | Ideographic description character surround from right |
⿸ | U+2FF8 | Ideographic description character surround from upper left |
⿹ | U+2FF9 | Ideographic description character surround from upper right |
⿺ | U+2FFA | Ideographic description character surround from lower left |
| U+2FFD | Ideographic description character surround from lower right |
⿻ | U+2FFB | Ideographic description character overlaid |
| U+2FFE | Ideographic description character horizontal reflection |
| U+2FFF | Ideographic description character rotation |
Close
Two additional ideographic description characters are scattered in other Unicode blocks. U+303E 〾 IDEOGRAPHIC VARIATION INDICATOR is not officially an ideographic description character, but is sometimes used in ideographic description sequences.
More information Character, Unicode Character Number ...
Other related Ideographic Description Characters in Unicode
Character | Unicode Character Number | Block | Full Unicode Name |
〾 | U+303E | CJK Symbols and Punctuation | Ideographic variation indicator |
| U+31EF | CJK Strokes | Ideographic description character subtraction |
Close
These sequences are useful in describing to the reader a character that is not directly printable, either because it is absent in a given font, or is absent from the Unicode standard altogether. For example, the Sawndip character encoded in CJK Unified Ideographs Extension F as U+2DA21 𭨡 can be described as ⿰書史. Another use is for dictionary lookup purposes, as a rough input method for queries.
These sequences can be rendered either by keeping the individual characters separately or by parsing the Ideographic Description Sequence and drawing the ideograph so described.[3] They do not, by themselves, provide unambiguous rendering for all characters. For instance, the sequence ⿱十一 represents both ⼟ 'EARTH' with the middle bar being narrower, and ⼠ 'SCHOLAR' with the middle bar being wider.
Unicode's specification for these sequences is based on the characters and syntax of the earlier GBK standard. Additional symbols are later encoded to fill in the missing combinations.
The IDSgrep free software package by Matthew Skala[4][5] extends Unicode's IDS syntax to include additional features for dictionary lookup; it is capable of converting KanjiVG's database to its own extended IDS format, or of searching EIDS files generated by the related Tsukurimashou font family.