Digraph (orthography)

A digraph or digram (from the Greek: δίς dís, "double" and γράφω gráphō, "to write") is a pair of characters used in the orthography of a language to write either a single phoneme (distinct sound), or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

In Welsh, the digraph Ll, ll fused for a time into a ligature.

Some digraphs represent phonemes that cannot be represented with a single character in the writing system of a language, like the English sh in ship and fish. Other digraphs represent phonemes that can also be represented by single characters. A digraph that shares its pronunciation with a single character may be a relic from an earlier period of the language when the digraph had a different pronunciation, or may represent a distinction that is made only in certain dialects, like the English wh. Some such digraphs are used for purely etymological reasons, like rh in English. Digraphs are used in some Romanization schemes, like the zh often used to represent the Russian letter ж. As an alternative to digraphs, orthographies and Romanization schemes sometimes use letters with diacritics, like the Czech and Slovak š, which has the same function as the English digraph sh, like the Romanian Ț, which has the same function as the Slavic C and the English digraph ts, the letter Ť that is used in Czech and Slovak, which has the same function as the Hungarian digraph Ty, and the letter with the cedilla in a few Turkic languages that have the same function as the letter with the cedilla below followed by the letter h in English, for example, ç will become ch in English, and ş will become sh in English.

In some languages' orthographies, digraphs (and occasionally trigraphs) are considered individual letters, which means that they have their own place in the alphabet and cannot be separated into their constituent graphemes when sorting, abbreviating or hyphenating words. Examples of this are found in Hungarian (cs, dz, dzs, gy, ly, ny, sz, ty, zs), Czech (ch), Slovak (ch, dz, ), Albanian (dh, gj, ll, nj, rr, sh, th, xh, zh), Gaj's Latin alphabet (lj, nj, dž), and in Uzbek before the 2021 reform (sh, ch, ng), but still uses the ng digraph and the replacement of the digraph ts to the letter C. Kazakh also used a form of the Latin alphabet where there are a few digraphs and one tetragraph, specifically the 2018 version of the Kazakh latin alphabet (sh, ch, shch, ıo), and there is still one digraph in the new version of the Latin alphabet (şç). In Dutch, when the digraph ij is capitalized, both characters are written in uppercase form (IJ). In the Albanian language, the letter X is in fact pronounced as a digraph, because it has the pronunciation of the digraph (dz). In the Māori language, there are two digraphs in the language that are still part of the alphabet, which is ng and wh. In Welsh, there are eight digraphs that exist in the official alphabet (ch, dd, ff, ng, ll, ph, rh, th). In Maltese, there are two digraphs part of the official alphabet ( and ie). Romanization of the Cyrillic alphabet, especially those used in some Slavic languages, including Russian, resulted in some letters sometimes becoming digraphs, which are the letters (ё, ж, х, ц, ч, ш, щ, ю, я) and can be transliterated into (jo/yo, zh, kh, ts, ch, sh, shch, yu/ju, ya/ja), while sometimes romanizing the letters is done by adding adding diacritics, except for kh and ts, which for kh, sometimes becoming ch or x, and ts sometimes become c (ë, ž, č, š), still with some digraphs (šč, ju/yu, ja/ya). The Czech alphabet used to have a lot of digraphs a few hundred years ago, but through evolution, those digraphs eventually become letters with diacritics, although the Czech language still kept some as those letters with diacritics cannot make the pronunciations of the respective digraphs (ch, dz, dž), which is also the same case with the Slovak alphabet, having a lot of digraphs in the alphabet, and then evolving to become a diacritical letter, and keeping some when the diacritical letters can't make the pronunciation of the respective digraphs. The letter Q in those languages are pronounced as a digraph, being pronounced as (kv). Most alphabetical reforms that have digraphs in their alphabets, and then some or all digraphs will become normal letters with a diacritic, such as in Sweden and Norway before the year 1917. The letter Åå was written as a double a (aa), but there were some quirks. If the double a is still used today in place of å, such as in "Håa" and "Sjåast", there will be three consecutive A's in the word ("Haaa" and "Sjaaast"). During the 1800s, Å started becoming more popular in Norway and Sweden.

Digraphs may develop into ligatures, but this is a distinct concept: a ligature involves a graphical combination of two characters, as when a and e are fused into æ, and as when o and e are fused into œ. Those two ligatures are still used in some languages. Æ is usually used in Scandinavian languages, specifically Icelandic, Norwegian, and Danish. Swedish used to have the letter Æ, but this letter has been changed to become Ä. Œ is usually used in French, but is usually typed in two keystrokes (OE/oe), instead of a special key in the French keyboard or using the AltGr key. In Canada, the keyboard layout (Canadian Multilingual Standard) is modified so that it can use the right Ctrl key to get more characters, including the œ and other foreign characters, sometimes a dead key to input a few kinds of diacritics on some letters to type in the language that use the diacritic in question. The digraph ij is a special case, especially in Dutch, as when it is handwritten, the capital version (IJ) becomes very similar if not indistinguishable to the cursive letter Y, but if it is written in the regular, lower case version, it will look like a Y with a diaeresis/umlaut (ÿ).