CMU_Pronouncing_Dictionary

CMU Pronouncing Dictionary
Developer(s)	Carnegie Mellon University
Stable release	0.7b / November 19, 2014; 9 years ago
Available in	English
License	BSD
Website	www.speech.cs.cmu.edu/cgi-bin/cmudict

CMU Pronouncing Dictionary

Machine-readable pronunciations

The CMU Pronouncing Dictionary (also known as CMUdict) is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research.

Quick Facts Developer(s), Stable release ...

CMUdict provides a mapping orthographic/phonetic for English words in their North American pronunciations. It is commonly used to generate representations for speech recognition (ASR), e.g. the CMU Sphinx system, and speech synthesis (TTS), e.g. the Festival system. CMUdict can be used as a training corpus for building statistical grapheme-to-phoneme (g2p) models^[1] that will generate pronunciations for words not yet included in the dictionary.

The most recent release is 0.7b; it contains over 134,000 entries. An interactive lookup version is available.^[2]

Share this article:

This article uses material from the Wikipedia article CMU_Pronouncing_Dictionary, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[1] [1]
"Sequitur G2P - A trainable Grapheme-to-Phoneme converter".

[cmudict-2] [2]
"The CMU Pronouncing Dictionary". CMU Pronouncing Dictionary. 2015-07-16. Archived from the original on 2022-06-03. Retrieved 2022-06-04.

[3] [3]
ftp://ftp.cs.cmu.edu/project/speech/dict/%5B%5D

[4] [4]
http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/00README_FIRST.txt ^{[bare URL plain text file]}

[5] [5]
"Cmusphinx - Revision 10973: /Trunk/Logios". Archived from the original on 2011-05-20. Retrieved 2009-12-19.

[1]

[2]

[3]

[4]

[5]

ARPABET	Rspl.	IPA	Example
`AA`	ah	ɑ	odd
`AE`	a	æ	at
`AH0`	ə	ə	about
`AH`	uh	ʌ	hut
`AO`	aw	ɔ	ought, story
`AW`	ow	aʊ	cow
`AY`	eye	aɪ	hide
`EH`	eh	ɛ	Ed

ARPABET	Rspl.	IPA	Example
`ER`	ur, ər	ɝ, ɚ	hurt
`EY`	ay	eɪ	ate
`IH`	i, ih	ɪ	it
`IY`	ee	i	eat
`OW`	oh	oʊ	oat
`OY`	oy	ɔɪ	toy
`UH`	uu	ʊ	hood
`UW`	oo	u	two

AB	Description
0	No stress
1	Primary stress
2	Secondary stress

ARPABET	Rspl.	IPA	Example
`B`	b	b	be
`CH`	ch, tch	tʃ	cheese
`D`	d	d	dee
`DH`	dh	ð	thee
`F`	f	f	fee
`G`	g	ɡ	green
`HH`	h	h	he
`JH`	j	dʒ	gee

ARPABET	Rspl.	IPA	Example
`K`	k	k	key
`L`	l	l	lee
`M`	m	m	me
`N`	n	n	knee
`NG`	ng	ŋ	ping
`P`	p	p	pee
`R`	r	r	read
`S`	s, ss	s	sea

CMU_Pronouncing_Dictionary

CMU Pronouncing Dictionary

Database format

History

Applications

See also

References

External links

Share this article:

ARPABET	Rspl.	IPA	Example
`SH`	sh	ʃ	she
`T`	t	t	tea
`TH`	th	θ	theta
`V`	v	v	vee
`W`	w, wh	w	we
`Y`	y	j	yield
`Z`	z	z	zee
`ZH`	zh	ʒ	seizure

Version	Release date^[3]	License
0.1	16 September 1993	Public Domain
0.2	10 March 1994	Public Domain
0.3	28 September 1994	Public Domain
0.4	8 November 1995	Public Domain
0.5	No public release	Public Domain
0.6	11 August 1998	Public Domain
0.7	No public release	Public Domain
0.7a	18 February 2008	2-clause BSD
0.7b	19 November 2014^[4]	2-clause BSD
GitHub (unversioned)	26 May 2021	2-clause BSD