Date: 11-8-95 Files: README (this file), cmudict.0.1.Z (compressed), cmulex.0.1.Z, cmudict.0.2.Z (compressed), cmudict.0.3.Z (compressed), cmudict.0.4.Z, cmulex.0.3.Z, cmulex.0.4.Z, phoneset.0.1, phoneset.0.3, phoneset.0.4. This directory contains pronunciation dictionaries (cmudict.0.1.Z is the first one we put out, cmudict.0.4.Z is the latest and most up-to-date) containing approximately 100k words and their transcriptions; lists of the words are in cmulex.0.[134].Z. We use these dictionaries at Carnegie Mellon in our speech understanding systems. The phone set for cmudict.0.4 contains 39 phones, a list of which can be found in phoneset.0.4. Lexical stress is indicated by means of a numeral [012] attached to a vowel: 0 = no stress 1 = primary stress 2 = secondary stress Alternate transcriptions are identified with a numeral in parentheses as part of the lexical entry. We generated this dictionary using the following independent sources: - a 20k+ general English dictionary, built by hand at Carnegie Mellon (extensively proofed and used). - a 200k+ UCLA-proofed version of the shoup dictionary. - a 32k subset of the Dragon dictionary. - a 53k+ dictionary of proper names, synthesiser-generated, unproofed. - a 200k dictionary generated with Orator, unproofed. - a 200k dictionary generated with Mitalk, unproofed. All entries that occur solely in copyrighted sources, like the Dragon dictionary, are not currently included in this dictionary. If you have words and transcriptions that you would like included in this unrestricted resource, please send them to Robert L. Weide (weide@cs.cmu.edu) and we will consider them for an upcoming version. All of the above sources were preprocessed and the transcriptions in the current cmudict.0.1 were selected from the transcriptions in the sources or a combination thereof. We have removed some potentially unreliable transcriptions from this dictionary, including those based on only one source, and will reintroduce them once we have verified the transcriptions. CMU does not guarantee the accuracy of this dictionary, nor its suitablity for any specific purpose. In fact, we expect a number of errors, omissions and inconsistencies to remain in the current result. We intend to continually update the dictionary as we make progress in correcting them. We will make subsequent versions available via anonymous ftp, and those who would like notification when updated versions are available should send email to weide@cs.cmu.edu. We welcome input from users: send e-mail to Robert L. Weide (weide@cs.cmu.edu) if you have comments and suggestions on the content of the dictionary. The Carnegie Mellon Pronouncing Dictionary [cmudict.0.4 and all previous versions] is Copyright 1993, 1994, and 1995 by Carnegie Mellon University. Use of this dictionary for any research or commercial purpose is completely unrestricted. If you make use of or redistribute this material, we would appreciate acknowlegement of its origin. If you add words to or correct words in this dictionary, we would like the additions and corrections sent to us (weide@cs) for consideration in a subsequent version. All final entries will be approved by Robert L. Weide, editor of the dictionary.