Checking pronunciations with dicttest
The dicttest program is a command line tool that lets you check pronunciations of individual words or lists of words inside a file. The program is installed with Recognizer.
Note: The following information is written with English examples, but applies generically to all languages. The English examples use English phonemes, but you must always use the phonemes for the language you specify (again, see the appropriate Language Supplement for a list of phonemes).
Program usage
Use a command prompt window to gain access to the program. For example, typing the program name with no arguments provides usage information:
c:/> dicttest
Usage: dicttest -language code
[-system_dict file]
[-user_dict file]
[-max_pron num]
[-encoding]
Argument |
Meaning |
---|---|
-language code |
Required. Specifies the ID of the language being tested. For example, to test French you would enter "-language fr-FR" or "-language fr-CA". |
-system_dict file |
Optional (required if -language is omitted). Specifies a system dictionary to search for pronunciations. System dictionaries are binary files supplied with Recognizer; there is one system dictionary per language stored in the installation baseline. This parameter is useful when using a system dictionary that is not stored in the default baseline location. |
-user_dict file |
Optional. Specifies a user dictionary to search for pronunciations. If a word is found in the user dictionary, the program uses those pronunciations and does not search the system dictionary. The precedence of the file can be changed (see Changing the default precedence) by appending the SWI.type variable; for example "user.dict?SWI.type=backup". |
-max_pron num |
Optional. Specifies the maximum number of pronunciations to be shown for any given word. By default, dicttest shows all pronunciations in the available dictionaries. |
-encoding |
Optional. Specifies the encoding of the dictionary file. Values are -utf8 (default) or -iso8859. |
See dicttest.
Program output
The dicttest program provides the following output:
- Count of the number of pronunciations found.
- The phonemic spelling of the word (the pronunciation).
- Indication of the location where the pronunciations were found.
- A max_res_len value (used internally by Nuance).
Automatically generated pronunciations
As mentioned earlier (The system dictionary), if you enter a word that is not found in a dictionary, the system automatically generates a pronunciation and indicates that a set of internal rules were used as the source of the pronunciation. Some languages may write an additional warning message to Recognizer’s diagnostic log (by default, this log file is $NUANCE_DATA_DIR/system/diagnosticLogs/nrs.log.
Investigate generated pronunciations carefully to ensure that they are correct. For example, if a person’s name "Pnina" returned a pronunciation that is not correct, and the application developer would make a correction (in this case, the pronunciation is correct):
pni:n@
source: pnina?name=automatic;level=automatic;massaged=1;
Explanation of the output:
Output |
Description |
---|---|
name=automatic |
The "automatic" dictionary was used to generate the pronunciation. |
level=automatic |
The precedence of the automatic dictionary is also "automatic". See Dictionary precedence. (When the automatic dictionary is used, the precedence is always "automatic" too.) |
massaged=1 |
Boolean that indicates whether the lookup algorithm employed "mangling" rules. For example, if the algorithm tried variations of the exact word after the removal of punctuation. |