Checking pronunciations with dicttest

The dicttest program is a command line tool that lets you check pronunciations of individual words or lists of words inside a file. The program is installed with Recognizer.

Note: The following information is written with English examples, but applies generically to all languages. The English examples use English phonemes, but you must always use the phonemes for the language you specify (again, see the appropriate Language Supplement for a list of phonemes).

Program usage

Use a command prompt window to gain access to the program. For example, typing the program name with no arguments provides usage information:

c:/> dicttest

Usage: dicttest -language code
  [-system_dict file]
  [-user_dict file] 
  [-max_pron num]

  [-encoding]

Argument	Meaning
-language code	Required. Specifies the ID of the language being tested. For example, to test French you would enter "-language fr-FR" or "-language fr-CA".
-system_dict file	Optional (required if -language is omitted). Specifies a system dictionary to search for pronunciations. System dictionaries are binary files supplied with Recognizer; there is one system dictionary per language stored in the installation baseline. This parameter is useful when using a system dictionary that is not stored in the default baseline location.
-user_dict file	Optional. Specifies a user dictionary to search for pronunciations. If a word is found in the user dictionary, the program uses those pronunciations and does not search the system dictionary. The precedence of the file can be changed (see Changing the default precedence) by appending the SWI.type variable; for example "user.dict?SWI.type=backup".
-max_pron num	Optional. Specifies the maximum number of pronunciations to be shown for any given word. By default, dicttest shows all pronunciations in the available dictionaries.
-encoding	Optional. Specifies the encoding of the dictionary file. Values are -utf8 (default) or -iso8859.

See dicttest.

Program output

The dicttest program provides the following output:

Count of the number of pronunciations found.
The phonemic spelling of the word (the pronunciation).
Indication of the location where the pronunciations were found.
A max_res_len value (used internally by Nuance).

Example

Here is sample output for the word "hello":

Pron 0: h@l@U

Source: hello?name=system;level=backup;massaged=0;

Pron 1: hel@U

Source: hello?name=system;level=backup;massaged=0;

Pron 2: hVl@U

Source: hello?name=system;level=backup;massaged=0;

Explanation of the output:

Output	Description
Pron 0/1/2	Three pronunciations were found.
name=system	All pronunciations were found in the system dictionary.
level=backup	Indicates the precedence of the named dictionary. See Dictionary precedence. (The system dictionary is a backup dictionary).
massaged=0	Boolean that indicates whether the lookup algorithm employed "mangling" rules. For example, if the algorithm tried variations of the exact word after the removal of punctuation.

Automatically generated pronunciations

As mentioned earlier (The system dictionary), if you enter a word that is not found in a dictionary, the system automatically generates a pronunciation and indicates that a set of internal rules were used as the source of the pronunciation. Some languages may write an additional warning message to Recognizer’s diagnostic log (by default, this log file is $NUANCE_DATA_DIR/system/diagnosticLogs/nrs.log.

Investigate generated pronunciations carefully to ensure that they are correct. For example, if a person’s name "Pnina" returned a pronunciation that is not correct, and the application developer would make a correction (in this case, the pronunciation is correct):

pni:n@

source: pnina?name=automatic;level=automatic;massaged=1;

Explanation of the output:

Output	Description
name=automatic	The "automatic" dictionary was used to generate the pronunciation.
level=automatic	The precedence of the automatic dictionary is also "automatic". See Dictionary precedence. (When the automatic dictionary is used, the precedence is always "automatic" too.)
massaged=1	Boolean that indicates whether the lookup algorithm employed "mangling" rules. For example, if the algorithm tried variations of the exact word after the removal of punctuation.

Checking pronunciations with dicttest

Program usage

Program output

Automatically generated pronunciations

Related topics