Testing grammars
The parseTool and test_parser, described in detail below, are tools specifically designed for testing grammars. Both are shipped with Recognizer, and are stored in the %SWISRSDK%\bin directory.
Also useful are the following utilities:
- acc_test for recognition accuracy testing, which takes one or more prepared scripts as input.
- dicttest for checking dictionary pronunciations.
Using parseTool
Use the parseTool program test a single grammar interactively. It lets you type sentences into grammar to see how the grammar handles them. You can test grammar coverage, interpretation, ambiguity, and overgeneration.
To use parseTool, navigate to the %SWISRSDK%\bin directory, and enter a command with the following format at the prompt:
parseTool grammarfile.grxml [option1arg1] [option2arg2] [...]
Where grammarfile is the path and name of the grammar file to be tested. The options most often used for regular GrXML grammars are described in the table that follows:
Note: Some parseTool options only apply for natural language grammars.
Option |
Description |
---|---|
-debug_output |
Prints information about ECMAScript operations. Used with -test_sentences and -test_file. Can be abbreviated to -d_o. |
-dump_parser filename |
Prints parser information to the specified file. |
-gen_file filename |
Generates random output sentences from the grammar to the specified file. Use with -max_gen to specify a number of sentences to be generated. |
-gen_sentences |
Generates random output sentences from the grammar. Used to detect overgeneration. Use with -max_gen to specify a number of sentences to be generated. Can be abbreviated to -g_s. |
-iso8859 |
Specifies the encoding format of the input and output files as ISO-8859. Used to override UTF-8 format when UTF-8 is the default. |
-max_gen |
Specifies how many sentences to generate. Used with -gen_sentences and -gen_file. |
-media_type |
Specifies the media type of the grammar. The value is either "application/x-vnd.speechworks.emma+xml" or "application/x-vnd.speechworks.recresult+xml". |
-no_pretty |
Prints the parse result with no formatting (that is, as a continuous line of text). |
-no_script_check |
Disables validity checking of the grammar. |
-s |
Enables silence mode, which stops the printing of argument information at the beginning of output. |
-test_file filename |
Specifies an input file with test sentences (one sentence per line). |
-test_sentences |
Enables input of sentences to test the grammar. You can type sentences from the keyboard (the default) or specify an input file (using the -test_file option). The tool evaluates each sentence and shows whether it is covered by the grammar. Can be abbreviated to -t_s. |
-utf8 |
Specifies the encoding format of the input and output files as UTF-8. Used to override ISO-8859 format when ISO-8859 is the default. |
-utt |
Enables input of audio files to be parsed. Only audio/basic files may be input. Use this option with –test_sentences. You cannot use this option with -test_file. With this option, you can specify audio files in addition to typing sentences (see below). The syntax is <filename (the angle bracket is required, and no whitespace is allowed between the bracket and the filename). |
-verbose |
Prints additional parse details. |
Note: You can put the parseTool options in any order on the command line.
Using test_parser
The test_parser tool allows you to perform interpretation tests on grammars by comparing the correct key/value pairs that get passed to Recognizer with those actually generated.
Note that test_parser only tests keys/values that are set at the root and thus passed back to Recognizer; it cannot test attribute settings from subroot rules.
The tool operates on a test file, each of whose lines defines a test, or directive. Additionally, lines beginning with # are treated as comments. Like parsetool, test_parser accepts an argument of -iso8859 when the input file is not utf-8.
Each test line in the test file must be of the form:
xml_grammar_file sentence_text key_name correct_value_for_key
Item |
Description |
---|---|
xml_grammar_file |
Name of the grammar file. A hyphen (–) uses the previous grammar. |
sentence_text |
Text of the sentence to be recognized (in quotes). Precede the text with a tilde (~) to indicate sentences not allowed by the grammar (sentences that should not parse, or that parse and cause SWI_disallow to be set to 1. |
key_name |
Name of key to test. Precede the text with a tilde (~) to indicate keys to ignore. |
correct_value_for_key |
The expected value for key_name. The test_parser program compares the value actually returned with this value. Place the value in double quotes if it has spaces. |
You can run test_parser with the following command-line options:
- -verbose (-v) prints more details of program operation.
- -debug_output (-d_o) prints details of script execution (see Using parseTool to verify ECMAScript for an example).
Related topics
Related topics
Reference