parseTool

The parseTool utility is a grammar testing tool that tests whether the grammar covers input sentences, that traces ECMAScript operations performed within the grammar, or that checks the grammar for overgeneration and nonsense phrases.

As input parseTool takes the name and location of the grammar file to be tested, and other arguments as determined by the options used in the command line.

When used to test grammar coverage, the utility accepts these sentences from a test file, or directly from the command line, depending on the options used. A sentence may also be input as an audio file if the -utt option is used (audio/basic files only): however, the utility is not intended to test recognition accuracy.

The utility is located in: %SWISRSDK%\amd64\bin

Usage

parseTool grammar.grxml 
   [-batch]
   [-compute_lm]
   [-debug_output]
   [-dump_parser filename]
   [-gen_file filename]
   [-gen_sentences]
   [-iso8859]
   [-max_gen number]
   [-media_type type]
   [-no_pretty]
   [-no_script_check]
   [-perplexity filename]
   [-s]
   [-test_file filename]
   [-test_sentences]
   [-utf8]
   [-utt]
   [-verbose]

Options

grammar.grxml

The name and location of the XML grammar to be tested.

-batch

Uses a test_parser-type script to test whether a grammar correctly interprets a list of specified utterances.

-compute_lm

Calculates the language model score for each sentence.

-debug_output

Prints information about ECMAScript operations. Used with -test_sentences and -test_file. Can be abbreviated to -d_o.

-dump_parser filename

Prints parser information to the specified file.

-gen_file filename

Generates random output sentences from the grammar to the specified file. Use with -max_gen to specify a number of sentences to be generated.

-gen_sentences

Generates random output sentences from the grammar. Used to detect overgeneration. Use with -max_gen to specify a number of sentences to be generated. Can be abbreviated to -g_s.

-iso8859

Specifies the encoding format of the input and output files as ISO-8859. Used to override UTF-8 format when UTF-8 is the default.

-max_gen number

Specifies how many sentences to generate. Used with -gen_sentences and -gen_file.

-media_type type

Specifies the media type of the grammar. The value must be either "application/x-vnd.speechworks.emma+xml" or "application/x-vnd.speechworks.recresult+xml".

-no_pretty

Prints the parse result with no formatting (that is, as a continuous line of text).

-no_script_check

Disables validity checking of the grammar.

-perplexity filename

Specifies an input test file for computing perplexity.

-s

Enables silence mode, which stops the printing of argument information at the beginning of output.

-test_file filename

Specifies an input file with test sentences (one sentence per line).

-test_sentences

Enables input of sentences to test the grammar. You can type sentences from the keyboard (the default) or specify an input file (using the -test_file option). The tool evaluates each sentence and shows whether it is covered by the grammar. Can be abbreviated to -t_s.

-utf8

Specifies the encoding format of the input and output files as UTF-8. Used to override ISO-8859 format when ISO-8859 is the default.

-utt

Enables input of audio files to be parsed. Only audio/basic files may be input. Use this option with –test_sentences. You cannot use this option with -test_file. With this option, you can specify audio files at the prompt as well as typing sentences. The syntax is <filename (the angle bracket is required, and no whitespace is allowed between the bracket and the filename).

-verbose

Prints additional parse details.

Example

> parseTool directcalls.grxml -gen_sentences -max_gen 10

This example generates ten random sentences that are covered by the driectcalls.grxml grammar.