sgc
The sgc utility precompiles an XML grammar (a *.grxml text file) into a binary grammar (*.gram), or trains a Statistical Language Model (SLM) natural language grammar from a prepared training file. You can also generate a binary output SLM by using the option -slm
.
The utility is located in: %SWISRSDK%\amd64\bin
Usage
sgc grammar1.grxml grammar2.grxml ...
[-baseline path]
[-batch filename]
[-language lang]
[-langver version]
[-lexicon_uri uri_for_arpa_dictionary]
[-load_arpa filename]
[-no_gram]
[-no_logo]
[-no_script_verify]
[-optimize n]
[-out filename]
[-slm filename]
[-test filename]
[-train filename]
Options
grammar1.grxml grammar2.grxml ...
The locations and names of one or more grammars to be compiled.
-baseline path
Specifies the starting location for all relative file path references.
-batch filename
Specifies a batch file containing instructions to compile several files.
-language lang
Specifies the language to use in the compilation.
-langver version
Specifies the language pack version for compilation. Use this parameter to compile a grammar with the same language version used by the application. (If the application loads a binary grammar compiled for a different language version, Recognizer returns an error.)
- If your system has only one version of a language installed, this parameter is not required for that language.
- If your system has more than one version of a language, sgc uses the newest by default. The parameter is required if the application uses a previous version.
This example compiles for a version of US English:
-langver "en-us 9.0.0"
When the grammar covers more than one language, you can specify the version of each:
-langver "en-us 9.0.0,fr-ca 10.0.0"
-lexicon_uri uri_for_arpa_dictionary
Specifies a pronunciation dictionary during compilation training of an ARPA ngram. Ignored unless -load_arpa is also used.
-load_arpa filename
Specifies a file that contains SLM training data written in ARPA format. The resulting binary grammar will be a simple loop over all vocabulary words in the training file. When used, an input grammar is not allowed. Cannot be used with -train.
Note: Recognizer uses the Katz backoff formula, which says that if the n-gram doesn't exist in the language model, use the n-1-gram likelihood with its backoff weight.
-no_gram
This option is available with -train. It suppresses output of the binary grammar file, and is used when configuration parameters inside the training file are writing FSM and wordlist output.
-no_logo
Suppresses the version info when using a script.
-no_script_verify
Does not check ECMAScript when compiling the grammar.
-optimize n
Sets the optimization level for the compilation. Value are 0–12 (but not 10). Generally, lower values compile faster but get slower recognition, and higher values compile slower and recognize faster.
-out filename
Specifies the filename for the compiled output.
-slm filename
Generates a binary output SLM. The input file can be an SLM training file or an slmxml file. For example:
sgc -slm mytraining.xml
generates the SLM mytraining.slm. See Interpolated SLMs.
-test filename
This option specifies an input file of test sentences. The compiler reports perplexity measurements for each sentence provided.
-train filename
Specifies a file that contains SLM training data. This option requires a training file, not an SRGS speech grammar. The resulting output is a binary grammar that is a simple loop over all vocabulary words in the training file. Cannot be used with the -load_arpa option.
Example
> sgc mygrammar.grxml -optimize 9 -no_script_verify