SSM training file
Below is an example training file. A description of the header and main body of the XML-formatted file follow.
<!DOCTYPE SSMTraining SYSTEM "SSMTraining.dtd">
<SSMTraining version="1.0.0" xml:lang="en-us">
<features>
<word>broken</word>
<word>computer</word>
<word>is</word>
<!-- more words, -->
<word>what</word>
<word>are</word>
<word>promotions</word>
</features>
<semantic_models>
<SSM>
<meaning prior="1.0">
<slot name="route">sales</slot>
</meaning>
<meaning prior="0.8">
<slot name="route">tech_support</slot>
</meaning>
</SSM>
</semantic_models>
<training>
<sentence count="1">
<semantics>
<slot name="route">tech_support</slot>
</semantics>
my computer is broken
</sentence>
<sentence count="1">
<semantics>
<slot name="route">sales</slot>
</semantics>
what are the promotions
</sentence>
</training>
</SSMTraining>
The initial header lines for an SSM training file are similar to those required in an SLM training file (see SLM training file header):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE SSMTraining SYSTEM "SSMTraining.dtd">
<SSMTraining version="1.0.0" xml:lang="en-us">
The most important difference here is that the document type is "SSMTraining", the related document type definition is SSMTraining.dtd (also located in the %SWISRSDK%\config directory), and the main declaration uses an <SSMTraining> element rather than the <SLMTraining> element.
<SSMTraining> element
Unlike the <SLMTraining> element, the <SSMTraining> element does not support the <meta> element for specifying configuration parameters, nor the <lexicon> element for specifying a user dictionary. To specify parameters, you must instead use the <param> and <value> elements in the file main body: see SSM training configuration parameters for details.
However, <SLMTraining> does support the <training> and <test> elements. See Training and test sections.
The <SLMTraining> element does support two SSM-specific elements used in the training file main body: the <semantic_models> and <features> elements.
SSM training file main body
The main body of the training file uses several elements that are specific to SSM training. These are organized into two main sections of the file: the <features> section, and the <semantic_models> section.
<features> section (vocabulary and classes)
The <features> section in an SSM training file serves a similar role to the <vocab> section in an SLM training file: it defines a section of vocabulary words and classes using the <word> and <ruleref> elements. The vocabulary section of the training file defines all words allowed in the other sections. (Omitted words are ignored if they appear in sentences.)
You can also use ECMAScript in the <features> section to modify or augment <ruleref> meanings. See Feature extraction and ECMAScript.
<word> element
Within the vocabulary section of the training file, each <word> declares a single vocabulary word or a phrase joined by underscores.
<ruleref> grammar classes
The optional <ruleref> element defines a classes of words in sentences that are best handled by an external grammar. The ruleref imports a rule from the external grammar into the training file:
- The <ruleref> element appears among the features and, optionally, the sentences of training files.
- When used in the <features> section, the <ruleref> acts as a declaration for the grammar, which is used later inside of sentences. Without this declaration, the <ruleref> cannot appear in training or test sentences.
The words are not imported individually from the grammar, and they cannot be used individually in sentences. Instead, they are imported as a class, and they can only be specified in sentences as a <ruleref> class. See the feature_extraction attribute (manual or automatic).
- When used inside of a <sentence>, the <ruleref> declares a placeholder in the named grammar. Recognizer treats those words as a class, and for the purpose of statistical modeling, they are treated as a word.
Typically, the words are left in the input file and feature extraction (matching rulerefs) happens automatically. In this case, depending on the setting of the feature_generation attribute, the words are either augmented with the class feature, replaced by it, or removed altogether.
Classes help generalize from the training sentences. For example, if an application accepts restaurant reservations, it would be useful to have a restaurant <ruleref> instead of writing individual restaurant names in the training sentences. Doing this has numerous advantages: it automates the creation of additional sentences with all names in the restaurant class; it ensures that no restaurant name is accidentally omitted from the sentences; and when the list of available restaurant names changes in the restaurant grammar, no change is needed for the training sentences.
The <ruleref> element allows the following attributes:
- origuri: Optional. This attribute declares the URI of the original grammar:
<features>
<word>sample</word>
...
<ruleref origuri="small.grxml#genre"/>
...
<word>vocabulary</word>
</features>
If the uri attribute declares a compiled grammar, the origuri attribute is required.
- uri: Required. This attribute declares the URI of the grammar:
<features>
<word>sample</word>
...
<ruleref uri="small.gram#genre"/>
...
<word>vocabulary</word>
</features>
Normally, the rulename (the text after the #) is optional because a uri refers to the root rule of the grammar by default. However, in this context the rulename is required.
- tag: Optional. Only allowed inside the <features> element. This attribute specifies an ECMAScript expression that will be executed when Recognizer parses the words represented by a <ruleref> subgrammar. A typical use for this attribute is to assign additional slot/value pairs.
- words: Optional. This attribute provides a way to remember the original phrase or sentence that has been converted into a ruleref. This is useful to the creator of the training file. It is not currently used by the compiler.
For example, if the application recognizes dates, you would typically convert actual dates that are spoken and transcribed during data collection into a generalized date ruleref. This attribute provides a way to record the original actual dates. A coded example:
<sentence>
<semantics> <slot name="route">MONEY</slot>
<ruleref uri="./number.xml" words="two" />
<ruleref uri="./currencies.xml" words="pesos" />
</semantics>
</sentence>
The example implies that the phrase "two pesos" was seen during data collection, and has been generalized to be a number followed by a currency. This attribute is used strictly for annotation; it has no effect on the SSM.
feature_generation attribute
The feature_generation attribute controls the handling of <ruleref> during training. When the recognized text contains a string that matches a <ruleref>, this parameter determines the contents of the feature set:
Value
|
Description
|
fragment
|
Retains the string and the matching <ruleref>. This is the default.
For example, “departing january twenty seventh” might become “departing january twenty seventh <ruleref uri="date.grxml"/>
|
remove
|
Removes the string and the matching <ruleref>.
For example, “departing january twenty seventh” might become “departing”
|
stem
|
Replaces the string with the <ruleref>.
For example, “departing january twenty seventh” might become “departing <ruleref uri="date.grxml"/>”
|
See Feature extraction and ECMAScript.
feature_keys and feature_values attributes
If someone builds a call routing context into an existing application, they likely already have several grammars representing semantics relevant to the SSM. The feature_values and (especially) feature_keys attributes are a great time saver, enabling re-use of existing grammars in the feature extractor of the SSM and leveraging the existing semantic interpretations that those grammars can provide. They can also tremendously increase the coverage of the feature extractor of an SSM by leveraging the existing coverage of those grammars.
The feature_keys attribute works in conjunction with a <ruleref> that points to a grammar that sets key/value pairs (slots) in ECMAScript (found in a <tag> statement in that grammar). The feature_keys attribute is a space-delimited list of keys that are returned by the grammar. Upon encountering a fragment in a sentence that fires the rule mentioned in the <ruleref>, the resulting key/value pairs are analyzed and matched against those listed in the feature_keys attribute. If a match is found, the key NAME is added to the <ruleref>'s name to form the full feature name.
The feature_values attribute allows for dynamic (training time) generation of feature names based on the input sentence. In other words, features are listed as is in the <features> section and used as is when encountered during training. With the feature_values attribute, the exact name of the feature is determined at training time and not explicitly listed in the <features> section of the XML training file.
The feature_values attribute also works in conjunction with a <ruleref> pointing to a grammar that sets key/value pairs (slots) in ECMAScript (found in a <tag> statement in that grammar). The feature_values attribute is a space-delimited list of values that are returned by the grammar. Upon encountering a fragment in a sentence that fires the rule mentioned in the <ruleref>, the resulting key/value pairs are analyzed and matched against those listed in the feature_values attribute. If a match is found, the key VALUE is added to the ruleref's name to form the full feature name.
The following example shows a typical use of feature_keys and feature_values, based on a <ruleref> from a date grammar (my_gram.grxml).
Suppose the grammar includes the following rule:
<rule id="month" scope="public">
<one-of>
<item> january
<tag> MONTH = "01"; WINTER="1"; SPECIAL_DAY="New Year" </tag> </item>
<item> february
<tag> MONTH = "02"; WINTER="1"; SPECIAL_DAY="Valentine" </tag>
</item>
...
...
...
<item> july <tag> MONTH = "07"; SUMMER="1" </tag> </item>
...
...
...
<item> december
<tag> MONTH = "12"; FALL="1"; SPECIAL_DAY="Christmas" </tag>
</item>
</one-of>
</rule>
You then define two <ruleref>s that refer to this rule—one for the feature_keys, and one for the feature_values:
- Example 1 (feature_keys):
<ruleref uri="my_gram.grxml#month" feature_generation="stem"feature_keys="WINTER SPRING SUMMER FALL">
- Example 2 (feature_values):
<ruleref uri="my_gram.grxml#month" feature_generation="fragment"feature_values="MONTH SPECIAL_DAY">
Based on this grammar and these <ruleref> definitions, the following features will be used in training the SSM (assuming that all words are defined in the <features> section):
Input
|
Example 1 (feature keys)
|
Example 2 (feature values)
|
for february ninth
|
for
my_gram.grxml#month#WINTER
ninth
|
for
february my_gram.grxml#month#02
my_gram.grxml#month#Valentine
ninth
|
july second
|
my_gram.grxml#month#SUMMER
second
|
july
my_gram.grxml#month#07
second
|
on december third
|
on
my_gram.grxml#month#FALL
third
|
on
my_gram.grxml#month#12
my_gram.grxml#month#Christmas
third
|
If you specify both feature_keys and feature_values in the same <ruleref>, the resulting feature contains both. For example, suppose you combine both <ruleref>s like this:
<ruleref uri="my_gram.grxml#month" feature_generation="stem" feature_keys="WINTER SPRING SUMMER FALL MONTH SPECIAL_DAY"feature_values="MONTH SPECIAL_DAY">
The following table summarizes the features can appear in this case:
Input
|
Result
|
for february ninth
|
for
my_gram.grxml#month#WINTER
my_gram.grxml#month#MONTH#02
my_gram.grxml#month#SPECIAL_DAY#Valentine
ninth
|
july second
|
my_gram.grxml#month#SUMMER
my_gram.grxml#month#MONTH#07
second
|
on december third
|
on
my_gram.grxml#month#FALL
my_gram.grxml#month#MONTH#12
my_gram.grxml#month#SPECIAL_DAY#Christmas
third
|
Note: Like all other features, even if it is listed in the <features> section, an attribute needs to be exercised by <sentence>'s in the <train> section to be used to train the SSM. This is especially true for feature_values. In the above example, if "may" is never seen during training, the feature "my_gram.grxml#month#05" will never participate in any decision at testing time. In the case of feature_keys, if "april" was seen in training, the feature "my_gram.grxml#month#SPRING" will be active in the SSM. This means that at testing time "may" will indeed participate in the SSM's decision.
<grammar_script> root ECMAScript
Use <grammar_script> to define a final script (a root ECMAScript) after the parsing is complete. This element is a child of <features>, and a sibling of <ruleref>.
In this example, if the input is “help help help”, then foundhelp is 3.
<features>
<ruleref uri="rbg_tagging.xml#SSM_to_remove"
feature_generation="remove"/>
<ruleref uri="rbg_tagging.xml#help_command"
feature_generation="remove">
<tag>cmd='help';</tag>
</ruleref>
<grammar_script>
SWIjsPrint( ":::" + typeof(cmd) + "\n" );
if
( typeof(cmd)!="undefined" && cmd == "help" )
{
foundhelp = (typeof(foundhelp)=="undefined"
? 1:foundhelp + 1);
}
</grammar_script>
...
</features>
<semantic_models> section
The <semantic_models> section is a required section. It declares the SSM label classifiers, sets parameters, and lists all possible meanings returned by the SSM. The main entries in the section consist of <SSM> elements, which specify the label names, and define the associated meanings using the <meaning> element. The meaning element itself may have different attributes, as discussed below.
As previously discussed, SSM training configuration parameters can only be set in this section. See SSM training configuration parameters for details.
The training file fragment below specifies a classifier labelled "action". By default, the <SSM> element fills a slot of the specified label name. In this example, the action slot has possible values of dial and enroll:
<semantic_models>
<SSM label="action">
<meaning prior="-1.3">
dial
</meaning>
<meaning prior="-.8">
enroll
</meaning>
</SSM>
</semantic_models>
Name slots have precedence over labels. If an <SSM> element has both a label and named slots in the <meaning>, then the label merely refers to the SSM, and the meanings determine the values of the named slots.
You can specify an initial probability for each meaning using the prior attribute of the <meaning> element. The training program uses such initial probabilities as preliminary values, and adjusts them during processing. (For a related discussion, see use_prior_weight.)
By default, the special key SWI_meaning has the value of the concatenation of all labels set in the SSM (see SWI_meaning). You also can set SWI_meaning explicitly as a single slot, just as you would any label:
<SSM label="SWI_meaning">
<meaning> dial </meaning>
</SSM>
In the next example given below, a single decision by the classifier sets two slots—action, and destination:
<semantic_models>
<SSM>
<meaning>
<slot name="action">dial</slot>
<slot name="destination">home</slot>
</meaning>
<meaning>
<slot name="action">dial</slot>
<slot name="destination">office</slot>
</meaning>
</SSM>
</semantic_models>
<meaning> element
The <meaning> element, which must be a child of an <SSM> element, is a container for slots and values. There is a limit of 5,000 meanings in an SSM.
default_meaning attribute
Use the default_meaning attribute to ensure that the SSM returns at least one meaning for a slot, even if the confidence score of the default is low. If there is no default, and the SSM finds no evidence for a meaning in the recognized words, Recognizer fills no slots with that SSM. If there is no other interpretation grammar, there is no recognition result; the utterance is marked out of grammar.
The example below defines a slot named rating, with possible values of normal (the default value) and high:
<SSM label="rating">
<meaning default_meaning="true"> normal </meaning>
<meaning> high </meaning> </SSM>
Note: Only one default_meaning is allowed per SSM.
This attribute is most useful when activating a single SSM. But you can also use it for multiple activations. If your wrapper grammar activates parallel SSMs, each can return a default meaning. If you activate SSMs in prioritized groups, the first group that returns a default meaning takes precedence; subsequent groups are not considered.
reject_meaning attribute
To explicitly prevent a <meaning>, use the reject_meaning attribute. The purpose of this attribute is similar to the use of decoy words in an SRGS grammar (see SWI_decoy). If the attribute is the top choice of the SSM, the SSM returns no meaning. If there is no other interpretation grammar, there is no recognition result (the utterance assumed to be out of grammar).
You can assign the reject_meaning attribute to any number of meanings:
<SSM label="priority">
<meaning reject_meaning="true"> normal </meaning>
<meaning> high </meaning> </SSM>
Training and test sections
The <training> and <test> sections declare sentences for training the SSM and estimating its accuracy. The elements have identical syntax and child elements.
The sentences contained in the <training> and <test> sections independently reflect the actual distribution of the sentences seen in the actual application. The same literal sentence can be used in both sections; however, the training and test sections cannot be identical. The best way to select training sentences is to select a number of sentences randomly from your corpus of data. The remaining sentences are then used for the test section.
Typically, a training file has one <training> and <test> element; but you can divide sentences among many training or test sections to allow for different settings of the feature_extraction attribute.
feature_extraction attribute
The feature_extraction attribute is intended to avoid extra processing when the training file contains <ruleref> elements that are already expanded into words.
<training feature_extraction="manual">
...training set...
</training>
Normally, you do not expand rulerefs in advance of training, and instead rely on automatic expansion. The default behavior is feature_extraction="automatic". Setting this attribute to "manual" is not recommended, and can have a severe effect on recognition accuracy if used improperly.
Also see Testing and tuning SSMs.
<sentence> element
The <sentence> element defines a valid utterance in an SSM <training> or <test> section. The order of the sentences has no effect on the results.
To train a good SSM, populate the training file with 10,000 to 20,000 sentences, or approximately 200 sentences per label. In practice, your collected data may not cover all possibilities equally. For example, you might have 400 sentences for one label, and only 50 for another. The distribution of labels (in both the training and test sections) must reflect the actual distribution in the actual application. Thus, it is important that the training and test sections to draw the samples randomly from your corpus of data, if this is at all possible.
Below is a fragment of a training section that shows one sentence:
<training>
<sentence count="2.3">
<semantics>
dial
</semantics>
<!-- Use spaces to separate words -->
call home
</sentence>
...
</training>
Above, the sentence “call home” is represented 2.3 times for each training iteration. If this sentence is recognized, its <semantic> meaning (dial) fills the appropriate <SSM> slot, as determined in this case by the SSM label.
<semantics> element
Use the <semantics> element to define the returned meanings of a sentence.
Below, when “call home” is recognized, “dial” is returned to the application:
<sentence>
<semantics>
dial
</semantics>
call home
</sentence>
Sentences can have more than one meaning. Below is an example fragment where a sentence fills two slots:
<sentence>
<semantics>
<slot name="action">dial</slot>
<slot name="destination">home</slot>
</semantics>
call home
</sentence>
Sentence counts
Use the count attribute to multiply the occurrence of a sentence. This attribute is a short-cut that repeats the same sentence multiple times. The value of count is a floating-point number.
In a training sentence, this attribute increases the sentence’s weight in the resulting models. The following example implies that the sentence is 100 times more common than phrases with a count of 1 (the default):
<sentence count="100">
<semantics>
<slot name="example">sample</slot>
</semantics>
this is a sample
</sentence>
In a test sentence, this attribute increases the sentence’s weight in the resulting error rates. For example, if a test sentence appears 20 times, and the sentence is not correctly classified in the current iteration of the SSM, then the error occurs 20 times. This affects the error rate more severely than if it appeared 1 time.