Scripts, tags, and semantic interpretation
Script language is used within a grammar to assign values to variable slots, as discussed in Sample grammar file revisited:
<ruleref uri="#No"/>
<tag> YesNo='no'</tag>
Here, the simple script within the tag assigns a value of 'no' to the YesNo slot when the No rule is triggered.
A well-designed grammar distinguishes between what the caller actually speaks (the raw text that was recognized) and the semantic meaning of that speech. For example, if there are five ways for the caller to say “direct my calls home”, the grammar must account for all of them. But, the grammar only needs to pass one meaning to the application (in a single format that the application will interpret correctly).
Your grammars can extract the caller’s meaning from different permutations of words, and pass this on in a single format, by setting keys and values in a script.
Using ECMAScript to assign key/value pairs
ECMAScript is a JavaScript standard scripting language, and it is accepted by Recognizer as a default. It’s compatible with all three of the supported tag-formats (semantics/1.0, semantics/1.0-literals, and swi-semantics/1.0).
You can use ECMAScript within a <tag> element in your grammar file to add key/value pair assignments to rules. For example:
<rule id="myrule">
<item> first choice </item>
<tag> ECMAScript statement(s) </tag>
</rule>
You can include a <tag> element as a child of any of the following elements:
- <grammar>
- <item>
- <one-of>
- <rule>
- <ruleref>
Keys and values can be used to define the interface between the grammars and the application. You can change the grammar without affecting the application so long as the keys remain unchanged, because the interface stays the same.
After a recognition, you can get an XML representation of the current results (as returned in the result format for your environment or voice platform).
Consider the example of an application that routes calls to different destinations: callers want to direct calls, and the application needs to direct them to some locations.
It will be far easier for this application to process two keys than to deal with all the sentence variations. You can use key/value pairs to isolate the meaning of many different sentences.
The table below shows the power of condensing different sentences to return an ACTION key (in all these cases, “direct calls”), and a LOCATION key.
|
Sentences |
Key/Value |
|---|---|
|
direct my calls home |
ACTION = direct calls LOCATION = home |
|
direct my calls to my car |
ACTION = direct calls LOCATION = car |
|
direct my calls to the office |
ACTION = direct calls LOCATION = work |
How do you use scripts within a grammar, so the correct keys and values are returned for each covered sentence?
Here is a sample directcalls.grxml grammar that will parse any sentence listed in Example of key/value pairings, and assign appropriate values to the ACTION and LOCATION keys (using swi-semantics tag formatting):
<?xml version='1.0' encoding='UTF-8'?>
<grammar xml:lang="en-US" version="1.0" root="ROOT"
tag-format="swi-semantics/1.0"
xmlns="http://www.w3.org/2001/06/grammar">
<rule id="ROOT" scope="public">
<item> <ruleref uri="#CallCommand" />
<tag>
ACTION=CallCommand.ACTION;
LOCATION=CallCommand.LOCATION;
SWI_meaning = ACTION + ' ' + LOCATION
</tag>
</item>
</rule>
<rule id="CallCommand">
<item repeat="0-1"> please </item>
<ruleref uri="#DirectMyCalls" />
<tag>
ACTION='direct calls'
</tag>
<ruleref uri="#Where" />
<tag>
LOCATION=Where.LOC
</tag>
</rule>
<rule id="DirectMyCalls">
<one-of>
<item>direct</item>
<item>send</item>
</one-of>
<item repeat="0-1"> my </item>
calls
</rule>
<rule id="Where">
<one-of>
<item>
<ruleref uri="#Home" />
<tag>LOC='home'</tag>
</item>
<item>
<ruleref uri="#Car" />
<tag>LOC='car'</tag>
</item>
<item>
<ruleref uri="#Work" />
<tag>LOC='office'</tag>
</item>
</one-of>
</rule>
<rule id="Home">
<item repeat="0-1"> to my </item>
home
</rule>
<rule id="Car">
to my car
<item repeat="0-1"> phone </item>
</rule>
<rule id="Work"> to the office </rule>
</grammar>
In this grammar, consider the following points:
- Each rule has a corresponding ECMAScript object, called the rule object.
- Each rule object has ECMAScript object properties that are set by scripts. The scripts are run if the rule is invoked while parsing the sentence.
- The script in any rule can access objects and properties of its child objects once they have been set—that is, once the scripts associated with the child rules have been run.
- Scripts are executed by rules in left-to-right rule order (see below).
- The root rule is special, since each of its properties corresponds to a key returned by the grammar to the application. For instance, in our example, the script running on the ROOT rule sets the LOCATION, ACTION, and SWI_meaning keys.
Scripting considerations
A full discussion of ECMAScript is beyond the scope of this manual; if you are not already familiar with ECMAScript, we recommend that you refer to a book specifically dedicated to the topic for a detailed discussion of all it offers. However, in the context of Nuance, a few points are worth highlighting.
Commonly-used operators and functions
ECMAScript offers several particularly useful operations:
- String concatenation: This can be performed using the + operator. Be careful that the operands are strings; if they are numbers, this indicates addition.
- String fragment extraction: Performed using the substr function.
- Character-to-integer conversion: Performed using the toString function.
- Integer-to-character conversion: Performed using the parseInt function.
- Regular expressions: These can help canonicalize the returned key. For example, you can remove all spaces with:
KEY = KEYWITHSPACES.replace(/[] +/g), '');
- The var declarator: This function keeps variables local to the script. Use this for variables in the root rule that you do not want to pass to the application:
<ruleref uri="myrule" />
<tag>
var a = parseInt(myrule.SWI_literal,10);
SWI_meaning = a + ' ' + 'done';
</tag>
In this case, the variable a is not accessible to any other rule.
- Testing attribute instantiation: Often, the instantiation of a variable depends on a script being executed in an optional rule. If a script tries to manipulate an undefined variable during recognition, an error results. For instance:
<rule id="string">
<item repeat="0-1">
<ruleref uri="#D1"/>
<tag>VALUE = D1.V;</tag>
</item>
<ruleref uri="#D2" />
<tag>VALUE += D2.V</tag>
</rule>
In this case, if rule D1 is included in the parse, VALUE is instantiated and the script attached to D2 will run properly. Otherwise, an error will be returned since the script is incrementing an undefined variable. The usual way to handle this is to rewrite as follows:
<rule id="string">
<item repeat="0-1">
<ruleref uri="#D1" />
<tag>VALUE = D1.V; </tag>
</item>
<ruleref uri="#D2" />
<tag>VALUE = VALUE ? VALUE + D2.V : D2.V </tag>
</rule>
Alternatively, you can initialize VALUE to an empty string:
<rule id="string">
<tag>VALUE=''; </tag>
<item repeat="0-1">
<ruleref uri="#D1" />
<tag>VALUE=D1.V; </tag>
</item>
<ruleref uri="#D2" />
<tag>VALUE=VALUE+D2.V; </tag>
</rule>
While scripting, look out for the following common errors:
- Remember that as far as XML is concerned, each script is one long string with no white space, new line characters, and so on.
- For comments, use the /* and */ delimiters and not the // syntax, since the latter will comment out the remainder of the line.
- A common error is to forget to put quotes around a literal:
<item> <tag>V=apple;</tag> /*Single quotes are missing*/ apple </item>
Instead, the desired code is:
<item> <tag>V='apple';</tag> apple </item>
- However, Boolean values do not take quotes. A Boolean value that is delimited with quotes is treated as a literal string. For example, consider the following:
<tag>ordercomplete = "true";</tag>
This assigns "true" to the ordercomplete variable. To use a Boolean instead of a string, do not delimit the value with quotes. This example assigns the Boolean TRUE:
<tag>ordercomplete = true;</tag>
- A common error is to forget to escape characters that have special meaning to XML. These include > (>), < (<), and & (&). See Escaped characters in ECMAScript.
- ECMAScript is interpreted. The grammar tools test for certain types of errors, but others may not be detected until runtime. We strongly recommend use of tools such as parseTool and test_parser to test scripts.
For an example, see Using parseTool to verify ECMAScript.
Using parseTool to verify ECMAScript
Now that the basic relations among rules, objects, and scripts have been established, let’s follow the operation of the grammar in response to the phrase, “direct my calls home”. First, we enter the following command at the prompt:
parseTool directcalls.grxml -t_s
Note: We’re assuming that directcalls.grxml is in the same directory as parseTool.
The parseTool utility processes the file and displays the “next sentence” prompt:
PROG parseTool:
arg <spec-filename> == directcalls.grxml
arg <-test_sentences> == -t_s
next sentence:
Enter direct my calls home at the prompt. The output is as follows:
Parsing 'direct my calls home' with uri 'directcalls.grxml'...
Parse 0: {{{direct my calls DirectMyCalls} {{home Home} Where} CallCommand} ROOT} <?xml version='1.0'?>
<result>
<interpretation grammar="ParseToolGrammar" confidence="100">
<input mode="speech">
direct my calls home
</input>
<instance>
<ACTION confidence="100">
direct calls
</ACTION>
<LOCATION confidence="100">
home
</LOCATION>
<SWI_meaning>
direct calls home
</SWI_meaning>
<SWI_literal>
Direct my calls home
</SWI_literal>
<SWI_grammarName>
ParseToolGrammar
</SWI_grammarName>
</instance>
</interpretation>
</result>
Parse successful, line 1
We can see that “direct my calls home” produces the expected results:
- The ACTION key is set to direct calls
- The LOCATION key is set to home
- The SWI_meaning key concatenates the other keys to get direct calls home
You can see how the ECMAScript assigns these values by looking at each step in the grammar, using the -debug_output option in the parseTool command line:
parseTool directcalls.xml -debug_output -t_s
The -debug_output option lists each step that the grammar follows to assign values to the keys in a consistent format for each step:
|
Output |
Description |
|---|---|
|
Step n: rule name |
Execution index and rule name. |
|
Enviro: |
Lists all the ECMAScript objects that can be referenced before the script executes, as well as their values. |
|
Input: |
The property values for the current rule object before the script is executed. |
|
Script: |
The executed script. |
|
Result: |
Rule’s property values after script executed. |
For example, for the sentence “direct my calls home”, the output is as follows:
Step 0: CallCommand
Enviro: {SWI_vars:{}DirectMyCalls:{SWI_literal:Direct my calls
SWI_spoken:Direct my calls SWI_confidence:1 }}
Input : {}Script: ACTION='direct calls'
Result: {ACTION:direct calls }Step1: Where
Enviro: {SWI_vars:{}Home:{SWI_literal:Home SWI_spoken:Home
SWI_confidence:1 }}
Input : {}Script: LOC='home'
Result: {LOC:home }Step2: CallCommand
Enviro: {SWI_vars:{}DirectMyCalls:{SWI_literal:Direct my calls SWI_spoken:Direct my calls SWI_confidence:1
}Where home SWI_literal:Home SWI_spoken:Home
SWI_confidence:1 }}
Input : {ACTION:direct calls }Script: LOCATION=Where.LOC
Result: {ACTION:direct calls LOCATION:home }Step3: ROOT
Enviro: {SWI_vars:{}CallCommand:{ACTION:direct callsLOCATION:home SWI_literal:Direct my calls home
SWI_spoken:Direct calls home SWI_confidence:1 }}
Input : {}Script: ACTION=CallCommand.ACTION;
LOCATION=CallCommand.LOCATION;
SWI_meaning = ACTION + ' ' + LOCATION
Result: {ACTION:direct calls LOCATION:home SWI_meaning:direct calls home }
The process starts at the first executed script, and proceeds in order of script execution. Note that although control is top-down, the order of script execution is bottom-up: a script cannot execute until its child scripts do. This allows the child scripts to compute any properties that are required by the parent scripts.
In our example, the ROOT rule only executes a script if the CallCommand rule returns a positive result. In this case, the ROOT script takes values from the CallCommand object (CallCommand.ACTION and CallCommand.LOCATION) which have been computed by the CallCommand rule.
However, the CallCommand rule only executes scripts if the DirectMyCalls and Where rules return positive results. First it refers to the DirectMyCalls rule, and assigns the “direct calls” value to ACTION if DirectMyCalls returns a positive result (Step 0). Then (and only then) it refers to the Where rule to get a LOCATION, which will be the LOC value returned by the Where rule.
The Where rule itself refers to three other rules (Home, Car, and Work) to determine which value to assign to the LOC variable (Step 1). Once this is done, the CallCommand rule is able to assign this value to LOCATION (Step 2).
Finally, with all the values now determined, the ROOT rule is able to retrieve the ACTION and LOCATION from the CallCommand rule, and to concatenate them to obtain the SWI_meaning (Step 3).
A more detailed discussion of these steps follows.
In Step 0, “direct my calls” is parsed by DirectMyCalls:
Step 0: CallCommand
Enviro: {SWI_vars:{}DirectMyCalls:{SWI_literal:Direct my calls
SWI_spoken:Direct my calls SWI_confidence:1 }}
Input : {}Script: ACTION='direct calls'
Result: {ACTION:direct calls }
- Upon entry, Enviro indicates the SWI_vars object and the SWI_literal property that is set by DirectMyCalls.
- SWI_vars is an object accessible to all rules (see SWI_vars).
- SWI_literal is a special key that contains the recognized text (see SWI_literal).
- There are no properties set in the CallCommand object, so Input is empty.
- Script indicates that the script ACTION='direct calls' is executed.
- After execution, Result indicates that the ACTION property of the CallCommand object is set to 'direct calls'.
There is an important distinction between the rule that executes the script (CallCommand) and the rule that triggers that execution (DirectMyCalls). The script is executed by CallCommand, but it is triggered because the DirectMyCalls rule returns a positive result. DirectMyCalls does not assign the value itself.
This distinction is important in cases where you may need to use the same rule to activate different scripts. For example, the city returned by a City rule could be assigned as a point of origin, or a destination, depending on the context.
Next, CallCommand invokes the Where rule, which executes its own script:
Step 1: Where
Enviro: {SWI_vars:{}Home:{SWI_literal:Home SWI_spoken:Home
SWI_confidence:1 }}
Input : {}Script: LOC='home'
Result: {LOC:home }
The Where rule itself executes one of three scripts depending on which of three of its child rules (Home, Car, or Work) recognizes the utterance. In our example, the Home rule is activates. Again, the Home rule itself does not execute any scripts: the Where rule executes the script LOC=’home’ based on the positive Home result. The parseTool output for Step 1 is analogous to that of Step 0.
Now that the Where rule has returned a value, CallCommand can finally execute the script attached to that rule: "LOCATION=Where.LOC":
Step 2: CallCommand
Enviro: {SWI_vars:{}DirectMyCalls:{SWI_literal:Direct my calls SWI_spoken:Direct my calls SWI_confidence:1
}Where home SWI_literal:Home SWI_spoken:Home
SWI_confidence:1 }}
Input : {ACTION:direct calls }Script: LOCATION=Where.LOC
Result: {ACTION:direct calls LOCATION:home }
- Upon entry, Enviro indicates that CallCommand can access properties of its descendant rule’s objects. In particular, it can access the Where.LOC value.
- Input indicates that CallCommand already has its ACTION property set.
- After the script executes, Result indicates that it now also has its LOCATION property set. This property is set to “home”, which is the LOC property of the Where rule object.
Now that CallCommand has executed all its scripts, the main ROOT rule script can execute. The parseTool executable labels the rule with its filename (direct.grxml) to indicate that this is the grammar’s root rule:
Step 3: ROOT
Enviro: {SWI_vars:{}CallCommand:{ACTION:direct callsLOCATION:home SWI_literal:Direct my calls home
SWI_spoken:Direct my calls home SWI_confidence:1 }}
Input : {}Script: ACTION=CallCommand.ACTION;
LOCATION=CallCommand.LOCATION;
SWI_meaning = ACTION + ' ' + LOCATION
Result: {ACTION:direct calls LOCATION:home SWI_meaning:direct calls home }
At the beginning of this step, none of the ROOT rule’s properties have been set yet, since they all depend on the results from the CallCommand rule. After the script executes, the LOCATION, ACTION, and SWI_meaning keys have all been set. These keys can then be accessed by your application.
A few things to note in this example:
Scripts appear within <tag></tag> pairs inside other elements (as per the W3C grammar specification), such as the <item> and <one-of> elements. For example, in a stock quote application, we might have the following excerpt:
<one-of>
<item>
<tag>ticker='IBM'</tag>big blue
</item>
<item>
<tag>ticker='IBM'</tag>international business machines
</item>
<item>
<tag>ticker='T'</tag>a t and t
</item>
</one-of>
We set a key called SWI_meaning by concatenating the ACTION and LOCATION properties. This is a special key (see SWI_meaning).
The fact that processing was left-to-right was not important here, since the scripts computing ACTION and LOCATION were independent of each other. However, in some cases there is more dependence between different parts of the grammar, and you have to pay attention to order in those cases.
Related topics
Related tasks
Related topics
Reference