A grammar is a words and word sequences the recognizer can recognize, and the interpretations for those utterances.

This general process is summarized in the following steps:

  1. The gRPC client instructs the recognizer to load and activate a grammar.
  2. The gRPC client plays a prompt audio designed to elicit a user response, and simultaneously records incoming audio to send to the recognizer.
  3. The recognizer listens for speech.
  4. After detecting speech, the recognizer searches for a match between the words spoken and the activated grammar(s).
  5. The recognizer returns the result of this search to the gRPC client or to another component. If this result is positive, the recognizer may pass values to application variables.

There are three basic sources for grammars:

  • Built-in grammars: These grammars are included automatically as part of the recognition package, to provide coverage for common words like numbers and dates. They can be used without further development.

  • Inline grammars: Very small grammars may be coded in the voice application file directly, and interpreted by the gRPC client at runtime.

  • External grammars: These grammars are created individually by a grammar developer. They are contained in separate files, which are invoked as needed from within the voice application (.vxml) file.

External and inline grammars must be coded by the grammar developer, while built-in grammars are available when the built-in grammar service is running.

Included in RecognitionInit, the resources field defines the RecognitionResource message. In the RecognitionResource message, you can define one or more grammars to be used for the recognition. For each grammar, you must specify the language using the language field. If you want to specify the grammar’s weight relative to other grammars active for that recognition, you can enter a value from 1 to 32767 in the weight field. The grammar_id field specifies the ID that Mix will use to identify the grammar in the recognition result. If it is not set, Mix will generate a unique one.

Use a built-in grammar

The RecognitionResource message contains the builtin field in which you can specify the name of the built-in resource in the data pack that you want to use in the recognition. Leave blank if you do not want to use a built-in grammar.

Use an inline grammar

The RecognitionResource message contains the inline_grammar field which defines the InlineGrammar message. The InlineGrammar message contains the grammar field in which you specify the grammar data and defines the EnumMediaType message.

Use an external grammar

The RecognitionResource message contains the uri_grammar field which defines the UriGrammar message. The UriGrammar message contains the uri field which specifies the location of the grammar resource as a URN reference and defines the EnumMediaType and UriGrammarParameters messages.

A simple external grammar resource may look like this, a grammar for recognizing fruits:

  View fruits.grxml