Defining information slots

The key task of any grammar is to convert the user’s spoken language into information that the main application can use to perform its actions and tasks. The grammar supplies this information by interpreting the caller’s spoken responses, and matching different responses to values that it will assign to the application variables.

A grammar passes each value on to the application file by filling a grammar slot. The information in the grammar slot is then passed to a variable in the main application.

A slot is similar to a variable, in that it has a name and a value that can be accessed by the application. However, a slot is more than a variable:

  • Slots are associated with individual meanings in utterances.
  • A single sentence (utterance) can contain more than one slot.
  • A slot can have a confidence score, to rate how close a match it appears to be.

The slots that will have to be filled by your grammar are determined by the items in your database, the tasks your application performs, and other factors.

Therefore, a good first step in grammar design is to identify those information items that the main application requires. Doing so gives you a good idea of the slots you will need to fill in your grammar, and thus the questions you will have to ask the user. In cases where some of the information depends on the type of action the user requests, this can also determine the order of your prompts.

Use the following questions to outline the slots required in a grammar:

  1. What pieces of information are required to complete the task?
  2. In what order must the information be requested?
  3. What results will be returned to the application?
  4. What format will they use?

For example, for an air travel application you may need to ask the user for two cities (origin and destination), a date and time, and then confirm the validity of information assembled (a yes/no question). That’s five pieces of information the user must supply. You can summarize them in a table:

Item

Slot name

Value format

Value type

city #1

origin

3-letter

code string

city #2

destination

3-letter

code string

date

date

[<month> <day>]

NL structure

time

time

0-2359

integer

yes/no

confirm

"yes" or "no"

string

Once you know what information you need, you can decide on the best way to prompt the user for it.

Information dependencies

The structure of information used in your application is critical in determining the flow of your dialog. If the information you need to collect will depend on the user’s choices, some questions must be asked before others. For example, in a flight reservation application, the information flow may include several decision points:

  • In the air travel application example, the dates and times of available flights will depend on the cities the user selected as the origin and destination.
    • The voice application must ask for the cities first, and submit them to the application in order to obtain a list of dates and times of available flights.
    • Next, it tells the caller the dates and times of available flights from the origin to the destination, and the caller must select one such option.
    • Finally, the application plays back the caller selection and asks for a confirmation. This final confirmation can only occur once the other four pieces of information are chosen, so this naturally must be the last piece of information collected.
  • If your application offers more than one task, the caller must indicate which task is to be performed. For example, a flight reservation application may allow callers to make new reservations, and verify or cancel existing ones. Your grammar must be able to indicate this choice to the main application.
  • The task chosen may determine what other information is required. If the caller wants to check an existing reservation, the application may only need the reservation number; but to make a new reservation, the application may have to go through an entire dialog to establish the departure city, destination, date, and time as we’ve seen in previous examples.
  • The information flow will not necessarily be all one way, from the caller to the application: it may be necessary to present the caller with a list of options at different points in the dialog. In the flight reservation example, flights between two specific cities may only be available on certain dates. Once you know the cities, you must provide a list of flights before the caller can then select a date and time. You may even have to ask the caller to choose a range of dates in order to limit the choices.
  • If your application will be writing to or drawing information from a database, the fields in that database may be a good indication of the slots you will have to include in your grammar. However, be careful. Not all fields may be relevant for your application, and you may have to approach them in a different order. For example, to make a new reservation in your database you may only need to supply the traveler’s name, the date, and flight number: but in the application you may also need to find out the origin and destination in order to give the caller a list of flights to choose from.
  • Not all information appearing in the dialog must necessarily be represented in the grammar. For example, the flight reservation application may note the date that the reservation was made, in order to restrict cancellations to within a week of purchase. The application may then repeat this date to the caller along with the warning about cancellation restrictions. However, even if the caller later decides to cancel the reservation, any checks against the original date of purchase and cancellation deadline will be internal to the application. The caller will not ever need to speak either date, since the application can retrieve both based on the reservation number.
  • You may find it helpful to draw a flow chart of the information dependencies in your application. This provides a visual map of the different branches your application will have to cover, and may also provide some guidance as to the best strategy to use in designing the dialog.

Confirmations

At various points in the dialog, you may ask for confirmations to ensure that Recognizer has interpreted the caller’s statements correctly (and that the caller hasn’t changed his or her mind). Such confirmations can be covered by existing built-in grammars, but you must still include information slots for the results.

If the caller rejects the interpretation, you must correct any mistakes. Rather than repeat the dialog from scratch, you may want to let the caller to choose which items need to be corrected. This too will require a slot in the grammar.

Universals

Many applications will include a small set of general commands that can be invoked at any point: “main menu”, “repeat”, “help”, “cancel”, “exit”, and so on. These values must also be put into a slot that will direct the application to the corresponding action, so you must create a grammar that is able to do so.