Natural language dialog design

In natural speech, a single spoken sentence can contain several pieces of information. Natural language techniques make it easy to interpret the meaning of a sentence, and to identify which pieces of information belong in each slot. This means that natural language techniques are very useful in mixed-initiative dialogs. They give callers some control over the sequence of questions, because callers can reply with more information than they have been asked to provide.

Example

We’ve explored the concepts of a grammar slot (see Defining information slots) and the difference between a directed and mixed-initiative dialogs (see Directed dialogs vs. mixed-initiative dialogs). Let's take a look at another example.

A restaurant guide application might define slots for the general location, food type, and price level for the restaurant a caller is looking for. In a directed dialog, the application will ask for each piece of information in sequence. However, a mixed-initiative dialog might open with a very general prompt, such as “Welcome to our service. How can I help you today?”

When a caller replies to this initial prompt with a sentence like "I'm looking for a restaurant in Madrid where the meal will cost around five euros per person", natural language technology allows the application to extract the location and the price level. That might leave the restaurant type to be specified, so the next step in the dialog would then be to find out the type of restaurant the caller is looking for.

If a caller gives all information needed in one single utterance, no further questions are necessary—except perhaps for a confirmation question.

Designing natural language prompts

As discussed in Constructing a dialog, the wording of prompts heavily influences how a caller replies to them. For natural language purposes, this means that the prompts have a profound influence on the statistical model that results from user data. Use the same prompts during data collection that are planned for the final application.

As with all prompts, design a question with the audience in mind. Why has the caller phoned? If possible, interview the people who handle transactions with the callers. These interviews can reveal categories of callers, each of which can be profiled and used to develop questions and sample sentences. Avoid administrative information in your prompts; for example, adding privacy statements and instructions can confuse callers.

When prompts ask broad questions, callers can be uncertain how to phrase their responses. Adding examples to prompts allows callers to mimic the examples in their responses. Such examples help callers to narrow their choices, and speak in a manner expected by the application.

When creating examples, a good practice is to have one example be a statement and another example be a question:

System

Thanks for calling the appliance repair line. What problem are you experiencing? <pause> For example, ’My dryer isn’t drying my clothes,’ or ’How do I rinse my dishes without washing them?’

In confirmation questions, be clear that you are not repeating the caller’s speech verbatim. In a natural language dialog, the system does not confirm the exact words spoken by the caller. Instead, it confirms a "guess" of the meaning of the utterance. The choice of words is important to avoid confusion when the caller’s exact words are extrapolated into meanings by the application. Here is an example that a caller might find confusing:

Caller

Uh, yes, uh when I click to check email nothing happens, it doesn't work.

System

I think you said "email connectivity", is that right?

Some callers will not understand the relationship between “email not working” and “email connectivity”. They might incorrectly answer “no” to this confirmation prompt. A better confirmation prompt to use might be:

System

I think you’re having a problem with email, is that right?

Some prompts may elicit responses that are ambiguous, such as “I need help”. The best prompts lead to few ambiguous utterances (though it is nearly impossible to eliminate all ambiguity). When they occur, the ambiguous utterances may be handled with a dialog (application logic and supporting grammar) designed to handle the ambiguity, as in the following example:

System

“Hello. What can Metro Insurance do for you today?”

Caller

“Payments”

System

“O.K. Would you like to make, schedule, report, or verify a payment? You can also say ’payment location.’”

The system asks a specific question to disambiguate what the caller means by “payments”. Note also that the grammar for the second prompt may or may not be a natural language grammar: the question is specific enough that a regular SRGS may be enough to cover the possible replies.