Designing prompts

Once you have an idea of how you would like the dialog to flow, you can design prompts that will encourage the responses you want.

The caller’s responses are shaped by how you ask your questions, and well-designed prompts can lead users to respond naturally with words that fall within your grammar. Because the dialog and prompts are so closely linked, you must design the prompts while planning your grammars, rather than treating these tasks as separate steps.

How do you guess the most common responses? Fortunately, it turns out that there are two types of responses that are by far the most common:

The information item by itself
The literal response to the question wording

If your prompt asks “What is your departure city?” most responses will be just a city name like “Miami”, with no other words spoken. A smaller group of responses will contain phrases like “My departure city is Miami”, or “departure from Miami” that closely parallel the prompt wording.

If you change the prompt to “What city will you leave from?” you’ll still get many city-only responses; but you are less likely to get “My departure city is Miami”. Instead you’re more likely to get “Miami”, “from Miami”, “I’m leaving from Miami” or “leaving Miami” as responses. Few responses are likely to include the word “departure”, since it does not appear in the prompt.

In the flight reservation application, you might use these prompts:

Prompt	Slot
What city would you like to leave from?	origin
What city would you like to fly to?	destination
What date would you like to leave?	date
What time of day would you prefer for your flight?	time
You’re going from <origin> to <destination> on <date> at <time>. Is that correct?	confirm

These prompts indicate a directed dialog. A mixed-initiative dialog might instead start by asking “Where would you like to travel?” or “How can I help you?” and then pose more specific questions to obtain information missing from the user’s initial response.

Note that within the VoiceXML file where you code the prompt, you can use certain VoiceXML elements to supply a secondary prompt that will be used if the response to the original prompt is unclear. If the user’s first response is not recognizable, subsequent prompts can tell the user what sort of response is expected (“I’m sorry, I didn’t understand. Please state the name of the city you’re leaving from”). See Errors and reprompts.

Using spoken language

When designing prompts, remember that there are differences between written language and spoken language. A voice application will often express things in a friendlier, more casual, and less wordy fashion than a written letter. Keep the following points in mind:

Written language usually has an impersonal tone: it often uses the passive voice and the third person. Spoken dialogs typically use the first person, and address the caller directly. For example, instead of “The next step is to identify the amount to be transferred. Please state this amount now”, a friendlier and more natural prompt would be “Next, I need the amount. How much money do you want to transfer?”
Spoken language flows best when you use short, informal words.
Spoken language uses contractions frequently: “it’s” rather than “it is”, “let’s” rather than “let us”, and so on.
Since it addresses a person directly, spoken language often includes interjections that would be out of place in written language. “Okay”, “sorry”, “great”, “thanks”, and other such words can add to the application’s personality and make the dialog flow naturally, even though these words are not really needed to convey meaning.

The wording of your prompts will also be affected by the persona you have chosen for the main voice application. The persona is the personality that the callers perceive as they interact with the application.

Depending on the nature of your application, your corporate image, and the voices or audio resources have available, you may want to present the application as young or old, male or female, friendly or formal.

Directing caller responses

Design prompts to encourage the simplest and most direct responses possible. For example, the prompt “Would you like to book a new reservation, check an existing reservation, or cancel a reservation?” naturally leads the caller to respond with one of those phrases, or an easily parsed variation, or simply with one word: “book”, “check”, or “cancel”.

If the list of options is too long to put them in a prompt, you may instead include an example. Your prompt may be something like “What would you like to do? For example, would you like to book a new reservation?”. This approach can give callers a general structure for the response, without forcing them to listen to a long list of possible options.

Be consistent

Phrase prompts and requests in a consistent fashion, so similar tasks are introduced with similar prompts, and accept a similar range of responses. Consistency encourages callers to fall into a natural pattern with their responses, which makes those responses easier to predict.

For example, in a banking application, one of the tasks may be to transfer funds between two accounts. This transfer will require similar information about each account. If you ask for the type, account number, and password in that order for the origin account, ask for them in the same order for the destination account.

Use words consistently. If you use “booking a flight” for a prompt at the start of the application, callers may be confused if you say “reserving a ticket” later. If “ticket” and “flight” are valid responses to one prompt, they should be valid for other prompts too.

Avoid ambiguity

Ensure that each prompt has a clear meaning so callers know what is expected. This is particularly important when your application is informing of them of key words to be repeated or spoken.

Often, it is best to delivery key words at the end of prompts. This avoids confusion and helps users who might otherwise forget the words. Examples:

Confusing: “Say ’help’ now if you would like more information” This prompt can lead callers to say “help now”, “help, I’d like more information”, or even “more” if they weren’t listening closely to the first part of the sentence.
Improved: “If you would like more information now, say ’help’.” This approach makes it much clearer that the caller is expected to say the key word alone.

Errors and reprompts

No matter how clear your prompts are, a caller may still misunderstand them. Your dialog can minimize the impact of out-of-grammar responses by including additional prompts and messages that will be used when the initial prompt fails to get an appropriate response, or when some other error occurs.

By default, if the response is out-of-grammar, the application goes back to the initial prompt and tries again, and continues doing this until it gets a suitable response. However, this may be frustrating for your callers because they get no indication of the error or how to fix it.

The VoiceXML language used to write your application file includes several elements that make it easy to respond to different out-of-grammar caller responses:

<nomatch> identifies an out-of-grammar reply.
<noinput> identifies caller silence.
<help> recognizes when the caller says “help”.
<error> is invoked when Recognizer generates an error.

You can use these elements to give callers extra information or to take a different action when Recognizer doesn’t get a suitable reply to the initial prompt:

“Sorry, I didn’t quite get that. What type of account is it?” (<nomatch>)
“Sorry, I couldn’t hear anything. Could you repeat that?” (<noinput>)
“Please state the type of bank account you have. For example, if your account is a savings account, say ’savings’.” (<help>)
“I’m sorry; we’re having technical difficulties. Please try again later.” (<error>)

Each of these elements includes a "count" attribute, so you can use a different message each time the problem occurs:

“Sorry, I didn’t get that. What type of account is it?” (<nomatch count=1>)
“I’m sorry, I still didn’t understand. I need the type of account; is it ’checking’, or ’savings’?” (<nomatch count=2>)
“I’m very sorry, but I still don’t understand. Maybe there’s a problem on the line. Let me transfer you to a live operator.” (<nomatch count=3>)