Predicting caller responses

Once you’ve determined exactly what your application is going to say in the dialog, the next important step is to go through the entire dialog from the point of view of callers, and try to anticipate the different ways a person might reasonably respond to the prompts you’ve chosen.

You will already have done this to some extent during the prompt design.Now, consider variations of the responses, and responses you haven’t anticipated already. This serves as an elementary test of your dialog, to make sure it flows as you intended in different circumstances, and to identify plausible responses you might not have considered so you can include them in your grammar.

For example, possible caller responses to our flight application prompts appear below:

What city would you like to leave from?

San Francisco

I’m leaving from San Francisco

I’d like to leave from San Francisco

Uh, San Francisco

San Francisco, please

I’m flying from San Francisco

Departing from San Francisco

[the city name by itself]

[a literal response]

[another literal response]

[initial hesitation]

[final “please”]

[some additional possibilities]

What city would you like to fly to?

New York

I’m flying to New York

I’d like to fly to New York

Uh, New York

New York, please

Going to New York

My destination is New York

[the city name by itself]

[a literal response]

[another literal response]

[initial hesitation]

[final “please”]

[some additional possibilities]

What date would you like to leave?

May second

I’d like to leave on May second

I’m leaving on May second

Leaving May second

Um, May second, please

[the date by itself]

[a literal response]

[a second literal response]

[a third literal response]

[hesitation + final "please"]

What time would you like to depart?

I’d like to depart at 2 pm

I’m departing at 2 pm

Departing 2 pm

2pm, please

[a literal response]

[a second literal response]

[a third literal response]

[final "please"]

You’re going from <origin> to <destination> on <date> at <time>. Is that right?

Yes

No

Yes, that’s correct

Yes it is

Yes, that’s right

No, that’s not correct

No, it’s not

No, that’s wrong

Yeah

["yes" by itself]

["no" by itself]

[a literal response]

[a second literal response]

[a third literal response]

[a fourth literal response]

[a fifth literal response]

[a sixth literal response]

[casual alternative]

In this particular example, it’s important to remember that the cities themselves may sometimes be identified in different ways. For example, Los Angeles may be called “Los Angeles”, or simply “L.A.” while New York may be identified as “New York”, “New York, New York”, “New York City”, or “NYC”.

It is not necessary to predict every response to your prompts or to cover every possible utterance that callers might make. You must balance the extent of your grammar’s coverage with other factors, such as speed and accuracy. The smaller the grammar is, the better its recognition speed and accuracy. If you try to cover too much, you’ll sacrifice performance.

For an initial release, it’s best to focus on the most plausible ways that people could respond to each prompt, and build those responses into your first grammar. Remember that if the first response is out-of-grammar, you can use secondary prompts in the voice application to give callers a better idea of how to answer (see Errors and reprompts). If necessary, you can expand your grammar later based on actual results from your pilot testing of the application.

Synonyms

There are usually many different ways to answer a given question. Even for a simple yes/no question, callers will likely answer with a wide range of responses: “yeah”, “yup”, “yessir”, “no way”, “correct”, “nope”, “wrong”, and others that may not immediately seem obvious.

Synonyms

As the example above shows, there may be many synonyms for even the simplest words. Callers may use such words interchangeably: in the phrase “Please send all calls to my home”, a caller could just as easily use “direct” or “transfer” in place of “send”, or “house” or “place” instead of “home”.

Acronyms

If there are acronyms commonly used in the subject of your application, callers may reply with the acronym, or they may use the full term instead.

For example, in a banking application that lets callers buy Canadian GICs (Guaranteed Investment Certificates), a caller may use either “GIC” or “Guaranteed Investment Certificate” to refer to them. In the flight reservation example, a caller might refer to Los Angeles as “L.A.” or to Kansas City as “K.C.” rather than use the city’s full name.

Extra words

While you may only be interested in one or two key words in a response, callers will probably use several. The instruction “transfer a thousand dollars from savings to checking” can be preceded by many different phrases: “I’d like to...” “Please...” “Can I”, and other such variations. Similarly, callers may end a sentence with “please”, “if you don’t mind”, and other such polite phrases.