Recognition timeouts
The endpointer uses several VoiceXML timeout properties that determine the end of speech by measuring the time after a prompt or silence after an utterance.
- The timeout property sets the maximum silence that can follow the end of a prompt before Voice Platform assumes that the caller is not going to answer, throws a “noinput” event. The default value is seven seconds.
- The completetimeout property specifies the required length of silence following caller speech before the recognizer finalizes a result (either accepting it or throwing a nomatch event) when the speech before the silence matches an active grammar. The default value is 0 seconds.
- The incompletetimeout property specifies the required length of silence that must follow caller speech before the recognizer finalizes a result when the speech prior to the silence does not match any active grammar. This property also applies for partial matches—when the preceding speech matches part of an active grammar, but more is required to complete the match. The default value is 1.5 seconds.
- The maxspeechtimeout property specifies a maximum amount of time allowed for a caller response after the beginning of speech. A period of silence corresponding to the completetimeout or incompletetimeout will still result in the end of speech. The default value is 40 seconds.
These timeouts can be expressed in either seconds (s) or milliseconds (ms).
Setting timeouts
You can set the recognition timeout properties in a <property> element at different scope. You can also set the timeout as an attribute within a <prompt> element.
If you find that a dialog is taking too much time because the endpointer is taking too long to declare the end of speech, you can speed up the dialog by shortening the timeouts. This is especially effective if you deal with many experienced or advanced users who already know what they need to say.
However, when the timeouts are too short the interpreter is more likely cut off users during a pause and generate a nomatch or noinput event that will take time to recover from, making the dialog longer rather than shorter.
timeout
The timeout sets the maximum period of time that Voice Platform waits for a reply after playing a prompt. If the caller has not replied by the end of this period, Voice Platform throws a noinput event and either reprompts or proceeds to the next <noinput> instruction in the VoiceXML file.
Note that this timeout value is used both for speech and DTMF input. However, you can set some DTMF-specific timeouts as described later in this section.
incompletetimeout
For optimal performance, the incompletetimeout property can be adjusted for each reply. For example, for simple “yes/no” responses, it can be set to a small value (on the order of 0.5 seconds), so the response will be quicker. However, when reciting an account number, callers are likely to pause between groups of digits. In this case the property can be set to a longer value to prevent the user from being cut off while pausing.
A default value can be set in the application root document, and different values can be set for each dialog state, depending on the active grammars. For optimal performance, we recommend that you use the following guidelines:
- 0.5 seconds: For short utterances (yes/no, hot word)
- 1.5 seconds: In general (default value)
- 2.5 seconds: For digit strings and long utterances
The timer can be invoked several times during a single reply, whenever the caller pauses. It only expires if a pause reaches the defined limit before new speech is detected.
The value assigned for an incompletetimeout cannot be less than the value assigned to the swirec.swiep_EOS_backoff property.
completetimeout
The completetimeout property is best used with simple, closed grammars such as a list of names. Use it to reduce the wait for end-of-speech once a valid recognition has occurred. Shorter settings result in faster response times, but risk interrupting callers before they are done speaking.
Because the timer starts when recognition has occurred, you should tune this property’s value for each active grammar. Otherwise, if the value is too small, the timer could cut off the user before the speech is complete. For example, it could accidentally cut off the caller after recognizing only the first pizza topping if the caller pauses too long before saying the second topping.
The default value for this property is 0, and reasonable values typically range from 250 to 1000 milliseconds. For more information about timeouts, see Appendix D—Timing properties in the VoiceXML specification.
Note that the value set for the completetimeout property must be less than the value set for the incompletetimeout property.
Note: The recognition for a completetimeout is made before ECMAScript and constraint list processing. If you are using a constraint list or applying a script to recognition results, use a larger completetimeout value to avoid false rejections. For information about constraint lists, scripts, tags, and semantic interpretation, see the Speech Suite documentation.
maxspeechtimeout
For most input, caller speech continues until silence long enough to trigger the completetimeout or incompletetimeout occurs. However, if the speech takes a long time, it will be stopped automatically after an amount of time equal to the maxspeechtimeout value. You may want to specify the value for this property—for example, to extend or limit the length of a recorded message.
Note that the maxspeechtimeout value is measured from the beginning of speech, rather than the end of the prompt. This means that the full period is available even if the caller pauses before beginning to speak.