Using hot word recognition

The term hot word recognition describes a scenario where the system is constantly listening for a particular command word or phrase (the hot word) that acts as the signal for it to take a given action. For example, hot word recognition can take place while the user is placing a bridged transfer call to a third party. In this scenario, when the system recognizes the hot word, it ends the call to the third party, and the dialog between the caller and the application resumes.

Another possible use of hot word recognition is during the playing of a long prompt, where the hot word can be used simply to stop the prompt and continue with the next part of the application. The advantage of using hot word over normal recognition in this scenario is that the prompt playback will only be stopped when a successful recognition has occurred, rather than at the moment when the user has started to speak.

A typical speech recognition application is designed as a turn-taking dialog, where the system and the user take turns speaking to one another. In this scenario, the recognizer knows when to expect audio from the user.

In other cases, the system listens to and processes the stream of speech but does not take any action until it recognizes a hot word. In this scenario, the recognizer does not know when to expect the hot word. However, the moment the hot word is recognized, the application stops listening and takes action.

There are two principal occasions where you can find a hot word useful:

  • Hot word return from a <transfer> element allows the user to terminate a bridged <transfer> and return to the application. Voice Platform supports either near-end (the caller or party A uses the hot word) or far-end (the called party or party C uses the hot word) hot word return.
  • A field-level hot word allows you to specify hot word recognition instead of standard recognition for a particular <field>.