Vocalizer features

Speech Server uses Nuance Vocalizer to handle speech synthesis.

Vocalizer for Enterprise is a complete engine for delivering spoken output. Available in over 40 different languages with a wide selection of voices, both male and female, Vocalizer can handle all the application’s audio, from a library of prompt recordings to dynamically generated text-to-speech (TTS) synthesis. Each Vocalizer voice pack is bundled separately and available for download from Nuance Network.

Vocalizer allows applications to decide the spoken words at runtime (instead of using pre-determined words and sentences), without the intervention of live operators and without the limitations and costs of developing and maintaining prompt libraries.

By providing a single source for all audio output, Vocalizer combines computer-generated and prerecorded audio to enable automation of more application behaviors, deliver detailed information specific to an individual customer, and reduce implementation and operational costs. For example, you can use synthesized speech when developing and testing applications, and then add prerecorded prompts for deployment and tuning. This approach speeds development and reduces the cost of studio recordings.

Other Vocalizer benefits:

  • Avoids unnecessary transfers: Sometimes customers need to hear dynamic information that is difficult to prerecord. For example, names, addresses, or information from a database. Vocalizer’s text-to-speech capabilities can read that information and avoid a transfer to a human agent to complete the task.
  • Automates more calls: Research shows that callers are more likely to complete an automated call when they hear information such as names and addresses read clearly by a single voice. Callers who hear a mixture of voices and lesser-quality TTS get distracted, and are more likely to leave the automated system.
  • Simplifies application development: Applications request the desired text and Vocalizer either finds a prerecorded audio file or generates high-quality synthesized speech. When the application requests an assembly of phrases, Vocalizer blends them seamlessly together.
  • Enables complete audio control: Applications can fine-tune every aspect of generated audio (pronunciations, speaking speed, and so on), which is especially useful for frequently played prompts.
  • Provides superior speech synthesis: Vocalizer multiform synthesis (MFS) technology and voice models built with recurrant neural network technology produce superior speech synthesis, supported with XPremium-high and XPremium-high-nb voice models.
  • FIPS supportVocalizer supports SSL connections to servers that are FIPS-compliant. This protects any data transferred to or from the Vocalizer service (for example, fetched audio files).FIPS refers to Federal Information Processing Standards, a set of USA government standards that includes encryption algorithms. See inet_ssl_enable_fips.