Data packs

ASRaaS works with one or more factory data packs, available in several languages and locales. The data pack includes these neural network-based components:

  • Acoustic model (AM) translates utterances into phonetic representations of speech.

  • Language model (LM) identifies the words or phrases most likely spoken.

Data pack

The base acoustic model is trained to give good performance in many acoustic environments. The base language model is developed to remain current with popular vocabulary and language use. As such, ASRaaS paired with a data pack is ready for use out-of-the-box for many applications.

You must declare a data pack in each recognition request, using RecognitionInitMessage: RecognitionParameters: language and optionally topic.

See Geographies: Languages and Voices for the languages available to ASRaaS in your region.

You may extend and customize the data pack at runtime using several types of specialization resources:

Each recognition turn leverages a weighted mix of builtins, domain LMs, and wordsets. See Resource weights.