swirec_audio_media_type

Audio formats that will be supplied to Recognizer.

Value

String of audio parameters separated by semi-colons.

Default

audio/basic;rate=8000

How to set

In Management Station set on the Nuance recognition service. (To set more than one value, use the pipe | symbol as a separator.) If not using Management Station, set in a Recognizer configuration file (User-nrsxx.xml).

Usage

Frequently used by system administrators during installation. The voice browser determines which formats are used.

Recognizer uses this parameter to load matching recognition models during initialization.

At runtime, the browser specifies the audio format when it writes audio samples to the Recognizer (the browser can only specify formats declared by this parameter). By declaring the needed audio formats in advance, this parameter avoids the initialization time and memory cost of loading unneeded formats.

The parameter value has the following format:

media-type-prefix
[;rate=sample-rate]
[;encoding=encoding]
[;orig-encoding=voip-std]
[;bitrate=voip-bitrate]

Format element

Description

media-type-prefix

Indicates the format of the audio data:

audio/basic

(Default) ulaw-encoded data

audio/x-alaw-basic

alaw-encoded data

audio/L16

16-bit linear data

application/x-aurora

Aurora data (original bitstream)

application/x-feature

Aurora data (advanced bitstream for encoding=ES_202_050)

Note: Aurora data is a description of audio features defined by the European Telecommunication Standards Institute (ETSI). Aurora is supported by Recognizer but due to size constraints the models are not shipped with the language packs. The Aurora models are available upon request. Contact your sales representative.

rate

Optional. Indicates samples/second. The default is 8000. Other rates are allowed based on the media-type-prefix.

encoding

Required if the media type is application/x-feature, in which case the encoding must be ES_202_050 (for the ES 202 050 standard).

orig

Optional. This element is useful for installations with customized language packs for applications that receive voice over an IP network (VoIP).

Use this argument to indicate the original encoding of the audio received by the application. Values: g723 or g729.

It indicates that u-law audio has been decoded from VoIP audio. You cannot input VoIP encoded audio directly; you must always decode to u-law.

You can use this element for all media types except application/x-feature.

bitrate

Optional when orig-encoding is defined as g723. This is the bitrate of the original VoIp data stream. Values: 5.3 or 6.3.

The following tables show sample combinations of media-type-prefix, sample-rate, encoding, and bit-rate. Identical combinations of the rate, orig-encoding, and bitrate above can be used for audio/x-alaw-basic and audio/L16.

Sample media types for ulaw, 8,000 Hz.

audio/basic;rate=8000

audio/basic;rate=8000;orig-encoding=g723

audio/basic;rate=8000;orig-encoding=g723;bitrate=5.3

audio/basic;rate=8000;orig-encoding=g723;bitrate=6.3

audio/basic;rate=8000;orig-encoding=g729

The table below shows Aurora combinations. If you use the 11,000 and 16,000 kHz rates, you must request the corresponding acoustic models as custom deliverables from Nuance.

Sample media types for Aurora

application/x-feature;rate=8000;encoding=ES_202_050

application/x-feature;rate=11000;encoding=ES_202_050

application/x-feature;rate=16000;encoding=ES_202_050

Here are additional combinations:

Sample media types

audio/L16;rate=8000 (8 KHz 16-bit linear)

audio/L16;rate=8000 (8 KHz 16-bit linear)

audio/x-alaw-basic;rate=8000 (8 KHz a-law)

This parameter allows multiple values. If Aurora is one of the values, it must not be listed first.

Related parameter