Saving audio files

Nuance Speech Server can save the incoming audio stream used for recognition, and provide a URI of the resulting file to the MRCP client. To control the feature, the waveform logging parameters identify the location of the web server, the location for writing files, and the amount of data that can be written.

Note: This Speech Server feature is unrelated to Configuring whole call recording, and to Recognizer’s logging of audio waveforms. You control the features separately, and they result in different waveforms. However, these files are all saved to the same base location. (For troubleshooting purposes, you can compare pairs of files created by the mechanisms in search of unexpected differences.)

Recording the audio stream

The MRCP header Save-Waveform instructs the Speech Server to record the incoming audio stream of the recognition and to use the Waveform-URI header to provide a URI where the client can access the resulting file. For information on the location of saved waveforms, see Call log directories and files.

The following example illustrates this scenario:

C->S:MRCP/2.0 RECOGNIZE 543257

Channel-Identifier: 1@speechrecog

Content-Type: text/uri-list

Content-Length: 22
Save-Waveform: true

builtin:grammar/digits

S->C:MRCP/2.0 543257 200 IN-PROGRESS

Channel-Identifier: 1@speechrecog

S->C:MRCP/2.0 85 START-OF-SPEECH 543257 IN-PROGRESS

Channel-Identifier: 1@speechrecog

S->C:MRCP/2.0 RECOGNITION-COMPLETE 543257 COMPLETE

Channel-Identifier: 1@speechrecog

Waveform-URL: http://localhost/1/20051128204946_677309.wav

Completion-Cause: 000 success

Content-Type: application/x-nlsml

Content-Length: 175

<?xml version='1.0'?>

<result>

  <interpretation grammar="builtin:grammar/digits" confidence="100">

    <input mode="speech">one</input>

    <instance>1</instance>

  </interpretation>

</result>

Configuring a web server

Nuance Speech Server stores logs and audio recordings locally on the disk. To retrieve them over an HTTP interface, a local Web server must be configured. To record utterances (save waveforms), you must install and configure a web server on the same machine as Speech Server.

Note: If Apache is already installed when you install Nuance speech products, the existing httpd.conf file in the Apache config directory to httpd.conf.bak is automatically backed up and replaced with an httpd.conf file that supports integration with Speech Server. If you wish to retain customizations to httpd.conf, you must copy them manually to the new configuration file.

The listening port of the web server must be the same as specified by the server.session.ossweb.port parameter on Speech Server in the Management Station.

Supported media types

To set the media type for a stored recording so that it can be read by the application, use the Speech Server parameter. The default value is audio/x-wav.

The following media types are supported for the Save-Waveform and Input-Waveform-URI headers:

Header	Supported media-type
Save-Waveform	audio/x-wav audio/x-nist
Input-Waveform-URI	audio/basic audio/x-alaw-basic audio/L16 audio/x-wav audio/x-nist

Saving audio files

Recording the audio stream

Configuring a web server

Supported media types

Related topics