Fetching and caching files

A typical Nuance deployment includes web servers that provide files to various speech components. This topic describes aspects of fetching and caching files from those servers that are accessible to a voice application.

Web servers

The system administrator is responsible for providing any required web servers. Decisions on which web servers to use and where they are located depend on the projected usage and loads. Web servers can be located on the same computer as Nuance products or on separate computers. There can be one server for all requests, or multiple servers for different types of requests.

Note: If Apache is already installed when you install Nuance speech products, the existing httpd.conf file in the Apache config directory to httpd.conf.bak is automatically backed up and replaced with an httpd.conf file that supports integration with Speech Server. If you wish to retain customizations to httpd.conf, you must copy them manually to the new configuration file.

Although Nuance speech products generally do not need to configure the location of the web servers, a speech application must provide URIs for the servers in resource requests.

Fetching files

The voice application uses VoiceXML elements, properties, or attributes to identify data to be fetched. The voice browser converts those parameters into headers in MRCP messages, and forwards them to Speech Server, which in turn passes the requests to the respective speech products. Each Nuance speech product also has default behaviors for fetching and caching, which the application can override by specifying its own values.

Applications can specify data to fetch in a VoiceXML page (typical case) or a session.xml file.

An application can specify the URI of a requested resource by either an absolute or a relative pathname. If the application provides only a relative path, the Nuance speech product completes the URI, using other information provided by the application. For example, if the same server provides the VoiceXML documents and grammar files, the Speech Server might concatenate the value of the xml:base property with the specified value.

The following components can fetch files from web servers.

Component

Types of files fetched

Voice browser

  • VoiceXML application documents
  • Sometimes the browser uses a server for prerecorded audio files and for grammars.

Recognizer

  • Grammars
  • Subgrammars
  • Dictionaries

Vocalizer

  • Input text
  • Digital audio recordings
  • User dictionaries
  • Rulesets
  • ActivePrompt databases (including digital audio recordings referenced by these databases)
Natural Language Processing service
  • Manifest or configuration file named nuance_package.json, provided by Nuance or created by you (via Nuance Experience Studio or Nuance Mix Tools ) along with application-specific models (fetched by Krypton and NLE, as described below). For more information, see Triggering the Dragon Voice recognizer.
Krypton recognition engine
  • Wordset files in JSON format. Custom lists of words to be added to the standard general vocabulary at runtime. For more information, see Using wordsets.
  • Pre-processed wordset packages. ZIP file containing a wordset file and, optionally, a binary file of pre-computed word pronunciations.
  • Domain language model package. ZIP file containing all the binary and non-binary components of a custom trained domain language model.

Natural Language Engine
  • Semantic model package
  • Wordset files in JSON format

Note: Nuance Recognizer and Dragon Voice applications require different artifacts. (They do not share artifacts.) To create Dragon Voice artifacts, contact Nuance to get access to Nuance Command Line Interface or Nuance Experience Studio or Nuance Mix Tools .

Caching fetched objects

Fetched items can be cached by the component that performs the fetch. This improves response times when those items are needed repeatedly.

VoiceXML applications can use the following properties to control when items are fetched, and when an item must be refetched if it is changed on the server. You set these properties in a <property> element.

Property

Description

audiofetchhint

This property is supported by Nuance, but ignored; the effect is always "safe".

Specifies when the text-to-speech server retrieves digital audio recordings from the server. Possible values are:

  • prefetch—Fetches all items that will be requested in a VoiceXML document when that document is opened.
  • safe—Fetches an item only when requested.

Implemented through the Fetch-Hint MRCP header on a SPEAK or LOAD-LEXICON request.

audiomaxage

The server may re-use only cached audio files whose age is no greater than the specified time in seconds.

Implemented through the Cache-Control MRCP header on a SPEAK or LOAD-LEXICON request. (Used for URI-list WAV files and <audio> tags embedded in SSML content.)

audiomaxstale

The server may re-use only cached audio files that have exceeded the expiration time by up to the specified number of seconds. If no value is assigned to max-stale, the server may use stale audio files of any age.

Implemented through the Cache-Control MRCP header on a SPEAK or LOAD-LEXICON request. (Used for URI-list WAV files and <audio> tags embedded in SSML content.)

fetchtimeout

Interval to wait for the content to be returned before throwing an error.badfetch event. The value is a time designation. If not specified, a value derived from the innermost fetchtimeout property is used.

grammarfetchhint

This property is supported by Nuance, but ignored; the effect is always "safe."

Specifies when the recognizer retrieves documents or other resources such as grammars or dictionaries from the server. Possible values are:

  • prefetch—Fetches all items that will be requested in a VoiceXML document when that document is opened.
  • safe—Fetches an item only when it is specifically requested.

Implemented through the Fetch-Hint MRCP header on a DEFINE-GRAMMAR or RECOGNIZE request.

grammarmaxage

The server may re-use only cached grammars whose age is no greater than the specified time in seconds.

Implemented through the Cache-Control MRCP header on a DEFINE-GRAMMAR or RECOGNIZE request.

grammarmaxstale

The server may re-use only cached grammars that have exceeded the expiration time by up to the specified number of seconds. If no value is assigned to max-stale, the server may use grammars of any age.

Implemented through the Cache-Control MRCP header on a DEFINE-GRAMMAR or RECOGNIZE request.

For detailed information on the Cache-Control and Fetch-Hint headers, see Recognition headers or Text-to-speech headers, respectively.

For example, if you wish to use an audio file of a specific age for a prompt in a particular form, you can set these properties as follows:

<property name="audiomaxage" value="150s"/>
<property name="audiomaxstale" value="25s"/>
   <form>
     …

You can also change three general caching characteristics for a particular fetch by using the following attributes for the specific fetch request (in a <grammar> or <audio> element):

Attribute

Description

maxage

Use only cached data whose age is no greater than the specified time in seconds.

maxstale

Use only cached data that has exceeded expiration time by up to the specified number of seconds.

minfresh

(Nuance specific)

Use only cached data whose expiration time is no less than its current age plus the specified time in seconds.

Note that it is generally better to set these attributes using cache-control server configuration parameters (see cache-control.max-age, cache-control.max-stale, and cache-control.min-fresh).

Notes:

The following VoiceXML properties control actions directly at the VoiceXML level and are not relevant outside the VoiceXML interpreter. They have no counterparts in the other levels, and do not affect Nuance components.

  • documentfetchhint
  • documentmaxage
  • documentmaxstale
  • fetchaudio
  • fetchaudiodelay
  • fetchaudiominimum
  • objectfetchhint
  • objectmaxage
  • objectmaxstale
  • scriptfetchhint
  • scriptmaxage
  • scriptmaxstale
  • universals

Tuning caches for performance

Caches are normally used to increase the efficiency of a system, by reducing the time spent in fetching files. Each Nuance product provides parameters for controlling the size and efficiency of its caches. However, those parameters are not directly accessible by an application; the parameters described here affect only how to select particular versions of cached objects.