Troubleshooting system latency

This topic covers some of the causes of latency that may occur in your system. Latency is the delay observed by callers who notice a long pause between the end of their speech and the next prompt they hear. Callers are very sensitive to these latencies.

Always test for latencies and verify operations in a full production environment. During initial deployments and full production, you can detect latencies in various ways: test calls to the system, reports from customers (for example, comments to agents after users transfer from the application), and call analysis (for example, places in the application where users hang up unexpectedly, or other unexpected values in the call logs).

Follow the steps below to diagnose the causes of latency. They can be general or specific; for example, a heavily loaded system is a general problem, and a slow fetch of a large file is specific. Thus, in diagnosing problems, you might know exactly where to focus, or you might need to survey all parts of the system. For example, latency can be due to delays in these locations:

Voice platform or application activities. See Diagnosing platform latency.
Grammar load activities. See Diagnosing delays during grammar loading.
- Fetch delay: Getting the grammar, often across the network.
- Compile delay: Waiting for large grammars to compile.
Recognizer processing. See Diagnosing Recognizer latency.
If you don’t have a specific starting point, see General troubleshooting.

The sections that follow help you look for clues in call logs.

Tip: See Logging for Nuance speech products to learn about call log paths, filenames, events, and tokens.

Diagnosing platform latency

To investigate whether the observed latency is caused by a voice platform, consider these questions:

Does the browser access a database after the recognition?
Is the application playing a prompt that begins with silence? The time of silence will be perceived as a delay.
Are there network delays that slow interaction between the system and the application?

Diagnosing delays during grammar loading

If you suspect the latency occurs during grammar load, investigate the SWIgrld—grammar load event in the call logs.

Use the URI token to identify the grammar type. A grammar name ending in .xml or .grxml is a source grammar. A name ending in .gram is a binary grammar.

Diagnosing Recognizer latency

Use the call logs to diagnose latency during recognition:

Investigate the SWIrcst event. Review the list of active grammars (GURIn), and ensure that all of them are needed for the recognition. Otherwise, Recognizer needlessly processes the grammar and attempts to match utterances.
Investigate the SWIrcnd event. Make note of the following tokens:
- DURS: Duration of the speech signal.
- EOST: Clock time from first speech packet received until end of speech declared.
- EORT: Clock time from first speech packet received to end of recognition (when the results are ready).
- EOSS: Milliseconds into the audio stream where end-of-speech occurs.
- RCPU: Amount of CPU used for the recognition.
1. If DURS is far apart from EOST and EORT, there is a CPU problem, such as non-Nuance software running on the machine. (Check the CPU usage of virus scanners and automatic software updaters.) This might also might that the CPU is not sufficient for the number of simultaneous recognitions.
2. If EOST and EORT are far apart, the problem is usually insufficient CPU or a very complex recognition task.
  Rarely, this can reflect a delay in the audio sent to Recognizer.
3. If RCPU is high, the problem is probably a complex grammar, noisy speech, or the utterance is covered in more than one grammar.
4. If EOSS and EOST are far apart, it can signal a CPU problem.

Enable the swirec_save_comp_stats parameter, which writes detailed statistics of Recognizer processing to the call logs. Collect more data in call logs, and deliver the logs to Nuance technical support.
Investigate large gaps in timestamps between events. Tracing MRCP packets can help identify issues.

General troubleshooting

A strong indicator of latency is when multiple callers are hanging up unexpectedly at the same place in an application. If you are getting reports in general of slow recognition, but you don’t know the specific grammar or recognition context that is performing slowly, try the following:

If it’s a repeatable test case, turn on perfmon (Windows) or sar or top (Linux; sar is not included in a typical Linux installation) to track CPU use.
To narrow in on a problem recognition context, look at the call logs for the application from the time period in which latency was reported.

Once you have selected call logs, look for a recognition that took a long time. (Look for large recognition times between SWIrcst and SWIrcnd.) If you find several recognitions for the same context for various calls that seem to take longer from SWIrcst to SWIrcnd than anticipated, zero in on that grammar by following the suggestions in Diagnosing Recognizer latency.

Likewise, look for large grammar load times, and follow the suggestions in Diagnosing delays during grammar loading.

Tuning scenarios

The following topics cover typical tuning scenarios, and provide some suggestions of what to analyze.

Troubleshooting system latency

Diagnosing platform latency

Diagnosing delays during grammar loading

Diagnosing Recognizer latency

General troubleshooting

Tuning scenarios

Related topics