nrc-callsummary payload

Current version: v1

The application/x-nuance-nrc-callsummary payloads provide summary information about the NRaaS interaction (full recognition turn), including request parameters, results, statistics, and internal events.

In addition to the standard fields described in data field structure, messages with the application/x-nuance-nrc-callsummary dataContentType include the following fields:

fields table
Field Description
status Contains the status code. Go to Status codes in the NRaaS documentation for details.
nrcSessionid Identifier of the current NR session.
absEndTime Audio stream end time. Go to absEndTime
audioPacketStats Timing information about audio packets.
audioURN URN of the audio file for the utterance. This field can be used to download the audio file with the AFSS API. Go to AFSS API message for details.
audioDump URL of the audio recording. Is not included in DTMF recognitions.
nrcallogs The call logs with all NRaaS events that occurred during the interaction.

absEndTime

The absEndTime field contains the following values:

fields table
Field Description
firstPacketTime Date and time the first audio packet was received.
lastPacketTime Date and time the last audio packet was received.
audioDurationMs The duration of the audio received minus begin and end silence periods as detected by NRaaS.
  Sample payload  

Tokens in nrcalllogs events

Event records that detail recognizer executions, recognitions, special events (such as compilation and cache activities), and caller utterances. These records contain information such as:

  • Timestamps of each event
  • Recognition results with confidence scores
  • Timing statistics of each recognition event
  • Names of audio files containing caller utterances

Tokens used in every event

fields table
Token Description
TIME System time when the event occurred, in the following format (accurate to within 0.01 second): YYYYMMDDhhmmssmmm.
CHAN A unique session identification name provided when the session is created.
EVNT The event identifier.
UCPU The current running value of CPU time consumed from the start of the recognition or synthesis. This value is reported in milliseconds, accurate to within 0.01 second. For events where this doesn’t apply, the value is 0.
SCPU This value is always 0.

SWIfrmt

This event identifies the format of call log events written by Nuance Recognizer and occurs at the beginning of a call.

fields table
Token Description
ENCD Format of log events written by Nuance Recognizer. For example, UTF-8.
  Sample payload  

SWIclst

This event indicates the beginning of a call to the system. It is triggered at the beginning of a session.

fields table
Token Description
SRC Session ID and status.
SRC Component that issued the event.
  Sample payload  

SWIliss

The SWIliss and SWIlise events indicate recognizer license usage at the beginning and end of a call to the system.

SWIlise indicates the duration (in seconds) a license was held for the call; the event is triggered at the end of a session, and the tokens describe the count of licenses in use after incrementing for the new license.

fields table
Token Description
LUSED Licenses used. The current number of recognizer instances.
OMAX Overdraft maximum. The number of available license ports (not including overdraft ports).
LFEAT License features. A comma-separated list showing which features are associated with the license.
  Sample payload  

SWIgrld

This event summarizes the loading of a grammar. The event is logged whenever a grammar is loaded, activated, or compiled.

fields table
Token Description
API The called Recognizer function, either: “SWIrecGrammarLoad” “SWIrecGrammarActivate”, or “SWIrecGrammarCompile”.
TYPE The data type of the grammar.
URI The grammar URI (token not written if grammar is not a URI).
PROPS Any properties supplied in for the grammar.
FETCHES Number of fetches needed to load the grammar.
MEMHITS Memory cache hits for this load. (The number of loaded grammars that were already in the memory cache.)
MEMMISS Memory cache misses for this load. (The number of loaded grammars that were not already available in the memory cache.)
DISKHITS Disk cache hits for this load. (The number of loaded grammars that were already in the disk cache.)
DISKMISS Disk cache misses for this load. (The number of loaded grammars that were not already available in the disk cache.)
LDCPU Total CPU milliseconds used for the API call.
LDTIME Total clock-time milliseconds used for the API call.
GCCPU Total CPU milliseconds used for grammar compilation.
GCTIME Total clock-time milliseconds used for grammar compilation.
IFCPU Total CPU milliseconds to fetch the grammar(s) from inet.
IFTIME Total clock-time milliseconds to fetch the grammar(s) from inet.
IFBYTES Total bytes fetched (or re-fetched) from inet or the disk cache.
COMPILES Number of “real” compiles from source or old (OSR 1.n) binary files. (Total count of loaded grammars that required compilation; grammars already pre-compiled with the sgc 2.0 compiler are excluded.)
RC The return code from the API call.
  Sample payload  

SWIcach

This event is a periodic summary of grammar caching activities.

fields table
Token Description
MHIT Number of times grammars were found in the memory cache.
MMISS Number of times grammars were not found in the memory cache.
MSIZE Size of the memory cache (in kilobytes).
  Sample payload  

SWIrcst

This event is logged at the start of a recognition.

fields table
Token Description
ACST Indicates whether the acoustic state has been reset: Set to 1 when a recognizer is created or the acoustic state is reset. Set to 0 after the SWIrcst event has been logged. For the second and subsequent recognition events during a telephone call, the expected value is 0, which indicates that the acoustic state has not been reset during the call.
GURIx Grammar URI, where x is an integer enumerating active speech grammars, starting at 0.
GRNM Grammar name. (For a parameter grammar, this is the grammar ID.)
LANG Grammar language. (This field is empty for parameter grammars.)
GRMT Grammar media type. For example, “GRMT=application/srgs+xml”.
WGHT Activation weight of a grammar.
OSRVER Recognizer version number. Logged only if ACST=1.
  Sample payload  

SWIepst

This event is written for each recognition turn and signals that the endpointer has begun the attempt to detect the start of speech.

fields table
Token Description
VERSION NR version.
  Sample payload  

SWIepss

The SWIepss and SWIepse events indicate the endpointer license usage at the beginning and end of a call to the system. SWIepss is triggered at the start of a session, and the tokens describe the count of licenses in use after incrementing for the new license.

fields table
Token Description
LUSED Licenses used. The current number of endpointer instances.
LMAX License maximum. The maximum number of available licenses. The number of licenses actually checked out and available for use by an endpointer instance.
OMAX Overdraft maximum. The number of available license ports (not including overdraft ports).
LFEAT License features. A comma-separated list showing which features are associated with the license.
  Sample payload  

SWIrcnd

The SWIepss and SWIepse events indicate the endpointer license usage at the beginning and end of a call to the system. SWIepss is triggered at the start of a session, and the tokens describe the count of licenses in use after incrementing for the new license.

fields table
Token Description
RSTT See Return codes.
RENR See Reasons for end of recognition.
ENDR See Reasons for end of speech.
NBST Number of n-best items. Used only if RSTT is “ok” or “lowconf”.
RSLT Parsed text for n-best item.
SPOK Normalized raw text for n-best item; set to the value of the SWI_spoken key.
GRMR Grammar for n-best item.
KEYS List of key/value pairs for the top result.
CONF Confidence value for n-best item. Values can range from 0 to 999.
RAWS Raw score for n-best item.
SPIV The second pass has been invoked. When the recognizer is “unsure” about the accuracy of the nbest list, it invokes a second pass through the data to help improve the accuracy. A second pass uses more CPU and may also presage a low-confidence recognition.
SPAG The second pass has not modified the result of the first pass. When the recognizer is “unsure” about the accuracy of the nbest list, it invokes a second pass through the data to help improve the accuracy. A second pass uses more CPU and may also presage a low-confidence recognition.
MDVR Model version—version stamp of models. Format is L.M.m.s, where L is language number, M is major version, m is minor version, and s is the set number.
MPNM Indicates the acoustic models used for generating the recognition result. Contains a comma separated list showing the language and acoustic model filenames used for first-pass recognition processing to get the top choice on the n-best list. Each list element has the format LangCode/Version/Path/Filename. (If there is no applicable value to report, a value of NA is used.) For example: MPNM=en.us/10.0.0/models/FirstPass/models.hmm,de.de/10.0.0/models/FirstPass/models.hmm
DPNM Root name of the diphone acoustic models used to recognize the top choice on the n-best list. (If there is no applicable value to report, a value of NA is used.)
MACC Filename of the statistics file (the monophone accumulator) that tuned the acoustic model used for the recognition event.
MEDIA An audio media type. For example, “MEDIA=audio/basic;rate:8000”.
EOSS End-of-speech signal: where in the input stream the endpointer wanted the recognizer to stop.
DURS Amount of speech processed by the recognizer in milliseconds. The value can sometimes exceed EOSS by small amounts.
EOSD How much speech data was passed to the endpointer before EOS was determined. This token helps determine latency due to endpointer decision-making (mostly end of speech timeout). If EOSD equals EOSS then something unusual caused the end-of-speech; for example, the maximum speech duration timer expired.
BORT Beginning of recognition time (when the recognizer first processed the signal).
EOST End-of-speech time in milliseconds. Clock time when the endpointer determined the end of caller speech; measured in real time from the arrival of the first packet; delays in the audio path are not counted.
EORT End-of-recognition time in milliseconds. Clock time when the results are ready. Measured in real time from the arrival of the first packet of the input stream.
LA Value of the swirec_load_adjusted_speedvsaccuracy parameter used for the recognition. Values include: idle, normal, busy, Xidle, Xnormal, Xbusy. “X” values indicate that the parameter specified that value. Values without “X” were determined at runtime with the parameter setting “on”.
OFFS For internal use only. Shows an offset value for acoustic models. For example, “OFFS=1.3”.
SCAL For internal use only. Shows a multiplier for acoustic scale. For example, “SCAL=5.5”.
RCPU Recognizer CPU time in milliseconds. Measures how much CPU was used for the recognition.

Return codes

return codes table
Return code Status
serr A system error occurred.
lowconf There was an n-best result (including any possible decoys), but it was below the setting of the confidencelevel parameter.
maxc The maximum CPU time was reached (swirec_max_cpu_time).
nomatch There was no recognition match, and no n-best result.
ok Recognition was successful. There is an n-best result.
stop Recognizer received a stop request.

Reasons for end of recognition

end of recognition codes table
Return code Status
count The maximum sentences were reached. (The max is determined by internal algorithms; this is not swirec_max_sentences.)
err A system error occurred.
maxc The maximum CPU time was reached.
maxsrch Recognizer’s maximum allowed search time was reached.
maxsent The number of sentences tried.
ok Recognition was successful. There is an n-best result.
prun Stopped generating the n-best list. This can occur even if no n-best entries returned. One cause is that the pruning threshold was exceeded (swirec_state_beam). But typically, it simply means that there were no more hypotheses to consider. For example, this happens if requesting an n-best size of n but the grammar has fewer than n choices. It will also happen if the recognizer has found a compelling acoustic match so that all the other hypotheses are pruned in the first pass search.
stop Recognizer received a stop request.

Reasons for end of speech

end of speech codes table
Return code Status
ctimeout The end of speech was detected (completetimeout was triggered).
eeos External end of speech. The audio sample sent to the recognizer was labeled as the last sample.
itimeout Normal end of speech.
maxs The maximum speech time was reached (maxspeechtimeout).
nobos No beginning of speech detected.
  Sample payload  

SWIacum

This event is written whenever the Recognizer collects a statistic as part of its self-learning feature (acoustic adaptation).

fields table
Token Description
MODNM Name of the recognition model associated with the statistics.
LANG The language of the acoustic models associated with the statistics.
  Sample payload  

SWIrslt

This event logs the complete XML recognition result at the end of a successful recognition (SWIrcnd) when a voice platform requests a result from Nuance Recognizer.

fields table
Token Description
MEDIA Media type of the result.
CNTNT XML result of the recognition. The exact format of the XML string depends on the voice platform (for example, the platform might request NLSML result format).
SECURE=TRUE Confidential information has been suppressed (removed) from the call log record. The token only appears when TRUE.
  Sample payload  

SWIepms

This event signals that the external endpointer is done with trying to detect the beginning of speech.

fields table
Token Description
PD The offset in milliseconds from which the prompt started playing to the time it stopped (either due to barge-in, or because it finished playing). This value is reset to “-1” before the next prompt plays. If no barge-in occurs, this value reflects the total duration of the prompt that was played.
BOS The offset time, in milliseconds, at which the beginning of speech in the signal was detected, with some additional backoff. For the true start of speech, see the SOS value. When set to -1, this means that the endpointer timed out.
SOS The offset time, in milliseconds, at which the beginning of speech in the signal was detected. If SOS is set to -1, this means that the endpointer timed out. If SOS=PD this indicates that there was barge-in, because the prompt stopped at the start of speech.
EOS End of speech time. The default reset value is -1, meaning that the external endpointer did not find the end of speech. The -1 value is expected when the endpointer is in begin_only mode.
  Sample payload  

SWIendp

This event is written for every recognition attempt (where start of speech is detected), whether the recognition was successful or not. It is not logged if there is no start of speech. The event is triggered if the voice platform stops the endpointer.

fields table
Token Description
SRC This token, if present, is set to “SWIep.”
BRGN Boolean value, set to 1 if speech was detected while the prompt was playing, 0 if not.
BTIM Integer number of elapsed milliseconds between the first sample and the detection of speech, counted based on the duration of the samples passed into the endpointer.
MODE nput mode used: spch—Caller used speech. dtmf—Caller used DTMF. hangup—Caller disconnected (some older systems logged this as empty). timeout—No speech detected; timeout. other—The voice browser requested a stop for an unknown reason.
  Sample payload  

SWIepse

The SWIepss and SWIepse events indicate endpointer license usage at the beginning and end of a call to the system.

SWIepse indicates the duration (in seconds) a license was held for the call; the event is triggered at the end of a session, and the tokens describe the count of licenses in use after incrementing for the new license.

fields table
Token Description
LUSED Licenses used. The current number of endpointer instances.
LMAX License maximum. The maximum number of available licenses. The number of licenses actually checked out and available for use by an endpointer instance.
OMAX Overdraft maximum. The number of available license ports (not including overdraft ports).
LFEAT License features. A comma-separated list showing which features are associated with the license.
LTIME License time. It shows the number of milliseconds that the license was held since the beginning of the call.
  Sample payload  

SWIlise

The SWIliss and SWIlise events indicate recognizer license usage at the beginning and end of a call to the system.

SWIlise indicates the duration (in seconds) a license was held for the call; the event is triggered at the end of a session, and the tokens describe the count of licenses in use after incrementing for the new license.

fields table
Token Description
LUSED Licenses used. The current number of recognizer instances.
OMAX Overdraft maximum. The number of available license ports (not including overdraft ports).
LMAX License maximum. The maximum number of available licenses. The number of licenses actually checked out and available for use by a recognizer instance.
LFEAT License features. A comma-separated list showing which features are associated with the license.
LTIME License time. It shows the number of milliseconds that the license was held since the beginning of the call.
  Sample payload  

SWIlps

This event indicates the versions of data packs used for recognition during the session.

fields table
Token Description
LANGVER Concatenation of all languages (and their data pack versions) used during the session.
  Sample payload  

NUANtnat

This event is written near the end of every call.

fields table
Token Description
TNAT The tenant name associated with the recognition.
  Sample payload  

SWIclnd

This event indicates the end of a call to the system. It is triggered at the end of a session.

fields table
Token Description
VALU Session ID and status.
SRC Nuance component that issued the event.
  Sample payload