nrc-callsummary payload

Current version: v1

The application/x-nuance-nrc-callsummary payloads provide summary information about the NRaaS interaction (full recognition turn), including request parameters, results, statistics, and internal events.

In addition to the standard fields described in data field structure, messages with the application/x-nuance-nrc-callsummary dataContentType include the following fields:

fields table
Field	Description
status	Contains the status code. Go to Status codes in the NRaaS documentation for details.
nrcSessionid	Identifier of the current NR session.
absEndTime	Audio stream end time. Go to absEndTime
audioPacketStats	Timing information about audio packets.
audioURN	URN of the audio file for the utterance. This field can be used to download the audio file with the AFSS API. Go to AFSS API message for details.
audioDump	URL of the audio recording. Is not included in DTMF recognitions.
nrcallogs	The call logs with all NRaaS events that occurred during the interaction.

absEndTime

The absEndTime field contains the following values:

fields table
Field	Description
firstPacketTime	Date and time the first audio packet was received.
lastPacketTime	Date and time the last audio packet was received.
audioDurationMs	The duration of the audio received minus begin and end silence periods as detected by NRaaS.

Sample payload

"data": {
  "dataContentType": "application/x-nuance-nrc-callsummary.v1+json",
  "nrcSessionid": "c8b1a8e1-43e5-48ff-9ad0-666c0632c969",
  "traceid": "",
  "requestid": "28e465d6-96c4-9f0c-b984-547db6f33194",
  "clientRequestid": "",
  "userid": "1857909e51fddfaa9d548a7ef0d4a0fcc65dcc866ede0a1982cac1baff6b6efc",
  "locale": "en-US",
  "processingTime": {
      "startTime": "2023-07-13T20:56:42.956Z",
      "durationMs": 869
  },
  "callsummary": {
      "status": {
          "code": 200,
          "message": "OK",
          "details": "SUCCESS"
      },
      "absEndTime": 1689281803606,
      "audioPacketStats": {
          "firstPacketTime": "2023-07-13T20:56:42.983Z",
          "lastPacketTime": "2023-07-13T20:56:43.565Z",
          "audioDurationMs": 589
      },
      "audioUrn": "urn:nuance-mix:log:service:audio/nraas/c8b1a8e1-43e5-48ff-9ad0-666c0632c969.wav",
      "audioDump": "https://cd4fabricctpdevsa.blob.core.windows.net/afss-qa/525231a0434c31d360526ae82df01058d9eb250ec4e4dd097e7bb8d7b5a75b21/nraas/c8b1a8e1-43e5-48ff-9ad0-666c0632c969.wav",
      "nrcalllogs": [{
              "TIME": "20230713205642956",
              "CHAN": "c8b1a8e1-43e5-48ff-9ad0-666c0632c969",
              "EVNT": "SWIfrmt",
              "ENCD": "UTF-8",
              "UCPU": "0",
              "SCPU": "0"
          }, {
              "TIME": "20230713205642956",
              "CHAN": "c8b1a8e1-43e5-48ff-9ad0-666c0632c969",
              "EVNT": "SWIclst",
              "VALU": "Session c8b1a8e1-43e5-48ff-9ad0-666c0632c969 started",
              "SRC": "SWIrec",
              "UCPU": "0",
              "SCPU": "0"
          },
          //list of events truncated for this example
          {
              "TIME": "20230713205643607",
              "CHAN": "c8b1a8e1-43e5-48ff-9ad0-666c0632c969",
              "EVNT": "SWIclnd",
              "VALU": "Session c8b1a8e1-43e5-48ff-9ad0-666c0632c969 ended",
              "SRC": "SWIrec",
              "UCPU": "0",
              "SCPU": "0"
          }
      ]
  }
}

Tokens in nrcalllogs events

Event records that detail recognizer executions, recognitions, special events (such as compilation and cache activities), and caller utterances. These records contain information such as:

Timestamps of each event
Recognition results with confidence scores
Timing statistics of each recognition event
Names of audio files containing caller utterances

Tokens used in every event

fields table
Token	Description
TIME	System time when the event occurred, in the following format (accurate to within 0.01 second): YYYYMMDDhhmmssmmm.
CHAN	A unique session identification name provided when the session is created.
EVNT	The event identifier.
UCPU	The current running value of CPU time consumed from the start of the recognition or synthesis. This value is reported in milliseconds, accurate to within 0.01 second. For events where this doesn’t apply, the value is 0.
SCPU	This value is always 0.

SWIfrmt

This event identifies the format of call log events written by Nuance Recognizer and occurs at the beginning of a call.

fields table
Token	Description
ENCD	Format of log events written by Nuance Recognizer. For example, UTF-8.

Sample payload

SWIclst

This event indicates the beginning of a call to the system. It is triggered at the beginning of a session.

fields table
Token	Description
SRC	Session ID and status.
SRC	Component that issued the event.

Sample payload

SWIliss

The SWIliss and SWIlise events indicate recognizer license usage at the beginning and end of a call to the system.

SWIlise indicates the duration (in seconds) a license was held for the call; the event is triggered at the end of a session, and the tokens describe the count of licenses in use after incrementing for the new license.

fields table
Token	Description
LUSED	Licenses used. The current number of recognizer instances.
OMAX	Overdraft maximum. The number of available license ports (not including overdraft ports).
LFEAT	License features. A comma-separated list showing which features are associated with the license.

Sample payload

SWIgrld

This event summarizes the loading of a grammar. The event is logged whenever a grammar is loaded, activated, or compiled.

fields table
Token	Description
API	The called Recognizer function, either: “SWIrecGrammarLoad” “SWIrecGrammarActivate”, or “SWIrecGrammarCompile”.
TYPE	The data type of the grammar.
URI	The grammar URI (token not written if grammar is not a URI).
PROPS	Any properties supplied in for the grammar.
FETCHES	Number of fetches needed to load the grammar.
MEMHITS	Memory cache hits for this load. (The number of loaded grammars that were already in the memory cache.)
MEMMISS	Memory cache misses for this load. (The number of loaded grammars that were not already available in the memory cache.)
DISKHITS	Disk cache hits for this load. (The number of loaded grammars that were already in the disk cache.)
DISKMISS	Disk cache misses for this load. (The number of loaded grammars that were not already available in the disk cache.)
LDCPU	Total CPU milliseconds used for the API call.
LDTIME	Total clock-time milliseconds used for the API call.
GCCPU	Total CPU milliseconds used for grammar compilation.
GCTIME	Total clock-time milliseconds used for grammar compilation.
IFCPU	Total CPU milliseconds to fetch the grammar(s) from inet.
IFTIME	Total clock-time milliseconds to fetch the grammar(s) from inet.
IFBYTES	Total bytes fetched (or re-fetched) from inet or the disk cache.
COMPILES	Number of “real” compiles from source or old (OSR 1.n) binary files. (Total count of loaded grammars that required compilation; grammars already pre-compiled with the sgc 2.0 compiler are excluded.)
RC	The return code from the API call.

Sample payload

SWIcach

This event is a periodic summary of grammar caching activities.

fields table
Token	Description
MHIT	Number of times grammars were found in the memory cache.
MMISS	Number of times grammars were not found in the memory cache.
MSIZE	Size of the memory cache (in kilobytes).

Sample payload

SWIrcst

This event is logged at the start of a recognition.

fields table
Token	Description
ACST	Indicates whether the acoustic state has been reset: Set to 1 when a recognizer is created or the acoustic state is reset. Set to 0 after the SWIrcst event has been logged. For the second and subsequent recognition events during a telephone call, the expected value is 0, which indicates that the acoustic state has not been reset during the call.
GURIx	Grammar URI, where x is an integer enumerating active speech grammars, starting at 0.
GRNM	Grammar name. (For a parameter grammar, this is the grammar ID.)
LANG	Grammar language. (This field is empty for parameter grammars.)
GRMT	Grammar media type. For example, “GRMT=application/srgs+xml”.
WGHT	Activation weight of a grammar.
OSRVER	Recognizer version number. Logged only if ACST=1.

Sample payload

SWIepst

This event is written for each recognition turn and signals that the endpointer has begun the attempt to detect the start of speech.

fields table
Token	Description
VERSION	NR version.

Sample payload

SWIepss

The SWIepss and SWIepse events indicate the endpointer license usage at the beginning and end of a call to the system. SWIepss is triggered at the start of a session, and the tokens describe the count of licenses in use after incrementing for the new license.

fields table
Token	Description
LUSED	Licenses used. The current number of endpointer instances.
LMAX	License maximum. The maximum number of available licenses. The number of licenses actually checked out and available for use by an endpointer instance.
OMAX	Overdraft maximum. The number of available license ports (not including overdraft ports).
LFEAT	License features. A comma-separated list showing which features are associated with the license.

Sample payload

SWIrcnd

fields table
Token	Description
RSTT	See Return codes.
RENR	See Reasons for end of recognition.
ENDR	See Reasons for end of speech.
NBST	Number of n-best items. Used only if RSTT is “ok” or “lowconf”.
RSLT	Parsed text for n-best item.
SPOK	Normalized raw text for n-best item; set to the value of the SWI_spoken key.
GRMR	Grammar for n-best item.
KEYS	List of key/value pairs for the top result.
CONF	Confidence value for n-best item. Values can range from 0 to 999.
RAWS	Raw score for n-best item.
SPIV	The second pass has been invoked. When the recognizer is “unsure” about the accuracy of the nbest list, it invokes a second pass through the data to help improve the accuracy. A second pass uses more CPU and may also presage a low-confidence recognition.
SPAG	The second pass has not modified the result of the first pass. When the recognizer is “unsure” about the accuracy of the nbest list, it invokes a second pass through the data to help improve the accuracy. A second pass uses more CPU and may also presage a low-confidence recognition.
MDVR	Model version—version stamp of models. Format is L.M.m.s, where L is language number, M is major version, m is minor version, and s is the set number.
MPNM	Indicates the acoustic models used for generating the recognition result. Contains a comma separated list showing the language and acoustic model filenames used for first-pass recognition processing to get the top choice on the n-best list. Each list element has the format LangCode/Version/Path/Filename. (If there is no applicable value to report, a value of NA is used.) For example: MPNM=en.us/10.0.0/models/FirstPass/models.hmm,de.de/10.0.0/models/FirstPass/models.hmm
DPNM	Root name of the diphone acoustic models used to recognize the top choice on the n-best list. (If there is no applicable value to report, a value of NA is used.)
MACC	Filename of the statistics file (the monophone accumulator) that tuned the acoustic model used for the recognition event.
MEDIA	An audio media type. For example, “MEDIA=audio/basic;rate:8000”.
EOSS	End-of-speech signal: where in the input stream the endpointer wanted the recognizer to stop.
DURS	Amount of speech processed by the recognizer in milliseconds. The value can sometimes exceed EOSS by small amounts.
EOSD	How much speech data was passed to the endpointer before EOS was determined. This token helps determine latency due to endpointer decision-making (mostly end of speech timeout). If EOSD equals EOSS then something unusual caused the end-of-speech; for example, the maximum speech duration timer expired.
BORT	Beginning of recognition time (when the recognizer first processed the signal).
EOST	End-of-speech time in milliseconds. Clock time when the endpointer determined the end of caller speech; measured in real time from the arrival of the first packet; delays in the audio path are not counted.
EORT	End-of-recognition time in milliseconds. Clock time when the results are ready. Measured in real time from the arrival of the first packet of the input stream.
LA	Value of the swirec_load_adjusted_speedvsaccuracy parameter used for the recognition. Values include: idle, normal, busy, Xidle, Xnormal, Xbusy. “X” values indicate that the parameter specified that value. Values without “X” were determined at runtime with the parameter setting “on”.
OFFS	For internal use only. Shows an offset value for acoustic models. For example, “OFFS=1.3”.
SCAL	For internal use only. Shows a multiplier for acoustic scale. For example, “SCAL=5.5”.
RCPU	Recognizer CPU time in milliseconds. Measures how much CPU was used for the recognition.

Return codes

return codes table
Return code	Status
serr	A system error occurred.
lowconf	There was an n-best result (including any possible decoys), but it was below the setting of the confidencelevel parameter.
maxc	The maximum CPU time was reached (swirec_max_cpu_time).
nomatch	There was no recognition match, and no n-best result.
ok	Recognition was successful. There is an n-best result.
stop	Recognizer received a stop request.

Reasons for end of recognition

end of recognition codes table
Return code	Status
count	The maximum sentences were reached. (The max is determined by internal algorithms; this is not swirec_max_sentences.)
err	A system error occurred.
maxc	The maximum CPU time was reached.
maxsrch	Recognizer’s maximum allowed search time was reached.
maxsent	The number of sentences tried.
ok	Recognition was successful. There is an n-best result.
prun	Stopped generating the n-best list. This can occur even if no n-best entries returned. One cause is that the pruning threshold was exceeded (swirec_state_beam). But typically, it simply means that there were no more hypotheses to consider. For example, this happens if requesting an n-best size of n but the grammar has fewer than n choices. It will also happen if the recognizer has found a compelling acoustic match so that all the other hypotheses are pruned in the first pass search.
stop	Recognizer received a stop request.

Reasons for end of speech

end of speech codes table
Return code	Status
ctimeout	The end of speech was detected (completetimeout was triggered).
eeos	External end of speech. The audio sample sent to the recognizer was labeled as the last sample.
itimeout	Normal end of speech.
maxs	The maximum speech time was reached (maxspeechtimeout).
nobos	No beginning of speech detected.

Sample payload

SWIacum

This event is written whenever the Recognizer collects a statistic as part of its self-learning feature (acoustic adaptation).

fields table
Token	Description
MODNM	Name of the recognition model associated with the statistics.
LANG	The language of the acoustic models associated with the statistics.

Sample payload

SWIrslt

This event logs the complete XML recognition result at the end of a successful recognition (SWIrcnd) when a voice platform requests a result from Nuance Recognizer.

fields table
Token	Description
MEDIA	Media type of the result.
CNTNT	XML result of the recognition. The exact format of the XML string depends on the voice platform (for example, the platform might request NLSML result format).
SECURE=TRUE	Confidential information has been suppressed (removed) from the call log record. The token only appears when TRUE.

Sample payload

SWIepms

This event signals that the external endpointer is done with trying to detect the beginning of speech.

fields table
Token	Description
PD	The offset in milliseconds from which the prompt started playing to the time it stopped (either due to barge-in, or because it finished playing). This value is reset to “-1” before the next prompt plays. If no barge-in occurs, this value reflects the total duration of the prompt that was played.
BOS	The offset time, in milliseconds, at which the beginning of speech in the signal was detected, with some additional backoff. For the true start of speech, see the SOS value. When set to -1, this means that the endpointer timed out.
SOS	The offset time, in milliseconds, at which the beginning of speech in the signal was detected. If SOS is set to -1, this means that the endpointer timed out. If SOS=PD this indicates that there was barge-in, because the prompt stopped at the start of speech.
EOS	End of speech time. The default reset value is -1, meaning that the external endpointer did not find the end of speech. The -1 value is expected when the endpointer is in begin_only mode.

Sample payload

SWIendp

This event is written for every recognition attempt (where start of speech is detected), whether the recognition was successful or not. It is not logged if there is no start of speech. The event is triggered if the voice platform stops the endpointer.

fields table
Token	Description
SRC	This token, if present, is set to “SWIep.”
BRGN	Boolean value, set to 1 if speech was detected while the prompt was playing, 0 if not.
BTIM	Integer number of elapsed milliseconds between the first sample and the detection of speech, counted based on the duration of the samples passed into the endpointer.
MODE	nput mode used: spch—Caller used speech. dtmf—Caller used DTMF. hangup—Caller disconnected (some older systems logged this as empty). timeout—No speech detected; timeout. other—The voice browser requested a stop for an unknown reason.

Sample payload

SWIepse

The SWIepss and SWIepse events indicate endpointer license usage at the beginning and end of a call to the system.

SWIepse indicates the duration (in seconds) a license was held for the call; the event is triggered at the end of a session, and the tokens describe the count of licenses in use after incrementing for the new license.

fields table
Token	Description
LUSED	Licenses used. The current number of endpointer instances.
LMAX	License maximum. The maximum number of available licenses. The number of licenses actually checked out and available for use by an endpointer instance.
OMAX	Overdraft maximum. The number of available license ports (not including overdraft ports).
LFEAT	License features. A comma-separated list showing which features are associated with the license.
LTIME	License time. It shows the number of milliseconds that the license was held since the beginning of the call.

Sample payload

SWIlise

The SWIliss and SWIlise events indicate recognizer license usage at the beginning and end of a call to the system.

fields table
Token	Description
LUSED	Licenses used. The current number of recognizer instances.
OMAX	Overdraft maximum. The number of available license ports (not including overdraft ports).
LMAX	License maximum. The maximum number of available licenses. The number of licenses actually checked out and available for use by a recognizer instance.
LFEAT	License features. A comma-separated list showing which features are associated with the license.
LTIME	License time. It shows the number of milliseconds that the license was held since the beginning of the call.

Sample payload

SWIlps

This event indicates the versions of data packs used for recognition during the session.

fields table
Token	Description
LANGVER	Concatenation of all languages (and their data pack versions) used during the session.

Sample payload

NUANtnat

This event is written near the end of every call.

fields table
Token	Description
TNAT	The tenant name associated with the recognition.

Sample payload

SWIclnd

This event indicates the end of a call to the system. It is triggered at the end of a session.

fields table
Token	Description
VALU	Session ID and status.
SRC	Nuance component that issued the event.

Sample payload

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.