Response resources for external orchestration

The QAAction message in an ExecuteResponse contains parameters, resources, and resource references useful for client-side orchestration with Mix services. This includes resources for TTSaaS, ASRaaS, NLUaaS, and NRaaS.

TTSaaS voice parameters

If TTS modality is defined for the Q&A node, the contents of the messages field of a QAAction message includes TTS voice parameters within the tts_parameters field in each message. These are TTS voice parameters configured in Mix.dialog, and include the name, language, and voice type. The voice type can be standard or neural.

The message nlg field contains message contents that can be sent to TTSaaS.

For more details on orchestrating with TTSaaS from the client app, see TTS with orchestration by client app.

Orchestration resource reference object

Self-hosted environments: Using the orchestration_resource_reference field requires version 1.4.0 (or later) of the Dialog service. This corresponds to engine pack 2.3 for Speech Suite deployments and engine pack 3.11 for self-hosted Mix deployments.

The orchestration_resource_reference field of a QAAction message can include multiple references to ASR recognition resources, NLU interpretation resources, and NR speech and DTMF grammars.

This can include the following types of resources:

  • ASRaaS recognition resources
    • DLM reference, including weight
    • Inline wordset
    • Compiled wordset reference, including weight
  • NLUaaS interpretation resources
    • NLU model reference
    • Inline wordset
    • Compiled wordset reference
  • NRaaS grammar references
    • Speech grammar reference
    • DTMF grammar reference

The resources returned within orchestration_resource_reference depend on the resources that have been made available to the current session of your Dialog application. The resources are pulled in from different sources:

  • Built Mix resources deployed in the same app configuration as the dialog model such as the associated NLU model and (main) DLM.
  • Compiled resources passed in previously in the session via an ExternalResourceReferences variable such as additional DLMs and compiled wordsets
  • Dynamic entity data variables passed in earlier in the session for inline wordsets
  • Settings configured in Mix.dialog such as the main DLM weight and grammar references

By default, the orchestration_resource_reference will return references to the NLU model and/or main DLM in the same app configuration (with same context tag) as the dialog model.

Other orchestration resources

A QAAction message also includes other resources useful for orchestration.

The language field of a QAAction message indicates the language in which user input is expected for the turn. If you are collecting speech input and orchestrating client side with ASRaaS, you will need to provide this language to ASRaaS.

A Selectable message (selectable field) includes options to select a value for an entity from a list, for example in an interaction.

A RecognitionSettings message (recognition_settings field) includes settings for configuring both voice and DTMF input modalities:

  • Collection settings: settings related to collecting speech input
  • Speech settings: settings related to recognizing speech input
  • DTMF mappings: mappings between touchtone key presses and model entity values
  • DTMF settings: settings related to collecting DTMF input

Input modalities and orchestration resources returned

Self-hosted environments: This feature requires version 1.5.0 (or later) of the Dialog service. This corresponds to engine pack 2.4 for Speech Suite deployments and engine pack 3.11 for self-hosted Mix deployments. Filtering will not happen for engine packs that do not support this version of Dialog.

The orchestration resources and settings that come back in the QAAction depend on the input modalities that are configured for the current node. Only the resources relevant to processing the input modalities for the current channel are returned.

The following table shows the resources returned for each input modality. If more than one input modality is configured for a channel, the combination of the resources for each is returned.

Resources and settings returned for input modes
Input modality Resources returned
Text NLU model
Interactivity Selectable
Voice NLU model, DLMs, speech settings, collection settings, speech grammars
DTMF DTMF mappings, DTMF settings, DTMF grammars

Contacting the other services and continuing the dialog flow

For details about how to use these resources to send requests to the services, see the documentation for each service:

To continue the flow of the dialog, you need to send an ExecuteRequest to DLGaaS including an intent inferred from the user input, or, alternatively, a speech recognition transcript in Nuance ASR format. For more details, see Handling user input externally.