Dialog service gRPC API

DLGaaS provides three protocol buffer (.proto) files to define the Dialog service for gRPC:

The dlg_interface.proto file defines the main DialogService interface methods.
The dlg_messages.proto file defines the request and response messages for the main DialogService methods.
The dlg_common_messages.proto file defines other objects used in the fields of messages.

Once you have transformed the proto files into functions and classes in your programming language using gRPC tools, you can call these functions from your client application to start a conversation with a user, collect the user’s input, obtain the action to perform, and so on.

See Client app development for a scenario using Python that provides an overview of the different methods and messages used in a sample order coffee application.

For other languages, consult the gRPC and protocol buffer documentation.

Tip:

Try out this API using a Sample Python runtime client.

Proto file structure

The proto files define the methods and message types for the API.

View Dialog basic proto structure

View Dialog streaming proto structure

View status and update request proto structure

DialogService

The Dialog service contains six methods related to starting, executing, managing, and closing a conversation flow or dialog.

Dialog service methods description
Name	Request Type	Response Type	Description
Start	StartRequest	StartResponse	Starts a conversation. Returns a StartResponse object.
Status	StatusRequest	StatusResponse	Returns the status of a session. Returns grpc status 0 (OK) if found, 5 (NOT_FOUND) if no session was found. Returns a StatusResponse object.
Update	UpdateRequest	UpdateResponse	Updates the state of a session, for example session variables, without advancing the conversation. Returns an UpdateResponse object.
Execute	ExecuteRequest	ExecuteResponse	Used to continuously interact with the conversation based on end user input or events. Returns an ExecuteResponse object that will contain data related to the dialog interactions and that can be used by the client to interact with the end user.
ExecuteStream	StreamInput stream	StreamOutput stream	Performs recognition on streamed audio using ASRaaS and provides speech synthesis using TTSaaS.
Stop	StopRequest	StopResponse	Ends a conversation and performs cleanup. Returns a StopResponse object.

StartRequest

Request object used by the Start method.

Start request fields description
Field	Type	Description
session_id	string	Optional session ID. If not provided then one will be generated.
selector	common.Selector	Selector providing the channel and language used for the conversation.
payload	common.StartRequestPayload	Payload of the Start request.
session_timeout_sec	uint32	Session idle timeout limit (in seconds), after which the session is terminated. Maximum of 259200 (72 hours). The maximum is configurable in self-hosted deployments.
user_id	string	Identifies a specific user within the application. See User ID.
client_data	map<string,string>	Map of client-supplied key-value pairs to inject into the call log. Optional. Example: `"client_data": { "param1": "value1", "param2": "value2" }`

This method includes:

StartRequest
    session_id
    selector
        channel
        language
        library
    payload
        model_ref
            uri
            type
        data
        suppress_log_user_data
    session_timeout_sec
    user_id
    client_data

StartResponse

Response object used by the Start method.

Start response fields description
Field	Type	Description
payload	common.StartResponsePayload	Payload of the Start response. Contains session ID.

This method includes:

StartResponse
payload
session_id

StatusRequest

Request object used by Status method. For more information about the Status method, see Step 5. Check session status.

Status request fields description
Field	Type	Description
session_id	string	ID for the session.

This method includes:

StatusRequest
session_id

StatusResponse

Response object used by the Status method.

Status response fields description
Field	Type	Description
session_remaining_sec	uint32	Remaining session time to live (TTL) value in seconds, after which the session is terminated. Note: The TTL may be a few seconds off based on how long the round trip of the request took.

This method includes:

StatusResponse
session_remaining_sec

UpdateRequest

Request object used by the Update method. For more information about the Update method, see Step 6. Update session data.

Update request fields description
Field	Type	Description
session_id	string	ID for the session.
payload	common.UpdateRequestPayload	Payload of the Update request.
client_data	map<string,string>	Map of client-supplied key-value pairs to inject into the call log. Optional. Example: `"client_data": { "param1": "value1", "param2": "value2" }`
user_id	string	Identifies a specific user within the application. See User ID.

This method includes:

UpdateRequest
    session_id
    payload
    client_data
    user_id

UpdateResponse

Response object used by the Update method. Currently empty.

This method includes:

UpdateResponse

ExecuteRequest

Request object used by the Execute method.

Execute request fields description
Field	Type	Description
session_id	string	ID for the session.
selector	common.Selector	Selector providing the channel and language used for the conversation.
payload	common.ExecuteRequestPayload	Payload of the Execute request.
user_id	string	Identifies a specific user within the application. See User ID.

This method includes:

ExecuteRequest
    session_id
    selector
        channel
        language
        library
    payload
        user_input
            user_text
            interpretation
                confidence
                input_mode
                utterance
                data
                    key
                    value
                slot_literals
                    key
                    value
                slot_formatted_literals
                    key
                    value
                slot_confidences
                    key
                    value
                alternative_interpretations
            selected_item
                id
                value
            nluaas_interpretation
            asraas_result
            input_mode
        dialog_event
            type
            message
            event_name
        requested_data
            id
            data
    user_id

ExecuteResponse

Response object used by the Execute method. This object carries a payload, which instructs the client app to play messages to the user (as needed) and do one of the following:

Prompt for user input
Provide requested data
Fill time and keep user engaged while server side is fetching data
Transfer or end the conversation

The payload also includes references to ASR, NLU, TTS, and NR resources that can be used to orchestrate externally with these other services rather than having Dialog perform orchestration.

Execute response fields description
Field	Type	Description
payload	common.ExecuteResponsePayload	Payload of the Execute response.

This method includes:

ExecuteResponse
    payload
        messages
            nlg
                text
                mask
                barge_in_disabled
            visual
                text
                mask
                barge_in_disabled
            audio
                text
                uri
                mask
                barge_in_disabled
            view
                id
                name
            language
            tts_parameters
                voice
            channel
        qa_action
            message
                nlg
                    text
                    mask
                    barge_in_disabled
                visual
                    text
                    mask
                    barge_in_disabled
                audio
                    text
                    uri
                    mask
                    barge_in_disabled
                view
                    id
                    name
                language
                tts_parameters
                  voice
                channel
            data
            view
                id
                name
            selectable
                selectable_items
                    value
                        id
                        value
                    description
                    display_text
                    display_image_uri
            recognition_settings
                dtmf_mappings
                collection_settings
                speech_settings
                dtmf_settings
                input_modes
            mask
            orchestration_resource_reference
                grammar_references
                recognition_resources
                interpretation_resources
            recognition_init_resources
                recognition_init_message
                recognition_init
                dtmf_recognition_init
            language
        da_action
            id
            message
                nlg
                    text
                    mask
                    barge_in_disabled
                visual
                    text
                    mask
                    barge_in_disabled
                audio
                    text
                    uri
                    mask
                    barge_in_disabled
                view
                    id
                    name
                language
                tts_parameters
                  voice
                channel
            view
                id
                name
            message_settings
                delay
                minimum
            data
        escalation_action
            message
                nlg
                    text
                    mask
                    barge_in_disabled
                visual
                    text
                    mask
                    barge_in_disabled
                audio
                    text
                    uri
                    mask
                    barge_in_disabled
                view
                    id
                    name
                language
                tts_parameters
                  voice
                channel
            view
                id
                name
            data
            id
            escalation_settings
                type
                destination
        end_action
            data
            id
        continue_action
            message
                nlg
                    text
                    mask
                    barge_in_disabled
                visual
                    text
                    mask
                    barge_in_disabled
                audio
                    text
                    uri
                    mask
                    barge_in_disabled
                view
                    id
                    name
                language
                tts_parameters
                  voice
                channel
            message_settings
                  delay
                  minimum
            backend_connection_settings
                  fetch_timeout
                  connect_timeout
            view
                id
                name
            data
            id
        channel

StreamInput

Performs recognition on streamed audio using ASRaaS and requests speech synthesis using TTSaaS.

Warning:

You can send an input audio stream for recognition only if input was requested by the dialog application in a question and answer node. You cannot use the StreamInput method to provide an audio input if the previous response did not include a question and answer node. See Step 4b. Interact with the user using audio for details.

The field asr_control_v1 (and control_message if applicable) must be sent as part of the first StreamInput message in order for DLGaaS to chain the audio stream with ASRaaS. Audio is streamed, in order, segment by segment, over the course of the various StreamInput messages.

Stream input fields description
Field	Type	Description
request	ExecuteRequest	Standard DLGaaS ExecuteRequest. Used to continue the dialog interactions. Used on the first StreamInput only.
asr_control_v1	AsrParamsV1	Defines audio recognition parameters to be forwarded to the ASR service to initiate audio streaming. The contents of this message correspond to those of the recognition_init_message field used in the first message of the ASR input stream. Used on the first StreamInput only.
audio	bytes	A segment of the input speech audio in the selected encoding for recognition.
tts_control_v1	TtsParamsv1	Parameters to be forwarded to the TTS service.
control_message	nuance.asr.v1.ControlMessage	Optional input message to be forwarded to the ASR service. This corresponds to the optional control_message field used in the first message of the ASR input stream. ASR uses this message to start the recognition no-input timer if it was disabled by a stall_timers recognition flag in asr_control_v1. See the ASRaaS RecognitionRequest documentation for details. Used on the first StreamInput only.

This method includes:

StreamInput
    request Standard DLGaaS ExecuteRequest
    asr_control_v1
        audio_format
            pcm | alaw | ulaw | opus | ogg_opus
        utterance_detection_mode
            SINGLE | MULTIPLE | DISABLED
        recognition_flags
            auto_punctuate
            filter_profanity
           include_tokenization
           stall_timers
            etc.
        result_type
        no_input_timeout_ms
        recognition_timeout_ms
        utterance_end_silence_ms
        speech_detection_sensitivity
        max_hypotheses
        end_stream_no_valid_hypotheses
        resources
        speech_domain
        formatting
    audio
    tts_control_v1
        audio_params
            audio_format
            volume_percentage
            speaking_rate_percentage
            and so on
        voice
            name
            model
            and so on
    control_message
        start_timers_message

StreamOutput

Streams the requested TTS output and returns ASR results.

Stream output fields description
Field	Type	Description
response	ExecuteResponse	Standard DLGaaS ExecuteResponse; used to continue the dialog interactions. Used on the first StreamOutput only.
audio	nuance.tts.v1.SynthesisResponse	TTS output. See the TTSaaS SynthesisResponse documentation for details.
asr_result	nuance.asr.v1.Result	Output message containing the transcription result, including the result type, the start and end times, metadata about the transcription, and one or more transcription hypotheses. See the ASRaaS Result documentation for details. Used on the first StreamOutput only.
asr_status	nuance.asr.v1.Status	Output message indicating the status of the transcription. See the ASRaaS Status documentation for details. Used on the first StreamOutput only.
asr_start_of_speech	nuance.asr.v1.StartOfSpeech	Output message containing the start-of-speech message. See the ASRaaS StartOfSpeech documentation for details. Used on the first StreamOutput only.

This method includes:

StreamOutput
    response Standard DLGaaS ExecuteResponse
    audio
    asr_result
    asr_status
    asr_start_of_speech

StopRequest

Request object used by Stop method.

Stop request fields description
Field	Type	Description
session_id	string	ID for the session.
user_id	string	Identifies a specific user within the application. See User ID.

This method includes:

StopRequest
session_id
user_id

StopResponse

Response object used by the Stop method. Currently empty; reserved for future use.

This method includes:

StopResponse

Fields reference

The following section contains additional details about the message types of fields used in the request and response messages.

AsrParamsV1

Parameters to be forwarded to the ASR service. See Step 4b. Interact with the user using audio for details.

ASR params fields description
Field	Type	Description
audio_format	nuance.asr.v1.AudioFormat	Audio codec type and sample rate. See the ASRaaS AudioFormat documentation for details.
utterance_detection_mode	nuance.asr.v1. EnumUtteranceDetectionMode	How end of utterance is determined. Defaults to SINGLE. See the ASRaaS EnumUtteranceDetectionMode documentation for details.
recognition_flags	nuance.asr.v1.RecognitionFlags	Flags to fine tune recognition. See the ASRaaS RecognitionFlags documentation for details.
result_type	nuance.asr.v1.EnumResultType	Whether final, partial, or immutable results are returned. See the ASRaaS EnumResultType documentation for details.
no_input_timeout_ms	uint32	Maximum silence, in milliseconds, allowed while waiting for user input after recognition timers are started. Default (0) means server default, usually no timeout. See the ASRaaS Timers documentation for details.
recognition_timeout_ms	uint32	Maximum duration, in milliseconds, of recognition turn. Default (0) means server default, usually no timeout. See the ASRaaS Timers documentation for details.
utterance_end_silence_ms	uint32	Minimum silence, in milliseconds, that determines the end of an utterance. Default (0) means server default, usually 5 or half a second. See the ASRaaS Timers documentation for details.
speech_detection_sensitivity	float	A balance between detecting speech and noise (breathing, etc.), from 0 to 1. 0 means ignore all noise, 1 means interpret all noise as speech. Default is 0.5. See the ASRaaS Timers documentation for details.
max_hypotheses	uint32	Maximum number of n-best hypotheses to return. Default (0) means a server default, usually 10 hypotheses.
end_stream_no_valid_hypotheses	bool	Determines whether the dialog application or the client application handles the dialog flow when ASRaaS does not return a valid hypothesis. When set to false (default), the dialog flow is determined by the Mix.dialog application, according to the processing defined for the NO_INPUT and NO_MATCH events. To configure the streaming request so that the stream is closed if ASRaaS does not return a valid hypothesis, set to `true`. See Client handling of ASR no valid hypotheses for details.
resources	nuance.asr.v1.RecognitionResource	Repeated. Resources (DLMs, wordsets, builtins) to improve recognition. See the ASRaaS RecognitionResource documentation for details.
speech_domain	string	Mapping to internal weight sets for language models in the data pack. Values depend on the data pack.
formatting	nuance.asr.v1.Formatting	Specifies how the transcription results are presented, using keywords for formatting schemes and options supported by the data pack. See ASRaaS Formatting for details.

BackendConnectionSettings

Settings configured for a data access node backend connection.

Backend connection settings fields description
Field	Type	Description
fetch_timeout	string	Number of milliseconds allowed for fetching the data before timing out.
connect_timeout	string	Connect timeout in milliseconds.

ContinueAction

Continue action provides the client application with information useful for handling latency or delays involved with a data access node using a backend data connection. The continue action prompts the client application to respond to initiate the data access. See Continue actions for more detail.

Continue action fields description
Field	Type	Description
message	Message	Latency message to be played to the user while waiting for the backend data access.
view	View	View details for this action.
data	google.protobuf.Struct	Map of data exchanged in this node.
id	string	ID identifying the Continue action node in the dialog application.
message_settings	MessageSettings	Settings to be used along with messages returned to the present user.
backend_connection_settings	BackendConnectionSettings	Backend settings that will be used by DLGaaS for connecting to and fetching from the backend.

DAAction

A Data Access action is associated with a Data access node using client-side data access. It provides the client application with data needed to perform the data access as well as a message to play to the user while waiting for the data access to complete. See Data access actions for more detail.

DA action fields description
Field	Type	Description
id	string	ID identifying the Data Access node in the dialog application.
message	Message	Message to be played to the user while waiting for the data access to complete.
view	View	View details for this action.
data	google.protobuf.Struct	Map of data exchanged in this node.
message_settings	MessageSettings	Settings to be used along with messages played to the present user.

DialogEvent

Message used to indicate an event that occurred during the dialog interactions.

Dialog event fields description
Field	Type	Description
type	DialogEvent.EventType	Type of event being triggered.
message	string	Optional message providing additional information about the event.
event_name	string	Name of custom event. Must be set to the name of the custom event defined in Mix.dialog. See Handling events for details. Applies only when DialogEvent.EventType is set to CUSTOM.

DialogEvent.EventType

The possible event types that can occur on the client side of interactions.

Dialog event event type fields description
Name	Number	Description
SUCCESS	0	Everything went as expected.
ERROR	1	An unexpected problem occurred.
NO_INPUT	2	End user has not provided any input.
NO_MATCH	3	End user provided unrecognizable input.
HANGUP	4	The end user session has been terminated by the user. This event is used both for IVR (caller hangup) and for digital channels (for example the user disconnecting from a chat session). In Mix.dialog, this event type triggers a UserDisconnect event.
CUSTOM	5	Custom event. You must set field `event_name` in DialogEvent to the name of the custom event defined in Mix.dialog.

EndAction

End node, indicates that the dialog has ended. See End actions for more detail.

End action fields description
Field	Type	Description
data	google.protobuf.Struct	Map of data exchanged in this node.
id	string	ID identifying the End Action node in the dialog application.

EscalationAction

Escalation action to be performed by the client application. See Transfer actions for more detail.

Escalation action fields description
Field	Type	Description
message	Message	Message to be played as part of the escalation action.
view	View	View details for this action.
data	google.protobuf.Struct	Map of data exchanged in this node.
id	string	ID identifying the External Action node in the dialog application.
escalation_settings	EscalationSettings	Settings to configure the esclation transfer.

EscalationSettings

Settings to configure a transfer of the dialog, for example, to a live agent.

Escalation settings fields description
Field	Type	Description
type	string	Type of escalation transfer. Values can include “blind” and “route-request”. Empty if not set.
destination	string	Optional, provided if a specific transfer type is set. Destination for the transfer, for example, a phone number.

ExecuteRequestPayload

Payload sent with the Execute request. If both an event and a user input are provided, the event has precedence. For example, if an error event is provided, the input will be ignored.

Execute request payload fields description
Field	Type	Description
user_input	UserInput	Input provided to the Dialog engine.
dialog_event	DialogEvent	Used to pass in events that can drive the flow. Optional; if an event is not passed, the operation is assumed to be successful.
requested_data	RequestData	Data that was previously requested by engine.

ExecuteResponsePayload

Payload returned after the Execute method is called. Specifies the action to be performed by the client application.

Execute response payload fields description
Field	Type	Description
messages	Message	Repeated. Message action to be performed by the client application.
One of:
qa_action	QAAction	Question and answer action to be performed by the client application.
da_action	DAAction	Data access action to be performed by the client application in relation to data access node using client-side data connection.
escalation_action	EscalationAction	Escalation action to be performed by the client application.
end_action	EndAction	End action to be performed by the client application.
continue_action	ContinueAction	Continue action to be performed by the client application in relation to data access node using server-side data connection.
channel	string	Active channel for the action.

Message

Specifies the message to be played to the user. See Message actions for details.

Message fields description
Field	Type	Description
nlg	Message.Nlg	Repeated. Text to be played using text to speech.
visual	Message.Visual	Repeated. Text to be displayed to the user (for example, in a chat).
audio	Message.Audio	Repeated. Prompt to be played from an audio file.
view	View	View details for this message.
language	string	Message language in xx-XX format. For example, en-US.
tts_parameters	TTSParameters	Voice parameters for TTS to be used when TTSaaS orchestrated separately from DLGaaS.
channel	string	Active channel for the message.

Message.Audio

Message audio details.

Message audio fields description
Field	Type	Description
text	string	Text to be used as TTS backup if the audio file cannot be played.
uri	string	URI to the audio file, in the following format: `<language>/prompts/<library>/<channel>/<filename>?version=<version>` For example: `en-US/prompts/default/Omni_Channel_VA/Message_ini_01.wav?version=1.0_1602096507331` See To provide speech response using recorded audio for more details on how the filename portion is generated.
mask	bool	When set to true, indicates that the text contains sensitive data that will be masked in logs.
barge_in_disabled	bool	When set to true, indicates that barge-in is disabled.

Message.TTSParameters

Message TTS parameters.

TTS params fields description
Field	Type	Description
voice	Voice	TTSaaS voice to be used.

Message.TTSParameters.Voice

Message TTS voice details.

TTS params voice fields description
Field	Type	Description
name	string	The voice’s name, for example ‘Evan’. Mandatory for SynthesizeRequest.
model	string	The voice’s quality model, for example ‘standard’ or ’enhanced’. Mandatory for SynthesizeRequest.
gender	EnumGender	Voice gender. Default ANY for SynthesisRequest.
language	string	Language associated with the voice in xx-XX format, for example en-US.
voice_type	string	TTS voice type, for example ’neural’ or ‘standard’. To identify the TTS Engine to use.

Message.TTSParameters.Voice.EnumGender

TTSaaS voice gender.

TTS params voice gender enum description
Name	Number	Description
ANY	0	Any gender voice. Default for SynthesisRequest.
MALE	1	Male voice.
FEMALE	2	Female voice.
NEUTRAL	3	Neutral gender voice.

Message.Nlg

Text for text to speech.

Message nlg fields description
Field	Type	Description
text	string	Text to be played using text to speech.
mask	bool	When set to true, indicates that the text contains sensitive data that will be masked in logs.
barge_in_disabled	bool	When set to true, indicates that barge-in is disabled.

Message.Visual

Text to be displayed to the user.

Message visual fields description
Field	Type	Description
text	string	Text to be displayed to the user (for example, in a chat).
mask	bool	When set to true, indicates that the text contains sensitive data that will be masked in logs.
barge_in_disabled	bool	When set to true, indicates that barge-in is disabled.

MessageSettings

Settings to be used with latency messages returned by DAAction or ContinueAction.

Message settings fields description
Field	Type	Description
delay	string	Time in milliseconds to wait before presenting user with message.
minimum	string	Time in milliseconds to display/play message to user.

QAAction

Question and answer action to be performed by the client application. See Question and answer actions for more details.

QA action fields description
Field	Type	Description
message	Message	Message to be played as part of the question and answer action.
data	google.protobuf.Struct	Map of data exchanged in this node.
view	View	View details for this action.
selectable	Selectable	Interactive elements to be displayed by the client app, such as clickable buttons or links. See Interactive elements for details.
recognition_settings	RecognitionSettings	Configuration information to be used during recognition when handled externally.
mask	bool	When set to true, indicates that the question and answer node is marked on the node level as sensitive, or is meant to collect an entity that will hold sensitive data to be masked in logs. Also true when logs are suppressed globally.
orchestration_resource_reference	OrchestrationResourceReference	References to ASR/NLU/NR resources to support external orchestration.
recognition_init_resources	RecognitionInitResources	ASR/NR parameters to configure speech or DTMF recognition. Dialog service does not populate this field in responses.
language	string	Language expected for user input.

RecognitionSettings

Configuration information to be used during recognition when handled externally.

Recognition settings fields description
Field	Type	Description
dtmf_mappings	DtmfMapping	Repeated. DTMF mappings configured in Mix.dialog.
collection_settings	CollectionSettings	Collection settings configured in Mix.dialog.
speech_settings	SpeechSettings	Speech settings configured in Mix.dialog.
dtmf_settings	DtmfSettings	DTMF settings configured in Mix.dialog.
input_modes	string	Repeated. Input modes configured in Mix.dialog for the presently selected channel. Possible values are “text”, “interactivity,” “voice,” or “dtmf”.

RecognitionSettings.CollectionSettings

Collection settings configured in Mix.dialog.

Recognition collection settings fields description
Field	Type	Description
timeout	string	Time, in ms, to wait for speech once a prompt has finished playing before throwing a NO_INPUT event.
complete_timeout	string	Duration of silence, in milliseconds, to determine the user has finished speaking. The timer starts when the recognizer has a well-formed hypothesis.
incomplete_timeout	string	Duration of silence, in milliseconds, to determine the user has finished speaking. The timer starts when the user stops speaking.
max_speech_timeout	string	Maximum duration, in milliseconds, of an utterance collected from the user.

RecognitionSettings.DtmfMapping

DTMF mappings configured in Mix.dialog. See Set DTMF mappings for details.

Recognition DTMF mapping fields description
Field	Type	Description
id	string	Name of the entity to which the DTMF mapping applies.
value	string	Entity value to map to a DTMF key.
dtmf_key	string	DTMF key associated with this entity value. Valid values are: 0-9, *, #

RecognitionSettings.DtmfSettings

DTMF settings configured in Mix.dialog.

Recognition DTMF settings fields description
Field	Type	Description
inter_digit_timeout	string	Maximum time, in milliseconds, allowed between each DTMF character entered by the user.
term_timeout	string	Maximum time, in milliseconds, to wait for an additional DTMF character before terminating the input.
term_char	string	Character that terminates a DTMF input.

RecognitionSettings.SpeechSettings

Speech settings configured in Mix.dialog.

Recognition speech settings fields description
Field	Type	Description
sensitivity	string	Level of sensitivity to speech. 1.0 means highly sensitive to quiet input, while 0.0 means least sensitive to noise.
barge_in_type	string	Barge-in type; possible values: “speech” (interrupt a prompt by using any word) and “hotword” (interrupt a prompt by using a specific hotword).
speed_vs_accuracy	string	Desired balance between speed and accuracy. 0.0 means fastest recognition, while 1.0 means best accuracy.

RequestData

Data that was requested by the dialog application.

Request data fields description
Field	Type	Description
id	string	ID used by the dialog application to identify which node requested the data.
data	google.protobuf.Struct	Map of keys to json objects of the data requested.

ResourceReference

Reference object of the resource to use for the request (for example, URN or URL of the model)

Resource reference fields description
Field	Type	Description
uri	string	Reference (for example, the URL or URN for the Dialog model).
type	ResourceReference. EnumResourceType	Type of resource.

ResourceReference.EnumResourceType

Resource reference resource type enum.

Resource reference type enum fields description
Name	Number	Description
APPLICATION_MODEL	0	Dialog application model.

Selectable

Interactive elements to be displayed by the client app, such as clickable buttons or links. See Interactive elements for details.

Selectable fields description
Field	Type	Description
selectable_items	Selectable.SelectableItem	Repeated. Ordered list of interactive elements.

Selectable.SelectableItem

Selectable item details.

Selectable item fields description
Field	Type	Description
value	Selectable.SelectableItem. SelectedValue	Key-value pair of entity information (name and value) for the interactive element. A selected key-value pair is passed in an ExecuteRequest when the user interacts with the element.
description	string	Description of the interactive element.
display_text	string	Label to display for this interactive element.
display_image_uri	string	URI of image to display for this interactive element.

Selectable.SelectableItem.SelectedValue

Selectable item value.

Selectable item value fields description
Field	Type	Description
id	string	Name of the entity being collected.
value	string	Entity value corresponding to the interactive element.

OrchestrationResourceReference

References to orchestration resources for ASR, NLU, and Nuance Recognizer (NR).

Orchestration Resource References fields description
Field	Type	Description
grammar_references	OrchestrationResourceReference.GrammarResourceReference	Repeated. Nuance Recognizer resource references.
recognition_resources	nuance.asr.v1.RecognitionResource	Repeated. ASR resource references.
interpretation_resources	nuance.nlu.v1.InterpretationResource	Repeated. NLU resource references.

OrchestrationResourceReference.GrammarResourceReference

Reference to Nuance Recognizer grammar.

Grammar Resource Reference fields description
Field	Type	Description
uri	string	Reference (for example, the URL or URN).
type	OrchestrationResourceReference.GrammarResourceReference.EnumResourceType	Type of resource.

OrchestrationResourceReference.GrammarResourceReference.EnumResourceType

Nuance Recognizer grammar type.

Grammar Resource Reference fields description
Name	Number	Description
SPEECH_GRAMMAR	0	SRGS Grammar for speech(xml/gram/and so on).
DTMF_GRAMMAR	1	SRGS Grammar for dtmf(xml/gram/and so on).

RecognitionInitResources

RecognitionInitResources is used to pass ASR/NR parameters to configure speech or DTMF recognition. Dialog service does not populate this message in responses.

Recognition Init Resources fields description
Field	Type	Description
recognition_init_message	nuance.asr.v1.RecognitionInitMessage	ASR parameters and resources to configure speech recognition. Dialog service does not populate this field in responses. See the RecognitionInitMessage documentation for details.
recognition_init	nuance.nrc.v1.RecognitionInit	NR parameters and resources to configure speech recognition. Dialog service does not populate this field in responses. See the RecognitionInit documentation for details.
dtmf_recognition_init	nuance.nrc.v1.DTMFRecognitionInit	NR parameters and resources to configure DTMF recognition. Dialog service does not populate this field in responses. See the DTMFRecognitionInit documentation for details.

Selector

Provides channel and language used for the conversation. See Languages, channels, and modalities for details.

Selector fields description
Field	Type	Description
channel	string	Optional. Channel that this conversation is going to use (for example, WebVA). Note: Replace any spaces or slashes in the name of the channel with the underscore character (_).
language	string	Optional. Language to use for this conversation. This sets the language session variable. The format is xx-XX, for example, “en-US”
library	string	Optional. Library to use for this conversation. Advanced customization reserved for future use. Always use the default value for now, which is `default`.

StartRequestPayload

Payload sent with the Start request.

Start request payload fields description
Field	Type	Description
model_ref	ResourceReference	Reference object for the Dialog model.
data	google.protobuf.Struct	Session variables data sent in the request as a map of key-value pairs.
suppress_log_user_data	bool	Set to true to disable logging for ASR, NLU, TTS, and Dialog.

StartResponsePayload

Payload returned after the Start method is called. If a session ID is not provided in the request, a new one is generated and should be used for subsequent calls.

Start response payload fields description
Field	Type	Description
session_id	string	Returns session ID to use for subsequent calls.

UpdateRequestPayload

Payload sent with the Update request.

Update request payload fields description
Field	Type	Description
data	google.protobuf.Struct	Map of key-value pairs of session variables to update.

TtsParamsv1

Parameters to be forwarded to the TTS service. See Step 4b. Interact with the user (using audio) for details.

TTS params fields description
Field	Type	Description
audio_params	nuance.tts.v1. AudioParameters	Output audio parameters, such as encoding and volume. See the TTSaaS AudioParameters documentation for details.
voice	nuance.tts.v1.Voice	The voice to use for audio synthesis. See the TTSaaS Voice documentation for details.

UserInput

Provides input to the Dialog engine. The client application sends either the text collected from the user, to be interpreted by Mix, or an interpretation that was performed externally.

Note: Provide only one of the following fields: user_text, interpretation, selected_item, nluaas_interpretation, asraas_result.

User input fields description
Field	Type	Description
One of:
user_text	string	Text collected from end user.
interpretation	UserInput.Interpretation	Interpretation that was done externally (for example, Nuance Recognizer for VoiceXML). This can be used for simple interpretations that include entities with string values only. Use `nluaas_interpretation` for interpretations that include complex entities.
selected_item	Selectable.SelectableItem. SelectedValue	Value of element selected by end user.
nluaas_interpretation	nuance.nlu.v1.InterpretResult	Interpretation that was done externally (for example, Nuance Recognizer for VoiceXML), provided in the NLUaaS InterpretResult format. See Interpreting text user input for an example. Note that DLGaaS currently only supports single intent interpretations.
asraas_result	nuance.asr.v1.Result	Speech recognition that was done externally, provided in the ASRaaS result format.
input_mode	string	Optional. Input mode. Used for reporting. Current values are dtmf/voice. Applies to user_text and nluaas_interpretation input only.

UserInput.Interpretation

Sends interpretation data.

User input interpretation fields description
Field	Type	Description
confidence	float	Required: Value from 0..1 that indicates the confidence of the interpretation.
input_mode	string	Optional. Input mode. Current values are dtmf/voice (but input mode not limited to these).
utterance	string	Raw collected text.
data	UserInput.Interpretation. DataEntry	Repeated. Data from the interpretation of intents and entities. For example, INTENT:BILL_PAY or AMOUNT:100.
slot_literals	UserInput.Interpretation. SlotLiteralsEntry	Repeated. Slot literals from the interpretation of the entities. The slot literal provides the exact words used by the user. For example, AMOUNT: One hundred dollars.
slot_formatted_literals	UserInput.Interpretation. SlotFormattedLiteralsEntry	Repeated. Slot formatted literals from the interpretation of the entities.
slot_confidences	UserInput.Interpretation. SlotConfidencesEntry	Repeated. Slot confidences from the interpretation of the entities.
alternative_interpretations	UserInput.Interpretation	Repeated. Alternative interpretations possible from the interaction, that is, n-best list.

UserInput.Interpretation.DataEntry

Data entry from interpretation of user input.

User input interpretation data entry fields description
Field	Type	Description
key	string	Key of the data.
value	string	Value of the data.

UserInput.Interpretation.SlotConfidencesEntry

Slot confidence entry from interpretation of user input.

User input interpretation slot confidence fields description
Field	Type	Description
key	string	Name of the entity.
value	float	Value from 0..1 that indicates the confidence of the interpretation for this entity.

UserInput.Interpretation.SlotLiteralsEntry

Slot literal entry from interpretation of user input.

User input interpretation slot literal fields description
Field	Type	Description
key	string	Name of the entity.
value	string	Literal value of the entity.

UserInput.Interpretation.SlotFormattedLiteralsEntry

Slot formatted literal entry from interpretation of user input.

User input interpretation slot formatted literal fields description
Field	Type	Description
key	string	Name of the entity.
value	string	Literal value of the entity.

View

Specifies view details for this action.

View fields description
Field	Type	Description
id	string	Class or CSS defined for the view details in the node.
name	string	Type defined for the view details in the node.

Scalar value types

The data types in the proto files are mapped to equivalent types in the generated client stub files.

Scalar data types
Proto	Notes	C++	Java	Python
double		double	double	float
float		float	float	float
int32	Uses variable-length encoding. Inefficient for encoding negative numbers. If your field is likely to have negative values, use sint32 instead.	int32	int	int
int64	Uses variable-length encoding. Inefficient for encoding negative numbers. If your field is likely to have negative values, use sint64 instead.	int64	long	int/long
uint32	Uses variable-length encoding.	uint32	int	int/long
uint64	Uses variable-length encoding.	uint64	long	int/long
sint32	Uses variable-length encoding. Signed int value. These encode negative numbers more efficiently than regular int32s.	int32	int	int
sint64	Uses variable-length encoding. Signed int value. These encode negative numbers more efficiently than regular int64s.	int64	long	int/long
fixed32	Always four bytes. More efficient than uint32 if values are often greater than 2^28.	uint32	int	int
fixed64	Always eight bytes. More efficient than uint64 if values are often greater than 2^56.	uint64	long	int/long
sfixed32	Always four bytes.	int32	int	int
sfixed64	Always eight bytes.	int64	long	int/long
bool		bool	boolean	boolean
string	A string must always contain UTF-8 encoded or 7-bit ASCII text.	string	String	str/unicode
bytes	May contain any arbitrary sequence of bytes.	string	ByteString	str

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.