Wordset service messages

The Wordset gRPC API contains methods for compiling wordsets.

NLU as a Service provides a set of protocol buffer (.proto) files to define a gRPC Wordset service. These files allow you to use large wordsets with your NLUaaS applications:

  • nuance/nlu/wordset/v1beta1/wordset.proto contains the methods and messages needed to work with large wordsets in NLU applications.
  • nuance/nlu/common/v1beta1/resource.proto and nuance/nlu/common/v1beta1/job.proto define messages for referencing external resources (NLU models and wordsets) and for job status updates.
  • nuance/rpc/status.proto, nuance/rpc/status_code.proto, and nuance/rpc/error_details define messages for rpc status and error codes.

Wordset proto file structure

The file wordset.proto defines a Wordset service with several RPC methods for creating and managing wordsets.

  Wordset proto file structure  

For the nuance.nlu.common.v1beta1 messages, see Common messages.

For the nuance.rpc.Status and related messages, see RPC status messages.

Job status vs. request status

The responses in this API include two types of status updates related to the requested task:

  • A Job status update refers to the condition of the job that is compiling the wordset. Its values are set in JobStatus and can be JOB_STATUS_PROCESSING, JOB_STATUS_COMPLETE, or JOB_STATUS_FAILED.
  • A Request status refers to the condition of the gRPC request. Its values are set in nuance.rpc.StatusCode and can be OK, INVALID_REQUEST, NOT_FOUND, ALREADY_EXISTS, BAD_REQUEST, and so on.

Wordset service

The Wordset service offers three RPC methods to compile and manage wordsets.

Wordset service methods
Name Request Type Response Type Description
CompileWordsetAndWatch CompileWordsetRequest WatchJobStatusResponse stream Submit and watch for job completion (server streaming)
GetWordsetMetadata GetWordset MetadataRequest GetWordsetMetadataResponse Gets a compiled wordset’s metadata (unary)
DeleteWordset DeleteWordsetRequest DeleteWordset Response Delete the compiled wordset (unary)

CompileWordsetAndWatch

  Process flow of CompileWordsetAndWatch method  

This RPC method submits a request to compile a wordset and returns streaming messages from the server until the job completes. It consists of a CompileWordsetRequest request and a WatchJobStatusResponse response.

The method submits a wordset to be compiled and starts a batch compilation job. The response is a server stream of multiple WatchJobStatusResponse job progress notifications, which continues until the end of the compilation job. This is followed by a final job status.

The WatchJobStatusResponse includes two types of notifications:

The JobStatusUpdate notifications that come in the stream of WatchJobStatusUpdate messages will include the following:

  • Job ID (Not used by NLUaaS. This field will always be empty.)
  • One or more job status responses with JOB_STATUS_PROCESSING. The same status may be returned multiple times. Repeated notifications also keep the process alive.
  • Final job status with JOB_STATUS_COMPLETE or JOB_STATUS_FAILED, or with error messages when appropriate.

The final WatchJobStatusUpdate message will include a request status. This includes a gRPC StatusCode signalling either OK for successful requests, or INVALID_REQUEST, ALREADY_EXISTS, BAD_REQUEST, and so on for unsuccessful requests. This status refers to the gRPC request itself, not the wordset compile job that was created. See Job status vs. request status.

CompileWordsetRequest

Request to compile a wordset related to a dynamic list entity.

CompileWordsetRequest fields
Field Type Description
wordset string Mandatory. Inline wordset JSON resource.
Note: 4 MB request size limit.
companion_artifact_reference nuance.nlu.common.v1beta1. ResourceReference Mandatory. URN reference to the NLU model in which the entity is defined.
target_artifact_reference nuance.nlu.common.v1beta1. ResourceReference Mandatory. URN reference to use with the compiled wordset. The URN will later be used to reference the wordset to use it as an interpretation resource.
metadata string,string Client-supplied key,value pairs to associate with the artifact. Keys can only contain lowercase characters.
client_data string,string Client-supplied key,value pairs to inject into the logs.

This message includes:

CompileWordsetRequest
  wordset
  companion_artifact_reference (nuance.nlu.common.v1beta1.ResourceReference)
  target_artifact_reference (nuance.nlu.common.v1beta1.ResourceReference)
  metadata
  client_data

WatchJobStatusResponse

A series of these are streamed in response to a CompileWordset request. The responses provide status on both the wordset compilation job and the CompileWordsetAndWatch gRPC request.

WatchJobStatusResponse fields
Field Type Description
job_status_update nuance.nlu.common.v1beta1.JobStatusUpdate Status of the compile wordset job.
request_status nuance.rpc.Status gRPC request status. Returned in the final streamed response.

This message includes:

WatchJobStatusResponse
  job_status_update (nuance.nlu.common.v1beta1.JobStatusUpdate)
  request_status (nuance.rpc.Status)

GetWordsetMetadata

  Process flow for GetWordsetMetadata  

This RPC method requests and returns information about a compiled wordset. It does not return the content of the wordset. It provides two types of metadata:

  • Custom metadata, optionally supplied by the client as metadata in CompileWordsetRequest.

  • Default metadata (reserved keys):

    x_nuance_companion_checksum_sha256: The companion DLM, SHA256 hash in hex format.

    x_nuance_wordset_content_checksum_sha256: The source wordset content, SHA256 hash in hex format.

    x_nuance_compiled_wordset_checksum_sha256: The compiled wordset, SHA256 hash in hex format.

    x_nuance_compiled_wordset_last_update: Date and time of last update as ISO 8601 UTC date.

GetWordsetMetadataRequest

Request for a GetWordsetMetadata request.

GetWordsetMetadataRequest fields
Field Type Description
artifact_reference nuance.nlu.common.v1beta1. ResourceReference Reference to the wordset artifact.

This message includes:

GetWordsetMetadataRequest
  artifact_reference (nuance.nlu.common.v1beta1.ResourceReference)

GetWordsetMetadataResponse

Response to a GetWordsetMetadata request.

GetWordsetMetadataResponse fields
Field Type Description
metadata string,string Default and client-supplied key,value pairs.
request_status nuance.rpc.Status Indicates whether fetching of metadata was done successfully.

This message includes:

GetWordsetMetadataResponse
  metadata
  request_status (nuance.rpc.Status)

DeleteWordset

  Process flow for DeleteWordset  

This RPC method deletes a specified wordset.

DeleteWordsetRequest

Request to a DeleteWordset request.

DeleteWordsetRequest fields
Field Type Description
artifact_reference nuance.nlu.common.v1beta1. ResourceReference Reference to the wordset artifact.

This message includes:

DeleteWordsetRequest
  artifact_reference (nuance.nlu.common.v1beta1.ResourceReference)

DeleteWordsetResponse

Response to a DeleteWordset request.

DeleteWordsetResponse fields
Field Type Description
request_status nuance.rpc.Status Indicates whether deletion was done successfully.

This message includes:

DeleteWordsetResponse
  request_status (nuance.rpc.Status)

Scalar value types

The data types in the proto files are mapped to equivalent types in the generated client stub files.

Scalar data types
Proto Notes C++ Java Python
double double double float
float float float float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers. If your field is likely to have negative values, use sint32 instead. int32 int int
int64 Uses variable-length encoding. Inefficient for encoding negative numbers. If your field is likely to have negative values, use sint64 instead. int64 long int/long
uint32 Uses variable-length encoding. uint32 int int/long
uint64 Uses variable-length encoding. uint64 long int/long
sint32 Uses variable-length encoding. Signed int value. These encode negative numbers more efficiently than regular int32s. int32 int int
sint64 Uses variable-length encoding. Signed int value. These encode negative numbers more efficiently than regular int64s. int64 long int/long
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long
sfixed32 Always four bytes. int32 int int
sfixed64 Always eight bytes. int64 long int/long
bool bool boolean boolean
string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode
bytes May contain any arbitrary sequence of bytes. string ByteString str