Status codes

A single Recognizer service provides a single Recognize method supporting bi-directional streaming of requests and responses.

The client first provides a recognition request message with parameters indicating at minimum what language to use. Optionally, it can also include resources to customize the data packs used for recognition, and arbitrary client data to be injected into call recording for reference in offline tuning workflows.

service Recognizer {
  rpc Recognize (stream RecognitionRequest) returns (stream RecognitionResponse);
}

In response to the recognition request message, ASRaaS returns a status message confirming the outcome of the request. Usually the message is Continue: recognition started on audio/l16;rate=8000 stream.

{
  status: {
    code: 100
    message: 'Continue'
    details: 'recognition started on audio/l16;rate=8000 stream'
  }
  cookies: {  ... }
}

Status messages include HTTP-aligned status codes. A failure to begin recognizing is reflected in a 4xx or 5xx status as appropriate. (Cookies returned from resource fetches, if any, are returned in the first response only.)

When a 100 Continue status is received, the client may proceed to send one or more messages bearing binary audio samples in the format indicated in the recognize message (default: signed PCM/8000 Hz).

The server responds with zero or more result messages reflecting the outcome of recognizing the incoming audio, until a terminating condition is reached, at which point the server sends a final status message indicating normal completion (200/204) or any errors encountered (4xx/5xx).

Termination conditions include:

  • Utterance detection mode is SINGLE and server detects end of speech.
  • Utterance detection mode is SINGLE and server observes non-speech samples corresponding to the no_input_timeout_ms value.
  • Utterance detection mode is SINGLE and server observes speech samples corresponding to the recognition_timeout_ms value.
  • Client ends its message stream to the server.
  • Client cancels the RPC.
  • Client sends no audio for a server-configured idle timeout.
  • Server encounters an error.

If the client cancels the RPC, no further messages are received from the server. If the server encounters an error, it attempts to send a final error status and then cancels the RPC.

HTTP status codes

ASRaaS returns standard gRPC error codes  , and the following HTTP status codes.

Description of HTTP status codes
Code Message Indicates
100 Continue Recognition parameters and resources were accepted and successfully configured. Client can proceed to send audio data.
Also returned in response to a start_timers_message, which starts the no-input timer manually.
200 Success

Audio was processed, recognition was completed and returned a result with at least one hypothesis. Each hypothesis includes a confidence score, the text of the result, and (for the final result only) whether the hypothesis was accepted or rejected.

200 Success is returned for both accepted and rejected results. A rejected result means that one or more hypothesis are returned, all with rejected = True.

204 No result Recognition completed without producing a result. This may occur if the client closes the RPC stream before sending any audio.
400 Bad request A malformed or unsupported client request was rejected.
401 Unauthorized The request could not be authorized. Make sure you have provided your credentials and/or generated an OAuth token. See Authorize your client application.
403 Forbidden A request specified a topic that the client is not authorized to use.
404 No speech No utterance was detected in the audio stream for a number of samples corresponding to no_input_timeout_ms. This may occur if the audio does not contain anything resembling speech.
408 Audio timeout Excessive stall in sending audio data.
409 Conflict The recognizer is currently in use by another client.
410 Not recognizing A start_timers_message was received (to start the no-input timer manually) but no in-progress recognition exists.
413 Too much speech Recognition of utterance samples reached a duration corresponding to recognition_timeout_ms.
429 Too many requests The client application has reached the rate limit, either for the authorization service or runtime service. See Rate limits.
If the rate limit is related to the authorization service, make sure you are reusing the access token properly across requests.
500 Internal server error A serious error occurred that prevented the request from completing normally.
502 Resource error One or more resources failed to load.
503 Service unavailable Unused, reserved for gateways.