tts-callsummary payload for Neural TTS
Current version: v4
Neural TTSaaS offers a streamed synthesis response containing information received from Microsoft Speech Services.
Unlike TTSaaS, it does not offer a non-streamed (unary) response and does not include the NVOC events.
Note:
The GetVoice method in the Synthesizer service does not create any event log entries, as it is a purely informational request.
In addition to the standard fields described in data field structure, messages with the application/x-nuance-tts-callsummary.v4+json
dataContentType include service-specific fields as detailed below.
Neural TTS fields
Field |
Description |
request |
Contents of the request |
message |
The Synthesize request. See Synthesizer API in the Neural TTSaaS documentation for details. |
totalCharacterCount |
The number of characters in the text to be synthesized |
characterCounts |
The characters to be synthesized for each voice |
voice |
The voice performing the synthesis |
count |
The number of characters in the text to be synthesized for this voice |
response |
Contents of the response |
status |
Contents of the request statu |
code |
The status code |
messsage |
The status summary message |
details |
Details of the status message |
Sample payload
"data":{
"dataContentType":"application/x-nuance-tts-callsummary.v4+json",
"traceId":"ebdd45a32595e6f7a81d5d1ce8fa009a",
"sessionId":"acfb1263-3cba-4792-bfc9-c303a18c2d6e",
"userId":"e0808acd35f35a49ca8d36d6c07dc0df16d39d3fea099f9c23de0092fe04eaf6",
"processingTime":{
"startTime":"2022-04-21T18:04:27.009Z",
"durationMs":662.0349,
"firstAudioBuffer":"2022-04-21T18:04:27.589Z"
},
"clientData": {
"correlation_id": "9894b63a-3fd8-4a05-a8dd-5394297e48f2",
"cc_session_id": "5ea0650b-77dd-4b88-99b5-4c1874868eb4",
"msdyn_botid": "6183456a-1787-4f63-bd67-b6077f9a6b0f",
"sequence_id": "3",
"msdyn_sessionid": "f5a4a4e6-20fc-4535-8487-55c5d9605a61"
},
"request":{
"message":"{ \"audioParams\": { \"audioFormat\": { \"pcm\": { \"sampleRateHz\": 16000 } },
\"volumePercentage\": 80, \"speakingRateFactor\": 1, \"audioChunkDurationMs\": 2000 },
\"input\": { \"ssml\": { \"text\": \"\\u003cspeak version=\\\"1.0\\\"
xmlns=\\\"http://www.w3.org/2001/10/synthesis\\\" xml:lang=\\\"en-US\\\"\\u003e\\n
\\u003cvoice name=\\\"en-US-JennyNeural\\\"\\u003e\\n\\tHello it's Jenny.\\n \\u003c/voice\\u003e\\n
\\u003cvoice name=\\\"en-US-AriaNeural\\\"\\u003e\\n\\tHi it's Aria.\\n \\u003c/voice\\u003e\\n
\\u003c/speak\\u003e\" } }, \"eventParams\": { \"sendLogEvents\": true },
\"clientData\": { \"hello\": \"goodbye\" }, \"userId\": \"MyApplicationUser\" }",
"totalCharacterCount": 30,
"characterCounts":[
{
"voice":"en-US-JennyNeural",
"count":17
},
{
"voice":"en-US-AriaNeural",
"count":13
}
]
},
"response":{
"status":{
"code":200,
"message":"OK",
"details":""
}
}
}
The request
object describes the synthesis request, as provided in the Input message of a SynthesisRequest.
The response
object for a synthesis request contains only a status
message.
Response payload for successful synthesis
"response":{
"status":{
"code":200,
"message":"OK",
"details":""
}
}
Response payload for invalid voice
"response":{
"status":{
"code":400,
"message":"Bad Request",
"details":"Requested invalid or unknown voice: bn-US-JennyNeural"
}
}
Response payload for a server error
"response":{
"status": {
"code": 500,
"message": "Internal Server Error",
"details": "Speech synthesis failed. Reason=Error, ErrorDetails=WebSocket upgrade failed: Authenticationerror (401). Please check subscription information and region name. USP state: 2. Received audio size: 0 bytes."
}
}