Managing sensitive information in an application
Mix lets you manage sensitive information in an application so that the information is redacted in the logs.
You can mark data as sensitive in the Mix tools as follows:
There are two redaction options: partial and complete.
-
Partial redaction: Only the information marked as sensitive is redacted in the dialog and NLU logs. Partial redaction is implemented by marking entities and variables as sensitive in the Mix tools.
-
Complete redaction: All user data is redacted in the logs. Complete redaction is enabled by:
-
Setting a question and answer node as sensitive. In this case, all user input collected at this node is fully redacted from the ASRaaS, NLUaaS, and DLGaaS logs.
-
Setting the suppress_log_user_data
field to true in the DLGaaS StartRequest. In this case, all user input is fully redacted from the logs across all services for the entire session.
-
Setting a field in the NLUaaS, TTSaaS, ASRaaS, NRaaS API. In this case, the logs are redacted for that specific service.
This section summarizes these options.
Partial redaction
When you mark an entity or a variable as sensitive in Mix.nlu and Mix.dialog and complete redaction is not enabled, then only the entity and variables marked as sensitive will be redacted in the dialog and NLU logs.
For details on the values masked with partial redaction, see:
Complete redaction
ASRaaS, NLUaaS, TTSaaS, and DLGaaS all provide a field that disables logging of user data, as follows:
-
ASRaaS: Set the suppress_call_recording
RecognitionFlags field to True to disable call logging. See ASR values possibly masked.
-
NLUaaS: Set the interpretation_input_logging_mode
InterpretationParameters field to SUPPRESSED so that input is masked. See NLU values possibly masked.
-
TTSaaS: Set the suppress_input
EventParameters field to True to omit input text and URIs from log events. See TTS values possibly masked.
-
DLGaaS: Set the suppress_log_user_data
field in the StartRequestPayload to True to disable logging of user data for all the services when dialog orchestrates with these services. Otherwise, the client application must set the field when requesting a service. See Dialog values possibly masked.
-
NRaaS: Set the secure_context_level
field to SUPPRESS to disable utterance waveform recording and suppress recognition results. See NR values possibly masked.
You can also mark a question and answer node as sensitive. This enables complete redaction of all user input collected at this node (user text, utterance, intent and entity values and literals). This applies to NLU intent and entity collection, as well as to all events at that node: collection, recovery, confirmation, nomatch, noinput, max events, NO_INTENT, intent switching, and so on.
ASR values possibly masked
When the suppress_call_recording
RecognitionFlags field is set to true, the recognition respsonse is suppressed in the event logs and the corresponding audio is removed. The hypotheses
field is empty and the redactedReason
field is provided:
"hypotheses": [],
"redactedReason": "suppressCallRecording"
The minimally formatted text is available in the final result returned to the application, but this information is not logged.
See the ASRaaS RecognitionFlags documentation for details on setting suppress_call_recording
.
NLU values possibly masked
The NLU values masked depend on whether partial or complete redaction is enabled.
Partial redaction
When an entity is marked as sensitive (either in Mix.nlu or Mix.dialog) and complete redaction is not enabled, values in the data
field are redacted as follows:
request.input
: Only the sensitive entity value is redacted from the input.
response.result.interpretations
: The literal
, formatted literal
, and string
values are redacted for sensitive entities.
response.result.literal
: Only the sensitive entity value is redacted from the literal.
response.result.formattedLiteral
: The formatted literal value is redacted.
Example of partial redaction, COFFEE_SIZE entity is marked as sensitive
"data": {
...
"request": {
...
"input": "I want a ****redacted**** espresso"
},
"response": {
"result": {
"interpretations": [{
"singleIntentInterpretation": {
"entities": {
"COFFEE_TYPE": {
"entities": [{
"entities": {},
"metadata": {},
"textRange": {
"startIndex": 16,
"endIndex": 24
},
"confidence": 1,
"origin": "GRAMMAR",
"literal": "espresso",
"sensitive": false,
"formattedLiteral": "espresso",
"formattedTextRange": {
"startIndex": 16,
"endIndex": 24
},
"audioRange": null,
"stringValue": "espresso",
"valueUnion": "stringValue"
}
]
},
"COFFEE_SIZE": {
"entities": [{
"entities": {},
"metadata": {},
"textRange": {
"startIndex": 9,
"endIndex": 15
},
"confidence": 1,
"origin": "GRAMMAR",
"literal": "****redacted****",
"sensitive": true,
"formattedLiteral": "****redacted****",
"formattedTextRange": {
"startIndex": 9,
"endIndex": 15
},
"audioRange": null,
"stringValue": "****redacted****",
"valueUnion": "stringValue"
}
]
}
},
"metadata": {},
"intent": "ORDER_COFFEE",
"confidence": 1,
"origin": "GRAMMAR"
},
"interpretationUnion": "singleIntentInterpretation"
}
],
"literal": "I want a ****redacted**** espresso",
"sensitive": true,
"formattedLiteral": "****redacted****"
}
}
}
Complete redaction
When the interpretation_input_logging_mode
field in the InterpretRequest is set to SUPPRESSED, values in the data
field are redacted as follows:
request.input
: Fully redacted.
response.result.interpretations
: Fully redacted.
response.result.literal
: Fully redacted.
response.result.formattedLiteral
: Fully redacted.
See the NLUaaS InterpretationParameters documentation for details on setting interpretation_input_logging_mode
.
Example of complete redaction
"data": {
...
"request": {
...
"input": "****redacted****"
},
"processingTime": {
"startTime": "2022-02-22T20:17:16.996Z",
"durationMs": 74
},
"response": {
...
"result": {
"interpretations": "****redacted****",
"literal": "****redacted****",
"sensitive": true,
"formattedLiteral": "****redacted****"
}
}
}
Dialog values possibly masked
The dialog values masked depend on whether partial or complete redaction is enabled.
Partial redaction
When an entity or a variable is marked as sensitive (either in Mix.nlu for entities or Mix.dialog for entities, variables, and question and answer nodes) and complete redaction is not enabled, values in Dialog application logs are redacted as follows:
- Messages: Only the sensitive variable or entity is redacted from messages.
- Data: The values of sensitive entities and variables are redacted from the
data
fields.
- Utterances: The values of sensitive entities are redacted from the
utterance
field.
- Entity values and literals: The values and literals of sensitive entities are redacted in the qa-config, question-router, and input-received event logs.
- Reporting variables: Only non-sensitive variables configured for reporting are included in the reporting-vars event log.
Example of message event, partial redaction of user_name variable
"events": [{
"name": "message",
"value": {
"nlg": [],
"visual": [{
"text": "Hello "
}, {
"text": "****"
}, {
"text": "and welcome to the coffee app."
}
],
"audio": []
}
}
]
Example of data field in session-update event, partial redaction of user_name variable
"events": [{
"name": "session-update",
"value": {
"data": {
"quantity": 7.0,
"user_name": "****"
}
}
}
]
Example of partial redaction in input-received event, COFFEE_SIZE entity is masked as sensitive
"events": [{
"name": "input-received",
"value": {
"interpretation": {
"isSuccess": "true",
"confidence": "1.0",
"inputmode": "text",
"utterance": "I want a ****COFFEE_SIZE**** espresso",
"data": {
"COFFEE_TYPE": "espresso",
"INTENT": "ORDER_COFFEE",
"COFFEE_SIZE": "****"
},
"slot_literals": {
"COFFEE_TYPE": "espresso",
"INTENT": "****",
"COFFEE_SIZE": "****"
},
"slot_confidences": {
"COFFEE_TYPE": "1.0",
"INTENT": "1.0",
"COFFEE_SIZE": "1.0"
}
}
}
}
]
Complete redaction
When the suppress_log_user_data
field in the StartRequest is set to true, values in Dialog application logs are redacted as follows:
-
Messages: The messages are fully redacted.
-
Data: All entity and variable values are redacted from the data
fields.
-
Utterances: The utterance
fields are fully redacted.
-
Entity values and literals: All entity values and literals are redacted in the qa-config, question-router, and input-received event logs.
-
Reporting variables: Only non-sensitive variables configured for reporting are included in the reporting-vars event log.
See the StartRequestPayload documentation for details on setting suppress_log_user_data
.
Example of message event, complete redaction
"events": [{
"name": "message",
"value": {
"nlg": [],
"visual": [{
"text": "****"
}, {
"text": "****"
}, {
"text": "****"
}
],
"audio": []
}
}
]
Example of data field in session-update event, complete redaction
"events": [{
"name": "session-update",
"value": {
"data": {
"quantity": "****",
"user_name": "****"
}
}
}
]
Example of complete redaction in input-received event
"events": [{
"name": "input-received",
"value": {
"interpretation": {
"isSuccess": "true",
"confidence": "1.0",
"inputmode": "text",
"utterance": "I want a ****COFFEE_SIZE**** espresso",
"data": {
"COFFEE_TYPE": "****",
"INTENT": "****",
"COFFEE_SIZE": "****"
},
"slot_literals": {
"COFFEE_TYPE": "****",
"INTENT": "****",
"COFFEE_SIZE": "****"
},
"slot_confidences": {
"COFFEE_TYPE": "1.0",
"INTENT": "1.0",
"COFFEE_SIZE": "1.0"
}
}
}
}
]
NR values possibly masked
When the secure_context_level
field in RecognitionParameters or DTMFRecognitionParameters is set to SUPPRESS, sensitive information is suppressed from the event logs and the corresponding audio is removed. Utterance waveforms are not recorded, recognition results in the diagnostic and call logs are suppressed.
See the secure_context_level
field in RecognitionParameters or DTMFRecognitionParameters documentation for details on setting secure_context_level
.
TTS values possibly masked
When the suppress_input
field in the SynthesisRequest is set to true, the following fields are masked:
Credit card numbers
All the Mix engines attempt to mask or redact credit card numbers in the event logs.
In all engines except ASR, credit card numbers that are between 13 and 19 characters and pass the Luhn algorithm
test are masked in the Mix event logs. Potential credit card numbers can be interspersed with spaces or hyphens. For example, the following values are masked:
- XXXX XXXX XXXX XXXX
- XXXXXXXXXXXXXXXX
- XXXX-XXXX-XXXX-XXXX
Warning:
Mix will mask any 13 to 19-digit number that passes the Luhn algorithm test. This means that data may be incorrectly masked if it resembles a credit card number.
As an additional precaution, you may also mark entities that collect credit card numbers as sensitive.
ASR uses a slightly different algorithm to detect and redact credit card numbers: see ASR below.
NLU
In NLU, the credit card number is redacted from the logs and replaced with ******
.
Example of redacted log for NLU
{
"specversion": "1.0",
"service": "NLUaaS",
"datacontenttype": "application/json",
"source": "nuance.nlu.v1.Runtime/Interpret",
"type": "Interpret",
"id": "35e5559c-9e2c-4de4-b270-13ea43b192fe",
"appid": "nluqa-test",
"partitionKey": "{\"service\":\"NLUaaS\",\"id\":\"35e5559c-9e2c-4de4-b270-13ea43b192fe\"}",
"data": {
"dataContentType": "application/x-nuance-nluaas-interpretation.v1+json",
"locale": "",
"requestid": "d51ec71b-5854-9a11-a21f-59f580626e64",
"traceid": "7d9b8c3f10da5f09bebae81ced523e88",
"request": {
"resources": [],
"clientData": {},
"parameters": {
"postProcessingScriptParameters": {},
"interpretationResultType": "SINGLE_INTENT",
"interpretationInputLoggingMode": "PLAINTEXT",
"maxInterpretations": 0
},
"model": {
"type": "SEMANTIC_MODEL",
"uri": "http://10.3.93.9/nlu-model-sensitive.zip",
"requestTimeoutMs": 0
},
"userId": "",
"input": {
"text": "use my credit card number ***** and pay the bill",
"inputUnion": "text"
}
},
"processingTime": {
"startTime": "2022-11-10T06:51:11.622Z",
"durationMs": 92
},
"response": {
"metadata": {},
"status": {
"code": 200,
"message": "OK",
"details": ""
},
"result": {
"interpretations": [{
"singleIntentInterpretation": {
"entities": {
"pay": {
"entities": [{
"entities": {},
"metadata": {},
"textRange": {
"startIndex": 47,
"endIndex": 50
},
"confidence": 0.7629098892211914,
"origin": "STATISTICAL",
"literal": "pay",
"sensitive": false,
"formattedLiteral": "pay",
"formattedTextRange": {
"startIndex": 47,
"endIndex": 50
},
"audioRange": null,
"stringValue": "",
"valueUnion": "stringValue"
}
]
}
},
"metadata": {},
"intent": "BILL_PAY",
"confidence": 0.9947828650474548,
"origin": "STATISTICAL"
},
"interpretationUnion": "singleIntentInterpretation"
}
],
"literal": "use my credit card number ***** and pay the bill",
"sensitive": false,
"formattedLiteral": "use my credit card number ***** and pay the bill"
}
}
},
"timestamp": "2022-11-10T06:51:11.714Z",
"request_input_text": null,
"response_result_literal": null
}
TTS
In TTS, the credit card number is redacted from the logs and replaced with the string *** POSSIBLE CC NUMBER REDACTED ***
.
Example of redacted log for TTS
{
"specversion": "1.0",
"service": "TTSaaS",
"source": "nuance.tts.v1.Synthesizer/Synthesize",
"type": "Synthesize",
"id": "7ba73e35-d4da-4392-99a8-116101dcfa63",
"partitionKey": "{\"service\":\"TTSaaS\",\"id\":\"751774f7-2fd2-4247-b5d0-ca3472e8a590\"}",
"timestamp": "2023-01-25T21:35:41.353Z",
"appid": "ttsaas-test-01",
"enable_call_logging": "false",
"datacontenttype": "application/json",
"data": {
"dataContentType": "application/x-nuance-tts-callsummary.v3+json",
"traceId": "",
"sessionId": "f7b2a6a1-2e7f-478d-8c73-481a7a9c2208",
"requestId": "89249280-6454-422f-a13b-10877d54eb36",
"processingTime": {
"startTime": "2023-01-25T21:35:39.704Z",
"firstAudioBufferTime": "2023-01-25T21:35:40.111Z",
"durationMs": 1648
},
"request": {
"resources": []
},
"clientData": {
"applicationName": "APP-XTTS-1999-case3",
"applicationVersion": "d.e.f",
"companyName": "COMPANY"
},
"response": {
"events": [{
"EVENT": "NVOCcntv",
"VOICE_VOP": "Evan_xpremium-high",
"CHARS": 30,
"TTSTIME": "2023/01/25 21:35:40.599 UTC",
"TTSAASTIME": "2023-01-25T21:35:40.600Z",
"VOIC": "Evan",
"VMDL": "xpremium-high",
"LOCALE": "en-US",
"DURS": 6641.36
}, {
"EVENT": "NVOCcntg",
"CHARS": 30,
"TTSTIME": "2023/01/25 21:35:40.600 UTC",
"TTSAASTIME": "2023-01-25T21:35:40.600Z"
}, {
"EVENT": "NVOCinpt",
"MIME": "text/plain;charset=utf-8",
"TXSZ": 34,
"TEXT": "Your card number is *** POSSIBLE CC NUMBER REDACTED ***",
"TTSTIME": "2023/01/25 21:35:41.350 UTC",
"TTSAASTIME": "2023-01-25T21:35:41.350Z"
}, {
"EVENT": "NVOCsynd",
"INPT": 34,
"DURS": 6641,
"RSTT": "ok",
"TTSTIME": "2023/01/25 21:35:41.350 UTC",
"TTSAASTIME": "2023-01-25T21:35:41.350Z"
}
],
"status": {
"code": 200,
"message": "OK",
"details": ""
}
}
}
}
NR
In NR, if a potential credit card number is detected, the results are deleted in call logs for that recognition and FluentD redacted possible CCN
appears instead.
Example of redacted log for NR
{
"specversion": "1.0",
"service": "NRaaS",
"source": "nuance.nrc.v1.NRC/Recognize",
"type": "Recognize",
"id": "a154e984-5e0a-4441-9401-d88afc0ef120",
"timestamp": "2023-08-28T16:18:03.754Z",
"appid": "nraas-qa",
"datacontenttype": "application/json",
"data": {
"dataContentType": "application/x-nuance-nrc-result.v1+json",
"nrcSessionid": "5c1cb860-96ce-41f6-884e-f280e9f36820",
"traceid": "",
"requestid": "e0147562-8a79-96f6-866c-3e8c659ed7e0",
"clientRequestid": "",
"locale": "en-US",
"processingTime": {
"startTime": "2023-08-28T16:17:54.676Z",
"durationMs": 9459
},
"response": {
"result": {
"formattedText": "<?xml version='1.0'?><result><interpretation grammar=\"builtin:grammar/digits\" confidence=\"63\"><input mode=\"speech\">FluentD redacted possible CCN</input><instance>FluentD redacted possible CCN</instance></interpretation><interpretation grammar=\"builtin:grammar/digits\" confidence=\"42\"><input mode=\"speech\">FluentD redacted possible CCN</input><instance>FluentD redacted possible CCN</instance></interpretation></result>",
"status": "SUCCESS"
}
}
},
"EventProcessedUtcTime": "2023-08-28T16:18:26.4637188Z",
"PartitionId": 0,
"EventEnqueuedUtcTime": "2023-08-28T16:18:26.3400000Z"
}
Dialog
In Dialog, a credit card number is redacted from the logs and replaced with the string ****POSSIBLE CC NUMBER REDACTED****
.
Dialog redacts content in the following cases:
-
When the user provides input (for example, in qa-config
, input-required
, input-received
, and question-router
events).
-
During data exchanges with the client application or a backend server (for example, in session-update
, data-required
, data-received
, input-required
, transfer-initiated
, transfer-completed
, continue-initiated
, continue-completed
, application-ended
, and message
events).
Example of redacted log for Dialog
{
"specversion": "1.0",
"service": "DLGaaS",
"source": "nuance.dlg.v1.DialogService/Execute",
"type": "Execute",
"id": "fe3aedbe-6e76-430b-a68a-b0a0d54c1ea8",
"timestamp": "2023-01-31T22:27:14.868Z",
"appid": "DEMO-OMNICHANNEL-APP-DEV",
"datacontenttype": "application/json",
"data": {
"dataContentType": "application/x-nuance-dlg-interaction-summary.v1+json",
"traceid": "8ce26cfaf0223cb36a4980af0dceb0d2",
"requestid": "8e427200-b132-9085-a902-5712784a2682",
"sessionid": "1857779e-31e6-4236-b12c-292f3e151ac7",
"processingTime": {
"startTime": "2023-01-31T22:27:14.657Z",
"durationMs": 210
},
"events": [{
"event": "DLGaaS-Execute-End",
"time": "2023-01-31T22:27:14.868Z"
}
],
"request": {
"sessionId": "1857779e-31e6-4236-b12c-292f3e151ac7",
"selector": {
"language": "en-US",
"library": "default"
},
"payload": {
"userInput": {
"userText": "****POSSIBLE CC NUMBER REDACTED****"
}
}
},
"response": {
"status": {
"code": 200,
"message": "OK",
"detail": ""
},
"payload": {
"messages": [{
"visual": [{
"text": "Perfect, a double espresso coming right up! It will be charged at ****POSSIBLE CC NUMBER REDACTED****."
}
],
"view": {},
"language": "en-US",
"ttsParameters": {
"voice": {}
},
"channel": "Quick Start"
}
],
"endAction": {
"data": {},
"id": "End dialog"
},
"channel": "Quick Start"
}
}
}
}
ASR
ASR checks for credit card numbers in the recognition result by looking for 12 or more digits in each transcription hypothesis. The digits can be consecutive or non-consecutive. It then removes hypotheses that contain a potential credit card number from the event logs. For example, both these hypotheses are redacted:
- My number is 123456789012
- My number is 45004688, no sorry, that’s 4689
Warning:
Any hypothesis containing 12 or more digits will be redacted. This means that some non-credit card data may be incorrectly redacted.
When 12 or more digits are detected in a transcription, the hypotheses are deleted for that recognition and the redactedReason
field is returned, explaining why the content was redacted:
"hypotheses": [],
"redactedReason": "generic_digits"
Example of redacted log for ASR
{
"data": {
"response": {
"resultType": 0,
"absStartMs": 320,
"absEndMs": 9670,
"utteranceInfo": {
"durationMs": 0,
"clippingDurationMs": 0,
"droppedSpeechPackets": 0,
"droppedNonspeechPackets": 0,
"dsp": {
"snrEstimateDb": 17,
"level": 37,
"numChannels": 1,
"initialSilenceMs": 260,
"initialEnergy": -40,
"finalEnergy": -55,
"meanEnergy": 152
}
},
"hypotheses": [],
"dataPack": {
"language": "eng-USA",
"topic": "GEN",
"version": "4.11.1",
"id": ""
}
"redactedReason": "generic_digits"
},
"dataContentType": "application/x-nuance-asr-finalresultresponse.v2+json",
"userid": "7b88d547a41d88b788416084cbb7635338fad0c5780e505ca0acc07f7a0ebf15",
"asrSessionId": "e0f18324-5234-93f8-a47c-4168159757f1",
"requestid": "e0f18324-5234-93f8-a47c-4168159757f1"
},
"specversion": "1.0",
"service": "ASRaaS",
"source": "nuance.asr.v1.Recognizer/Recognize",
"type": "Recognize",
"id": "b8006cf9-2d66-4cb5-bd9e-d9bdfb57e471",
"timestamp": "2024-01-18T23:14:49.805Z",
"datacontenttype": "application/json",
"partitionKey": {
"service": "ASRaaS",
"id": "3ac75fb3-5753-4cda-ba7b-e6bc5e762418"
},
"appid": "myappid"
}