Protecting confidential data

Nuance speech products can protect confidential data as it moves between processes and gets written to disk. Confidential data might include proprietary information or private data such as names, addresses, telephone numbers, account numbers, and passwords. The data can be DTMF touchtone signals, spoken utterances, synthesized speech requests, recognized speech, saved audio files, or part of whole call recordings.

Note: This protection is only available for voice browsers that use MRCPv2.

During a session, applications can set security levels to protect individual transactions and pieces of information. In response, Nuance speech products encrypt, suppress, or mask data. See Security levels to protect confidential data for an overview of how security parameters interact with each other to protect confidential data when set.

Costs of enabling protection

Using the additional security features has several costs:

Added system load. Data encryption and suppression adds a small amount of CPU and memory load.
Added development time. Application developers protect data by configuring individual transactions. They decide the scope needed for security purposes (the whole session, a set of operations, or single recognition or synthesis events).
Unavailable tuning data. Speech applications need data to improve over time: the recognizer automatically adapts (self-learning), and application developers analyze call logs and audio files for tuning purposes. When data is unavailable, applications improve more slowly, and might have hidden problems that remain uncorrected.
Installation complexity. To help with troubleshooting an installation, if you plan measures for network security (for example, TLS), Nuance recommends installing and testing with no security initially, and then adding security in stages (one connection at a time) and testing each addition.

Note: This discussion of confidential data does not include data controlled by applications and voice browsers, or written to their logs, and it does not include inter-process communication between voice browsers and Nuance Speech Server (for example, TLS, SSL or IPSec).

Kinds of data that get protected

When the application sets security levels (for example, to suppress or encrypt information), the system automatically protects logged data:

DTMF—The system protects touchtone signals entered by users, submitted for recognition, and saved as text in log files. For example, when the user enters a PIN.
Spoken utterances—The system protects user speech collected by the application, submitted for recognition, and saved as audio log files. For example, when the user speaks a secret password.
Input text for speech synthesis—The system protects text-to-speech prompts requested by the application, produced by the speech synthesizer, and written to log files. For example, when the application generates a confirmation prompt containing confidential information. The system protects the text input to the synthesizer for audio generation as well as the prompt filenames.
Recognized speech—The system protects recognition results returned to the application, and written as text in call logs. For example, when the recognition results contain a credit card number.
In some cases you may need to keep track of confidential information in the logs, but still prevent it from being fully accessible. Grammar developers can by-pass security measures with the special SWI_safeKey in their grammars. Typically, the key passes a partial recognition result when passing the whole result might be a security risk. For example, it can pass some digits of a credit card number, but not the whole number. To implement a data pass-through (also known as partial masking) in a grammar, see SWI_safeKey.
Applications sometimes pass data in URI query strings when activating grammars. The first part of the URI fetches the grammar, and the part after the question mark (?) contains ECMAScript data. If the data is confidential, applications can use the swirec_sensitive_query_keys parameter to protect the information.
Whole call audio files—The system protects recordings of the whole user session (prompts and utterances) requested by the application, and saved by Nuance Speech Suite. For example, when the application saves whole call recordings (WCR) for analysis or transaction verification purposes. You can either mute parts of the recording or encrypt an entire whole-call recording. See Hiding confidential data in whole call recordings.
Diagnostic logs—The system protects confidential data written in diagnostic messages. Optionally, you can protect browser messages and URI strings written to logs (see Controlling security in diagnostic logs).

Output of protected data

When the application requests data encryption, the system protects as follows:

In call logs, the system sets the SECURE token (SECURE=encrypt) for all appropriate events, and replaces the confidential values with "_ENCRYPTED."
The system encrypts spoken utterances to audio files, and it inserts silence to whole-calling-recording (WCR) waveforms unless explicitly overridden by the application’s use of a WCR muting parameter. See Security levels to protect confidential data and Hiding confidential data in whole call recordings.
The encrypted data is also included in an Enc event in the call log. The extension .enc is added to the call log and to encrypted audio files.

When the application requests data suppression, the system protects as follows:

In call logs, the system sets the SECURE token (SECURE=suppress) for all appropriate events, and replaces the confidential values with "_SUPPRESSED." The suppressed values cannot be recovered in the future.
The system does not write spoken utterances to audio files, and it inserts silence to WCR waveforms (unless explicitly overridden by the application’s use of a WCR muting parameter).

If a grammar uses masking, the recognizer writes the mask string. Frequently, the mask covers part of the value and allows some of the value to appear in the call log; for example, a speech grammar can mask all but the last four digits of a credit card number.

Configuring security levels

Application developers use the switts.secure_context parameters to set security levels for events in their VoiceXML applications. The browser must pass the parameter as a vendor-specific parameter.

Parameter	Description	Default
swirec.secure_context	Sets security levels for protecting confidential data for a single event.	open (no security)
switts.secure_context	Sets Vocalizer security levels for protecting confidential data for a single event.	open (no security)

Muting data in whole-call recordings

You can mute specific sections of a whole-call recording using the following parameters.

Parameter	Description	Default
swirec.mute_wcr	Hides confidential recognition data in whole-call recordings.	(none)
switts.mute_wcr	Hides confidential text-to-speech data in whole-call recordings.	(none)
server.rtp.wcr.suppressdtmf	Suppresses DTMF audio tones in whole-call recordings.	0 (not suppressed)

This feature is independent of security levels, and WCR muting overrides the security level behavior in the same scope.

The following examples show how VoiceXML can mix security levels and WCR muting, followed by the MRCP translation, and a description of the resulting WCR recording.

This VoiceXML example sets security levels at the request level.

<form id="form1">

  <field name="field1">

    <prompt>please say password</prompt>

    <property name="swirec.secure_context" value="suppress"/>

        //request scope

    <property name="swirec.mute_wcr" value="1"/> //muting using WCR VSP

    <grammar><!-- ... --></grammar>

  </field>

  <field name="field2" type="boolean">

    <property name="switts.secure_context" value="suppress"/>

        //request scope

    <property name="switts.mute_wcr" value="1"/> //muting using WCR VSP

    <property name="swirec.mute_wcr" value="0"/> //muting using WCR VSP

    <prompt>did you say <value expr="field1"/></prompt>

  </field>

</form>

The following example illustrates the relevant parts of a typical MRCP call flow using both secure_context and WCR muting parameters:

INVITE

SPEAK "Please say password" (WCR records prompt)

RECOGNIZE swirec.secure_context=suppress (WCR records silence)

SPEAK Vendor-Specific-Parameters: switts.mute_wcr=1 "Did you say"

SPEAK Vendor-Specific-Parameters: switts.mute_wcr=1 "abc123"

RECOGNIZE Vendor-Specific-Parameters: swirec.mute_wcr=0

    (WCR records response)

This table shows the approximate contents of the WCR file:

Synthesizer	User	WCR content
Please say password		Please say password
	abc123	silence
Did you say abc123		silence
	No	No

Encrypting whole-call recordings

Parameter	Description	Default
server.rtp.wcr.encrypt	Encrypts whole-call recordings (WCR)	0 (disabled)
wcr_encrypt	Encrypts whole-call recordings (WCR)	0 (disabled)

Controlling security in diagnostic logs

Parameter	Description	Default
server.log.diagLogPerCompany	Writes diagnostic logs for different companies into separate files.	0 (disabled, a single log is written)
server.log.secureDiagLogOSRcontext	Specifies whether recognition messages from the browser appear in Speech Server diagnostic logs.	0 (disabled, browser messages written)
server.log.secureDiagLogTTScontext	Specifies whether text-to-speech messages from the browser appear in Speech Server diagnostic logs.	0 (disabled, browser messages written)
server.log.suppressSensitiveDiagLogs	Specifies whether the Speech Server diagnostic logs protect confidential data.	1 (enabled, secure data suppressed)
server.log.suppressSensitiveURIs	Protects confidential data that appears in URI strings.	0 (disabled, URIs are written)

Typical security configurations

Configuration	Example purpose
swirec.secure_context=open switts.secure_context=open	Default settings. Unprotected data. Simplest configuration.
swirec.secure_context=encrypt switts.secure_context=open swirec.secure_context=suppress switts.secure_context=open	Encrypt or suppress data. No need to protect the prompt. For example, when collecting credit card numbers with a prompt such as “What is your credit card number?”
swirec.secure_context=open switts.secure_context=encrypt swirec.secure_context=open switts.secure_context=suppress	Encrypt or suppress the text-to-speech data. No need to protect the recognizer data. For example, when validating a credit card number with a prompt that speaks the number, and expecting a yes/no response from the caller.
swirec.secure_context=encrypt switts.secure_context=encrypt swirec.secure_context=suppress switts.secure_context=suppress	Encrypt or suppress the recognizer and text-to-speech data. For example, when validating a credit card number and allowing users to respond with a corrected number.

Using encryption security

To encrypt protected data:

Generate public and private keys. For extra security, set a password on the private key. See Generating encryption keys.
Configure the public key in the session.xml. See Configuring encryption
Guard the private key in a secure location. It is required for decryption. Nuance has no access to the private key, and cannot decrypt your data unless you supply the key.
Run an application that sets encryption Security levels. This creates encrypted output data.
For extra security, change keys periodically. See Changing the encryption key.
To decrypt data, use the private key (and password, if used). See Decrypting data.

Generating encryption keys

The nr_gen_encryption_keys utility generates a pair of RSA encryption keys, one public and one private. Using the utility is not required. You can use any means available to generate RSA encryption keys. (RSA refers to an algorithm for public-key cryptography, and to Rivest, Shamir, and Adleman, the cryptographers who first described it.)

The utility is in the bin directory, in one of these paths:

%SWISRSDK%\amd64\bin
%SWISRSDK%\x86\bin

If you are running Speech Server without Nuance Recognizer, you can access nr_gen_encryption_keys in the following path:

$NSSSVRSDK/bin

Usage

nr_gen_encryption_keys -pub publicKeyFile -prv privateKeyFile

    [-pass privateKeyPassword]

    [-len keyLength]

Options

-pub publicKeyFile

Required. Specifies a filename for the output public key. The value can include an absolute or relative path.

-prv privateKeyFile

Required. Specifies a filename for the output private key. The value can include an absolute or relative path.

-pass privateKeyPassword

Optional. Specifies a password for accessing the output private key. When you specify a password, the utility uses OpenSSL to encrypt the privateKeyFile, and the password is required for decryption.

-len keyLength

Optional. A value between 1024 (default) and 4096. Typical values are 1024, 2048, and 4096. Larger values improve security with a small cost in CPU and file size (of privateKeyFile).

Configuring encryption

Parameter	Description	Default
encryption_key	Public RSA encryption key in hexadecimal.	(none)
encryption_key_tag	Label to identify a public encryption key.	(none)
internal_encryption	Name of the mechanism to use for data encryption.	AES-192

Decrypting data

The nr_decrypt utility decrypts call logs, waveforms, and diagnostic logs. It accepts a file or directory as input, and can recursively decrypt a tree of files. The utility ignores any non-encrypted files.

The utility accepts call logs as input. The output file omits the .enc file extension. For example, the encrypted myLog-LOG.enc becomes myLog-LOG. (If the input file does not have the .enc file extension, the tool adds an underscore prefix. For example, the encrypted myLog-LOG becomes _myLog-LOG.)

The utility is in the bin directory, in one of these paths:

%SWISRSDK%\amd64\bin
%SWISRSDK%\x86\bin

If you are running Speech Server without Nuance Recognizer, you can access nr_decrypt in the following path:

$NSSSVRSDK/bin

Using masking security

Grammar developers can use SWI_safeKey to pass non-confidential recognition results to log files even if security settings are enabled (swirec.secure_context). Typically, you use the key to pass a partial recognition result when passing the whole result might be a security risk. For example, the recognizer can return only the last four digits of a credit card number, but not the whole number. In the logs, the data appears with the SAFEK token in the SWIrcnd event. See SWI_safeKey.