Detecting telephony tones

Nuance Speech Server includes a tone detector module that can identify specific in-band telephony tone signals which go undetected by most gateways. These tones are classified as either universal, or region specific. The following tones are supported:

  • Fax-related tones (universal)

    The following tones can appear on a subscriber line serving a fax machine.

    • FAX-CED | ANS—disables echo suppression for data transmission for fax machines.
    • FAX-CNG—a calling fax machine (optionally) begins sending a CalliNG tone (CNG) after dialing the called fax machine's telephone number (and before receiving an answer).
  • TTY (universal)

    Electronic device for text communication using atelephone line. It is used when one or more of the parties wish to communicate using typed text rather than audio, for example if one is in a noisy environment or hashearingorspeech difficulties. (Also called TDD.)

  • SIT (Special-information tone) (region specific)

    Signal indicating a call did not go through—usually followed by a recorded announcement explaining the issue.An SIT is used before all call failure announcements for the benefit of automatic equipment that is unable to distinguish between a live answer or a recording, or what is said in a recorded announcement.

  • AMD (answering machine detection)

    Speech Server assumes the far end of a call is a live person and uses several factors to support or contradict this assumption. At a high level, these include speech, duration of speech, length of silence, beep, possible ringback, ringback, and other recognized telephony tones.

  • DTMF (Dual-Tone-Multi-Frequency) (universal)

    Dual Tone Multi-Frequency is a method for instructing a telephone switching system of the telephone number to be dialed, or to issue commands to switching systems or related telephony equipment.

Tone detection sequence

The tone detector works as follows:

  1. The tone detector is activated when the voice browser sends a SIP INVITE request that includes the header P-Nuance-Activate-Tone-Detector. When the tone detector is activated, Speech Server sends all audio it receives into the tone detector.
  2. The tone detector remains activated as follows:
    1. For FAX/SIT/TTY tones, the tone detector silently deactivates itself after a period of time (default: 20 seconds) when it seems unlikely that tones will still be detected. You can change the timeout period using the parameter server.toneDetector.timeout.
    2. For DTMF tones, the tone detector remains active through the entire session, or until a FAX/SIT/TTY tone is detected. In the latter case, DTMF detection remains inactive until tone detection is restarted (see next step).
  3. There are some cases where it is necessary to re-activate the tone detector during a call, for example, after a transfer has occurred. To reactivate the tone detector, the voice browser must re-INVITE Speech Server with the header P-Nuance-Restart-Tone-Detector set to "true"; for example:
    INVITE sip:nss@host.com:5060 SIP/2.0
    ...
    P-Nuance-Restart-Tone-Detector: true
  4. When Speech Server detects a tone, it responds as follows:
    1. When Speech Server detects a FAX/SIT/TTY tone or an answering machine, it notifies the voice browser using a SIP INFO message. See Tone detection reporting. Although Speech Server detects all signals described in an XML configuration file, it is up to the application how to handle the various event notifications.
    2. If Speech Server detects a DTMF tone in the audio stream, it forwards the detected tone for recognition.

Configuration parameters

You can control the behavior of the tone detector using the following Speech Server parameters by setting them in the Management Station.

Parameter

Description

server.toneDetector.configFile

Enables tone detection and points to the configuration file that Speech Server uses to configure the tone detection library. For example:

$NSSSVRSDK/config/tonedef-en-us.xml

To turn tone detection on, set this parameter in the Management Station.

Default: $NSSSVRSDK/config/tones.all.on

server.toneDetector.originator

Software that uses the tone detection library. The <originator> component of the event name. A typical value for this parameter is "nss". It will likely not be changed. For example:

server.toneDetector.originator VXIString nss

server.toneDetector.timeout

Interval after which the tone detector stops operating. To restart it within a session, the browser must send another INVITE with the header P-Nuance-Restart-Tone-Detector set to true.

Enabling and disabling tone detection

By default, the file tones.all.on enables FAX, SIT, TTY, and AMD.

To disable one or more types of detection, set the following headers to "false" in the SIP INVITE:

  • P-Nuance-Activate-FAX
  • P-Nuance-Activate-SIT
  • P-Nuance-Activate-TTY
  • P-Nuance-Activate-AMD

Tone detection reporting

When tones are detected, Speech Server notifies the voice browser using a SIP INFO method. The INFO message is formatted according to the following syntax:

INFO request from NSS to browser:

INFO sip:vbs@host.com:5060 SIP/2.0
Content-Type: text/plain
Content-Length: nnn
notif-request: <event_name>

For example:

INFO sip:vbs@host.com:5060 SIP/2.0
Content-Type: text/plain
Content-Length: 35
notif-request: device.fax.cng.nss.1

SIP INFO messages return event names with the following syntax:

prefix.toneDescription.originator.MID

The segments of the event name are defined as follows:

Segment

Description

prefix

Hardcoded to "device" (without the quotes).

toneDescription

Information about the tone detected.

  • For FAX and TTY, the following values are hard coded:
    • fax.ced
    • fax.cng
    • tty
  • For AMD, the following values are hard coded:
    • answering-machine
    • answering-machine.endofgreeting
  • For SIT tones, the "desc" attribute of the <seq_tone> element as specified in the configuration file is returned.

originator

Software that uses the tone detection library. (Default: "nss" without the quotes.) You can specify originator in the Management Station using the parameter server.toneDetector.originator.

MID

RTP channel identifier.

For example, the following string represents how Speech Server would return a FAX CNG event, given server.toneDetector.originator set to "nss":

device.fax.cng.nss