Getting recognition results

Recognition results are returned in different XML formats, depending on the media type requested by the application. By default, Speech Server requests NLSML (Natural Language Semantic Markup Language).

The voice browser is responsible for mapping the results to the VoiceXML variable application.lastresult$. See the VoiceXML and MRCP standards documentation.

Confidence scores

Dragon Voice and Nuance Recognizer use different techniques to return confidence scores to your application:

  • For Nuance Recognizer recognition events-Use the VoiceXML “confidence” returned to your application. The value of confidence corresponds to the CONF token (which is also returned).
  • For Dragon Voice recognition events-To assess confidence scores, ignore “confidence” in the VoiceXML recognition results, and use the NLCONF token in the log events instead (see NLEinnd—QuickNLP interpretation end and NLEplnd—Pipeline end).

Nuance Recognizer results

The following example illustrates a typical case of a user utterance (“I want to go to Pittsburgh”) and the corresponding recognition result, where the system presents more than one possible interpretation.

<?xml version="1.0"?>
<result xmlns="http://www.ietf.org/xml/ns/mrcpv2"
      xmlns:ex="http://www.example.com/example"
      grammar="http://www.example.com/flight">
  <interpretation confidence="0.6">
    <instance>
      <ex:airline>
        <ex:to_city>Pittsburgh</ex:to_city>
      <ex:airline>
    </instance>
      <input mode="speech">
        I want to go to Pittsburgh
      </input>
  </interpretation>
  <interpretation confidence="0.4">
    <instance>
      <ex:airline>
        <ex:to_city>Stockholm</ex:to_city>
      </ex:airline>
    </instance>
    <input>I want to go to Stockholm</input>
  </interpretation>
</result>

To specify the MIME media type of the recognition result, specify the value for the server.mrcp2.osrspeechrecog.mrcpdefaults.VSP.server.osrspeechrecog.result.mediatype parameter.

The optional swirec_result_enable_speech_mode parameter ensures that recognition results conform to the VoiceXML 2.0 specification.

Dragon Voice results: Krypton-only

Krypton-only results have a literal meaning. They never have a semantic intent.

Dragon Voice results: open-dialog

Open-dialog results have a meaning representation that is a combination of standard slots. Any slot can be captured at any turn—it is up to the application to extract (or ignore) the meaning (slots) returned.

To assess confidence scores, ignore “confidence” in the VoiceXML recognition results, and use the NLCONF token in the log events instead (see NLEinnd—QuickNLP interpretation end and NLEplnd—Pipeline end).

The following examples are taken from the pay-bill VoiceXML document shown in Example: open-dialog using Dragon Voice.