Using the light SSML parser

Vocalizer uses a default SSML parser for processing SSML input. It also provides an alternate parser, called the light SSML parser, which, compared to the default parser, uses fewer host resources, such as CPU and memory.

Before enabling the light SSML parser:

  • Review and update your SSML input to address any exceptions listed under SSML compliance in this topic.
  • Compare your input to the SSML compliance exceptions for the default parser listed in Vocalizer SSML support.
  • Run listening tests to check for SSML input errors.

SSML compliance

The light SSML parser supports all elements/attributes in the Speech Synthesis Markup Language Specification Version 1.0–W3C Recommendation 7 September 2004, regardless of their rating (MUST, REQUIRED, SHALL, SHOULD, RECOMMENDED, MAY, OPTIONAL), but with the following exceptions:

  • The <voice> element:
    • The name attribute is supported, but the light SSML parser only supports a single voice name. Specifying more than one voice, using a whitespace-separate list, for example, is not supported. If the specified voice is not on the host, Vocalizer uses the installed voice with the highest priority.

      For example, you have these voices installed: Laila, Samantha, Audrey-ml (highest priority), and Daniel. If you specify multiple voice names, such as <voice name="Laila,Audrey-ml,Samantha,Daniel">, the Vocalizer uses voice Audrey-ml, since it has highest priority. If you specify multiple voice names that are not installed, such as <voice name="Tom,Zoe">, Vocalizer uses voice Audrey-ml.

    • The variant attribute is not supported. Vocalizer uses the current active voice if its weight is the same as other voices.
  • The <say-as> element:
    • Unlike the default SSML parser, if the current language code starts with "en" and the value of the interpret-as attribute is "rational", "real", or "decimal", the light SSML parser does not convert a decimal comma (",") to a decimal period ("."). Ensure the inputs for rational, real, and decimal numbers are formatted correctly your input.
    • The format attribute is only valid when the value of the intrepret-as attribute is "date". For the date value, the light SSML parser does not reorder the month and day for enu languages, like the default SSML parser. Ensure the date inputs are formatted correctly in your input.
    • The detail attribute is not supported, but does not affect the output if specified. Consider removing the attribute from your input to avoid confusion.
  • The <phoneme> element: The language attribute does not support the values "ipa" and "unipa". If specified, the output is not affected, but consider changing these values in your input to avoid confusion.
  • The <s> element: The ssft-dtype attribute is not supported. If specified, the output is not affected, but consider removing the attribute from your input to avoid confusion.
  • The <audio> element: The attributes fetchhint, fetchtimeout, maxage, and maxstale are not supported. If specified, the output is not affected, but consider removing these attributes from your input to avoid confusion.

Enabling the light SSML parser

You enable the light SSML parser in a Vocalizer configuration file.

  1. Edit the Vocalizer configuration file.
  2. Uncomment the use_lightssml parameter and set it to true. For example, <use_lightssml>true</use_lightssml>
  3. Save the Vocalizer configuration file.
  4. Restart Vocalizer.
  5. Run listening tests to check for SSML input errors.