Voice filters

When searching for voices using GetVoicesRequest, you may include filters to narrow the list of voices. Without filters, GetVoices returns all Microsoft voices, a very long list. The available filters are the following fields in GetVoices: Voice:

These examples show the fields as entered in the flow.py input file for the Sample synthesis client for Neural TTSaaS, for example:

request = GetVoicesRequest()
request.voice.name = "en-US-JennyNeural"
# More filters here if wanted
list_of_requests.append(request)

name

The name field returns information about a single named voice. If the voice supports foreign languages and/or styles, this information is included in the response.

request.voice.name = "en-US-JennyNeural"
2022-12-01 11:11:38,919 (140144522364736) DEBUG [voice {
  name: "en-US-JennyNeural"
}
]
2022-12-01 11:11:39,450 (140144522364736) INFO  voices {
  name: "en-US-JennyNeural"
  model: "neural"
  language: "en-US"
  gender: FEMALE
  sample_rate_hz: 24000
  styles: "assistant"
  styles: "chat"
  styles: "customerservice"
  styles: "newscast"
  styles: "angry"
  styles: "cheerful"
  styles: "sad"
  styles: "excited"
  styles: "friendly"
  styles: "terrified"
  styles: "shouting"
  styles: "unfriendly"
  styles: "whispering"
  styles: "hopeful"
}

language

The language field returns information about all voices with the language code.

request.voice.language = "en-US"
2022-12-01 11:20:53,606 (139929787422528) DEBUG [voice {
  language: "en-US"
}
]
2022-12-01 11:20:53,606 (139929787422528) INFO  Sending GetVoices request
2022-12-01 11:20:53,606 (139929787422528) INFO  Adding x-nuance-tts-neural header
2022-12-01 11:20:53,766 (139929787422528) INFO  voices {
  name: "en-US-JennyNeural"
  model: "neural"
  language: "en-US"
  gender: FEMALE
  sample_rate_hz: 24000
  styles: "assistant"
  styles: "chat"
  styles: "customerservice"
  styles: "newscast"
  styles: "angry"
  styles: "cheerful"
  styles: "sad"
  styles: "excited"
  styles: "friendly"
  styles: "terrified"
  styles: "shouting"
  styles: "unfriendly"
  styles: "whispering"
  styles: "hopeful"
}
voices {
  name: "en-US-JennyMultilingualNeural"
  model: "neural"
  language: "en-US"
  gender: FEMALE
  sample_rate_hz: 24000
  foreign_languages: "de-DE"
  foreign_languages: "en-AU"
  foreign_languages: "en-CA"
  foreign_languages: "en-GB"
  foreign_languages: "es-ES"
  foreign_languages: "es-MX"
  foreign_languages: "fr-CA"
  foreign_languages: "fr-FR"
  foreign_languages: "it-IT"
  foreign_languages: "ja-JP"
  foreign_languages: "ko-KR"
  foreign_languages: "pt-BR"
  foreign_languages: "zh-CN"
}
voices {
  name: "en-US-GuyNeural"
  model: "neural"
  language: "en-US"
  gender: MALE
  sample_rate_hz: 24000
  styles: "newscast"
  styles: "angry"
  styles: "cheerful"
  styles: "sad"
  styles: "excited"
  styles: "friendly"
  styles: "terrified"
  styles: "shouting"
  styles: "unfriendly"
  styles: "whispering"
  styles: "hopeful"
}
voices {
  name: "en-US-AmberNeural"
  model: "neural"
  language: "en-US"
  gender: FEMALE
  sample_rate_hz: 24000
}
voices {
  name: "en-US-AnaNeural"
  model: "neural"
  language: "en-US"
  gender: FEMALE
  sample_rate_hz: 24000
}
. . . 

gender

The gender field returns information about all voices of the specified gender.

You may combine fields to reduce the number of voices, for example setting language and gender together returns the intersection of the two values, in this case all male American English voices.

request.voice.language = "en-US"
request.voice.gender = EnumGender.MALE
# Or request.voice.gender = 1
2022-12-01 11:23:48,141 (140466809009984) DEBUG [voice {
  language: "en-US"
  gender: MALE
}
]
2022-12-01 11:23:48,141 (140466809009984) INFO  Sending GetVoices request
2022-12-01 11:23:48,141 (140466809009984) INFO  Adding x-nuance-tts-neural header
2022-12-01 11:23:48,431 (140466809009984) INFO  voices {
  name: "en-US-GuyNeural"
  model: "neural"
  language: "en-US"
  gender: MALE
  sample_rate_hz: 24000
  styles: "newscast"
  styles: "angry"
  styles: "cheerful"
  styles: "sad"
  styles: "excited"
  styles: "friendly"
  styles: "terrified"
  styles: "shouting"
  styles: "unfriendly"
  styles: "whispering"
  styles: "hopeful"
}
voices {
  name: "en-US-BrandonNeural"
  model: "neural"
  language: "en-US"
  gender: MALE
  sample_rate_hz: 24000
}
voices {
  name: "en-US-ChristopherNeural"
  model: "neural"
  language: "en-US"
  gender: MALE
  sample_rate_hz: 24000
}
voices {
  name: "en-US-DavisNeural"
  model: "neural"
  language: "en-US"
  gender: MALE
  sample_rate_hz: 24000
  styles: "chat"
  styles: "angry"
  styles: "cheerful"
  styles: "excited"
  styles: "friendly"
  styles: "hopeful"
  styles: "sad"
  styles: "shouting"
  styles: "terrified"
  styles: "unfriendly"
  styles: "whispering"
}
voices {
  name: "en-US-EricNeural"
  model: "neural"
  language: "en-US"
  gender: MALE
  sample_rate_hz: 24000
}
...

foreign_languages

The foreign_languages field returns information about voices with one or more foreign languages. Currently only the Jenny multilingual voice includes foreign languages.

This field requests information about voices with any of the values specified. For example, if you ask about three French locales, JennyMultilingual is returned as that voice includes at least one these languages.

request.voice.foreign_languages.extend(["fr-CA","fr-FR","fr-CH"])
2022-12-01 11:32:14,306 (140503242782528) DEBUG [voice {
  foreign_languages: "fr-CA"
  foreign_languages: "fr-FR"
  foreign_languages: "fr-CH"
}
]
2022-12-01 11:32:14,306 (140503242782528) INFO  Sending GetVoices request
2022-12-01 11:32:14,306 (140503242782528) INFO  Adding x-nuance-tts-neural header
2022-12-01 11:32:14,460 (140503242782528) INFO  voices {
  name: "en-US-JennyMultilingualNeural"
  model: "neural"
  language: "en-US"
  gender: FEMALE
  sample_rate_hz: 24000
  foreign_languages: "de-DE"
  foreign_languages: "en-AU"
  foreign_languages: "en-CA"
  foreign_languages: "en-GB"
  foreign_languages: "es-ES"
  foreign_languages: "es-MX"
  foreign_languages: "fr-CA"
  foreign_languages: "fr-FR"
  foreign_languages: "it-IT"
  foreign_languages: "ja-JP"
  foreign_languages: "ko-KR"
  foreign_languages: "pt-BR"
  foreign_languages: "zh-CN"
}

styles

The styles field returns information about voices that have one or more styles.

Like foreign_languages, this field requests information about voices with any of the values specified. You may also include other filters to narrow the list.

For example, this requests Chinese Mandarin voices that contain any of the newscast styles, and returns three Chinese voices, two with newscast and one with newscast-casual.

request.voice.language = "zh-CN"
request.voice.styles.extend(["newscast", "newscast-casual", "newscast-formal"])
2022-12-01 11:37:04,603 (139734435911488) DEBUG [voice {
  language: "zh-CN"
  styles: "newscast"
  styles: "newscast-casual"
  styles: "newscast-formal"
}
]
2022-12-01 11:37:04,603 (139734435911488) INFO  Sending GetVoices request
2022-12-01 11:37:04,603 (139734435911488) INFO  Adding x-nuance-tts-neural header
2022-12-01 11:37:04,915 (139734435911488) INFO  voices {
  name: "zh-CN-XiaoxiaoNeural"
  model: "neural"
  language: "zh-CN"
  gender: FEMALE
  sample_rate_hz: 24000
  styles: "assistant"
  styles: "chat"
  styles: "customerservice"
  styles: "newscast"
  styles: "affectionate"
  styles: "angry"
  styles: "calm"
  styles: "cheerful"
  styles: "disgruntled"
  styles: "fearful"
  styles: "gentle"
  styles: "lyrical"
  styles: "sad"
  styles: "serious"
  styles: "poetry-reading"
}
voices {
  name: "zh-CN-YunyangNeural"
  model: "neural"
  language: "zh-CN"
  gender: MALE
  sample_rate_hz: 24000
  styles: "customerservice"
  styles: "narration-professional"
  styles: "newscast-casual"
}
voices {
  name: "zh-CN-YunxiNeural"
  model: "neural"
  language: "zh-CN"
  gender: MALE
  sample_rate_hz: 24000
  styles: "narration-relaxed"
  styles: "embarrassed"
  styles: "fearful"
  styles: "cheerful"
  styles: "disgruntled"
  styles: "serious"
  styles: "angry"
  styles: "sad"
  styles: "depressed"
  styles: "chat"
  styles: "assistant"
  styles: "newscast"
}

sample_rate_hz

The sample_rate_hz field returns voices with a specified sampling rate. Currently all voices have the same sampling rate (24000). Setting the rate to 0 ignores the filter, effecting returning all voices or the voices defined by other filters. For example, this returns all female Spanish Mexican voices.

request.voice.language = "es-MX"
request.voice.gender = EnumGender.FEMALE
request.voice.sample_rate_hz = 0
2022-12-01 11:43:24,600 (139891440494400) DEBUG [voice {
  language: "es-MX"
  gender: FEMALE
}
]
2022-12-01 11:43:24,600 (139891440494400) INFO  Sending GetVoices request
2022-12-01 11:43:24,600 (139891440494400) INFO  Adding x-nuance-tts-neural header
2022-12-01 11:43:24,906 (139891440494400) INFO  voices {
  name: "es-MX-BeatrizNeural"
  model: "neural"
  language: "es-MX"
  gender: FEMALE
  sample_rate_hz: 24000
}
voices {
  name: "es-MX-CandelaNeural"
  model: "neural"
  language: "es-MX"
  gender: FEMALE
  sample_rate_hz: 24000
}
voices {
  name: "es-MX-CarlotaNeural"
  model: "neural"
  language: "es-MX"
  gender: FEMALE
  sample_rate_hz: 24000
}
. . .