Tips on prons files

This section describes how to construct pronunciation files for a Mix project.

What is a prons file?

The Krypton recognition engine lets you add custom pronunciations to improve recognition of user speech in a single language in a specific domain. Recall how speech recognition works as discussed previously. Phonemes are inferred from the speech audio, and then words in the language and domain are inferred from the combinations of phonemes. Often a domain includes words which have unusual or unexpected or varying pronunciations. The relation between the phonemes and the orthography (spelling) is different from what is typical in the language. Additional valid word pronunciations may also need to be added due to the speaker’s region or history.

To address these cases, you may add phonetic pronunciations to your DLM expressed in either the IPA (International phenetic alphabet) or X-SAMPA (Extended speech assessment methods phonetic alphabet) phonetic alphabet. The words may be existing words or words to be added with a wordset.

To use custom prons in Krypton, create a _client_prons.txt file and include it with the training data as you generate the DLM. The _client_prons.txt file can be the only input file for the DLM, or used in conjunction with the TRSX file.

Alternative approach via wordset

The custom pron feature is recommended for speech scientists who are comfortable using phonetic alphabets to represent pronunciations of words. Others may find it more convenient to use an ASR wordset with the “spoken” option specified. This offers an alternative way to specify unusual pronunciations.

While prons are used as part of the inputs to build DLMs, wordsets are used in conjunction with DLMs at runtime.

For more details, see Wordsets in the ASRaaS documentation. Or, if using Speech Suite, please consult Speech Suite documentation for tips on Wordset usage.

Creating a prons file

To create a prons file:

  • Create a plain text file named _client_prons.txt. This specific name is required, and it must include the initial underscore.
  • Encode the file with UTF-8.
  • On the first line, specify the phonetic alphabetic used in the file (IPA or XSAMPA) optionally followed by the phoneme separator, by default a dot (.).
  • On each subsequent line, enter a word with one or more pronunciations. Separate all fields with Tabs. Each line must have at least one pronunciation for the word.
  • If you need to add a custom pronunciation for a multi-word phrase, we recommend using a Wordset instead.

Example prons files

Here is an example prons file using IPA. This example is for en-US.

IPA     .
tomato       t.ə.m.e.ɾ.o       t.ə.m.ɑ.ɾ.o    
scone       s.k.ɑ.n       s.k.o.n        
caribbean       k.ə.ɹ.ɪ.b.i.n̩       k.æ.ɹ.ɨ.b.i.n̩       k.ɜ.ɹ.ɨ.b.i.n̩  
herb       h.ɚ.b       ɚ.b            
futile       f.j.u.ɾ.l̩       f.j.u.t.ɑɪ.l   

And here is an example prons file for the same pronunciations using X-SAMPA. This example is also for en-US.

XSAMPA     .
tomato       t.@.m.e.4.o       t.@.m.A.4.o     
scone       s.k.A.n       s.k.o.n       
caribbean       k.@.r\.I.b.i.n=       k.{.r\.1.b.i.n=       k.3.r\.1.b.i.n= 
herb       h.@`.b       @`.b            
futile       f.j.u.4.l=       f.j.u.t.AI.l    

Multi-language projects

If you have a multi-language project, you need to create a separate prons file for each language in the project.