French Canada (fr-CA)

This documentation was updated on May 8, 2023.

Creating grammars

The following subsections describe key issues for working with grammar documents in the Canadian French language.

Grammar file encoding

Nuance has full internal Unicode support. Create your grammars using ISO-8859-1 (also known as Latin-1) or UTF-8. For example, your grammar header might be:

<?xml version='1.0' encoding='UTF-8'?>



<grammar xml:lang="fr-CA" version="1.0" root="test">

alphanum_lc built-in grammar

The alphanum_lc built-in grammar recognizes a connected string of up to 20 digits and lowercase alphabetic characters, such as “a8f9h23”. For example, this grammar could be used to recognize a product code or user id. The “lc” in the name of this built-in means lowercase. The possible characters are the lowercase letters a-z à â ç é è ê ë î ï ô ö û ü ù and the digits 0-9. The application layer can adjust the case of the returned letters as needed for further processing.

Note: This grammar replaces the alphanum built-in grammar.

alphanum built-in grammar

(NOTE: for backward-compatibility only. Otherwise, use alphanum_lc builtin)
This grammar has been replaced by the alphanum_lc grammar, but is still available. The alphanum builtin-grammar has been retained for backward-compatibility. For new implementations, please use the alphanum_lc builtin grammar.

The alphanum built-in grammar recognizes a connected string of up to 20 digits and uppercase or lowercase alphabetic characters, such as “A8f9h23”. For example, this grammar could be used to recognize a product code or order number. The possible characters are the uppercase letters A-Z À Â Ç É È Ê Ë Î Ï Ô Ö Û Ü Ù, lowercase letters a-z à â ç é è ê ë î ï ô ö û ü ù, and digits 0-9. Uppercase and lowercase letters are homonyms (e.g., “B” and “b”), so the inclusion of both is redundant for the purposes of speech recognition of case insensitive items such as product codes. Thus, the alphanum built-in grammar has been replaced by the alphanum_lc grammar.

boolean built-in grammar

The boolean grammar collects an affirmative or negative response.

Properties

The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.

Parameter Description
y Desired DTMF digit to be equivalent to “oui” (default = 1)
n Desired DTMF digit to be equivalent to “non” (default = 2)

Examples

Caller says… MEANING key
oui true
non false

ccexpdate built-in grammar

The ccexpdate grammar understands the expiration date on a credit card. Expiration dates are usually a month and a year, and are often embossed on a credit card in the form “mm/yy.” The grammar recognizes variations on the date, for example, “Décembre 2007,” “douze zéro sept,” “douze barre oblique zéro sept,” “douze du zéro sept,” “douze slash zéro sept,” etc.

creditcard built-in grammar

The creditcard grammar understands a caller saying a credit card number, optionally preceding the number with the credit card name, or the word “numéro.” For example, a caller can say, “carte visa numéro quatre zéro un sept…,” “mastercard cinq zéro zéro deux…,” or “trois sept trois cinq….”

currency built-in grammar

The currency grammar collects currency amounts using Dollars and Cents, Piastre and Sous.

MEANING Contains a string in the following form: currencymain_unit_amount.subunit_amount A currency value of CAD is added as a prefix if the caller explicitly says “dollar canadien.” If the caller omits the main unit or subunit amount, then that field is zero. The string contains a leading zero if the subunit amount is collected without the main unit.
SWI_literal Contains the exact text that was recognized.

Examples

Caller says MEANING
cinq dollars cinq piastres 5.00
cinq sous 0.05
cinq dollar canadien et cinq sous CAD5.05
cinq dollars et cinq sous cinq dollars cinq 5.05
cinq dollars vingt-cinq sous cinq dollars vingt-cinq cinq vingt-cinq cinq et vingt-cinq cinq virgule vingt-cinq cinq piastres vingt-cinq 5.25
six cent vingt-cinq mille quatre cent soixante-quatre dollars six cent vingt-cinq mille quatre cent soixante-quatre dollars et zéro cent six cent vingt-cinq mille quatre cent soixante-quatre dollars zéro sou 625464.00
un dollar un dollar et zéro cent un dollar zéro cent 1.00
un dollar vingt-deux cents un dollar et vingt-deux cents 1.22

date built-in grammar

The date grammar accepts a date spoken in any of several formats.

Recognized phrases include “4 Juin,” “4 Juin 2001,” ““4-6-2001,” “le quatre du six,” and “Lundi, le 4 Juin.”

The grammar also accepts “hier,” “aujourd’hui,” and “demain,” which return values of -1, 0, and +1 respectively into the MEANING key.

Examples

Caller says MEANING key
le 5 janvier 2004 le 5 janvier de l’an 2004 zéro cinq zéro un deux mille quatre cinq un deux mille quatre 20040105
hier -1
avant-hier (Phrase not recognized)
aujourd’hui 0
demain +1
après-demain (Phrase not recognized)
le quatre zéro quatre ??????04
mercredi (Phrase not recognized)
mercredi le 12 mercredi 12 ??????12
le quatre juin quatre juin ????0604
le quatre juin mille neuf cent quatre-vingt-dix-sept quatre juin mille neuf cent quatre-vingt-dix-sept 19970604
le quatre juin quatre-vingt-dix-sept quatre juin quatre-vingt-dix-sept ??970604
mercredi quatre juin mille neuf cent quatre-vingt-dix-sept le mercredi quatre juin mille neuf cent quatre-vingt-dix-sept 19970604
le six zéro six ??????06
le six avril zéro six zéro quatre ????0406
le mercredi 12 ??????12
le dix décembre dix douze le dix du douze ????1210
le dix du douze quatre-vingt deux ??821210
le dix décembre quatre-vingt-dix-sept dix douze quatre-vingt-dix-sept ??971210

digits built-in grammar

Valid characters are the digits 0-9.

number built-in grammar

The number grammar recognizes whole numeric numbers (the caller must not speak the individual digits).

Examples

Numbers from -999,999,999.99 to 999,999,999.99 are recognized, but by default the minallowed parameter is set to zero, which limits recognition to positive values.

Caller says MEANING key
vingt-cinq 25
douze mille trois cent quarante-cinq 12345
moins quatre -4
quatorze point cinquante-six 14.56
six virgule deux sept neuf deux 6.2792

phone built-in grammar

The phone grammar collects telephone numbers (landline and cellular) using North American dialing plan. The grammar also accepts reserved 3-digit numbers: 411, 911, 611. Toll free numbers can begin with “un huit cent” plus 7 digits.

The grammar allows natural numbers as well as responses of just digits. This allows people to combine adjacent digits in the phone number when they are more easily spoken as a single number. For example, saying “25” instead of “2” “5”.

Properties

As stipulated in the VoiceXML specification, the caller may specify an extension. By default, extensions of one to four digits long are supported.

Property Description
minextension Minimum numeric value allowed for an extension.
maxextension Maximum numeric value allowed for an extension. Set this to 0 to disallow extensions.

socialsecurity built-in grammar

The socialsecurity grammar understands 9-digit Canadian Social Security numbers. For example, a caller can say, “quatre neuf zéro zéro neuf un six cinq neuf.” Illegal numbers, such as those beginning with three zeroes, are rejected.

time built-in grammar

The time grammar recognizes spoken time of day utterances from the caller. Recognized phrases include times given in 12-hour format (e.g., “cinq heures”) and 24-hour format (“vingt-trois heures quinze”). In addition, it will recognize “qualified” times such as “avant cinq heures.”

Return keys/values

In addition to the MEANING key, the following keys are set:

HOUR 2-character value (00 to 24) corresponding to the hour.
MINUTE 2-character value (00 to 59) corresponding to the minute.
AMPM Either “p” (for p.m.), “a” (for a.m.), “h” for 24-hour format, and “?” for unknown
QUALIFIER Either ’exact’, ‘approx’ (abbreviation for approximately), ‘before’ or ‘after’ depending on whether the caller spoke a word to qualify the time The default is ’exact’, if no qualifier is heard. Here is a list of phrases that set the qualifier: approx = à_peu_près, environ, approximativement, autour_de before = avant after = à_partir_de, après from = dans, d’ici_à

Examples

For each entry, the values returned in the MEANING and QUALIFIER keys are shown. (Not shown are the values of the HOUR, MINUTE, and AMPM keys.)

Caller says MEANING QUALIFIER
tout de suite immédiatement (Phrases not recognized) --
dans une demi-heure (Phrase not recognized) --
à midi 1200p exact
à minuit 0000a exact
avant midi 1200p before (Beware that the caller might have intended “approx.”)
après treize heures trente 1330h after
vingt heures vingt 2020h exact
huit heures vingt 0820? exact
huit heures vingt du matin 0820a exact
huit heures trente huit heures et demie 0830? exact
dix-neuf heures quinze 1915h exact
sept heures et quart du soir 0715p exact
une heure 0100? exact
une heure du matin 0100a exact
vingt-quatre heures minuit zéro heure 0000h exact

zipcode built-in grammar

The zipcode grammar recognizes valid alphanumeric postal codes (“codes postaux”) in Canada. The following table shows the format. (“A” indicates an alphabetic character, and “N” indicates a digit.)

Format Example
ANA NAN A3Z 3N7

The letters W and Z never appear as the first letter.

These letters never appear in any position: D , F , I , O , Q , U .

Return keys/values

Upon return, the key MEANING is assigned to the recognized postal code. The string is alphanumeric, all uppercase, and contains no spaces. for example, “A3Z 3N7” is returned as “A3Z3N7”.

Vocabulary items and pronunciations

This chapter describes considerations for vocabularies and their pronunciations in Canadian French (fr-CA).

Specially tuned pronunciations

The following table shows common words that are fine-tuned by Nuance. Each of these words contains “word-specific phonemes;” that is, phonemes and associated models created especially for the words.

Words with tuned pronunciations (do not modify):

  • All letters of the alphabet, a-z
  • Oui, non
  • Cardinal numbers: 0-99, 100, and 1000
  • Ordinal numbers: 1.-31. (1 er , 2 ième …31)

Canadian French pronunciations

This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the French language as spoken in Canada. It provides information about transcription and pronunciation.

This section explains all the phonemes and their SAMPA symbols used in the Canadian French language. As a reference pronunciation dictionary, we use:

Le Nouveau Petit Robert (Pons); Dictionnaire de la Langue Française. Paris: Dictionnaires Le Robert (Klett) 1995. (ISBN 2-85036-500-9)

If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in The Canadian French symbol set in alphabetical order .

The Canadian French phoneme system

The Canadian French phoneme system can be divided into three groups:

  • Consonants
  • Semi-consonants
  • Vowels

Furthermore, it is possible to define six different types of consonants:

  • Plosives
  • Fricatives
  • Laterals
  • Trills
  • Nasals

Within the vowel group, further distinctions can be made between oral and nasal vowels. The weak vowel /@/ (e instable' / e muet’) represents a special characteristic among the group of vowels.

Canadian French symbol set grouped by phoneme classes

Phoneme class SAMPA IPA Examples of usage
Consonants Plosives b b
p p pont /po~/
g g gant /ga~/
k k quand /ka~/
d d dans /da~/
t t temps /ta~/
Fricatives v v vent
f f femme /fam/
z z zone /zon/
s s sans /sa~/
Z Z gens /Za~/
S S champ /Sa~/
Lateral l l long
Trill R ʀ rond
Nasals m m mont
n n nom /no~/
N ŋ camping /ka~piN/
J ɲ bagne /baJ/
Semi-consonants H ɥ juin
w w oui /wi/
j j bouillir /bujiR/
Vowels Oral vowels O ɔ
o o gros /gRo/
9 œ b?uf /b9f/
2 ø deux /d2/
a a patte /pat/
A ɑ pâte /pAt/
y y du /dy/
i ɪ / ɪ̆ si /si/
u u doux /du/
e e chanter /Sa~te/
E ɛ mettre /mEtR/
E: æ fête /fE:t/
Weak vowel @ ə justement
Nasal vowels 9~ œ̃ brun
a~ vent /va~/
e~ ɛ̃ vin /ve~/
o~ bon /bo~/
Xenophones (English phones) r\ ɹ right
aY tyler /taYlXr/
Xr ə tyler /taYlXr/
aW brown /bRaWn/

Canadian French consonants

The Canadian French consonant system comprises:

  • Six plosives
  • Six fricatives
  • One lateral
  • One trill
  • Three nasals
  • Three semi-consonants
Plosives

There are three voiced and three voiceless plosives in Canadian French, which can be arranged in pairs as shown here:

Voiced Voiceless
/b/ bain arbre snob
/g/ gant examen vague
/d/ dans cadre sud
Fricatives

There are six fricatives in Canadian French, three voiceless and three voiced:

Voiced Voiceless
/v/ vent avènement grave /va~/ /avEnma~/ /gRav/
/z/ zone cousin gaz /zon/ /kuze~/ /gaz/
/Z/ gens ajour litige /Za~/ /aZuR/ /litiZ/
Laterals

The phoneme /l/ is the only lateral in the Canadian French SAMPA symbol set.

/l/ long élastique avril /lo~/ /elastik/ /avRil/
Trills

The Canadian French SAMPA Symbol set consists of one trill, the phoneme /R/:

/R/ rond horrible corps /Ro~/ /ORibl/ /kOR/
Nasals

There are four nasals in the Canadian French SAMPA symbol system, /m/, /n/, /N/ and /J/. The phoneme /N/ only occurs in foreign words.

/m/ mont comment chrome /mo~/ /kOma~/ /kRom/ /N/ Camping /kampiN/
/n/ nom année saine /no~/ /ane/ /sEn/ /J/ gnouf oignon gagne /Juf/ /OJo~/ /gaJ/
Semi-consonants

In Canadian French orthography, semi-consonants occur as conjunctions of two vowels. But they are neither diphthongs, nor vowels. Phonetically they can be interpreted as a sequence of a consonantal element and vowel.

There are three semi-consonants in Canadian French, /H/, /w/ and /j/:

/H/ huit juin /Hit/ /ZHe~/
/w/ water coin /watER/ /kwe~/
/j/ yoyo milieu douille /jojo/ /milj2/ /duj/

The /j/ consonant can also be a pure consonant as in the following words:

-y- yucca voyage paye /juka/ /vwajaZ/ /pEj/
-ill- fille vieillard /fij/ /vjEjaR/

Canadian French vowels

Orals, short and long vowels

There are twelve so-called oral vowels in Canadian French, eight of these can be divided into two groups according to their length of articulation: short or long. The remaining four vowels can be used as either long or short.

The long and short vowel groups are shown in the following table:

Short Long
/O/ automne pomme
/9/ ?uf b?uf ?illet
/a/ ajout agrume Battre
/E/ aisseau faire doublet

The other four oral vowels are:

/y/ utile durée accru /ytil/ /dyRe/ /akRy/
/i/ il analyse idiotie /il/ /analiz/ /idjOsi/
/u/ outrager soutenir clou /utRaZe/ /sut@niR/ /klu/
/e/ épée vétérinaire pied /epe/ /veteRinER/ /pje/
Weak vowel /@/: “e muet” / “e instable”

Canadian French also has the weak vowel /@/ which is short and occurs only in unstressed positions. It mostly occurs in affixes and at the end of syllables:

/@/ tenir /t@niR/

At the end of a word, the `e muet’ is silent and not spoken (as in entre /a~tR/). If it appears in the middle of a word, it is almost silent, so that a variant should be introduced. For example:

entretenir /a~tR@tniR/
entretenir<a~tr@t@niR> /a~tR@t@niR/
Nasal vowels

In Canadian French there are four so-called nasal vowels: /9~/, /a~/, /e~/, and /o~/.

/9~/ humble tungstène brun /9~bl/ /t9~gstEn/ /bR9~/
/a~/ embêter avancer camp /a~bEte/ /ava~se/ /ka~/
/e~/ impossible absinthe vin /e~pOsibl/ /apse~t/ /ve~/
/o~/ ondulation compte bon /o~dylasjo~/ /ko~t/ /bo~/

Specific pronunciation transcription methods

Liaison

The linguistic phenomenon “liaison” occurs when the (usually) silent final consonants of certain words can be pronounced, in certain syntactic contexts, when the following word begins with a vowel. Spellings based on the etymology of the word may not reflect the real pronunciation. For example, liaison changes the pronunciations of these word pairs:

-d = /t/ grand homme /gra~ t Om/
-s = /z/ les enfants /le z a~fa~/
-x = /z/ faux ami /fo z ami/

With most words whose spellings end in -n and whose pronunciations end in nasal vowels, the vowel is denasalized during liaison:

with denasalization bon /bo~/ bon ami /bOnami/
without denasalization mon /mo~/ mon ami /mo~nami/

Liaison with words ending in -er, -c, and -p can also lead to a change in vowel quality:

premier /pR@mje/ premier étage /pRemjeRetaZ/
franc /fRa~/ franc étrier /fRa~ketRie/
beaucoup /buku/ beaucoup appris /bukupapRi/

Finally, French identifies three contexts for liaison: obligatory, forbidden, and optional. For example, liaison is forbidden when a word begins with an aspirated “h.”

des hiboux /de ibu/

For dictionary work, there are two approaches you can take to represent liaison. You can define two pronunciations for the word that includes the linking consonant, or you can create an entry for a specific context:

bon /bo~/, /bOn/
bon_ami /bOnami/

For a more complete description of liaison, refer to a French language grammar, such as “M. Grevisse. Le bon usage . 12th edition by A. Boosse, Duculot, Paris.”

Pronunciation of x

The grapheme x can be pronounced in several ways:

/ks/ xérus fixer axe /kseRys/ /fikse/ /aks/
/gz/ xanthine exact /gza~tin/ /Egzakt/
/z/ deuxième /d2zjEm/
/s/ soixante /swasa~t/

When x occurs in location names, there are often two different ways of pronouncing it, either as /ks/ or as /s/. For example:

Auxerre /o ks ER/ or /o s ER/

Silent consonants

Consonants at the end of a word are often silent, for example:

drap /dRa/
bout /bu/
froid /fRwa/
plomb /plo~/
sang /sa~/

The letter p between two consonants can be pronounced silent, for example:

compte /ko~t/
temps /ta~/

The letter m before n is sometimes not pronounced, for example:

automne /otOn/
condamner /ko~dane/
Transcription of the grapheme t

The letter t in combination with the letter i in the middle of a word can be pronounced as /s/, as in:

nation /nasjo~/
initial /inisjal/
Transcription of the grapheme combination gu

The letters gu can be pronounced as /g/, /gw/ or /gH/, for example:

guerre /gER/
lingual /le~gwal/
aiguille /egHij/
The phoneme /s/ as initial sound of a word

The letter s at the beginning of a word is always pronounced /s/, for example:

son /so~/

The letter c as the initial sound before the letters e, i and y is also always pronounced /s/, for example:

cent /sa~/
ciment /sima~/
cygne /siJ/
Gemination

Double /m/ can sometimes be pronounced, for example:

immobile /imObil/ /immObil/
immense /ima~s/ /imma~s/

Usually the gemination is not considered in the Nuance dictionary, but if it is necessary for your application you can consider to insert a variant. For details, see Multiple pronunciations (variants).

Pronunciation of foreign words

To transcribe foreign words, you must use the Canadian French SAMPA symbol set. If you use a different symbol set your system will be incapable of understanding the input.

Every language has a different phoneme inventory, so you may have problems in covering each and every sound. For the most common cases we offer some transcription examples.

English words

Try to apply a pronunciation that has been adapted for Canadian French. For example:

camping /kampiN/

The transcription ‘k{mpIN’ can not be realized because the English symbol ‘{’ is not part of the Canadian French symbol set. Other recommended transcriptions for English words are, for example:

cheeseburger /tSizb9Rg9R/
week-end /wikEnd/
English phones

The SAMPA symbol set includes these symbols for English phonemes:

r\ right /r\aYt/
aY tyler /taYlXr/
Xr tyler /taYlXr/
aW brown /bRaWn/

Multiple pronunciations (variants)

The type of pronunciation used in SAMPA and in the Canadian French dictionary conforms to the standard non-regional Canadian French pronunciation. It is possible for other varieties of Canadian French to occur in an application. If they markedly differ from the standard form, they should be transcribed as a separate variant, as in:

Anglet a~glEt
Anglet<a~glE> a~glE
Metz mEs
Metz<mEts> mEts

The Canadian French symbol set in alphabetical order:

SAMPA IPA Examples of usage
2 ø deux
9 œ b?uf
@ ə justement
9~ œ̃ brun
a a patte
A ɑ pâte
a~ vent
aW brown
aY tyler
b b bon
d d dans
e e chanter
E ɛ mettre
E: æ fête
e~ ɛ̃ vin
f f femme
g g gant
H ɥ juin
i ɪ / ɪ̆ si
j j bouillir
J ɲ bagne
k k quand
l l long
m m mont
n n nom
N ŋ camping
o o gros
O ɔ comme
o~ bon
p p pont
R ʀ rond
r\ ɹ right
s s sans
S S champ
t t temps
u u doux
v v vent
w w oui
Xr ə tyler
y y du
z z zone
Z Z gens