French France (fr-FR)

This documentation was updated on November 9, 2023.

Creating grammars

The following subsections describe key issues for working with grammar documents in the French language.

Grammar file encoding

Nuance has full internal Unicode support. Create your grammars using ISO-8859-1 (also known as Latin-1) or UTF-8. For example, your grammar header might be:

<?xml version=‘1.0’ encoding=‘ISO-8859-1’?> <grammar xml:lang=“fr-FR” version=“1.0” root=“test”>

alphanum_lc built-in grammar

The alphanum built-in grammar recognizes a connected string of up to 20 digits and lower case alphabetic characters.

For example, this grammar could be used to recognize a product code or user id. The “lc” in the name of this built-in means lowercase.

Characters are the letters a-z, and à â ç é è ê ë î ï ô ö û ü ù. Digits are 0-9.

Digits are 0-9.

Note: This grammar replaces the alphanum built-in grammar.

alphanum built-in grammar

The alphanum built-in grammar recognizes a connected string of up to 20 digits and alphabetic characters. For example, this grammar could be used to recognize a product code or order number.

Characters are the letters a-z, and à â ç é è ê ë î ï ô ö û ü ù. Digits are 0-9.

boolean built-in grammar

The boolean grammar collects an affirmative or negative response.

Properties

The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.

Parameter Description
y Desired DTMF digit to be equivalent to “oui” (default = 1)
n Desired DTMF digit to be equivalent to “non” (default = 2)

Examples

Caller says… MEANING key
oui true
non false

ccexpdate built-in grammar

The ccexpdate grammar understands the expiration date on a credit card. Expiration dates are usually a month and a year, and are often embossed on a credit card in the form “mm/yy.” The grammar recognizes variations on the date, for example, “juin 2005,” “zéro six zéro cinq,” “juin trente slash zéro cinq,” “six barre oblique zéro cinq,” and so on.

creditcard built-in grammar

The creditcard grammar understands a caller saying a credit card number, optionally preceding the number with the credit card name. For example, a caller can say, “Visa affaires numéro quatre zéro un sept …,” or “mastercard numéro de compte cinq huit sept…,” or “deux deux cinq deux cinq…”

currency built-in grammar

The currency grammar collects currency amounts using Euro and Cent or centime.

MEANING Contains a string in the following form: currencymain_unit_amount.subunit_amount If the caller explicitly says “euro” or “cent,” then a currency value of EUR is added as a prefix. If the caller does not explicitly indicate the currency type, then no prefix is added. If the caller omits the main unit or subunit amount, then that field is zero. The string contains a leading zero if the subunit amount is collected without the main unit.
SWI_literal Contains the exact text that was recognized.

Examples

Caller says MEANING
cinq euros EUR5.00
cinq centimes EUR0.05
cinq euros et cinq cents cinq euros et cinq centimes cinq euros cinq EUR5.05
cinq vingt-cinq cinq virgule vingt-cinq 5.25
six cent vingt-cinq mille quatre cent soixante-quatre euros six cent vingt-cinq mille quatre cent soixante-quatre euros et zéro centime six cent vingt-cinq mille quatre cent soixante-quatre euros zéro centime six cent vingt-cinq mille quatre cent soixante-quatre euros zéro cent EUR625464.00
un euro un euro et zéro centime un euro zéro centime un euro zéro cent EUR1.00
un euro vingt-deux centimes un euro et vingt-deux centimes un euro et vingt-deux cents EUR1.22

date built-in grammar

The date grammar accepts a date spoken in any of several formats.

Recognized phrases include “4 Juin,” “4 Juin 2006,” ““4-6-2006,” “le quatre du six,” and “Lundi, le 4 Juin.”

The grammar also accepts “hier,” “aujourd’hui,” and “demain,” which return values of -1, 0, and +1 respectively into the MEANING key.

Examples

Caller says MEANING key
le 5 janvier 2000 le 5 janvier de l’an 2000 zéro cinq zéro un deux mille cinq un deux mille 20000105
hier -1
avant-hier (Phrase not recognized)
aujourd’hui 0
demain +1
après-demain (Phrase not recognized)
le quatre zéro quatre ??????04
mercredi (Phrase not recognized)
mercredi le 12 mercredi 12 ??????12
le quatre juin quatre juin ????0604
le quatre juin mille neuf cent quatre-vingt-dix-sept quatre juin mille neuf cent quatre-vingt-dix-sept 19970604
le quatre juin quatre-vingt-dix-sept quatre juin quatre-vingt-dix-sept ??970604
mercredi quatre juin mille neuf cent quatre-vingt-dix-sept le mercredi quatre juin mille neuf cent quatre-vingt-dix-sept 19970604
le six zéro six ??????06
le six avril zéro six zéro quatre ????0406
le mercredi 12 ??????12
le dix décembre dix douze le dix du douze ????1210
le dix du douze quatre-vingt deux ??821210
le dix décembre quatre-vingt-dix-sept dix douze quatre-vingt-dix-sept ??971210

digits built-in grammar

Valid characters are the digits 0-9.

number built-in grammar

The number grammar recognizes whole numeric numbers (the caller must not speak the individual digits).

Up to two decimal places are recognized by default; this can be extended to 9 using the maxdecimal parameter. The digits after the decimal point can be spoken individually or as natural numbers.

Examples

Numbers from -999,999,999.99 to 999,999,999.99 are recognized, but by default the minallowed parameter is set to zero, which limits recognition to positive values.

Caller says MEANING key
vingt-cinq 25
douze mille trois cent quarante-cinq 12345
moins quatre -4
quatorze virgule cinquante-six 14.56
six virgule deux sept neuf deux 6.2792

phone built-in grammar

The phone grammar recognizes telephone numbers (landline and cellular) using the French calling plan (0 a bb bb bb bb where b is any digit and a is 1, 2, 3, 4, 5, 6, or 7). The grammar also accepts reserved 2-digit numbers for emergency calls: 14, 15, 17, and 18 (directory assistance, customer service, police, and fire). Toll free numbers can begin with “un huit cent” plus 6 digits.

The grammar allows natural numbers as well as responses of just digits. This allows people to combine adjacent digits in the phone number when they are more easily spoken as a single number. For example, saying “25” instead of “2” “5”.

Return keys/values

Upon return, the MEANING key is assigned to a variable length character result representing the recognized phone number.

Properties

As stipulated in the VoiceXML specification, the caller may specify an extension. By default, extensions of one to four digits long are supported.

Property Description
minextension Minimum numeric value allowed for an extension.
maxextension Maximum numeric value allowed for an extension. Set this to 0 to disallow extensions.

postcode built-in grammar

The postcode grammar recognizes valid French postal codes in the 5 digit format.

Return keys/values

Upon return, the key MEANING is assigned to the recognized postal code, and can contain five digits.

socialsecurity built-in grammar

The socialsecurity grammar understands 15-digit French numéro de sécurité sociale. For example, a caller can say: “deux soixante-treize dix vingt-et-un soixante-six deux six deux sept six.”

Illegal numbers, such as those beginning with three zeroes, are rejected. A checksum algorithm is used to validate legal numbers.

The advantage of using this grammar rather then a digits grammar (of length 9) is that identification numbers have constraints that reduce that set of possible recognition hypotheses by about 20% (and thus increases recognition accuracy).

time built-in grammar

The time grammar recognizes spoken time of day utterances from the caller. Recognized phrases include times given in 12-hour format (for example, “cinq heures”) and 24-hour format (“vingt-trois heures quinze”). In addition, it recognizes “qualified” times such as “avant cinq heures.”

Return keys/values

The time grammar returns the keys listed in the Language Pack Guide , and recognizes these values for the QUALIFIER key:

QUALIFIER Either “exact”, “approx” (abbreviation for approximately), “before”, or “after” depending on whether the caller spoke a word to qualify the time The default is “exact”, if no qualifier is heard. Here is a list of phrases that set the qualifier: approx = à_peu_près, environ, approximativement, autour_de before = avant after = à_partir_de, après from = dans, d’ici_à

Examples

For each entry, the values returned in the MEANING and QUALIFIER keys are shown. (Not shown are the values of the HOUR, MINUTE, and AMPM keys.)

Caller says MEANING QUALIFIER
tout de suite immédiatement (Phrases not recognized) --
dans une demi-heure (Phrase not recognized) --
à midi 1200p exact
à minuit 0000a exact
avant midi 1200p before (Beware that the caller might have intended “approx.”)
après treize heures trente 1330h after
vingt heures vingt 2020h exact
huit heures vingt 0820? exact
huit heures vingt du matin 0820a exact
huit heures trente huit heures et demie 0830? exact
dix-neuf heures quinze 1915h exact
sept heures et quart du soir 0715p exact
une heure 0100? exact
une heure du matin 0100a exact
vingt-quatre heures minuit zéro heures 0000h exact

Vocabulary items and pronunciations

This chapter describes considerations for vocabularies and their pronunciations in French (fr-FR).

Specially tuned pronunciations

The following table shows common words that are fine-tuned by Nuance. Each of these words contains “word-specific phonemes;” that is, phonemes and associated models created especially for the words.

Words with tuned pronunciations (do not modify):

  • All letters of the alphabet, a-z
  • Oui, non
  • Cardinal numbers: 0-99, 100, and 1000

French pronunciations

This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the French language as spoken in France. It provides information about transcription and pronunciation.

As reference pronunciation dictionary we use:

Le Nouveau Petit Robert (Pons); Dictionnaire de la Langue Française. Paris: Dictionnaires Le Robert (Klett) 1995. (ISBN 2-85036-500-9)

If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in the alphabetic SAMPA-IPA table (The French symbol set in alphabetical order).

The French phoneme system

The French phoneme system can be divided into three groups:

  • Consonants
  • Semi-consonants
  • Vowels

Furthermore, it is possible to define five different types of consonants:

  • Plosives
  • Fricatives
  • Laterals
  • Trills
  • Nasals

Within the vowel group, further distinctions can be made between oral and nasal vowels. The weak vowel /@/ (e instable' / e muet’) represents a special characteristic among the group of vowels.

French symbol set grouped by phoneme classes

Phoneme class SAMPA IPA Examples of usage
Consonants Plosives b b
p p pont /po~/
g g gant /ga~/
k k quand /ka~/
d d dans /da~/
t t temps /ta~/
Fricatives v v vent
f f femme /fam/
z z zone /zon/
s s sans /sa~/
Z Z gens /Za~/
S S champ /Sa~/
Lateral l l long
Trill R ʀ rond
Nasals m m mont
n n nom /no~/
N ŋ marketings /maRk@tiN/
J ɲ bagne /baJ/
Semi-consonants H ɥ juin
w w oui /wi/
j j bouillir /bujiR/
Vowels Oral vowels O O
o o gros /gRo/
9 œ b?uf /b9f/
2 ø feu /f2/
a a patte /pat/
y y du /dy/
i ɪ / ɪ̆ si /si/
u u doux /du/
e e chanter /Sa~te/
E ɛ anglaise /a~glEz@/
Weak vowel @ ə justement
Nasal vowels a~ vent
e~ ɛ̃/ œ̃ vin brun /ve~/ /bRe~/
o~ bon /bo~/

French consonants

The French consonant system comprises:

  • six plosives
  • six fricatives
  • one lateral
  • one trill
  • three nasals
  • three semi-consonants

Plosives

There are three voiced and three voiceless plosives in French, which can be arranged in pairs as shown here:

Voiced Voiceless
/b/ bain arbre snob
/g/ gant examen vague
/d/ dans cadre sud

Fricatives

There are six fricatives in French, three voiceless and three voiced:

Voiced Voiceless
/v/ vent avènement grave
/z/ zone cousin gaz
/Z/ gens ajour litige

Laterals

The phoneme /l/ is the only lateral in the French SAMPA symbol set.

/l/ long élastique avril /lo~/ /elastik/ /avRil/

Trills

The French SAMPA Symbol set consists of one trill, the phoneme /R/:

/R/ rond horrible corps /Ro~/ /ORibl/ /kOR/

Nasals

There are four nasals in the French SAMPA symbol system, /m/, /n/, /N/ and /J/.

/m/ mont comment chrome /mo~/ /kOma~/ /kRom/
/n/ nom année saine /no~/ /ane/ /sEn/
/N/ marketing /maRk@tiN/
/J/ gnouf oignon gagne /Juf/ /OJo~/ /gaJ/

Semi-consonants

In French orthography, semi-consonants occur as conjunctions of two vowels. But they are neither diphthongs, nor vowels. Phonetically they can be interpreted as a sequence of a consonantal element and a vowel.

There are three semi-consonants in French, /H/, /w/, and /j/:

/H/ huile juin /Hil/ /ZHe~/
/w/ waters coin /watER/ /kwe~/
/j/ milieu douille /milj2/ /duj/

The /j/ consonant can also be a pure consonant as in the following words:

-y- yucca voyage paye /juka/ /vwajaZ/ /pEj/
-ill- fille vieillard /fij/ /vjEjaR/

French vowels

Orals, short and long vowels

There are ten so-called oral vowels in French. Four of these can be divided into two groups according to their length of articulation: short or long. The remaining six vowels can be used as either long or short.

The long and short vowel groups are shown in the following table:

Short Long
/O/ automne pomme
/9/ ?uf b?uf

The other six oral vowels are:

/a/ âme battre mât /am/ /batR/ /ma/
/y/ utile durée accru /ytil/ /dyRe/ /akRy/
/i/ il analyse idiotie /il/ /analiz/ /idjOsi/
/u/ outrager soutenir clou /utRaZe/ /sutniR/ /klu/
/e/ épée vétérinaire pied /epe/ /veteRinER/ /pje/
/E/ aisance faire doublet /Eza~s/ /fER/ /dublE/

Weak vowel /@/: “e muet” / “e instable”

French also has the weak vowel /@/ which is short and occurs only in unstressed positions. It mostly occurs in affixes and at the end of syllables:

/@/ tenir ce /t@niR/ /s@/

At the end of a word, the `e muet’ is silent and not spoken (as in entre /a~tR/). If it appears in the middle of a word, it is almost silent, so that a variant should be introduced. For example:

entretenir a~tr@tniR
entretenir <a~tr@t@niR> a~tr@t@niR

Nasal vowels

In French there are three so-called nasal vowels: /a~/, /e~/, and /o~/.

/a~/ embêter avancer camp /a~bEte/ /ava~se/ /ka~/
/e~/ impossible absinthe vin humble tunstène brun /e~pOsibl/ /apse~t/ /ve~/ /e~bl/ /te~gstEn/ /bRe~/
/o~/ ondulation compte bon /o~dylasjo~/ /ko~t/ /bo~/

Specific pronunciation transcription methods

Liaison

The linguistic phenomenon “liaison” occurs when the (usually) silent final consonants of certain words can be pronounced, in certain syntactic contexts, when the following word begins with a vowel. Spellings based on the etymology of the word may not reflect the real pronunciation. For example, liaison changes the pronunciations of these word pairs:

-d = /t/ grand homme /gRa~tOm/
-s = /z/ les enfants /leza~fa~/
-x = /z/ faux ami /fozami/

With most words whose spellings end in -n and whose pronunciations end in nasal vowels, the vowel is denasalized during liaison:

with denasalization bon /bo~/ bon ami /bOnami/
without denasalization mon /mo~/ mon ami /mo~nami/

Liaison with words ending in -er, -c, and -p can also lead to a change in vowel quality:

premier /pR@mje/ premier étage /pRemjeRetaZ/
franc /fRa~/ franc étrier /fRa~ketRie/
beaucoup /buku/ beaucoup appris /bukupapRi/

Finally, French identifies three contexts for liaison: obligatory, forbidden, and optional. For example, liaison is forbidden when a word begins with an aspirated “h.”

des hiboux /de ibu/

For dictionary work, there are two approaches you can take to represent liaison. You can define two pronunciations for the word that includes the linking consonant, or you can create an entry for a specific context:

bon /bo~/, /bOn/
bon_ami /bOnami/

For a more complete description of liaison, refer to a French language grammar, such as “M. Grevisse. Le bon usage . 12th edition by A. Boosse, Duculot, Paris.”

Pronunciation of x

The grapheme x can be pronounced in several ways:

/ks/ xénon fixer axe /kseno~/ /fikse/ /aks/
/gz/ xanthine exact /gza~tin/ /Egzakt/
/z/ deuxième /d2zjEm/
/s/ incendie /e~sa~di/

When x occurs in location names, there are often two different ways of pronouncing it, either as /ks/ or as /s/. For example:

Auxerre /oksER/
Auxerre /osER/

Silent consonants

Consonants at the end of a word are often silent, for example:

drap /dRa/
bout /bu/
froid /fRwa/
plomb /plo~/
sang /sa~/

The letter p between two consonants is sometimes silent, for example:

compte /ko~t/
temps /ta~/

The letter m before n is sometimes not pronounced, for example:

automne /otOn/
condamner /ko~dane/

Transcription of the grapheme t

The letter t in combination with the letter i in the middle of a word can be pronounced as /s/, as in:

nation /nasjo~/
initial /inisjal/

Transcription of the grapheme combination gu

The letters gu can be pronounced as /g/, /gw/, or /gH/, for example:

guerre /gER/
lingual /le~gwal/
aiguille /egHij/

The phoneme /s/ as initial sound of a word

The letter s at the beginning of a word is always pronounced /s/, for example:

son /so~/

The letter c as the initial sound before the letters e, i, and y is also always pronounced /s/, for example:

centre /sa~tR/
ciment /sima~/
cygne /siJ/

Gemination

Double /m/ can sometimes be pronounced. For example:

immobile /immObil/ or /imObil/
immense /imma~s/ or /ima~s/

Usually the gemination is not considered in the recognizer dictionary, but if it is necessary for your application you can consider to insert a variant. For details, see Multiple pronunciations (variants).

Pronunciation of foreign words

To transcribe foreign words, you must use the French SAMPA symbol set. If you use a different symbol set your system will be incapable of understanding the input.

Every language has a different phoneme inventory, so you may have problems in covering each and every sound. For the most common cases we offer some transcription examples.

English words

Try to apply a pronunciation that has been adapted for French. For example:

sporting /spORtiN/

The transcription ‘spQRtiN’, cannot be realized because the English symbol ‘Q’ is not part of the French SAMPA symbol set. Other recommended transcriptions for English words are for example:

cheeseburger /tSizb9Rg9R/
week-end /wikEnd/

Multiple pronunciations (variants)

The type of pronunciation used in SAMPA and in the French dictionary conforms to the standard non-regional French pronunciation. It is possible for other varieties of French to occur in an application. If they markedly differ from the standard form, they should be transcribed as a separate variant, as in:

Anglet a~glEt
Anglet a~glE
Metz mEs
Metz mEts

The French symbol set in alphabetical order:

SAMPA IPA Examples of usage
2 ø feu
9 œ b?uf
@ ə justement
a a patte
a~ vent
b b bon
d d dans
E ɛ anglaise
e e chanter
e~ ɛ̃/ œ̃ vin brun
f f femme
g g gant
H ɥ juin
i ɪ / ɪ̆ si
J ɲ bagne
j j bouillir
k k quand
l l long
m m mont
N ŋ marketing
n n nom
O O comme
o o gros
o~ bon
p p pont
R ʀ rond
S S champ
s s sans
t t temps
u u doux
v v vent
w w oui
y y du
Z Z gens
z z zone