Italian Italy (it-IT)

This documentation was updated on November 9, 2023.

Creating grammars

The following subsections describe key issues for working with grammar documents in the Italian language.

Character encoding

Nuance has full internal Unicode support. For example, you can create your grammars using UTF-8 or Latin-1 (also known as ISO-8859-1) character encoding. For example, your grammar header might be:

<?xml version=‘1.0’ encoding=‘UTF-8’?>

<grammar xml:lang=“it-IT” version=“1.0” root=“test”>

If you do not have access to a keyboard for your target language, you can use the Windows character map. (Choose the “System” font and the “Latin-1” subset.)

Start → Programs → Accessories → System Tools → Character Map

Below are codes for writing some common Italian characters. These are useful if you do not have access to an Italian keyboard, and are typed by pressing the ALT key while entering digits on your keyboard (after typing the last digit, the desired character appears on your screen when you release the Alt key):

Alt/0224 = à Alt/0192 = À
Alt/0232 = è Alt/0200 = È
Alt/0236 = ì Alt/0204 = Ì
Alt/0242 = ò Alt/0210 = Ò
Alt/0249 = ù Alt/0217 = Ù

alphanum_lc built-in grammar

The alphanum built-in grammar recognizes a connected string of up to 20 digits and lower case alphabetic characters.

For example, this grammar could be used to recognize a product code or user id. The “lc” in the name of this built-in means lowercase.

Characters are the letters a-z, and à, è, ì, ò, ù.

Digits are 0-9.

Note : This grammar replaces the alphanum built-in grammar.

alphanum built-in grammar

The alphanum built-in grammar recognizes a connected string of up to 20 digits and alphabetic characters. For example, this grammar could be used to recognize a product code or order number.

Characters are the letters a-z, and à, è, ì, ò, ù.

Digits are 0-9.

boolean built-in grammar

The boolean grammar collects an affirmative or negative response.

Properties

The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.

Parameter Description
y Desired DTMF digit to be equivalent to “sì” (default = 1)
n Desired DTMF digit to be equivalent to “no” (default = 2)

Examples

Caller says… MEANING key
true
no false

ccexpdate built-in grammar

The ccexpdate grammar understands the expiration date on a credit card. Expiration dates are usually a month and a year, and are often embossed on a credit card in the form “mm/yy.” The grammar recognizes variations on the date, for example, “Dicembre 2007,” “dodici zero sette,” “dodici barra zero sette,” and so on.

creditcard built-in grammar

The creditcard grammar understands a caller saying a credit card number, optionally preceding the number with the credit card name, or the word “conto.” For example, a caller can say, “visa conto quattro zero uno sette…,” “mastercard cinque zero zero due…,” or “tre sette tre cinque….”

currency built-in grammar

The currency grammar collects currency amounts using euro and cent.

MEANING Contains a string in the form: currencymain_unit_amount.subunit_amount If the caller explicitly says the denomination of the currency (“euros”), then a currency value of EUR is added as a prefix. If the caller omits the main unit or subunit amount, then that field is zero. The string contains a leading zero if the subunit amount is collected without the main unit. The key AMBIGUOUS is set to 1 if the caller says an ambiguous phrase such as “fifteen twelve,” which could either be 15.12 or 1512.00. Otherwise, AMBIGUOUS is set to 0.
SWI_literal Contains the exact text that was recognized.

Examples

Caller says MEANING
cinque euros EUR5.00
cinque centesimi EUR0.05
cinque cinquantacinque 5.55
due cento tre mila cento quindici euros un centesimo EUR203115.01

date built-in grammar

The date grammar accepts a date spoken in any of several formats.

Recognized phrases include “nove dicembre,” “nove dicembre due mila sette,” “quattro sei due mila sette,” and “sabato venticinque luglio.”

The grammar also accepts “ieri l’altro, l’altro ieri,” “ieri” “oggi, oggigiorno, odierno, adesso,” “domani,” and “dopodomani, domani l’altro” which return values of -2, -1, 0, +1, and +2 respectively into the MEANING key.

Examples

Caller says MEANING key
il sette agosto mille novecento novantanove 19990807
otto settembre due mila sette 20070908
ieri l’altro l’altro ieri -2
ieri -1
oggi 0
domani +1
dopodomani +2
il tre ??????03
venerdì (Phrase not recognized)
venerdì nove marzo ????0309
4, 6 ????0604
10, 12 ????1210
10, 12, 97 ??971210

digits built-in grammar

Valid characters are the digits 0-9.

number built-in grammar

The number built-in grammar accepts quantities such as “settantadue,” “cento quaranta,” “cinquecento sessantuno virgola cinque,” “meno cinque,” and “meno quattro virgola tre.”

Examples

Numbers from -999,999,999.99 to 999,999,999.99 are recognized. For example:

Caller says MEANING key
venticinque 25
mille due cento venticinque 1225
meno due -2
quatordici virgola cinquantasei 14.56

phone built-in grammar

The phone built-in grammar accepts 10-digit phone numbers. An optional “0” can be placed before the 10-digit numbers.

Return keys/values

The key is assigned to a string of digits representing the recognized phone number. The return string may optionally contain the character x to indicate a phone number with an extension. For example, a result could be “3345678910x1234”.

The grammar does not allow phrases such as “tre due quattre cinquantacinque settantadue,” only individual digits.

Properties

Additionally, as stipulated in the VoiceXML specification, the caller may specify an extension, for example, “tre tre quattro cinque sei sette otto nove uno zero teleselezione uno due tre quattro.” By default, extensions of one to four digits are supported.

Property Description
minextension Minimum numeric value allowed for an extension (default is 1).
maxextension Maximum numeric value allowed for an extension. Set this to 0 to disallow extensions. (Default is 9999.)

DTMF interpretation

DTMF keys are interpreted according to the VoiceXML specification. DTMF asterisk “*” indicates “x” for extensions.

time built-in grammar

The time built-in grammar accepts spoken time-of-day utterances from the caller. Recognized phrases include times given in 12-hour format (for example, “alle sette”) and 24-hour format (“tredici e quarantacinque”). In addition, it recognizes “qualified” times such as “prima delle ore sedici” and “intorno alle sette e mezza.”

Examples

For each entry, the values returned in the MEANING and QUALIFIER keys are shown. (Not shown are the values of the HOUR, MINUTE and AMPM keys.)

Caller says MEANING QUALIFIER
immediatamente (Phrase not recognized) --
mezzogiorno 1200p exact
mezzanotte 0000? exact
prima mezzogiorno 1200p before
dopo tredici trenta 1330h after
venti venti 2020h exact
quattro di mattina 0400a exact
otto trenta 0830? exact
sette quindici della notte 0715p exact

zipcode built-in grammar

The zipcode grammar recognizes valid postal codes in Italy (" Codice di Avviamento Postale ," CAP) in five-digit format.

Return keys/values

Upon return, the key MEANING is assigned to the recognized postal code, and contains five digits.

Vocabulary items and pronunciations

This chapter describes considerations for vocabularies and their pronunciations in Italian (it-IT).

Specially tuned pronunciations

The following list shows common words that are fine-tuned by Nuance:

  • All letters of the alphabet, a-z
  • sì, no
  • Cardinal numbers: 0-99, 100, and 1000

Italian pronunciations

This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the Italian language as spoken in Italy. It provides information about transcription and pronunciation.

As reference pronunciation dictionary we use:

PONS Wörterbuch für Schule und Studium. Italienisch-Deutsch 1. Stuttgart et al.: Klett. 2.ed., April 2005. (ISBN 3-12-517490-2)

If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in the alphabetic SAMPA-IPA table ( The Italian symbol set in alphabetical order ).

The Italian phoneme system

The Italian phoneme system can be divided into two groups:

  • Consonants
  • Vowels

Furthermore, it is possible to distinguish six different types of Italian consonants:

  • Plosives
  • Fricatives
  • Affricates
  • Nasals
  • Laterals
  • Trills

Within the vowel group, a further distinction can be made between monophthongs and diphthongs.

Below you will find the phonemes of the Italian SAMPA symbol set. They are grouped by the phoneme classes to which they belong (according to the manner of their articulation).

Italian symbol set grouped by phoneme classes

Phoneme class SAMPA IPA Examples of usage
Consonants Plosives b b
bb bb gobba /gObba/
p p pane /pane/
pp pp coppa /kOppa/
d d danno /danno/
dd dd cadde /kadde/
t t tana /tana/
tt tt zitto /tsitto/
g ɡ gamba /gamba/
gg ɡɡ leggo /lEggo/
k k cane /kane/
kk kk tocca /tOkka/
Fricatives v v vano
vv vv bevvi /bevvi/
f f fame /fame/
ff ff beffa /bEffa/
z z sbaglio /zbaLo/
s s sano /sano/
ss ss cassa /kassa/
S ʃ scendo /Sendo/
Affricates dz dz̬ zona
ddz ddz̬ mezzo /mEddzo/
ts ts̬ stanza /stantsa/
tts tts̬ bozza /bOttsa/
dZ ʤ̬ gita /dZita/
ddZ dʤ̬ oggi /OddZi/
tS ʧ̬ cena /tSena/
ttS tʧ̬ braccio /brattSo/
Nasals m m molla
mm mm grammo /grammo/
n n notte /nOtte/
nn nn panna /panna/
J ɲ gnocco /JOkko/
Laterals l l luce
ll ll colla /kOlla/
L ʎ foglia /foLa/
Trills r r rete
rr rr ferro /fErro/
Vowels Single vowels a a
e e / e: rete /rete/
E ɛ / ɛ: pesca /pEska/
i i mite /mite/
o o / o: dove /dove/
O ɔ / ɔ: moto /mOto/
u u muto /muto/
Ascending diphthongs ja i̯a piatto
je i̯e: chiesa /kjeza/
jE i̯ɛ richiesta /rikjEsta/
jo i̯o: fiore /fjore/
jO i̯ɔ fiocco /fjOkko/
ju i̯u aiutare /ajutare/
wa u̯a quadro /kwadro/
we u̯e questo /kwesto/
wE u̯ɛ guerra /gwErra/
wo u̯o rincuorare /rinkworare/
wO u̯ɔ uomo /wOmo/
Descending diphthongs ai a:i mai
au au aumento /aumento/
ei e:i potei /potei/
Ei ɛ:i lei /lEi/
eu eu europeo /europEo/
oi o:i voi /voi/
Oi ɔ:i poi /pOi/

Italian consonants

The standard Italian consonant system is considered to have:

  • Six plosives
  • Five fricatives
  • Four affricates
  • Three nasals
  • Two laterals
  • One trill

According to a peculiarity of the Italian pronunciation the double consonants (for example, <bb>, <gg>, etc.) are actually pronounced as an intensified phoneme called geminate. Every consonant, except /z/, /J/, /S/, /L/, has a corresponding geminate. (See Double letters .) These are shown in the following tables under the respective single consonants. The semivowels /j/ and /w/ are included under diphthongs because in Italian they only occur as the first part of an increasing diphthong.

Plosives

There are three voiced and three voiceless plosives in Italian, which can be arranged in pairs:

Voiced Voiceless
/b/ bene imbottire
/bb/ abbonato
/d/ dove sdrucciola andare
/dd/ additivo
/g/ gara ghetto ago
/gg/ agghindare

Fricatives

There are five fricatives in the Italian SAMPA symbol set, two voiced and three voiceless. The voiced ones can be paired with their voiceless counterparts:

Voiced Voiceless
/v/ valere convolare svogliato
/vv/ avvalersi
/z/ caso snodare

<i> after /S/ is not to be transcribed unless it forms a syllable of its own. For example: sciopero = /SOpero/ (and not /SjOpero/).

Affricates

In Italian there are four affricates, /ts/, /tS/, /dz/, and /dZ/. Affricates are always represented in SAMPA by two single phonemes.

Voiced Voiceless
/dz/ zona
/ddz/ azzurro indennizzo
/dZ/ giorno cangiante
/ddZ/ maggio

The letter <z> and its double are mostly pronounced /dz/ and /ddz/ respectively if standing between two vowels or at the beginning of a word (see Pronunciation of <e>, <o>, and <z> ).

<i> after /dZ/, /ddZ/ and /tS/, /ttS/ is not to be transcribed unless it forms a syllable of its own. For example: gioco = /dZOko/ (and not /dZjOko/).

Nasals

There are three nasals in Italian, /m/, /n/, and /J/.

/m/ mano ambizione /mano/ /ambittsjone/
/mm/ ammonire /ammonire/
/n/ naso pane /naso/ /pane/
/nn/ annoiato /annojato/
/J/ gnomo ingegnere /Jomo/ /indZeJEre/

Laterals

There are two laterals in Italian, /l/ and /L/.

/l/ lampada salmo /lampada/ /salmo/
/ll/ pallone /pallone/
/L/ gli aglio /Li/ /aLo/

Trills

There is one trill in Italian which is pronounced with the tongue tip: /r/.

/r/ radio cera /radjo/ /tSera/
/rr/ afferrare /afferrare/

Italian vowels

In Italian, vowels fall into two groups:

  • Monophthongs
  • Diphthongs (subdivided into increasing and decreasing diphthongs)

Monophthongs

In Italian, usually seven monophthongs are distinguished: /a/, /E/, /e/, /i/, /O/, /o/, and /u/. (See Pronunciation of <i> and Pronunciation of <e>, <o>, and <z> )

/a/ palo arma /palo/ /arma/
/E/ etica costituente cioè /Etika/ /kostituEnte/ /tSoE/
/e/ elemento bere te /elemento/ /bere/ /te/
/i/ fine intimo parvi /fine/ /intimo/ /parvi/
/O/ ottimo ispanofono però /Ottimo/ /ispanOfono/ /perO/
/o/ ondoso ombra uno /ondoso/ /ombra/ /uno/
/u/ fumo unto giù /fumo/ /unto/ /dZu/

Diphthongs

The Italian diphthongs can be divided into:

  • descending diphthongs
  • ascending diphthongs

Descending diphthongs

Descending diphthongs are a combination of two monophthongs with the stress on the first monophthong. Since descending diphthongs do not usually carry important information for the meaning of a word, a reduced set of them is used for the SAMPA version. Speech recognition tests have shown very good results for this practice. The main advantage is that it reduces the amount of phonemes to be considered and thus also possible error sources.

There are seven descending diphthongs in Italian, /ai/, /Ei/, /ei/, /Oi/, /oi/, /au/, and /eu/.

/ai/ aitante laico mai /aitante/ /laiko/ /mai/
/Ei/ lei /lEi/
/ei/ potei /potei/
/Oi/ poi /pOi/
/oi/ voi /voi/
/au/ aumento /aumento/
/eu/ europeo /europEo/

Ascending diphthongs

Ascending diphthongs are formed by combining a semivowel with a subsequent monophthong with the stress on the monophthong.

There are eleven ascending diphthongs in Italian, /ja/, /jE/, /je/, /jO/, /jo/, /ju/, /wa/, /wE/, /we/, /wO/, and /wo/.

Phoneme Orthographic form Example
/ja/ <ia> amiamo macchia
/jE/ <ie> ieri richiesta
/je/ <ie> ieratico piegare varie
/jO/ <io> iodio fiotto picchiò
/jo/ <io> fionda occhio
/ju/ <iu> aiutare più
/wa/ <ua> quadro acqua
/wE/ <ue> guerra
/we/ <ue> questo
/wO/ <uo> uomo cuoco dileguò
/wo/ <uo> rincuorare scialacquo

Specific pronunciation transcription methods

Initial <h>

Initial <h> is always ignored in the transcription as it is not pronounced in Italian, e.g. hotel = /otEl/, hanno = /anno/. In the middle of a word it only stands in the following contexts: <che>, <chi>, <ghe>, <ghi> so that the consonants are pronounced as plosives (/ke/, /ki/, /ge/, /gi/).

Pronunciation of <i>

<i> between <c> or <g> and another vowel is not pronounced (<cia> = /tSa/ etc.), except when it is stressed, for example:

farmacia = /farmatSia/

Double letters

The double phonemes /bb/, /dd/, /ff/, /gg/, /kk/, /ll/, /mm/, /nn/, /pp/, /rr/, /ss/, /tt/, and /vv/ coincide with the corresponding double letters. Concerning the transcription of double affricates please note that only the plosive element must be redoubled, for example:

<zz> = /tts/ or /ddz/

Accentuation

The accentuation of a word is a distinctive aspect in the Italian language (e.g. àncora vs. ancóra). However, pairs of words differing only in accentuation are rare and thus accentuation as such is not represented in this version of SAMPA phonetic transcription.

If a word exhibits an orthographic accent, the transcription is usually different, for example:

faro /faro/ vs. farò /farO/

Pronunciation of <e>, <o>, and <z>

The pronunciation of <e> and <o> as open or closed vowels or <z> as /dz/ or /ts/ has in some cases an importance for the meaning of a word (e.g. /botte/ vs. /bOtte/ and /raddza/ versus /rattsa/). Nevertheless it generally varies according to the regional accent of the speaker and has no influence on the meaning, for example:

/gonna/ = /gOnna/

/raddzo/ = /rattso/

A good dictionary can help you to find the standard pronunciation.

The suffix <-zione>

When <z> introduces the suffix <-zione> it is always to be transcribed as a voiceless geminate, for example:

vibrazione = /vibrattsjone/

Pronunciation of foreign words

There is a set of phonemes that do not have regular Italian phonemes but only occur in foreign words. These are used in the system dictionary and are internally `mapped’ to other phonemes in the speech recognizer. There are no phoneme models trained for them. This means that it is not necessary to use these phonemes in custom transcriptions.

When there is a need to transcribe foreign words the general rule is to transcribe those words with the same SAMPA symbol set as the rest. In the case of an Italian transcription you have to transcribe every word of the dictionary with the Italian SAMPA symbols.

If you use a different symbol set your system will be incapable of understanding the input.

Every language has a different phoneme inventory, so you may have problems in covering each and every sound.

Below you find some examples to help you to find a pronunciation adapted to Italian.

French nasals

Nasalized vowels are common in French. However, there are no appropriate vowels in Italian to reflect this. The easiest solution is to use the equivalent non-nasalized vowel and then add the nasal /n/.

For example:

bon-bon /bOnbOn/

This is rather convenient, since this normally reflects the pronunciation used by Italian speakers that do not have knowledge of French.

The original transcription ‘bo~bo~’ cannot be realized because the French symbol ‘o~’ is not part of the Italian SAMPA symbol set.

English words

Even with English words you have to try to apply a pronunciation that has been adapted to Italian, for example:

camping /kamping/

The English phoneme ‘{’, ‘I’, and ‘N’ in the original transcription ‘k{mpIN’ are not elements of the Italian SAMPA symbol set.

The Italian symbol set in alphabetical order:

SAMPA IPA Examples of use
a a rata
ai a:i mai
au au aumento
b b banco
bb bb gobba
d d danno
dd dd cadde
ddz ddúz mezzo
ddZ dʤ̬ oggi
dz dz̬ zona
dZ dʦ̬ gita
e e / e: rete
E ɛ / ɛ: pesca
ei e:i potei
Ei ɛ:i lei
eu eu europeo
f f fame
ff ff beffa
g ɡ gamba
gg ɡɡ leggo
i i mite
J ɲ gnocco
ja i̯a piatto
je i̯e: chiesa
jE i̯ɛ richiesta
jo i̯o: fiore
jO i̯ɔ fiocco
ju i̯u aiutare
k k cane
kk kk tocca
l l luce
L ʎ foglia
ll ll colla
m m molla
mm mm grammo
n n notte
nn nn panna
o o / o: dove
O ɔ / ɔ: moto
oi o:i voi
Oi ɔ:i poi
p p pane
pp pp coppa
r r rete
rr rr ferro
s s sano
S ʃ scendo
ss ss cassa
t t tana
ts ts̬ stanza
tS ʧ̬ cena
tt tt zitto
tts tts̬ bozza
ttS tʧ̬ braccio
u u muto
v v vano
vv vv bevvi
wa u̯a quadro
we u̯e questo
wE u̯ɛ guerra
wo u̯o rincuorare
wO u̯ɔ uomo
z z sbaglio