Italian Italy (it-IT)
This documentation was updated on November 9, 2023.
Creating grammars
The following subsections describe key issues for working with grammar documents in the Italian language.
Character encoding
Nuance has full internal Unicode support. For example, you can create your grammars using UTF-8 or Latin-1 (also known as ISO-8859-1) character encoding. For example, your grammar header might be:
<?xml version=‘1.0’ encoding=‘UTF-8’?>
<grammar xml:lang=“it-IT” version=“1.0” root=“test”>
If you do not have access to a keyboard for your target language, you can use the Windows character map. (Choose the “System” font and the “Latin-1” subset.)
Start → Programs → Accessories → System Tools → Character Map
Below are codes for writing some common Italian characters. These are useful if you do not have access to an Italian keyboard, and are typed by pressing the ALT key while entering digits on your keyboard (after typing the last digit, the desired character appears on your screen when you release the Alt key):
Alt/0224 = à | Alt/0192 = À |
---|---|
Alt/0232 = è | Alt/0200 = È |
Alt/0236 = ì | Alt/0204 = Ì |
Alt/0242 = ò | Alt/0210 = Ò |
Alt/0249 = ù | Alt/0217 = Ù |
alphanum_lc built-in grammar
The alphanum built-in grammar recognizes a connected string of up to 20 digits and lower case alphabetic characters.
For example, this grammar could be used to recognize a product code or user id. The “lc” in the name of this built-in means lowercase.
Characters are the letters a-z, and à, è, ì, ò, ù.
Digits are 0-9.
Note : This grammar replaces the alphanum built-in grammar.
alphanum built-in grammar
The alphanum built-in grammar recognizes a connected string of up to 20 digits and alphabetic characters. For example, this grammar could be used to recognize a product code or order number.
Characters are the letters a-z, and à, è, ì, ò, ù.
Digits are 0-9.
boolean built-in grammar
The boolean grammar collects an affirmative or negative response.
Properties
The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.
Parameter | Description |
---|---|
y | Desired DTMF digit to be equivalent to “sì” (default = 1) |
n | Desired DTMF digit to be equivalent to “no” (default = 2) |
Examples
Caller says… | MEANING key |
---|---|
sì | true |
no | false |
ccexpdate built-in grammar
The ccexpdate grammar understands the expiration date on a credit card. Expiration dates are usually a month and a year, and are often embossed on a credit card in the form “mm/yy.” The grammar recognizes variations on the date, for example, “Dicembre 2007,” “dodici zero sette,” “dodici barra zero sette,” and so on.
creditcard built-in grammar
The creditcard grammar understands a caller saying a credit card number, optionally preceding the number with the credit card name, or the word “conto.” For example, a caller can say, “visa conto quattro zero uno sette…,” “mastercard cinque zero zero due…,” or “tre sette tre cinque….”
currency built-in grammar
The currency grammar collects currency amounts using euro and cent.
MEANING | Contains a string in the form: currencymain_unit_amount.subunit_amount If the caller explicitly says the denomination of the currency (“euros”), then a currency value of EUR is added as a prefix. If the caller omits the main unit or subunit amount, then that field is zero. The string contains a leading zero if the subunit amount is collected without the main unit. The key AMBIGUOUS is set to 1 if the caller says an ambiguous phrase such as “fifteen twelve,” which could either be 15.12 or 1512.00. Otherwise, AMBIGUOUS is set to 0. |
---|---|
SWI_literal | Contains the exact text that was recognized. |
Examples
Caller says | MEANING |
---|---|
cinque euros | EUR5.00 |
cinque centesimi | EUR0.05 |
cinque cinquantacinque | 5.55 |
due cento tre mila cento quindici euros un centesimo | EUR203115.01 |
date built-in grammar
The date grammar accepts a date spoken in any of several formats.
Recognized phrases include “nove dicembre,” “nove dicembre due mila sette,” “quattro sei due mila sette,” and “sabato venticinque luglio.”
The grammar also accepts “ieri l’altro, l’altro ieri,” “ieri” “oggi, oggigiorno, odierno, adesso,” “domani,” and “dopodomani, domani l’altro” which return values of -2, -1, 0, +1, and +2 respectively into the MEANING key.
Examples
Caller says | MEANING key |
---|---|
il sette agosto mille novecento novantanove | 19990807 |
otto settembre due mila sette | 20070908 |
ieri l’altro l’altro ieri | -2 |
ieri | -1 |
oggi | 0 |
domani | +1 |
dopodomani | +2 |
il tre | ??????03 |
venerdì | (Phrase not recognized) |
venerdì nove marzo | ????0309 |
4, 6 | ????0604 |
10, 12 | ????1210 |
10, 12, 97 | ??971210 |
digits built-in grammar
Valid characters are the digits 0-9.
number built-in grammar
The number built-in grammar accepts quantities such as “settantadue,” “cento quaranta,” “cinquecento sessantuno virgola cinque,” “meno cinque,” and “meno quattro virgola tre.”
Examples
Numbers from -999,999,999.99 to 999,999,999.99 are recognized. For example:
Caller says | MEANING key |
---|---|
venticinque | 25 |
mille due cento venticinque | 1225 |
meno due | -2 |
quatordici virgola cinquantasei | 14.56 |
phone built-in grammar
The phone built-in grammar accepts 10-digit phone numbers. An optional “0” can be placed before the 10-digit numbers.
Return keys/values
The key is assigned to a string of digits representing the recognized phone number. The return string may optionally contain the character x to indicate a phone number with an extension. For example, a result could be “3345678910x1234”.
The grammar does not allow phrases such as “tre due quattre cinquantacinque settantadue,” only individual digits.
Properties
Additionally, as stipulated in the VoiceXML specification, the caller may specify an extension, for example, “tre tre quattro cinque sei sette otto nove uno zero teleselezione uno due tre quattro.” By default, extensions of one to four digits are supported.
Property | Description |
---|---|
minextension | Minimum numeric value allowed for an extension (default is 1). |
maxextension | Maximum numeric value allowed for an extension. Set this to 0 to disallow extensions. (Default is 9999.) |
DTMF interpretation
DTMF keys are interpreted according to the VoiceXML specification. DTMF asterisk “*” indicates “x” for extensions.
time built-in grammar
The time built-in grammar accepts spoken time-of-day utterances from the caller. Recognized phrases include times given in 12-hour format (for example, “alle sette”) and 24-hour format (“tredici e quarantacinque”). In addition, it recognizes “qualified” times such as “prima delle ore sedici” and “intorno alle sette e mezza.”
Examples
For each entry, the values returned in the MEANING and QUALIFIER keys are shown. (Not shown are the values of the HOUR, MINUTE and AMPM keys.)
Caller says | MEANING | QUALIFIER |
---|---|---|
immediatamente | (Phrase not recognized) | -- |
mezzogiorno | 1200p | exact |
mezzanotte | 0000? | exact |
prima mezzogiorno | 1200p | before |
dopo tredici trenta | 1330h | after |
venti venti | 2020h | exact |
quattro di mattina | 0400a | exact |
otto trenta | 0830? | exact |
sette quindici della notte | 0715p | exact |
zipcode built-in grammar
The zipcode grammar recognizes valid postal codes in Italy (" Codice di Avviamento Postale ," CAP) in five-digit format.
Return keys/values
Upon return, the key MEANING is assigned to the recognized postal code, and contains five digits.
Vocabulary items and pronunciations
This chapter describes considerations for vocabularies and their pronunciations in Italian (it-IT).
Specially tuned pronunciations
The following list shows common words that are fine-tuned by Nuance:
- All letters of the alphabet, a-z
- sì, no
- Cardinal numbers: 0-99, 100, and 1000
Italian pronunciations
This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the Italian language as spoken in Italy. It provides information about transcription and pronunciation.
As reference pronunciation dictionary we use:
PONS Wörterbuch für Schule und Studium. Italienisch-Deutsch 1. Stuttgart et al.: Klett. 2.ed., April 2005. (ISBN 3-12-517490-2)
If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in the alphabetic SAMPA-IPA table ( The Italian symbol set in alphabetical order ).
The Italian phoneme system
The Italian phoneme system can be divided into two groups:
- Consonants
- Vowels
Furthermore, it is possible to distinguish six different types of Italian consonants:
- Plosives
- Fricatives
- Affricates
- Nasals
- Laterals
- Trills
Within the vowel group, a further distinction can be made between monophthongs and diphthongs.
Below you will find the phonemes of the Italian SAMPA symbol set. They are grouped by the phoneme classes to which they belong (according to the manner of their articulation).
Italian symbol set grouped by phoneme classes
Phoneme class | SAMPA | IPA | Examples of usage |
---|---|---|---|
Consonants | Plosives | b | b |
bb | bb | gobba | /gObba/ |
p | p | pane | /pane/ |
pp | pp | coppa | /kOppa/ |
d | d | danno | /danno/ |
dd | dd | cadde | /kadde/ |
t | t | tana | /tana/ |
tt | tt | zitto | /tsitto/ |
g | ɡ | gamba | /gamba/ |
gg | ɡɡ | leggo | /lEggo/ |
k | k | cane | /kane/ |
kk | kk | tocca | /tOkka/ |
Fricatives | v | v | vano |
vv | vv | bevvi | /bevvi/ |
f | f | fame | /fame/ |
ff | ff | beffa | /bEffa/ |
z | z | sbaglio | /zbaLo/ |
s | s | sano | /sano/ |
ss | ss | cassa | /kassa/ |
S | ʃ | scendo | /Sendo/ |
Affricates | dz | dz̬ | zona |
ddz | ddz̬ | mezzo | /mEddzo/ |
ts | ts̬ | stanza | /stantsa/ |
tts | tts̬ | bozza | /bOttsa/ |
dZ | ʤ̬ | gita | /dZita/ |
ddZ | dʤ̬ | oggi | /OddZi/ |
tS | ʧ̬ | cena | /tSena/ |
ttS | tʧ̬ | braccio | /brattSo/ |
Nasals | m | m | molla |
mm | mm | grammo | /grammo/ |
n | n | notte | /nOtte/ |
nn | nn | panna | /panna/ |
J | ɲ | gnocco | /JOkko/ |
Laterals | l | l | luce |
ll | ll | colla | /kOlla/ |
L | ʎ | foglia | /foLa/ |
Trills | r | r | rete |
rr | rr | ferro | /fErro/ |
Vowels | Single vowels | a | a |
e | e / e: | rete | /rete/ |
E | ɛ / ɛ: | pesca | /pEska/ |
i | i | mite | /mite/ |
o | o / o: | dove | /dove/ |
O | ɔ / ɔ: | moto | /mOto/ |
u | u | muto | /muto/ |
Ascending diphthongs | ja | i̯a | piatto |
je | i̯e: | chiesa | /kjeza/ |
jE | i̯ɛ | richiesta | /rikjEsta/ |
jo | i̯o: | fiore | /fjore/ |
jO | i̯ɔ | fiocco | /fjOkko/ |
ju | i̯u | aiutare | /ajutare/ |
wa | u̯a | quadro | /kwadro/ |
we | u̯e | questo | /kwesto/ |
wE | u̯ɛ | guerra | /gwErra/ |
wo | u̯o | rincuorare | /rinkworare/ |
wO | u̯ɔ | uomo | /wOmo/ |
Descending diphthongs | ai | a:i | mai |
au | au | aumento | /aumento/ |
ei | e:i | potei | /potei/ |
Ei | ɛ:i | lei | /lEi/ |
eu | eu | europeo | /europEo/ |
oi | o:i | voi | /voi/ |
Oi | ɔ:i | poi | /pOi/ |
Italian consonants
The standard Italian consonant system is considered to have:
- Six plosives
- Five fricatives
- Four affricates
- Three nasals
- Two laterals
- One trill
According to a peculiarity of the Italian pronunciation the double consonants (for example, <bb>, <gg>, etc.) are actually pronounced as an intensified phoneme called geminate. Every consonant, except /z/, /J/, /S/, /L/, has a corresponding geminate. (See Double letters .) These are shown in the following tables under the respective single consonants. The semivowels /j/ and /w/ are included under diphthongs because in Italian they only occur as the first part of an increasing diphthong.
Plosives
There are three voiced and three voiceless plosives in Italian, which can be arranged in pairs:
Voiced | Voiceless |
---|---|
/b/ | bene imbottire |
/bb/ | abbonato |
/d/ | dove sdrucciola andare |
/dd/ | additivo |
/g/ | gara ghetto ago |
/gg/ | agghindare |
Fricatives
There are five fricatives in the Italian SAMPA symbol set, two voiced and three voiceless. The voiced ones can be paired with their voiceless counterparts:
Voiced | Voiceless |
---|---|
/v/ | valere convolare svogliato |
/vv/ | avvalersi |
/z/ | caso snodare |
<i> after /S/ is not to be transcribed unless it forms a syllable of its own. For example: sciopero = /SOpero/ (and not /SjOpero/).
Affricates
In Italian there are four affricates, /ts/, /tS/, /dz/, and /dZ/. Affricates are always represented in SAMPA by two single phonemes.
Voiced | Voiceless |
---|---|
/dz/ | zona |
/ddz/ | azzurro indennizzo |
/dZ/ | giorno cangiante |
/ddZ/ | maggio |
The letter <z> and its double are mostly pronounced /dz/ and /ddz/ respectively if standing between two vowels or at the beginning of a word (see Pronunciation of <e>, <o>, and <z> ).
<i> after /dZ/, /ddZ/ and /tS/, /ttS/ is not to be transcribed unless it forms a syllable of its own. For example: gioco = /dZOko/ (and not /dZjOko/).
Nasals
There are three nasals in Italian, /m/, /n/, and /J/.
/m/ | mano ambizione | /mano/ /ambittsjone/ |
---|---|---|
/mm/ | ammonire | /ammonire/ |
/n/ | naso pane | /naso/ /pane/ |
/nn/ | annoiato | /annojato/ |
/J/ | gnomo ingegnere | /Jomo/ /indZeJEre/ |
Laterals
There are two laterals in Italian, /l/ and /L/.
/l/ | lampada salmo | /lampada/ /salmo/ |
---|---|---|
/ll/ | pallone | /pallone/ |
/L/ | gli aglio | /Li/ /aLo/ |
Trills
There is one trill in Italian which is pronounced with the tongue tip: /r/.
/r/ | radio cera | /radjo/ /tSera/ |
---|---|---|
/rr/ | afferrare | /afferrare/ |
Italian vowels
In Italian, vowels fall into two groups:
- Monophthongs
- Diphthongs (subdivided into increasing and decreasing diphthongs)
Monophthongs
In Italian, usually seven monophthongs are distinguished: /a/, /E/, /e/, /i/, /O/, /o/, and /u/. (See Pronunciation of <i> and Pronunciation of <e>, <o>, and <z> )
/a/ | palo arma | /palo/ /arma/ |
---|---|---|
/E/ | etica costituente cioè | /Etika/ /kostituEnte/ /tSoE/ |
/e/ | elemento bere te | /elemento/ /bere/ /te/ |
/i/ | fine intimo parvi | /fine/ /intimo/ /parvi/ |
/O/ | ottimo ispanofono però | /Ottimo/ /ispanOfono/ /perO/ |
/o/ | ondoso ombra uno | /ondoso/ /ombra/ /uno/ |
/u/ | fumo unto giù | /fumo/ /unto/ /dZu/ |
Diphthongs
The Italian diphthongs can be divided into:
- descending diphthongs
- ascending diphthongs
Descending diphthongs
Descending diphthongs are a combination of two monophthongs with the stress on the first monophthong. Since descending diphthongs do not usually carry important information for the meaning of a word, a reduced set of them is used for the SAMPA version. Speech recognition tests have shown very good results for this practice. The main advantage is that it reduces the amount of phonemes to be considered and thus also possible error sources.
There are seven descending diphthongs in Italian, /ai/, /Ei/, /ei/, /Oi/, /oi/, /au/, and /eu/.
/ai/ | aitante laico mai | /aitante/ /laiko/ /mai/ |
---|---|---|
/Ei/ | lei | /lEi/ |
/ei/ | potei | /potei/ |
/Oi/ | poi | /pOi/ |
/oi/ | voi | /voi/ |
/au/ | aumento | /aumento/ |
/eu/ | europeo | /europEo/ |
Ascending diphthongs
Ascending diphthongs are formed by combining a semivowel with a subsequent monophthong with the stress on the monophthong.
There are eleven ascending diphthongs in Italian, /ja/, /jE/, /je/, /jO/, /jo/, /ju/, /wa/, /wE/, /we/, /wO/, and /wo/.
Phoneme | Orthographic form | Example |
---|---|---|
/ja/ | <ia> | amiamo macchia |
/jE/ | <ie> | ieri richiesta |
/je/ | <ie> | ieratico piegare varie |
/jO/ | <io> | iodio fiotto picchiò |
/jo/ | <io> | fionda occhio |
/ju/ | <iu> | aiutare più |
/wa/ | <ua> | quadro acqua |
/wE/ | <ue> | guerra |
/we/ | <ue> | questo |
/wO/ | <uo> | uomo cuoco dileguò |
/wo/ | <uo> | rincuorare scialacquo |
Specific pronunciation transcription methods
Initial <h>
Initial <h> is always ignored in the transcription as it is not pronounced in Italian, e.g. hotel = /otEl/, hanno = /anno/. In the middle of a word it only stands in the following contexts: <che>, <chi>, <ghe>, <ghi> so that the consonants are pronounced as plosives (/ke/, /ki/, /ge/, /gi/).
Pronunciation of <i>
<i> between <c> or <g> and another vowel is not pronounced (<cia> = /tSa/ etc.), except when it is stressed, for example:
farmacia = /farmatSia/
Double letters
The double phonemes /bb/, /dd/, /ff/, /gg/, /kk/, /ll/, /mm/, /nn/, /pp/, /rr/, /ss/, /tt/, and /vv/ coincide with the corresponding double letters. Concerning the transcription of double affricates please note that only the plosive element must be redoubled, for example:
<zz> = /tts/ or /ddz/
Accentuation
The accentuation of a word is a distinctive aspect in the Italian language (e.g. àncora vs. ancóra). However, pairs of words differing only in accentuation are rare and thus accentuation as such is not represented in this version of SAMPA phonetic transcription.
If a word exhibits an orthographic accent, the transcription is usually different, for example:
faro /faro/ vs. farò /farO/
Pronunciation of <e>, <o>, and <z>
The pronunciation of <e> and <o> as open or closed vowels or <z> as /dz/ or /ts/ has in some cases an importance for the meaning of a word (e.g. /botte/ vs. /bOtte/ and /raddza/ versus /rattsa/). Nevertheless it generally varies according to the regional accent of the speaker and has no influence on the meaning, for example:
/gonna/ = /gOnna/
/raddzo/ = /rattso/
A good dictionary can help you to find the standard pronunciation.
The suffix <-zione>
When <z> introduces the suffix <-zione> it is always to be transcribed as a voiceless geminate, for example:
vibrazione = /vibrattsjone/
Pronunciation of foreign words
There is a set of phonemes that do not have regular Italian phonemes but only occur in foreign words. These are used in the system dictionary and are internally `mapped’ to other phonemes in the speech recognizer. There are no phoneme models trained for them. This means that it is not necessary to use these phonemes in custom transcriptions.
When there is a need to transcribe foreign words the general rule is to transcribe those words with the same SAMPA symbol set as the rest. In the case of an Italian transcription you have to transcribe every word of the dictionary with the Italian SAMPA symbols.
If you use a different symbol set your system will be incapable of understanding the input.
Every language has a different phoneme inventory, so you may have problems in covering each and every sound.
Below you find some examples to help you to find a pronunciation adapted to Italian.
French nasals
Nasalized vowels are common in French. However, there are no appropriate vowels in Italian to reflect this. The easiest solution is to use the equivalent non-nasalized vowel and then add the nasal /n/.
For example:
bon-bon /bOnbOn/
This is rather convenient, since this normally reflects the pronunciation used by Italian speakers that do not have knowledge of French.
The original transcription ‘bo~bo~’ cannot be realized because the French symbol ‘o~’ is not part of the Italian SAMPA symbol set.
English words
Even with English words you have to try to apply a pronunciation that has been adapted to Italian, for example:
camping /kamping/
The English phoneme ‘{’, ‘I’, and ‘N’ in the original transcription ‘k{mpIN’ are not elements of the Italian SAMPA symbol set.
The Italian symbol set in alphabetical order:
SAMPA | IPA | Examples of use |
---|---|---|
a | a | rata |
ai | a:i | mai |
au | au | aumento |
b | b | banco |
bb | bb | gobba |
d | d | danno |
dd | dd | cadde |
ddz | ddúz | mezzo |
ddZ | dʤ̬ | oggi |
dz | dz̬ | zona |
dZ | dʦ̬ | gita |
e | e / e: | rete |
E | ɛ / ɛ: | pesca |
ei | e:i | potei |
Ei | ɛ:i | lei |
eu | eu | europeo |
f | f | fame |
ff | ff | beffa |
g | ɡ | gamba |
gg | ɡɡ | leggo |
i | i | mite |
J | ɲ | gnocco |
ja | i̯a | piatto |
je | i̯e: | chiesa |
jE | i̯ɛ | richiesta |
jo | i̯o: | fiore |
jO | i̯ɔ | fiocco |
ju | i̯u | aiutare |
k | k | cane |
kk | kk | tocca |
l | l | luce |
L | ʎ | foglia |
ll | ll | colla |
m | m | molla |
mm | mm | grammo |
n | n | notte |
nn | nn | panna |
o | o / o: | dove |
O | ɔ / ɔ: | moto |
oi | o:i | voi |
Oi | ɔ:i | poi |
p | p | pane |
pp | pp | coppa |
r | r | rete |
rr | rr | ferro |
s | s | sano |
S | ʃ | scendo |
ss | ss | cassa |
t | t | tana |
ts | ts̬ | stanza |
tS | ʧ̬ | cena |
tt | tt | zitto |
tts | tts̬ | bozza |
ttS | tʧ̬ | braccio |
u | u | muto |
v | v | vano |
vv | vv | bevvi |
wa | u̯a | quadro |
we | u̯e | questo |
wE | u̯ɛ | guerra |
wo | u̯o | rincuorare |
wO | u̯ɔ | uomo |
z | z | sbaglio |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.