Portuguese Brazil (pt-BR)

This documentation was updated on January 26, 2024.

Creating grammars

The following subsections describe key issues for working with grammar documents in the Brazilian Portuguese language.

Character encoding

Nuance has full internal Unicode support. For example, you can create your grammars using UTF-8 or Latin-1 (also known as ISO-8859-1) character encoding. For example, your grammar header might be:

<?xml version=‘1.0’ encoding=‘UTF-8’?>

<grammar xml:lang=“pt-BR” version=“1.0” root=“test”>

If you do not have access to a keyboard for your target language, you can use the Windows character map. (Choose the “System” font and the “Latin-1” subset.)

Start Program Files Accessories System Tools Character Map

Entering Portuguese grammars for ParseTool

The ParseTool program lets you type sentences into a grammar and returns the grammar’s results. The tool is stored in the bin directory of the installation baseline. See your product documentation for information.

When entering Portuguese text for the built-in grammars, note the following:

  • Letters a-z are allowed, including ã, â, á, à, ç, ê, é, í, ó, õ, and ú.
  • Arabic numbers are not allowed. For example, the number grammar does not parse “25” but it does parse “vinte_e_cinco”.
  • Three-digit and greater numbers must include a space between each number. For example, 12,345 must be entered as:

doze mil trezentos quarenta_e_cinco

alphanum_lc built-in grammar

The alphanum_lc built-in grammar recognizes a connected string of up to 20 digits and lower case alphabetic characters. For example, this grammar could be used to recognize a product code or order number.

Characters are the letters a-z, including accented letters.

Digits are 0-9.

NOTE: The alphanum_l built-in grammar replaces the alphanum built-in grammar.

alphanum built-in grammar

The alphanum built-in grammar recognizes a connected string of up to 20 digits and alphabetic characters. For example, this grammar could be used to recognize a product code or order number.

Characters are the letters a-z, including accented letters.

Digits are 0-9.

boolean built-in grammar

The boolean grammar collects an affirmative or negative response.

Properties

The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.

Parameter Description
y Desired DTMF digit to be equivalent to “sim” (default = 1)
n Desired DTMF digit to be equivalent to “não” (default = 2)

Examples

Caller says… MEANING key
sim true
não false

ccexpdate built-in grammar

The ccexpdate grammar understands the expiration date on a credit card. Expiration dates are usually a month and a year, and are often embossed on a credit card in the form “mm/yy.”

The grammar recognizes variations on the date, for example, “dezembro 2005,” “zero quatro zero zero,” “onze barra zero três,” etc. The slash symbol (/) can be spoken as “barra,” “de,” or “do.”

Some credit cards are stamped with a day of the month as well as the month and year; the ccexpdate grammar recognizes these dates as well. However, the only day of the month it recognizes is the last day of a given month, e.g., “30 novembro 2005,” “zero dois dois nove zero seis,” etc. The grammar does not check for leap years: both 28 February and 29 February are recognized, regardless of the given year.

creditcard built-in grammar

The creditcard grammar understands a caller saying a credit card number, optionally preceding the number with the credit card name, or the words “conta número” or “conta.” For example, a caller can say, “visa conta número quatro zero um sete…,” “cartão mastercard cinco zero zero dois…,” or “três sete três cinco….”

currency built-in grammar

The currency grammar collects currency amounts using real and centavo, or dolár and centavo.

MEANING Contains a string in the following form: currencymain_unit_amount . subunit_amount If the caller explicitly says “real,” then BRR is added as a prefix to currency . If the caller says “dolár,” then a USD prefix is added. In all other cases, no prefix is added.
SWI_literal Contains the exact text that was recognized.

Examples

Caller says MEANING
5 dólares USD5.00
5 reais BRR5.00
5 centavos 0.05
5 reais e 5 centavos BRR5.05
5 reais 25 centavos 5 reais 25 BRR5.25
seiscentos vinte_e_cinco mil quatro centos sessenta_e_quatro reais BRR625464.00
um dólar zero centavos USD1.00

date built-in grammar

The date grammar accepts a date spoken in any of several formats.

Recognized phrases include “4 junho,” “4 junho 2006,” ““4, 6, 2006,” “o dia quatro,” and “segunda feira, o quatro de junho.”

The grammar also accepts “ontem” “hoje,” “amanhã,” and “depois de amanhã” which return values of -1, 0, +1, and +2 respectively into the MEANING key.

Examples

Caller says MEANING key
5 janeiro, 2000 20000105
ontem -1
hoje 0
amanhã +1
depois de amanhã +2
o dia quatro ??????04
quarta feira (Phrase not recognized)
4 de junho ????0604
4 de junho de 1997 19970604
4 de junho de 97 ??970604
quarta feira 4 de junho de 1997 19970604
dez doze Not allowed
dez doze noventa_e_sete Not allowed
noventa e sete Not allowed

digits built-in grammar

Valid characters are the digits 0-9.

number built-in grammar

The number grammar recognizes whole numeric numbers (the caller must not speak the individual digits). Decimal places are not allowed.

Examples

Numbers from -999,999,999.99 to 999,999,999.99 are recognized. For example:

Caller says MEANING key
vinte_e_cinco 25
doze mil trezentos quarenta_e_cinco 12345
menos quatro -4

phone built-in grammar

Telephone numbers (landline and cellular). Telephone numbers may contain 7-8 digits plus an optional area code (2 digits).

Return keys/values

Upon return, the MEANING key is assigned to a 7-, 8-, 9-, or 10-digit character result representing the recognized phone number.

postcode built-in grammar

The postcode grammar recognizes valid postal code in Brazil in eight-digit format:

XXXXX - XXX

The caller must speak the hyphen “traço” between the two sets of digits.

Return keys/values

Upon return, the key MEANING is assigned to the recognized code, and contains 5 digits, a hyphen, and 3 more digits.

time built-in grammar

The time grammar accepts spoken time utterances from the caller. Recognized phrases include times given in 12-hour format (for example, “5 horas”) and 24-hour format (“vinte_e_três e quinze”). In addition, it will recognize “qualified” times such as “antes das cinco horas” and “ao redor das cinco” (or “por volta das cinco”).

Examples

For each entry, the values returned in the MEANING and QUALIFIER keys are shown. (Not shown are the values of the HOUR, MINUTE, and AMPM keys.)

Caller says MEANING QUALIFIER
ao meio-dia 1200p exact
à meia-noite 0000? exact
antes do meio-dia 1200p before
depois das treze e trinta 1330h after
vinte e vinte 2020h exact
oito e vinte da manhã 0820a exact
oito e meia 0830? exact
sete e quinze da noite 0715p exact
às vinte e quatro horas 2400 Not allowed unless maxexpected = 2459.

Vocabulary items and pronunciations

This chapter describes considerations for vocabularies and their pronunciations in Brazilian Portuguese (pt-BR).

Specially tuned pronunciations

The following list shows common words that are fine-tuned by Nuance. Each of these words contains “word-specific phonemes;” that is, phonemes and associated models created especially for the words.

Words with tuned pronunciations (do not modify):

  • sim, não
  • Cardinal numbers: 0-99, 100, and 1000

Brazilian Portuguese pronunciations

This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the Portuguese language as spoken in Brazil. It provides information about transcription and pronunciation.

This section explains all the phonemes and their SAMPA symbols used in Brazilian Portuguese. Our reference sources for pronunciations are:

Wolfgang Pökl / Franz Rainer, Einführung in die romanische Sprachwissenschaft, Tübingen, 1990

Langenscheidts Taschenwörterbuch Portugiesisch

Volker Noll, Das brasilianische Portugiesisch, Heidelberg: Winter, 1999

If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in The Brazilian Portuguese symbol set in alphabetical order .

With this version of the Brazilian Portuguese language pack the new orthographic rules for the Portuguese language (valid from January 1, 2009) have been implemented.

For more information on the orthographic reform (document in Portuguese) see here   .

A free conversion tool (in Portuguese) can be obtained on this web-site  

The Portuguese phoneme system

The Brazilian Portuguese phoneme system can be divided into two groups:

  • Consonants
  • Vowels

Furthermore, it is possible to distinguish five different types of Portuguese consonants:

  • Plosives
  • Fricatives
  • Affricates
  • Nasals
  • Laterals
  • Trills

Brazilian Portuguese symbol set grouped by phoneme classes

The following table is an overview of the phonemes of the Brazilian Portuguese SAMPA and IPA symbol set, grouped by the phoneme classes to which they belong (according to the manner of their articulation):

Phoneme class SAMPA IPA Examples of usage
Consonants Plosives b b
p p para /par@/
g g gato /gatu/
k k casa /kaz@/
d d dedo /dedu/
t t terno /tERnu/
Fricatives v v vaca
f f filho /fiLu/
s s samba /sa~b@/
z z casa /kaz@/
S ʃ caixa /kajS@/
Z ʒ gelo /Zelu/
Affricates tS ʧ noite
dZ ʤ cidade /sidadZI/
Nasal n n navio
m m meu /mew/
J ɲ manhã /ma~Ja~/
Laterals l l lata
L ʎ milha /miL@/
Trills r r caro
R ʀ carro /kaRu/
Vowels Single vowels a a
@ ə mala /mal@/
e e meu /mew/
E ɛ velho /vELu/
i i ilha /iL@/
I ɪ norte /nORtSI/
o o coisa /kojz@/
u u tudo /tudu/
O ɔ nova /nOv@/
Nasal vowels a~ a~ irmã
e~ e~ pente /pe~tSI/
i~ i~ marfim /maRfi~/
o~ o~ bom /bo~/
u~ u~ mundo /mu~du/
Semi-vowels w w meu
j j cai /kaj/
Oral diphthongs aj aj pai
aw aw pau /paw/
ej ej leito /lejtu/
Ej ej papéis /papEjs/
ew ew seu /sew/
Ew ew céu /sEw/
iw iw adiu /adZiw/
oj oj coisa /kojz@/
Oj ɔj sóis /sOjS/
ow ow pouco /powku/
Ow ɔw sol /sOw/
uj uj fortuito /foRtujtu/
uw uw adulto /aduwtu/
Nasal dipthongs a~j a~j mãe
a~w a~w acusação /akuzasa~w/
e~j e~j bem /be~j/
i~j i~j palíndrome /p@li~jdromI/
o~j o~j põe /so~j/
o~w o~w marrom /maRo~w/
u~j u~j muito /mu~jtu/
u~w u~w comum /komu~w/

Brazilian Portuguese consonants

The standard Brazilian Portuguese consonant system is considered to have:

  • Six plosives
  • Six fricatives
  • Two affricates
  • Three nasals
  • Two laterals
  • Two trills

The sample words given below demonstrate the different contexts in which the sounds can appear. A short explanation is also given.

Plosives

There are three voiced and three voiceless plosives in Brazilian Portuguese, which can be arranged in pairs:

Voiced Voiceless
/b/ bar
/d/ dado
/g/ gato

Fricatives

There are six fricatives in the Brazilian Portuguese language, three voiced and three voiceless:

Voiced Voiceless
/v/ vaca
/z/ casa
/Z/ beijo

Affricates

Unlike European Portuguese, Brazilian Portuguese contains affricates, and they are used quite commonly. There are two of them: /tS/ and /dZ/.

Voiced Voiceless
/dZ/ cidade

Nasals

The three Brazilian Portuguese nasals are /m/, /n/ and /J/.

/m/ amiga /amig@/
/n/ nada /nad@/
/J/ sozinho /sOziJu/

/J/ is normally represented by the combination <nh> in Brazilian Portuguese orthography.

Laterals

There are two laterals in Brazilian Portuguese: /l/ and /L/.

/l/ lado /ladu/
/L/ velho /vELu/

Trills

There are officially two trills used in Brazilian Portuguese. One of them, the /R/, is pronounced mostly as an aspirated ‘h’ (which is not present as separate phoneme but contained in the phoneme /R/).

/r/ caro /karu/
/R/ carro /kaRu/

Brazilian Portuguese vowels

Single, nasal and semi-vowels

In Brazilian Portuguese, vowels fall into three groups:

  • Single vowels
  • Nasal vowels
  • Semi-vowels

Single vowels

There are nine single vowels, which are /@/, /a/, /e/, /E/, /i/, /I/, /o/, /O/ and /u/.

When the vowel is unstressed, transcribe the two unstressed vowels /@/ and /I/ instead of /a/ and /i/.

Note: since this version of the language pack the distinction between the stressed and unstressed variants of (‘u’ and ‘U’) is no longer made. The phoneme /u/ is the single phoneme used.

/@/ mala /mal@/
/a/ /pa/
/e/ seda /seda/
/E/ céu /sEw/
/i/ ali /ali/
/I/ norte /nORtSI/
/o/ por /poR/
/O/ mora /mOra/
/u/ mudo /mudu/

Nasal vowels

The nasal vowels are /a~/, /e~/, /i~/, /o~/, and /u~/

/a~/ irmã /iRma~/
/e~/ pente /pe~tSi/
/i~/ marfim /maRfi~/
/o~/ longe /lo~Zi/
/u~/ mundo /mu~du/

Semi-vowels

Semi-vowels are /j/ and /w/.

/j/ cai /kaj/
/w/ mau /maw/

Falling diphthongs

In our Brazilian Portuguese phoneme set, there are twenty-one falling diphthongs:

/a~j/ mãe /ma~j/
/a~w/ acusação /akuzasa~w/
/aj/ pai /paj/
/aw/ pau /paw/
/e~j/ bem /be~j/
/ej/ leito /lejtu/
/Ej/ papéis /papEjs/
/ew/ seu /sew/
/Ew/ céu /sEw/
/i~j/ sim /si~j/
/iw/ adiu /adZiw/
/oj/ coisa /kojz@/
/o~j/ põe /so~j/
/Oj/ sóis /sOjS/
/ow/ pouco /powku/
/o~w/ marrom /maRo~w/
/Ow/ sol /sOw/
/u~j/ muito /mu~jtu/
/uj/ fortuito /foRtujtu/
/u~w/ comum /komu~w/
/uw/ sul /suw/

Rising diphthongs

Rising diphthongs occur less often than falling diphthongs. Many of them may also be analyzed as hiatuses, thus assigning separate pronunciations to the two adjacent vowels. The difference between a rising diphthong and a hiatus is not phonemic; the former are usually found in colloquial speech, and the latter in careful pronunciation.

Specific pronunciation transcription methods

Initial <h>

The initial <h> should always be ignored in transcription as it is not pronounced in Brazilian Portuguese. For example:

hotel /otEw/
hora /Or@/

Two identical vowels adjacent to each other

When the vowel <e> is followed by the same vowel, the first <e> is transcribed as /e/ or /E/ (in analogy to the prefix <pre>).

preestabelecer /preeStabeleseR/ or /prEeStabeleseR/

Differences between fricatives and plosives

/g/ versus /Z/

Brazilian Portuguese realizes <g> as /g/ when it is followed by the vowels /u/, /a/, and /o/:

gosto /goStu/
guloso /gulozu/
gato /gatu/

It is realized as /Z/ when it is followed by /e/ and /i/:

girar /ZiraR/
gente /Ze~tSI/

When <g> is combined with the vowel <u> and <e> or <i>, the <u> is ignored and <g> is realized as /g/. For example (u-combination):

guiar /giar/
guerra /gER@/

/k/ versus /s/

Brazilian Portuguese <c> may also be realized as either a fricative or as a plosive. The realization of <c> as the /k/ plosive occurs when <c> precedes <a>, <o>, or <u>, as in the following examples:

casa /kaz@/
coisa /kojz@/
cupido /kupidu/

However, <c> is realized as the fricative /s/ when <c> is followed by <e> or <i>:

cego /sEgu/
cigarro /sigaRu/

To realize a /k/ in front of /e/ and /i/, the combination <qu> is used. For example:

querer /kereR/

Transcription of the fricatives /s/ and /z/

The voiceless fricative /s/ occurs in front position before vowels. Before consonants and in the end position, it is transcribed as /S/, like in the examples:

sol /sOw/
mês /meS/
estar /eStar/
pasto /paStu/

The realization of /S/ in end position can occur after any vowel. For example:

casas /kazaS/

The voiced fricative /z/ occurs before vowels. For example:

casa /kaz@/
zinco /zi~ku/

Transcription of the nasal /J/

The combination of <n> and <h> always produces the sound /J/. For example:

vinho /viJu/
cozinha /kuziJ@/

Transcription of the nasal /L/

The combination of ’l’ and ‘h’ always produces the sound /L/. For example:

olho /oLu/
alho /aLu/

The epenthetic vowel

In Brazilian Portuguese, it is common to include a vowel, mostly an <i>, where there is actually none in the orthography:

Between consonant cluster (two consonants) advogado /adZIvogadu/
Before final <s> mas /majS/
Before final <s> paz /pajZ/

Palatalization of /t/ and /d/

A very prominent characteristic of Brazilian Portuguese is the general tendency to palatalize the consonants /t/ and /d/ when they are realized before /i/ or /I/. Note that final <e> can be also realized as /I/. Examples:

tio /tSiu/
cidade /sidadZI/

Pronunciation of foreign words

For the transcription of foreign words, you are not allowed to use other phonemes than those in the Brazilian Portuguese phoneme inventory. So if you have English, French, German, or other foreign words to transcribe, do the following:

  • Check whether there is a common “Brazilian” pronunciation for this word. For example, Davidson /dejvidZiso~/ for English, bâton /bato~/ for French, etc.

In the case of Davidson, it is a common Brazilian usage to introduce a vowel where it does not exist–an epenthetic vowel–and to palatalize <d> and <t>. (See more in Palatalization of /t/ and /d/ .)

  • Use the closest Brazilian phoneme to cover the actual foreign phoneme (as shown in the examples below).

Here are some standard substitutions in Brazilian transcriptions:

Language Example Pronunciation
English camping /ke~piJ/
French nasal bâton /bato~/

Multiple pronunciations (variants)

Since it is possible to have more than one pronunciation for a word by using pronunciation variants, it may be difficult to determine how many pronunciation variants should be created.

The general rule is: Variants should only be created if the pronunciation differs in more than one phoneme. Minor systematic variations such as the pronunciation of <s> at the end of words as /z/ instead of the standard pronunciation /S/ can usually be reflected in the training material for the phonemes and therefore needs not to be covered by pronunciation variants. If such a word causes recognition errors, the creation of pronunciation variants may help to solve the problem.

The Brazilian Portuguese symbol set in alphabetical order:

SAMPA IPA Examples of usage
@ ə mala
a a mala
a~ a~ irmã
a~j a~j mãe
a~w a~w acusação
aj aj pai
aw aw pau
b b bar
d d dedo
dZ ʤ cidade
e e meu
E ɛ velho
e~ e~ pente
e~j e~j bem
ej ej leito
Ej ej papéis
ew ew seu
Ew ew céu
f f filho
g g gato
i i ilha
I ɪ norte
i~ i~ marfim
i~j i~j cafezinho
iw iw adiu
j j cai
J ɲ manhã
k k casa
l l lata
L ʎ milha
m m meu
n n navio
o o coisa
O ɔ nova
o~ o~ bom
o~j o~j põe
o~w o~w marrom
oj oj coisa
Oj ɔj sóis
ow ow pouco
Ow ɔw sol
p p para
r r caro
R ʀ carro
s s samba
S ʃ caixa
t t terno
tS ʧ noite
u u tudo
u~ u~ mundo
u~j u~j muito
u~w u~w comum
uj uj fortuito
uw uw adulto
v v vaca
w w meu
z z casa
Z ʒ gelo