Spanish Valencian (va-ES)
This documentation was updated on November 9, 2023.
Creating grammars
The following subsections describe key issues for working with grammar documents in the Valencian language.
Character encoding
Nuance Recognizer has full internal Unicode support. For example, you can create your grammars using UTF-8 or Latin-1 (also known as ISO-8859-1) character encoding. For example, your grammar header might be:
<?xml version=‘1.0’ encoding=‘UTF-8’?> <grammar xml:lang=“va-ES” version=“1.0” root=“test”>
Below are codes for writing some common Valencian characters. These are useful if you do not have access to a Valencian keyboard, and are typed by pressing the Alt key while entering digits on your keyboard (after typing the last digit, the desired character appears on your screen when you release the Alt key):
Alt/0224 = à | Alt/0237 = í |
---|---|
Alt/0225 = á | Alt/0239 = ï |
Alt/0231 = ç | Alt/0242 = ò |
Alt/0232 = è | Alt/0243 = ó |
Alt/0233 = é | Alt/0249 = ù |
Alt/0250 = ú |
If you do not have access to a keyboard for your target language, you can use the Windows character map. (Choose the “System” font and the “Latin-1” subset.)
Start→Programs→Accessories→System Tools→Character Map
alphanum_lc built-in grammar
The alphanum_lc grammar recognizes a connected string of up to 20 digits and lower case alphabetic characters.
For example, this grammar could be used to recognize a product code or order number.
Characters are the letters a-z, and à, á, ç, è, é, í, ï, ò, ó, ù, and ú.
Digits are 0-9.
Non-alphanumeric characters such as hyphens (-), dots (.), and underscores (_) are not recognized; if spoken they reduce recognition accuracy.
Returned keys/values
MEANING | Contains a string of ISO-8859-1 digits and lowercase letters, with no embedded spaces. |
---|---|
SWI_literal | Contains the exact text that was recognized. |
Note: the alphanum_lc built-in grammar replaces the alphanum built-in grammar.
alphanum built-in grammar
The alphanum grammar recognizes a connected string of up to 20 digits and alphabetic characters. For example, this grammar could be used to recognize a product code or order number.
Characters are the letters a-z, and à, á, ç, è, é, í, ï, ò, ó, ù, and ú.
Digits are 0-9.
Non-alphanumeric characters such as hyphens (-), dots (.), and underscores (_) are not recognized; if spoken they reduce recognition accuracy.
Returned keys/values
MEANING | Contains a string of ISO-8859-1 digits and lowercase letters, with no embedded spaces. |
---|---|
SWI_literal | Contains the exact text that was recognized. |
boolean built-in grammar
The boolean grammar collects an affirmative or negative response.
Properties
The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.
Parameter | Description |
---|---|
y | Desired DTMF digit to be equivalent to “sí” (default = 1) |
n | Desired DTMF digit to be equivalent to “no” (default = 2) |
Examples
Caller says | MEANING key |
---|---|
sí | true |
no | false |
digits built-in grammar
Valid characters are the digits 0-9.
Vocabulary items and pronunciations
This chapter describes considerations for vocabularies and their pronunciations in Valencian (va-ES). Your product documentation covers details about how to work with pronunciations and dictionaries.
Specially tuned pronunciations
The following table shows common words that are fine-tuned by Nuance.
Many of these words contain so called “word-specific phonemes” in their pronunciation transcription.
These are phonemes that have been trained exclusively on data containing the respective words.
Words with tuned pronunciations (do not modify):
All letters of the alphabet, a-z, à, é, ç, è, é, í, ï, ò, ó, ù, and ú.
Boolean: sí and no
Digits: 0-9
Valencian pronunciations
This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the Valencian language as spoken in Spain. It provides information about transcription and pronunciation.
If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in The Valencian symbol set in alphabetical order .
The Valencian phoneme system
The Valencian phoneme system can be divided into two groups:
- Vowels
- Consonants
It is possible to distinguish seven different types of Valencian consonants:
- Plosives
- Fricatives
- Affricates
- Nasals
- Laterals
- Trill
- Xenophones
Valencian symbol set grouped by phoneme classes
The following table shows all phonemes used in Valencian transcriptions. They are listed according to their phoneme classes with their SAMPA and IPA representations.
Phoneme class | SAMPA | IPA | Examples of use |
---|---|---|---|
Consonants | Plosives | b | b |
p | p | pellícula | /p@likul@/ |
d | d | diploma | /diplom@/ |
t | t | teatre | /teatr@/ |
g | g | garantia | /g@r@nti@/ |
k | k | informàtica | /imfurrmatik@/ |
Fricatives | f | f | micròfon |
s | s | observar | /ups@rba/ |
S | ʃ | peix | /peS/ |
Z | ʒ | pellroja | /peLrrOZ@/ |
j | j | aires | /ajr@s/ |
z | z | atletisme | /@ll@tizm@/ |
Affricates | dZ | ʤ | garatge |
Nasals | m | m | humana |
N | ŋ | llengua | /LeNgw@/ |
J | ɲ | acompanyar | /@kump@Ja/ |
n | n | ajuntament | /@Zunt@men/ |
Laterals | l | l | blanca |
L | ʎ | caballero | /kabaLero/ |
Trills | r | r | cabaret |
rr | rr | caràcter | /k@rakt@rr/ |
Vowels | Single vowels | a | a |
e | e | doble | /doble/ |
E | ɛ | dèbil | /dEbil/ |
i | i | edicions | /@disions/ |
o | o | editor | /@dito/ |
O | ɔ | escola | /@skOl@/ |
u | u | espero | /@speru/ |
Semivowels | j | j | fruita |
w | w | guardar | /gw@rda/ |
@ | ə | gaita | /gajt@/ |
Valencian consonants
The standard Valencian consonants system is considered to have:
- six plosives
- six fricatives
- one affricate
- four nasals
- two laterals
- two trills
- two xenophones
The sample words given below demonstrate the different contexts in which the sounds can appear. A short explanation is also given.
Plosives
There are three voiced and three voiceless plosives in Valencian, which can be arranged in pairs as shown here:
Voiced | Examples | Voiceless | Examples | ||
---|---|---|---|---|---|
/b/ | bellesa setembre | /b@LEz@/ /s@tembr@/ | /p/ | palma aplicar verb | /palm@/ /@plika/ /bErrp/ |
/d/ | declarar setze | /d@kl@ra/ /sEdz@/ | /t/ | teatre canta cabaret | /teatr@/ /kant@/ /k@b@rEt/ |
/g/ | gaita agost | /gajt@/ /@gost/ | /k/ | kilo descansar despòtic | /kilu/ /d@sk@nsa/ /d@spOtik/ |
Fricatives
There are six fricatives in the Valencian SAMPA symbol set, five voiced and four voiceless:
Voiced | Examples | Voiceless | Examples | ||
---|---|---|---|---|---|
/S/ | caixa cruz | /kaS@/ /kruS/ | |||
/z/ | zero reserva | /zEru/ /rr@zerb@/ | /f/ | fabricar cafè fotògraf | /f@brika/ /k@fE/ /futOgr@f/ |
/Z/ | genial urgent | /Z@nial/ /urrZen/ | /s/ | saber caríssima sentiments | /s@bE/ /k@risim@/ /s@ntimens/ |
/j/ | ioga paisatge bonsai | /jOg@/ /p@jzadZ@/ /bunsaj/ |
Affricates
In Valencian there is one affricate, ‘dZ’ which is trained as a combination of the phonemes /d/ and /Z/
/dZ/ | fotomuntatge | /futumuntadZ@/ |
---|
Nasals
There are four nasals in Valencian, /m/, /n/, /N/, and /J/.
/m/ | mes habitualment quelcom | /mes/ /@bitualmen/ /k@lkOm/ |
---|---|---|
/n/ | nata organitzar pagament | /nat@/ /urg@nidza/ /p@g@men/ |
/N/ | significa cinc | /siNnifik@/ /siN/ |
/J/ | acompanyar any | /@kump@Ja/ /aJ/ |
Laterals
There are two laterals in Valencian, /l/ and /L/.
/l/ | lògica diabòlic mal | /lOZik@/ /di@bolik/ /mal/ |
---|---|---|
/L/ | lluna mallorca castell | /Lun@/ /m@LOrrk@/ /k@steL/ |
Trills
There are two trills in Valencian, /r/ and /rr/.
/r/ | compra | /kompr@/ |
---|---|---|
/rr/ | racista sorprendre valor | /rr@sist@/ /surrpEndr@/ /b@lorr/ |
Valencian vowels
Single vowels (monophthongs)
The Valencian language has seven distinguishable monophthongs:
/a/ | cada caminar | /kad@/ /k@mina/ |
---|---|---|
/e/ | ella gomera porter | /eL@/ /gumer@/ /purrte/ |
/E/ | època aquella què | /Epuk@/ /@kEL@/ /kE/ |
/i/ | hipnòtic idea imaginar introduir | /ibnOtik/ /ide@/ /im@Zina/ /intrudui/ |
/o/ | onze operadora operació | /onz@/ /up@r@dor@/ /up@r@sio/ |
/O/ | òptica amazònica | /Optik@/ /@m@zOnik@/ |
/u/ | homòfon important menú | /umOfun/ /impurrtan/ /m@nu/ |
Semi-vowels
There are three semi-vowels in Valencian:
/j/ | ioga vizcaya massai | /jOg@/ /biSkaja/ /m@saj/ |
---|---|---|
/w/ | washington ambigua europeu | /waSinton/ /@mbigw@/ /@wrupEw/ |
/@/ | evitar exagerat fantasia | /@bita/ /@gz@Z@rat/ /f@nt@zi@/ |
Specific pronunciation transcription methods
The grapheme <h>
There is no phonetic realization of the grapheme <h>. For example:
hotel | /utEl/ |
---|
Transcription of the fricatives /s/ and /z/
The voiceless fricative /s/ occurs before vowels, voiceless consonants and at the end of a word. For example:
sala | /sal@/ |
---|---|
absorbit | /@psurbit/ |
articles | /@rrtikl@s/ |
The voiced fricative /z/ occurs before voiced consonants or between two vowels. For example:
anglicismes | /@Nglisizm@s/ |
---|---|
bellesa | /b@LEz@/ |
Transcription of the trills /r/ and /rr/
The trill /r/ appears in the middle of a word between two vowels and between a vowel and a consonant other than <n>, <l>, or <s>. For example:
bolero | /buleru/ |
---|---|
celebrar | /s@l@bra/ |
The trill /rr/ appears at the beginning and at the end of a word. In the middle of a word, its position is between two vowels or between a vowel and a consonant other than <n>, <l>, or <s>. For example:
realitat | /rre@litat/ |
---|---|
familiar | /f@miliarr/ |
catorze | /k@torrz@/ |
Transcription of the nasals /J/ and /N/
The grapheme <ny> is always represented by the SAMPA symbol /J/.
companyia | /kump@Ji@/ |
---|---|
desengany | /d@z@NgaJ/ |
The nasal /N/ appears before the phonemes /g/ and /k/ and word final. For example:
lingüista | /liNgwist@/ |
---|---|
banc | /baN/ |
Transcription of the affricate /dZ/
/dZ/ usually appears between two vowels and is represented orthographically as <tg> or <tj>.
paisatge | /p@jzadZ@/ |
---|---|
lletjos | /LedZus/ |
Pronunciation of foreign words
When there is a need to transcribe foreign words, the general rule is to transcribe those words with the same SAMPA symbol set than the rest. In case of a Valencian transcription you have to transcribe every word of the dictionary with the Valencian SAMPA symbols.
If you use a different symbol set your system will be incapable of understanding the input.
Every language has a different phoneme inventory, so you may have problems in covering each and every sound. For the most common case we offer transcription examples.
The French nasals for example are adapted to the Valencian vowel system:
blanc | /blaN/ |
---|---|
centre | /sentr@/ |
In general, foreign words are being integrated into Valencian phonetics, sometimes also the orthography was changed.
garatge | /g@radZ@/ |
---|---|
communications | /komunikEjSons/ |
hannover | /@nOb@rr/ |
lletjos | /LedZus/ |
Multiple pronunciations (variants)
The type of pronunciation used in SAMPA and in the Valencian Background dictionary conforms to the standard non-regional Valencian pronunciation. Since it is possible to have more than one pronunciation for a word by using pronunciation variants, it may be difficult to determine how many pronunciation variants should be created.
The general rule is: create variants only if the pronunciation differs in more than one phoneme.
The Valencian symbol set in alphabetical order
The following table shows the Valencian symbol set in alphabetical order:
SAMPA | IPA | Examples of use |
---|---|---|
@ | ə | gaita |
a | a | casado |
b | b | bodega |
d | d | diploma |
e | e | doble |
E | ɛ | dèbil |
f | f | micròfon |
g | g | garantia |
i | i | edicions |
j | j | aires |
J | ɲ | acompanyar |
k | k | informàtica |
l | l | blanca |
L | ʎ | caballero |
m | m | humana |
n | n | ajuntament |
N | ŋ | llengua |
o | o | editor |
O | ɔ | escola |
p | p | pellícula |
r | r | cabaret |
rr | rr | caràcter |
s | s | observar |
S | ʃ | peix |
t | t | teatre |
u | u | espero |
w | w | guardar |
z | z | atletisme |
Z | ʒ | pellroja |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.