Spanish Valencian (va-ES)

This documentation was updated on November 9, 2023.

Creating grammars

The following subsections describe key issues for working with grammar documents in the Valencian language.

Character encoding

Nuance Recognizer has full internal Unicode support. For example, you can create your grammars using UTF-8 or Latin-1 (also known as ISO-8859-1) character encoding. For example, your grammar header might be:

<?xml version=‘1.0’ encoding=‘UTF-8’?> <grammar xml:lang=“va-ES” version=“1.0” root=“test”>

Below are codes for writing some common Valencian characters. These are useful if you do not have access to a Valencian keyboard, and are typed by pressing the Alt key while entering digits on your keyboard (after typing the last digit, the desired character appears on your screen when you release the Alt key):

Alt/0224 = à Alt/0237 = í
Alt/0225 = á Alt/0239 = ï
Alt/0231 = ç Alt/0242 = ò
Alt/0232 = è Alt/0243 = ó
Alt/0233 = é Alt/0249 = ù
Alt/0250 = ú

If you do not have access to a keyboard for your target language, you can use the Windows character map. (Choose the “System” font and the “Latin-1” subset.)

Start→Programs→Accessories→System Tools→Character Map

alphanum_lc built-in grammar

The alphanum_lc grammar recognizes a connected string of up to 20 digits and lower case alphabetic characters.

For example, this grammar could be used to recognize a product code or order number.

Characters are the letters a-z, and à, á, ç, è, é, í, ï, ò, ó, ù, and ú.

Digits are 0-9.

Non-alphanumeric characters such as hyphens (-), dots (.), and underscores (_) are not recognized; if spoken they reduce recognition accuracy.

Returned keys/values

MEANING Contains a string of ISO-8859-1 digits and lowercase letters, with no embedded spaces.
SWI_literal Contains the exact text that was recognized.

Note: the alphanum_lc built-in grammar replaces the alphanum built-in grammar.

alphanum built-in grammar

The alphanum grammar recognizes a connected string of up to 20 digits and alphabetic characters. For example, this grammar could be used to recognize a product code or order number.

Characters are the letters a-z, and à, á, ç, è, é, í, ï, ò, ó, ù, and ú.

Digits are 0-9.

Non-alphanumeric characters such as hyphens (-), dots (.), and underscores (_) are not recognized; if spoken they reduce recognition accuracy.

Returned keys/values

MEANING Contains a string of ISO-8859-1 digits and lowercase letters, with no embedded spaces.
SWI_literal Contains the exact text that was recognized.

boolean built-in grammar

The boolean grammar collects an affirmative or negative response.

Properties

The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.

Parameter Description
y Desired DTMF digit to be equivalent to “sí” (default = 1)
n Desired DTMF digit to be equivalent to “no” (default = 2)

Examples

Caller says MEANING key
true
no false

digits built-in grammar

Valid characters are the digits 0-9.

Vocabulary items and pronunciations

This chapter describes considerations for vocabularies and their pronunciations in Valencian (va-ES). Your product documentation covers details about how to work with pronunciations and dictionaries.

Specially tuned pronunciations

The following table shows common words that are fine-tuned by Nuance.

Many of these words contain so called “word-specific phonemes” in their pronunciation transcription.

These are phonemes that have been trained exclusively on data containing the respective words.

Words with tuned pronunciations (do not modify):

All letters of the alphabet, a-z, à, é, ç, è, é, í, ï, ò, ó, ù, and ú.
Boolean: sí and no
Digits: 0-9

Valencian pronunciations

This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the Valencian language as spoken in Spain. It provides information about transcription and pronunciation.

If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in The Valencian symbol set in alphabetical order .

The Valencian phoneme system

The Valencian phoneme system can be divided into two groups:

  • Vowels
  • Consonants

It is possible to distinguish seven different types of Valencian consonants:

  • Plosives
  • Fricatives
  • Affricates
  • Nasals
  • Laterals
  • Trill
  • Xenophones

Valencian symbol set grouped by phoneme classes

The following table shows all phonemes used in Valencian transcriptions. They are listed according to their phoneme classes with their SAMPA and IPA representations.

Phoneme class SAMPA IPA Examples of use
Consonants Plosives b b
p p pellícula /p@likul@/
d d diploma /diplom@/
t t teatre /teatr@/
g g garantia /g@r@nti@/
k k informàtica /imfurrmatik@/
Fricatives f f micròfon
s s observar /ups@rba/
S ʃ peix /peS/
Z ʒ pellroja /peLrrOZ@/
j j aires /ajr@s/
z z atletisme /@ll@tizm@/
Affricates dZ ʤ garatge
Nasals m m humana
N ŋ llengua /LeNgw@/
J ɲ acompanyar /@kump@Ja/
n n ajuntament /@Zunt@men/
Laterals l l blanca
L ʎ caballero /kabaLero/
Trills r r cabaret
rr rr caràcter /k@rakt@rr/
Vowels Single vowels a a
e e doble /doble/
E ɛ dèbil /dEbil/
i i edicions /@disions/
o o editor /@dito/
O ɔ escola /@skOl@/
u u espero /@speru/
Semivowels j j fruita
w w guardar /gw@rda/
@ ə gaita /gajt@/

Valencian consonants

The standard Valencian consonants system is considered to have:

  • six plosives
  • six fricatives
  • one affricate
  • four nasals
  • two laterals
  • two trills
  • two xenophones

The sample words given below demonstrate the different contexts in which the sounds can appear. A short explanation is also given.

Plosives

There are three voiced and three voiceless plosives in Valencian, which can be arranged in pairs as shown here:

Voiced Examples Voiceless Examples
/b/ bellesa setembre /b@LEz@/ /s@tembr@/ /p/ palma aplicar verb /palm@/ /@plika/ /bErrp/
/d/ declarar setze /d@kl@ra/ /sEdz@/ /t/ teatre canta cabaret /teatr@/ /kant@/ /k@b@rEt/
/g/ gaita agost /gajt@/ /@gost/ /k/ kilo descansar despòtic /kilu/ /d@sk@nsa/ /d@spOtik/

Fricatives

There are six fricatives in the Valencian SAMPA symbol set, five voiced and four voiceless:

Voiced Examples Voiceless Examples
/S/ caixa cruz /kaS@/ /kruS/
/z/ zero reserva /zEru/ /rr@zerb@/ /f/ fabricar cafè fotògraf /f@brika/ /k@fE/ /futOgr@f/
/Z/ genial urgent /Z@nial/ /urrZen/ /s/ saber caríssima sentiments /s@bE/ /k@risim@/ /s@ntimens/
/j/ ioga paisatge bonsai /jOg@/ /p@jzadZ@/ /bunsaj/

Affricates

In Valencian there is one affricate, ‘dZ’ which is trained as a combination of the phonemes /d/ and /Z/

/dZ/ fotomuntatge /futumuntadZ@/

Nasals

There are four nasals in Valencian, /m/, /n/, /N/, and /J/.

/m/ mes habitualment quelcom /mes/ /@bitualmen/ /k@lkOm/
/n/ nata organitzar pagament /nat@/ /urg@nidza/ /p@g@men/
/N/ significa cinc /siNnifik@/ /siN/
/J/ acompanyar any /@kump@Ja/ /aJ/

Laterals

There are two laterals in Valencian, /l/ and /L/.

/l/ lògica diabòlic mal /lOZik@/ /di@bolik/ /mal/
/L/ lluna mallorca castell /Lun@/ /m@LOrrk@/ /k@steL/

Trills

There are two trills in Valencian, /r/ and /rr/.

/r/ compra /kompr@/
/rr/ racista sorprendre valor /rr@sist@/ /surrpEndr@/ /b@lorr/

Valencian vowels

Single vowels (monophthongs)

The Valencian language has seven distinguishable monophthongs:

/a/ cada caminar /kad@/ /k@mina/
/e/ ella gomera porter /eL@/ /gumer@/ /purrte/
/E/ època aquella què /Epuk@/ /@kEL@/ /kE/
/i/ hipnòtic idea imaginar introduir /ibnOtik/ /ide@/ /im@Zina/ /intrudui/
/o/ onze operadora operació /onz@/ /up@r@dor@/ /up@r@sio/
/O/ òptica amazònica /Optik@/ /@m@zOnik@/
/u/ homòfon important menú /umOfun/ /impurrtan/ /m@nu/

Semi-vowels

There are three semi-vowels in Valencian:

/j/ ioga vizcaya massai /jOg@/ /biSkaja/ /m@saj/
/w/ washington ambigua europeu /waSinton/ /@mbigw@/ /@wrupEw/
/@/ evitar exagerat fantasia /@bita/ /@gz@Z@rat/ /f@nt@zi@/

Specific pronunciation transcription methods

The grapheme <h>

There is no phonetic realization of the grapheme <h>. For example:

hotel /utEl/

Transcription of the fricatives /s/ and /z/

The voiceless fricative /s/ occurs before vowels, voiceless consonants and at the end of a word. For example:

sala /sal@/
absorbit /@psurbit/
articles /@rrtikl@s/

The voiced fricative /z/ occurs before voiced consonants or between two vowels. For example:

anglicismes /@Nglisizm@s/
bellesa /b@LEz@/

Transcription of the trills /r/ and /rr/

The trill /r/ appears in the middle of a word between two vowels and between a vowel and a consonant other than <n>, <l>, or <s>. For example:

bolero /buleru/
celebrar /s@l@bra/

The trill /rr/ appears at the beginning and at the end of a word. In the middle of a word, its position is between two vowels or between a vowel and a consonant other than <n>, <l>, or <s>. For example:

realitat /rre@litat/
familiar /f@miliarr/
catorze /k@torrz@/

Transcription of the nasals /J/ and /N/

The grapheme <ny> is always represented by the SAMPA symbol /J/.

companyia /kump@Ji@/
desengany /d@z@NgaJ/

The nasal /N/ appears before the phonemes /g/ and /k/ and word final. For example:

lingüista /liNgwist@/
banc /baN/

Transcription of the affricate /dZ/

/dZ/ usually appears between two vowels and is represented orthographically as <tg> or <tj>.

paisatge /p@jzadZ@/
lletjos /LedZus/

Pronunciation of foreign words

When there is a need to transcribe foreign words, the general rule is to transcribe those words with the same SAMPA symbol set than the rest. In case of a Valencian transcription you have to transcribe every word of the dictionary with the Valencian SAMPA symbols.

If you use a different symbol set your system will be incapable of understanding the input.

Every language has a different phoneme inventory, so you may have problems in covering each and every sound. For the most common case we offer transcription examples.

The French nasals for example are adapted to the Valencian vowel system:

blanc /blaN/
centre /sentr@/

In general, foreign words are being integrated into Valencian phonetics, sometimes also the orthography was changed.

garatge /g@radZ@/
communications /komunikEjSons/
hannover /@nOb@rr/
lletjos /LedZus/

Multiple pronunciations (variants)

The type of pronunciation used in SAMPA and in the Valencian Background dictionary conforms to the standard non-regional Valencian pronunciation. Since it is possible to have more than one pronunciation for a word by using pronunciation variants, it may be difficult to determine how many pronunciation variants should be created.

The general rule is: create variants only if the pronunciation differs in more than one phoneme.

The Valencian symbol set in alphabetical order

The following table shows the Valencian symbol set in alphabetical order:

SAMPA IPA Examples of use
@ ə gaita
a a casado
b b bodega
d d diploma
e e doble
E ɛ dèbil
f f micròfon
g g garantia
i i edicions
j j aires
J ɲ acompanyar
k k informàtica
l l blanca
L ʎ caballero
m m humana
n n ajuntament
N ŋ llengua
o o editor
O ɔ escola
p p pellícula
r r cabaret
rr rr caràcter
s s observar
S ʃ peix
t t teatre
u u espero
w w guardar
z z atletisme
Z ʒ pellroja