Dutch Netherlands (nl-NL)
This documentation was updated on November 22, 2023.
Creating grammars
The following subsections describe key issues for working with grammar documents in the Dutch language. For detailed information on creating grammars, see your product documentation.
Character encoding
Nuance Recognizer has full internal Unicode support. For example, you can create your grammars using UTF-8 or Latin-1 (also known as ISO-8859-1) character encoding. For example, your grammar header might be:
<?xml version=‘1.0’ encoding=‘UTF-8’?>
<grammar xml:lang=“nl-NL” version=“1.0” root=“test”>
All grammars (and any embedded ECMAScript code) must respect characters reserved by the XML standard. For example, the ampersand “&” functions as an escape character: for example, “>” represents the “greater than” symbol (>), “<” represents “less than” (<), and “&” represents the ampersand (&).
In case your keyboard does not match your target language on Windows add the respective keyboard by going to the “Control Panel” click “Regional and Language” and select “Keyboards and languages”.
Below are codes for writing some common Dutch characters. These are useful if you do not have access to a Dutch keyboard, and are typed by pressing the ALT key while entering digits on your keyboard (after typing the last digit, the desired character appears on your screen when you release the Alt key):
Alt/0203 = Ë | Alt/0207 = Ï |
---|---|
Alt/0233 = é | Alt/0239 = ï |
Alt/0235 = ë |
Valid characters in grammars
In order to define which characters can be used with this language pack please read the sections “Valid characters in grammars” and “Checking pronunciations with dicttest” in the Grammar Developer guide (accessible through the Product Documentation program shortcut).
alphanum_lc built-in grammar
The alphanum_lc built-in grammar recognizes a connected string of up to 20 digits and lowercase alphabetic characters, such as “a8f9h23”. For example, this grammar could be used to recognize a product code or user id. The “lc” in the name of this built-in means lowercase. The possible characters are the lowercase letters a-z and the digits 0-9. The letter “y” can be pronounced either as “Griekse IJ”, “IJ” or “i-grec”. The digit “7” can be pronounced as either /ze:v@/ or /z2:v@/. The application layer can adjust the case of the returned letters as needed for further processing.
Note: This grammar replaces the alphanum built-in grammar.
alphanum built-in grammar
(NOTE: for backward-compatibility only. Otherwise, use alphanum_lc builtin)
This grammar has been replaced by the alphanum_lc grammar, but is still available. The alphanum builtin-grammar has been retained for backward-compatibility. For new implementations, please use the alphanum_lc builtin grammar.
The alphanum built-in grammar recognizes a connected string of up to 20 digits and uppercase or lowercase alphabetic characters, such as “A8f9h23”. For example, this grammar could be used to recognize a product code or order number. The possible characters are the uppercase letters A-Z, lowercase letters a-z, and digits 0-9. Uppercase and lowercase letters are homonyms (e.g., “B” and “b”), so the inclusion of both is redundant for the purposes of speech recognition of case insensitive items such as product codes. Thus, the alphanum built-in grammar has been replaced by the alphanum_lc grammar.
boolean built-in grammar
The boolean grammar collects an affirmative or negative response.
Properties
The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.
Parameter | Description |
---|---|
y | Desired DTMF digit to be equivalent to “ja” (default = 1) |
n | Desired DTMF digit to be equivalent to “nee” (default = 2) |
Examples
Caller says… | MEANING key |
---|---|
ja | true |
nee | false |
ccexpdate built-in grammar
The ccexpdate grammar understands the expiration date on a credit card. Expiration dates are usually a month and a year, and are often embossed on a credit card in the form “mm/yy.” The grammar recognizes variations on the date, for example, “December 2007,” “twaalf nul zeven,” “twaalf tweeduizend zeven,” “twaalf schuine streep nul zeven,” etc.
currency built-in grammar
The currency built-in grammar collects currency using euro and cent such as “tien euros,” “tien euros en vijftien cent,” and “tien vijftien.”
Return keys/values
MEANING | contains a string in the following form: EUR main_unit_amount.subunit_amount If the caller omits the main unit or subunit amount, then that field is zero. The string contains a leading zero if the subunit amount is collected without the main unit. |
---|---|
SWI_literal | contains the exact text that was recognized. |
Examples
Caller says | MEANING |
---|---|
vijf euro | EUR5.00 |
vijf cent | EUR0.05 |
vijf euro en vijf cent | EUR5.05 |
vijf euro vijfentwintig vijf euro vijfentwintig cent | EUR5.25 |
twaalf honderd euro duizend twee honderd euro | EUR1200.00 |
éénenvijftig duizend euro | EUR51000.00 |
zes honderd vijfentwintig duizend vier honderd tweeënnegentig euro | EUR625492.00 |
één euro geen centen | EUR1.00 |
één drieënveertig | EUR1.43 |
date built-in grammar
The date built-in grammar accepts spoken date utterances from the caller.
Recognized phrases include “4 juni,” “4 juni 2007,” ““4, 6, 2006,” “de vierde,” and “maandag, de vierde juni.”
The grammar also accepts “eergisteren” “gisteren,” “vandaag,” “morgen,” and “overmorgen” which return values of -2, -1, 0, +1, and +2 respectively into the MEANING key.
Examples
Caller says | MEANING key |
---|---|
vijf januari twee duizend zes de vijfde januari twee duizend zes op vijf januari twee duizend zes | 20060105 |
eergisteren | -2 |
gisteren | -1 |
vandaag | 0 |
morgen | +1 |
overmorgen | +2 |
op de vierde de vierde | ??????04 |
woensdag | (Phrase not recognized) |
woensdag de twaalfde | ??????12 |
de vierde juni vier juni | ????0604 |
de vierde juni negentien honderd elf de vierde juni negentien elf vier juni negentien hondered elf vier juni negentien elf | 19110604 |
de zesde | ??????06 |
de zesde van de vierde | ????0406 |
tien twaalf | ????1 210 |
zevenentwintigste eerste zevenennegentig | ??970127 |
digits built-in grammar
The digits grammar recognizes a continuously spoken string of up to 20 digits (that is, the caller is not required to pause after each digit). The digits must be written in orthographic form (“één, twee, drie, …”) .
number built-in grammar
The number built-in grammar accepts quantities such as “ten,” “one hundred and forty,” “five hundred sixty one point five,” “negative five,” and “minus four point three.”
Examples
Numbers from -999,999,999.99 to 999,999,999.99 are recognized, but by default the minallowed parameter is set to zero, which limits recognition to positive values.
Caller says | MEANING key |
---|---|
vijfentwintig | 25 |
twaalf duizend drie honderd drieënveertig | 12345 |
min vier | -4 |
veertien komma tweeënzeventig | 14.72 |
éénendertig punt dertien | 31.13 |
time built-in grammar
Recognized phrases include times given in 12-hour format (for example, “vijf uur”) and 24-hour format (“drieëntwintig vijftien”). In addition, it recognizes “qualified” times such as “voor vijf uur” and “om ongeveer drie uur.”
Examples
For each entry, the values returned in the MEANING and QUALIFIER keys are shown. (Not shown are the values of the HOUR, MINUTE, and AMPM keys.)
Caller says | MEANING | QUALIFIER |
---|---|---|
middag | 1200p | exact |
om middernacht | 0000? | exact |
voor middernacht | 1200a | before |
na dertien uur dertig | 1330h | after |
twintig twintig | 2020h | exact |
acht uur twintig ’s_morgens | 0820a | exact |
tien voor half negen ’s_morgens | 0820a | exact |
half acht | 0730? | exact |
kwart over negen in de voormiddag | 0915a | exact |
kwart na negen | 0915? | exact |
kwart voor tien | 0945? | exact |
vijf over tien | 1005? | exact |
vijf na tien | 1005? | exact |
zes uur en tien minuten | 0610? | exact |
twee uur en één minuut ’s_middags | 0201p | exact |
Vocabulary items and pronunciations
This chapter describes considerations for vocabularies and their pronunciations in Dutch (nl-NL).
Specially tuned pronunciations
The following table shows common words that are fine-tuned by Nuance.
Many of these words contain so called “word-specific phonemes” in their pronunciation transcription.
These are phonemes that have been trained exclusively on data containing the respective words.
Words with tuned pronunciations (do not modify):
- All letters of the alphabet, a-z.
- ja, nee
- Cardinal numbers: 0-99, 100, and 1000
Dutch pronunciations
This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the Dutch language. It provides information about transcription and pronunciation.
As reference pronunciation dictionary we use:
Paardekooper, P.C.: ABN-uitspraakgids. Den Haag: Sdu Uitgevers, Antwerpen: Standaard Uitgeverij 1998 (ISBN 90 75 56678 6)
If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in The Dutch symbol set in alphabetical order .
The Dutch phoneme system
The Dutch phoneme system can be divided into two groups:
- Consonants
- Vowels
Furthermore, it is possible to define seven different types of consonants:
- Plosives
- Fricatives
- Affricates
- Glides
- Nasals
- Laterals
- Trills
Within the vowel group, a distinction can be made between checked and free. Furthermore, diphthongs represent an additional characteristic among the group of vowels.
The Netherlands Dutch symbol set grouped by phoneme classes
The following table shows all phonemes used in Dutch transcriptions, these are listed grouped by phoneme classes (this is to say their manner of articulation) with their IPA and SAMPA representations.
Phoneme class | SAMPA | IPA | Examples of usage |
---|---|---|---|
Consonants | Plosives | b | b |
g | g | goal | /go:l/ |
p | p | pak | /pAk/ |
k | k | kap | /kAp/ |
d | d | dak | /dAk/ |
t | t | tak | /tAk/ |
Fricatives | f | f | fel |
v | v | vel | /vEl/ |
s | s | soep | /sup/ |
z | z | zijn | /zEJn/ |
x | x | toch | /tOx/ |
G | ɣ | regen | /re:G@/ |
S | ʃ | show | /So:w/ |
Z | ʒ | bagage | /bAGa:Z@/ |
h | h | hand | /hAnt/ |
Glides | w | w | wit |
j | j | ja | /ja:/ |
Nasals | m | m | met |
n | n | net | /nEt/ |
N | ŋ | bang | /bAN/ |
Lateral | l | l | land |
Trill | r | r | rand |
Vowels | Checked vowels | I | ɪ |
E | ɛ | pet | /pEt/ |
E: | ɛ: | elitair | /e:litE:r/ |
A | ɑ | pan | /pAn/ |
O | ɔ | pot | /pOt/ |
Q: | ɔ: | caisson | /kEsQ:/ |
Y | ʏ | put | /pYt/ |
i | i | vier | /vir/ |
y | y | vuur | /vyr/ |
u | u | voer | /vur/ |
a: | a: | naam | /na:m/ |
e: | e: | veer | /ve:r/ |
2: | ø: | deur | /d2:r/ |
o: | o: | voor | /vo:r/ |
Diphthongs | EJ | ɛi | fijn |
9J | œy | huis | /h9Js/ |
AW | ɑu | goud | /GAWt/ |
Schwa | @ | ə | geleid |
The Netherlands Dutch consonants
The standard Dutch consonant system is generally considered to have:
- Five plosives
- Eight fricatives
- Two affricates
- Three nasals
- Two glides
- One lateral
- One trill
Additionally this language pack contains the following xenophones (phonemes from other languages that are commonly used in the Dutch language):
- One plosive /g/
- One fricative /Z/
The sample words given below show the different contexts in which the sounds can appear. A short explanation is also given.
Plosives
There are two voiced and three voiceless plosives in Dutch as well as the voiced xenophone /g/, which can be arranged in pairs as shown here:
Voiced | Voiceless |
---|---|
/b/ | berg kabel |
/g/ | goal |
/d/ | dag kader |
Note that /b/ and /d/ will not appear at the end of any word in your transcription. Dutch (as well as German) has “devoicing”, which means that voiced consonants become voiceless at the end of a word: tijd /tEJt/, Bob /bOp/.
Fricatives
There are eight fricatives in Dutch, five voiceless and three voiced. The voiced xenophone /Z/ can be added to the voiced fricatives. Four out of five voiced fricatives can be paired with their voiceless counterparts:
Voiced | Voiceless |
---|---|
/v/ | vijf leven |
/z/ | zee lezen |
/G/ | goud geluid |
/Z/ | journaal jukebox |
The phonetic characters /v/ and /z/ do not appear in the transcription in the end of a word due to the " devoicing .” (See Devoicing .)
Affricates
In Dutch there are two affricates, /ts/ and /tS/. Also the xenophone affricate /dZ/ is used.
Affricates are always represented by two single phonemes.
/dZ/ | manager budget | /mEn@dZ@r/ /bYdZEt/ |
---|---|---|
/ts/ | plantsoen plaats | /plAntsun/ /pla:ts/ |
/tS/ | Tsjech ketchup bondscoach | /tSEx/ /kEtSYp/ /bOntsko:tS/ |
Glides
There are two glides in Dutch: /w/ and /j/.
/w/ | weer trouwe schuw | /we:r/ /trAWw@/ /sxyw/ |
---|---|---|
/j/ | jaar gooien saai | /ja:r/ /Go:j@/ /sa:j/ |
Nasals
There are three nasals in Dutch, /m/, /n/, and /N/. The phoneme /N/ is normally represented by the combination <ng> or <nk> in Dutch orthography and can not appear at the beginning of a word.
/m/ | meer zwemmen tam | /me:r/ /zwEm@/ /tAm/ |
---|---|---|
/n/ | noot binnen laan | /no:t/ /bIn@/ /la:n/ |
/N/ | zingen zinken zong zonk | /zIN@/ /zINk@/ /zON/ /zONk/ |
Laterals
In the Dutch SAMPA symbol set there is one lateral, the phoneme /l/.
/l/ | lang | /lAN/ |
---|
Trills
All pronunciation variants of the letter which is written <r> have to be transcribed as /r/.
/r/ | ruit | /r9Jt/ |
---|
Dutch vowels
Dutch single vowels fall into two classes: checked and free vowels.
Checked vowels
Checked vowels mostly occur in stressed syllables with a following consonant. In instances where there is no final consonant, the checked vowels become free vowels. For example: dak /dAk/, daken /da:k@/
Note that two of the checked vowels are xenophones: /E:/, /Q:/
/I/ | inkt pit | /INkt/ /pIt/ |
---|---|---|
/E/ | erwt berk | /Ert/ /bErk/ |
/E:/ | elitair airport | /e:litE:r/ /E:rpQ:rt/ |
/A/ | achter land | /Axt@r/ /lAnt/ |
/O/ | ochtend morgen | /Oxt@nt/ /mOrG@/ |
/Q:/ | caisson hawkinslaan | /kEsQ:/ /hQ:kInsla:n/ |
/Y/ | ultimatum cultuur | /Yltima:tYm/ /kYltyr/ |
Schwa
The schwa-sound, as in maken , gooien , ziekte , etc., is transcribed /@/: /ma:k@/, /Go:j@/, /zikt@/, etc.
Note that the <i> in - ig and the <ij> in - lijk are transcribed as /@/, too: handig /hAnd@x/, makkelijk /mAk@l@k/.
Additionally, <en> is transcribed as /@/ when it appears at the end of a word: lopen /lo:p@/.
Free vowels
There are seven free vowels in Dutch:
/i/ | iets fiets ruzie | /its/ /fits/ /ryzi/ |
---|---|---|
/e:/ | ezel kwezel zee | /e:z@l/ /kwe:z@l/ /ze:/ |
/a:/ | aan baan zodra | /a:n/ /ba:n/ /zo:dra:/ |
/o:/ | oor koor zo | /o:r/ /ko:r/ /zo:/ |
/y/ | uur cultuur nu | /yr/ /kYltyr/ /ny/ |
/u/ | boer koe | /bur/ /ku/ |
/2:/ | eucalyptus kleur beu | /2:ka:lIptYs/ /kl2:r/ /b2:/ |
There are three diphthongs in Dutch:
/EJ/ | eigen mijn zij | /EJG@/ /mEJn/ /zEJ/ |
---|---|---|
/9J/ | uit ruig bui | /9Jt/ /r9Jx/ /b9J/ |
/AW/ | oud blauw | /AWt/ /blAWw/ |
There is no clear distinction between monophthongs (free vowels) and diphthongs. In some cases, it depends on the speaker if a phoneme (for example /e:/) is realized as a monophthong or as a diphthong.
Specific pronunciation transcription methods
Double consonants
There are no double consonants (geminates) in spoken Dutch. However, in written Dutch, these seem to abound where a double consonant signals the shortness of the preceding vowel. Compare the following examples:
poten | /po:t@/ |
---|---|
potten | /pOt@/ |
Devoicing
There are no voiced consonants in the final position of the spoken word in Dutch, although in orthography they are realized:
Consonant | Final position (voiceless) | Within a word (voiced) |
---|---|---|
d | bad | /bAt/ |
b | rib | /rIp/ |
Assimilation
In some sequences of consonants the first consonant is assimilated. An example of this is a voiceless /f/ becomes a voiced /v/, because a voiced /b/ follows.
Assimilation occurs in the following cases:
fb | /vb/ | leefbaar | /le:vba:r/ |
---|---|---|---|
fd | /vd/ | liefde | /livd@/ |
nb | /mb/ | tuinbank | /t9JmbANk/ |
sb | /zb/ | uitwisbaar | /9JtwIzba:r/ |
sd | /zd/ | huisdier | /h9Jzdir/ |
When the <n> is not pronounced, assimilation does not take place. See Multiple pronunciations–variants for more information. For example:
kippenboer /kIp@bur/
Vowel sequences with a diphthong character
There are five vowel sequences that are sometimes described as diphthongs.
-aai- | /a:j/ | aaien | /a:j@/ |
---|---|---|---|
-ooi- | /o:j/ | mooi | /mo:j/ |
-oei- | /uj/ | roeiboot | /rujbo:t/ |
-ieu- | /iw/ | nieuw | /niw/ |
-eeu- | /e:w/ | eeuw | /e:w/ |
Insertion of additional consonants
In order to obtain a fluent pronunciation it is necessary to insert an additional consonant to avoid a hiatus.
Between /u/, /y/, /o:/, or /AW/ and another vocal a, /w/ is inserted. In all other cases (/i/, /e:/, and /EJ/) the additional consonant is a /j/.
Examples:
houwer | /hAWw@r/ |
---|---|
Europeaan | /2:ro:pe:ja:n/ |
Even if the sound /AW/ is at the end of the word, in the transcription the /w/ is added. For example:
jouw | /jAWw/ |
---|---|
blauw | /blAWw/ |
Pronunciation of foreign words
There are phonemes that only occur in foreign words. These are internally “mapped” or linked to the closest Dutch phonemes in the speech recognizer. Therefore, it is not necessary to use the foreign phonemes in custom transcriptions.
To transcribe foreign words or names, you must use the Dutch SAMPA symbols. If you use a different symbol set your system will be incapable of understanding the input.
Foreign word | Sample transcriptions |
---|---|
Bergström Hölsgens | /bErkstr2:m/ /hYlsg@ns/ |
Hülsenbeck Hürriyet überhaupt | /hylz@nbEk/ /hyrijEt/ /yb@rhAWpt/ |
Jungschläger aufklärung Schäublin | /juNSle:g@r/ /AWfkle:rYN/ /SOjplin/ |
The following xenophones have been added to the Dutch phoneme set as they occur in frequently used loan words:
Xenophone | Example | Transcriptions |
---|---|---|
/g/ | gadget | /gEdZ@t/ |
/Z/ | manager college | /mEn@dZ@r/ /kOle:Z@/ |
/E:/ | milliampère advanced | /miliAmpE:r/ /E:tfa:nst/ |
/Q:/ | all-riskverzekering chanson | /Q:lrIskf@rze:k@rIN/ /SQ:sQ:/ |
Multiple pronunciations–variants
In Dutch there are two possibilities to pronounce words with the suffix -en. For example:
komen | /ko:m@/ /ko:m@n/ |
---|---|
poten | /po:t@/ /po:t@n/ |
We decided to transcribe consistently without /n/ since there is no difference in meaning, and one consistent transcription is easier to handle than two. This means that words with the suffix -en always have to be transcribed as in these examples:
komen | /ko:m@/ |
---|---|
poten | /po:t@/ |
This handling is also applied if “-en” occurs at the end of a word within a compound. For example, when “schroeven” (screws) combines with “draaier” (driver), the resulting compound is:
schroevendraaier /sxruv@dra:j@r/
The Dutch symbol set in alphabetical order
The following table shows the Dutch symbol set in alphabetical order:
SAMPA | IPA | Examples of usage |
---|---|---|
@ | ə | geleid |
2: | ø: | deur |
9J | œy | huis |
A | ɑ | pan |
a: | a: | naam |
AW | ɑu | goud |
b | b | bak |
d | d | dak |
E | ɛ | pet |
e: | e: | veer |
E: | ɛ: | elitair |
EJ | ɛi | fijn |
f | f | fel |
G | ɣ | regen |
g | g | goal |
h | h | hand |
i | i | vier |
I | ɪ | pit |
j | j | ja |
k | k | kap |
l | l | land |
m | m | met |
n | n | net |
N | ŋ | bang |
O | ɔ | pot |
o: | o: | voor |
p | p | pak |
Q: | ɔ: | caisson |
r | r | rand |
s | s | soep |
S | ʃ | show |
t | t | tak |
u | u | voer |
v | v | vel |
w | w | wit |
x | x | toch |
y | y | vuur |
Y | ʏ | put |
Z | ʒ | bagage |
z | z | zijn |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.