Dutch Netherlands (nl-NL)

This documentation was updated on November 22, 2023.

Creating grammars

The following subsections describe key issues for working with grammar documents in the Dutch language. For detailed information on creating grammars, see your product documentation.

Character encoding

Nuance Recognizer has full internal Unicode support. For example, you can create your grammars using UTF-8 or Latin-1 (also known as ISO-8859-1) character encoding. For example, your grammar header might be:

<?xml version=‘1.0’ encoding=‘UTF-8’?>

<grammar xml:lang=“nl-NL” version=“1.0” root=“test”>

All grammars (and any embedded ECMAScript code) must respect characters reserved by the XML standard. For example, the ampersand “&” functions as an escape character: for example, “>” represents the “greater than” symbol (>), “<” represents “less than” (<), and “&” represents the ampersand (&).

In case your keyboard does not match your target language on Windows add the respective keyboard by going to the “Control Panel” click “Regional and Language” and select “Keyboards and languages”.

Below are codes for writing some common Dutch characters. These are useful if you do not have access to a Dutch keyboard, and are typed by pressing the ALT key while entering digits on your keyboard (after typing the last digit, the desired character appears on your screen when you release the Alt key):

Alt/0203 = Ë Alt/0207 = Ï
Alt/0233 = é Alt/0239 = ï
Alt/0235 = ë

Valid characters in grammars

In order to define which characters can be used with this language pack please read the sections “Valid characters in grammars” and “Checking pronunciations with dicttest” in the Grammar Developer guide (accessible through the Product Documentation program shortcut).

alphanum_lc built-in grammar

The alphanum_lc built-in grammar recognizes a connected string of up to 20 digits and lowercase alphabetic characters, such as “a8f9h23”. For example, this grammar could be used to recognize a product code or user id. The “lc” in the name of this built-in means lowercase. The possible characters are the lowercase letters a-z and the digits 0-9. The letter “y” can be pronounced either as “Griekse IJ”, “IJ” or “i-grec”. The digit “7” can be pronounced as either /ze:v@/ or /z2:v@/. The application layer can adjust the case of the returned letters as needed for further processing.

Note: This grammar replaces the alphanum built-in grammar.

alphanum built-in grammar

(NOTE: for backward-compatibility only. Otherwise, use alphanum_lc builtin)
This grammar has been replaced by the alphanum_lc grammar, but is still available. The alphanum builtin-grammar has been retained for backward-compatibility. For new implementations, please use the alphanum_lc builtin grammar.

The alphanum built-in grammar recognizes a connected string of up to 20 digits and uppercase or lowercase alphabetic characters, such as “A8f9h23”. For example, this grammar could be used to recognize a product code or order number. The possible characters are the uppercase letters A-Z, lowercase letters a-z, and digits 0-9. Uppercase and lowercase letters are homonyms (e.g., “B” and “b”), so the inclusion of both is redundant for the purposes of speech recognition of case insensitive items such as product codes. Thus, the alphanum built-in grammar has been replaced by the alphanum_lc grammar.

boolean built-in grammar

The boolean grammar collects an affirmative or negative response.

Properties

The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.

Parameter Description
y Desired DTMF digit to be equivalent to “ja” (default = 1)
n Desired DTMF digit to be equivalent to “nee” (default = 2)

Examples

Caller says… MEANING key
ja true
nee false

ccexpdate built-in grammar

The ccexpdate grammar understands the expiration date on a credit card. Expiration dates are usually a month and a year, and are often embossed on a credit card in the form “mm/yy.” The grammar recognizes variations on the date, for example, “December 2007,” “twaalf nul zeven,” “twaalf tweeduizend zeven,” “twaalf schuine streep nul zeven,” etc.

currency built-in grammar

The currency built-in grammar collects currency using euro and cent such as “tien euros,” “tien euros en vijftien cent,” and “tien vijftien.”

Return keys/values

MEANING contains a string in the following form: EUR main_unit_amount.subunit_amount If the caller omits the main unit or subunit amount, then that field is zero. The string contains a leading zero if the subunit amount is collected without the main unit.
SWI_literal contains the exact text that was recognized.

Examples

Caller says MEANING
vijf euro EUR5.00
vijf cent EUR0.05
vijf euro en vijf cent EUR5.05
vijf euro vijfentwintig vijf euro vijfentwintig cent EUR5.25
twaalf honderd euro duizend twee honderd euro EUR1200.00
éénenvijftig duizend euro EUR51000.00
zes honderd vijfentwintig duizend vier honderd tweeënnegentig euro EUR625492.00
één euro geen centen EUR1.00
één drieënveertig EUR1.43

date built-in grammar

The date built-in grammar accepts spoken date utterances from the caller.

Recognized phrases include “4 juni,” “4 juni 2007,” ““4, 6, 2006,” “de vierde,” and “maandag, de vierde juni.”

The grammar also accepts “eergisteren” “gisteren,” “vandaag,” “morgen,” and “overmorgen” which return values of -2, -1, 0, +1, and +2 respectively into the MEANING key.

Examples

Caller says MEANING key
vijf januari twee duizend zes de vijfde januari twee duizend zes op vijf januari twee duizend zes 20060105
eergisteren -2
gisteren -1
vandaag 0
morgen +1
overmorgen +2
op de vierde de vierde ??????04
woensdag (Phrase not recognized)
woensdag de twaalfde ??????12
de vierde juni vier juni ????0604
de vierde juni negentien honderd elf de vierde juni negentien elf vier juni negentien hondered elf vier juni negentien elf 19110604
de zesde ??????06
de zesde van de vierde ????0406
tien twaalf ????1 210
zevenentwintigste eerste zevenennegentig ??970127

digits built-in grammar

The digits grammar recognizes a continuously spoken string of up to 20 digits (that is, the caller is not required to pause after each digit). The digits must be written in orthographic form (“één, twee, drie, …”) .

number built-in grammar

The number built-in grammar accepts quantities such as “ten,” “one hundred and forty,” “five hundred sixty one point five,” “negative five,” and “minus four point three.”

Examples

Numbers from -999,999,999.99 to 999,999,999.99 are recognized, but by default the minallowed parameter is set to zero, which limits recognition to positive values.

Caller says MEANING key
vijfentwintig 25
twaalf duizend drie honderd drieënveertig 12345
min vier -4
veertien komma tweeënzeventig 14.72
éénendertig punt dertien 31.13

time built-in grammar

Recognized phrases include times given in 12-hour format (for example, “vijf uur”) and 24-hour format (“drieëntwintig vijftien”). In addition, it recognizes “qualified” times such as “voor vijf uur” and “om ongeveer drie uur.”

Examples

For each entry, the values returned in the MEANING and QUALIFIER keys are shown. (Not shown are the values of the HOUR, MINUTE, and AMPM keys.)

Caller says MEANING QUALIFIER
middag 1200p exact
om middernacht 0000? exact
voor middernacht 1200a before
na dertien uur dertig 1330h after
twintig twintig 2020h exact
acht uur twintig ’s_morgens 0820a exact
tien voor half negen ’s_morgens 0820a exact
half acht 0730? exact
kwart over negen in de voormiddag 0915a exact
kwart na negen 0915? exact
kwart voor tien 0945? exact
vijf over tien 1005? exact
vijf na tien 1005? exact
zes uur en tien minuten 0610? exact
twee uur en één minuut ’s_middags 0201p exact

Vocabulary items and pronunciations

This chapter describes considerations for vocabularies and their pronunciations in Dutch (nl-NL).

Specially tuned pronunciations

The following table shows common words that are fine-tuned by Nuance.

Many of these words contain so called “word-specific phonemes” in their pronunciation transcription.

These are phonemes that have been trained exclusively on data containing the respective words.

Words with tuned pronunciations (do not modify):

  • All letters of the alphabet, a-z.
  • ja, nee
  • Cardinal numbers: 0-99, 100, and 1000

Dutch pronunciations

This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the Dutch language. It provides information about transcription and pronunciation.

As reference pronunciation dictionary we use:

Paardekooper, P.C.: ABN-uitspraakgids. Den Haag: Sdu Uitgevers, Antwerpen: Standaard Uitgeverij 1998 (ISBN 90 75 56678 6)

If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given there and then convert them into the SAMPA symbols, given in The Dutch symbol set in alphabetical order .

The Dutch phoneme system

The Dutch phoneme system can be divided into two groups:

  • Consonants
  • Vowels

Furthermore, it is possible to define seven different types of consonants:

  • Plosives
  • Fricatives
  • Affricates
  • Glides
  • Nasals
  • Laterals
  • Trills

Within the vowel group, a distinction can be made between checked and free. Furthermore, diphthongs represent an additional characteristic among the group of vowels.

The Netherlands Dutch symbol set grouped by phoneme classes

The following table shows all phonemes used in Dutch transcriptions, these are listed grouped by phoneme classes (this is to say their manner of articulation) with their IPA and SAMPA representations.

Phoneme class SAMPA IPA Examples of usage
Consonants Plosives b b
g g goal /go:l/
p p pak /pAk/
k k kap /kAp/
d d dak /dAk/
t t tak /tAk/
Fricatives f f fel
v v vel /vEl/
s s soep /sup/
z z zijn /zEJn/
x x toch /tOx/
G ɣ regen /re:G@/
S ʃ show /So:w/
Z ʒ bagage /bAGa:Z@/
h h hand /hAnt/
Glides w w wit
j j ja /ja:/
Nasals m m met
n n net /nEt/
N ŋ bang /bAN/
Lateral l l land
Trill r r rand
Vowels Checked vowels I ɪ
E ɛ pet /pEt/
E: ɛ: elitair /e:litE:r/
A ɑ pan /pAn/
O ɔ pot /pOt/
Q: ɔ: caisson /kEsQ:/
Y ʏ put /pYt/
i i vier /vir/
y y vuur /vyr/
u u voer /vur/
a: a: naam /na:m/
e: e: veer /ve:r/
2: ø: deur /d2:r/
o: o: voor /vo:r/
Diphthongs EJ ɛi fijn
9J œy huis /h9Js/
AW ɑu goud /GAWt/
Schwa @ ə geleid

The Netherlands Dutch consonants

The standard Dutch consonant system is generally considered to have:

  • Five plosives
  • Eight fricatives
  • Two affricates
  • Three nasals
  • Two glides
  • One lateral
  • One trill

Additionally this language pack contains the following xenophones (phonemes from other languages that are commonly used in the Dutch language):

  • One plosive /g/
  • One fricative /Z/

The sample words given below show the different contexts in which the sounds can appear. A short explanation is also given.

Plosives

There are two voiced and three voiceless plosives in Dutch as well as the voiced xenophone /g/, which can be arranged in pairs as shown here:

Voiced Voiceless
/b/ berg kabel
/g/ goal
/d/ dag kader

Note that /b/ and /d/ will not appear at the end of any word in your transcription. Dutch (as well as German) has “devoicing”, which means that voiced consonants become voiceless at the end of a word: tijd /tEJt/, Bob /bOp/.

Fricatives

There are eight fricatives in Dutch, five voiceless and three voiced. The voiced xenophone /Z/ can be added to the voiced fricatives. Four out of five voiced fricatives can be paired with their voiceless counterparts:

Voiced Voiceless
/v/ vijf leven
/z/ zee lezen
/G/ goud geluid
/Z/ journaal jukebox

The phonetic characters /v/ and /z/ do not appear in the transcription in the end of a word due to the " devoicing .” (See Devoicing .)

Affricates

In Dutch there are two affricates, /ts/ and /tS/. Also the xenophone affricate /dZ/ is used.

Affricates are always represented by two single phonemes.

/dZ/ manager budget /mEn@dZ@r/ /bYdZEt/
/ts/ plantsoen plaats /plAntsun/ /pla:ts/
/tS/ Tsjech ketchup bondscoach /tSEx/ /kEtSYp/ /bOntsko:tS/

Glides

There are two glides in Dutch: /w/ and /j/.

/w/ weer trouwe schuw /we:r/ /trAWw@/ /sxyw/
/j/ jaar gooien saai /ja:r/ /Go:j@/ /sa:j/

Nasals

There are three nasals in Dutch, /m/, /n/, and /N/. The phoneme /N/ is normally represented by the combination <ng> or <nk> in Dutch orthography and can not appear at the beginning of a word.

/m/ meer zwemmen tam /me:r/ /zwEm@/ /tAm/
/n/ noot binnen laan /no:t/ /bIn@/ /la:n/
/N/ zingen zinken zong zonk /zIN@/ /zINk@/ /zON/ /zONk/

Laterals

In the Dutch SAMPA symbol set there is one lateral, the phoneme /l/.

/l/ lang /lAN/

Trills

All pronunciation variants of the letter which is written <r> have to be transcribed as /r/.

/r/ ruit /r9Jt/

Dutch vowels

Dutch single vowels fall into two classes: checked and free vowels.

Checked vowels

Checked vowels mostly occur in stressed syllables with a following consonant. In instances where there is no final consonant, the checked vowels become free vowels. For example: dak /dAk/, daken /da:k@/

Note that two of the checked vowels are xenophones: /E:/, /Q:/

/I/ inkt pit /INkt/ /pIt/
/E/ erwt berk /Ert/ /bErk/
/E:/ elitair airport /e:litE:r/ /E:rpQ:rt/
/A/ achter land /Axt@r/ /lAnt/
/O/ ochtend morgen /Oxt@nt/ /mOrG@/
/Q:/ caisson hawkinslaan /kEsQ:/ /hQ:kInsla:n/
/Y/ ultimatum cultuur /Yltima:tYm/ /kYltyr/

Schwa

The schwa-sound, as in maken , gooien , ziekte , etc., is transcribed /@/: /ma:k@/, /Go:j@/, /zikt@/, etc.

Note that the <i> in - ig and the <ij> in - lijk are transcribed as /@/, too: handig /hAnd@x/, makkelijk /mAk@l@k/.

Additionally, <en> is transcribed as /@/ when it appears at the end of a word: lopen /lo:p@/.

Free vowels

There are seven free vowels in Dutch:

/i/ iets fiets ruzie /its/ /fits/ /ryzi/
/e:/ ezel kwezel zee /e:z@l/ /kwe:z@l/ /ze:/
/a:/ aan baan zodra /a:n/ /ba:n/ /zo:dra:/
/o:/ oor koor zo /o:r/ /ko:r/ /zo:/
/y/ uur cultuur nu /yr/ /kYltyr/ /ny/
/u/ boer koe /bur/ /ku/
/2:/ eucalyptus kleur beu /2:ka:lIptYs/ /kl2:r/ /b2:/

There are three diphthongs in Dutch:

/EJ/ eigen mijn zij /EJG@/ /mEJn/ /zEJ/
/9J/ uit ruig bui /9Jt/ /r9Jx/ /b9J/
/AW/ oud blauw /AWt/ /blAWw/

There is no clear distinction between monophthongs (free vowels) and diphthongs. In some cases, it depends on the speaker if a phoneme (for example /e:/) is realized as a monophthong or as a diphthong.

Specific pronunciation transcription methods

Double consonants

There are no double consonants (geminates) in spoken Dutch. However, in written Dutch, these seem to abound where a double consonant signals the shortness of the preceding vowel. Compare the following examples:

poten /po:t@/
potten /pOt@/

Devoicing

There are no voiced consonants in the final position of the spoken word in Dutch, although in orthography they are realized:

Consonant Final position (voiceless) Within a word (voiced)
d bad /bAt/
b rib /rIp/

Assimilation

In some sequences of consonants the first consonant is assimilated. An example of this is a voiceless /f/ becomes a voiced /v/, because a voiced /b/ follows.

Assimilation occurs in the following cases:

fb /vb/ leefbaar /le:vba:r/
fd /vd/ liefde /livd@/
nb /mb/ tuinbank /t9JmbANk/
sb /zb/ uitwisbaar /9JtwIzba:r/
sd /zd/ huisdier /h9Jzdir/

When the <n> is not pronounced, assimilation does not take place. See Multiple pronunciations–variants for more information. For example:

kippenboer /kIp@bur/

Vowel sequences with a diphthong character

There are five vowel sequences that are sometimes described as diphthongs.

-aai- /a:j/ aaien /a:j@/
-ooi- /o:j/ mooi /mo:j/
-oei- /uj/ roeiboot /rujbo:t/
-ieu- /iw/ nieuw /niw/
-eeu- /e:w/ eeuw /e:w/

Insertion of additional consonants

In order to obtain a fluent pronunciation it is necessary to insert an additional consonant to avoid a hiatus.

Between /u/, /y/, /o:/, or /AW/ and another vocal a, /w/ is inserted. In all other cases (/i/, /e:/, and /EJ/) the additional consonant is a /j/.

Examples:

houwer /hAWw@r/
Europeaan /2:ro:pe:ja:n/

Even if the sound /AW/ is at the end of the word, in the transcription the /w/ is added. For example:

jouw /jAWw/
blauw /blAWw/

Pronunciation of foreign words

There are phonemes that only occur in foreign words. These are internally “mapped” or linked to the closest Dutch phonemes in the speech recognizer. Therefore, it is not necessary to use the foreign phonemes in custom transcriptions.

To transcribe foreign words or names, you must use the Dutch SAMPA symbols. If you use a different symbol set your system will be incapable of understanding the input.

Foreign word Sample transcriptions
Bergström Hölsgens /bErkstr2:m/ /hYlsg@ns/
Hülsenbeck Hürriyet überhaupt /hylz@nbEk/ /hyrijEt/ /yb@rhAWpt/
Jungschläger aufklärung Schäublin /juNSle:g@r/ /AWfkle:rYN/ /SOjplin/

The following xenophones have been added to the Dutch phoneme set as they occur in frequently used loan words:

Xenophone Example Transcriptions
/g/ gadget /gEdZ@t/
/Z/ manager college /mEn@dZ@r/ /kOle:Z@/
/E:/ milliampère advanced /miliAmpE:r/ /E:tfa:nst/
/Q:/ all-riskverzekering chanson /Q:lrIskf@rze:k@rIN/ /SQ:sQ:/

Multiple pronunciations–variants

In Dutch there are two possibilities to pronounce words with the suffix -en. For example:

komen /ko:m@/ /ko:m@n/
poten /po:t@/ /po:t@n/

We decided to transcribe consistently without /n/ since there is no difference in meaning, and one consistent transcription is easier to handle than two. This means that words with the suffix -en always have to be transcribed as in these examples:

komen /ko:m@/
poten /po:t@/

This handling is also applied if “-en” occurs at the end of a word within a compound. For example, when “schroeven” (screws) combines with “draaier” (driver), the resulting compound is:

schroevendraaier /sxruv@dra:j@r/

The Dutch symbol set in alphabetical order

The following table shows the Dutch symbol set in alphabetical order:

SAMPA IPA Examples of usage
@ ə geleid
2: ø: deur
9J œy huis
A ɑ pan
a: a: naam
AW ɑu goud
b b bak
d d dak
E ɛ pet
e: e: veer
E: ɛ: elitair
EJ ɛi fijn
f f fel
G ɣ regen
g g goal
h h hand
i i vier
I ɪ pit
j j ja
k k kap
l l land
m m met
n n net
N ŋ bang
O ɔ pot
o: o: voor
p p pak
Q: ɔ: caisson
r r rand
s s soep
S ʃ show
t t tak
u u voer
v v vel
w w wit
x x toch
y y vuur
Y ʏ put
Z ʒ bagage
z z zijn