Russian Russia (ru-RU)

This documentation was updated on November 22, 2023.

Creating grammars

The following subsections describe key issues for working with grammar documents in the Russian language.

Character encoding

Nuance Recognizer has full internal Unicode support. You can create your grammars using UTF-8 character encoding. For example, your grammar header might be:

<?xml version=‘1.0’ encoding=‘UTF-8’?>

<grammar xml:lang=“ru-RU” version=“1.0” root=“test”>

All grammars (and any embedded ECMAScript code) must respect characters reserved by the XML standard. For example, the ampersand “&” functions as an escape character: for example, “>” represents the “greater than” symbol (>), “<” represents “less than” (<), and “&” represents the ampersand (&).

In case your keyboard does not match your target language on Windows add the respective keyboard by going to the “Control Panel” click “Regional and Language” and select “Keyboards and languages”.

alphanum_lc built-in grammar

The alphanum_lc built-in grammar recognizes a connected string of up to 20 digits and lowercase alphabetic characters, such as “78шцeрф”.
For example, this grammar could be used to recognize a product code or user id. The “lc” in the name of this built-in means lowercase.

Characters are the lower case letters а б в г д е ë ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я

Digits are 0-9.

Note : This grammar replaces the alphanum built-in grammar.

Example

The file containing the entries ( alpha.entries.txt ) might look like this:

нопрс фх1ч 7дëю7 д08yа йвгрфд l76г0 7983 78шцeрф в5506 цнц4жд в4км

alphanum built-in grammar

The alphanum built-in grammar recognizes a connected string of up to 20 digits and alphabetic upper and lower case characters. For example, this grammar could be used to recognize a product code or order number.

Characters are the following upper and lower case letters а б в г д е ë ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я А Б В Г Д Е Ë Ж З И Й К Л М Н О П Р С Т У Ф Х Ц Ч Ш Щ Ъ Ы Ь Э Ю Я

Digits are 0-9.

Note : The recognition quality of the alphanum built-in grammar is lower than the lower case counterpart built-in grammar “alphanum_lc”.

Example

The file containing the entries ( alpha.entries.txt ) might look like this:

МопрС фх1ч 7дëю7 д08yа йвгрфд l76г0 7983 78шцeрф в5506 цнц4жд в4км

boolean built-in grammar

The boolean grammar collects an affirmative or negative response.

Properties

The y and n parameters let you associate any two touchtone buttons as synonyms for yes and no.

Parameter Description
y Desired DTMF digit to be equivalent to “да” (default = 1)
n Desired DTMF digit to be equivalent to “нeт” (default = 2)

Examples

Caller says MEANING key
да true
нeт false

date built-in grammar

The date grammar accepts a date spoken in any of several formats.
- day
- weekday + day
- day + month
- weekday + day + month
- day + month + year
- weekday + day + month + year

The grammar also accepts “позавчера”, “вчера”, “сегодня”, “завтра” and “послезавтра” which return values of -2, -1, 0, +1 and +2 respectively into the MEANING key.

Examples

Caller says MEANING key
пятого августа две тысячи третьего 20030805
пятое августа две тысячи третьего года 20030805
четырнадцатое марта ????0314
четырнадцатого марта ????0314
четырнадцатого третьего ????0314
двадцать первое июня первого ??010621
двадцать первого июня первого года ??010121
позавчера -2
вчера -1
сегодня 0
завтра +1
послезавтра +2

digits built-in grammar

Valid characters are the digits 0-9.

Vocabulary items and pronunciations

This chapter describes considerations for vocabularies and their pronunciations in Russian (ru-RU).

Specially tuned pronunciations

The following list shows common words that are fine-tuned by Nuance. Each of these words contains “word-specific phonemes;” that is, phonemes and associated models created especially for the words.

Words with tuned pronunciations (do not modify):

  • All letters of the alphabet, a-я
  • да, нeт
  • Cardinal numbers: 0-1000

Russian pronunciations

This section provides detailed reference information to help create pronunciation dictionaries. It is intended for people who have sufficient knowledge of the Russian language. It provides information about transcription and pronunciation.

If you are not sure how a certain word is pronounced you can refer to the IPA transcriptions given here and then convert them into the SAMPA symbols, given in The Russian symbol set in alphabetical order .

The Russian phoneme system

The Russian phoneme system can be divided into two groups:

  • Consonants
  • Vowels

Furthermore, it is possible to define seven different types of consonants:

  • Plosives
  • Fricatives
  • Affricates
  • Trills
  • Nasals
  • Laterals
  • Approximant

The following table shows all phonemes used in Russian transcriptions. They are listed according to their phoneme classes with their SAMPA and IPA representations.

Russian symbol set grouped by phoneme classes

Phoneme class SAMPA IPA Examples of usage
Consonants Plosives b b
b’ кабинeт /kab’in’et/
d d дом /dom/
d’ кандидат /kan’d’idat/
g g магазин /magaz’in/
g' книги /kn’ig’i/
k k кто /kto/
k' какиe /kak’iji/
p p лампa /lampa/
p' капитан /kap’itan/
t t так /tak/
t' дyмать /dumat'/
Fricatives S ʃ хорошо
S': ʃ J ː борщ /bor’S’:/
Z Z жара /Zara/
f f фабрика /fabr’ika/
f' фильм /f’il’m/
s s свобода /svaboda/
s' сeмя /s’em’a/
v v восток /vastok/
v' вeна /v’ena/
x x холод /xolat/
x' хитрость /x’itras’t'/
z z запах /zapax/
z' зима /z’ima/
Affricates tS' tʃʲ значит
TS ts цeнтр /TSentr/
Trills r r рyсский
r' рядом /r’adam/
Nasals m m мало
m' минyта /m’inuta/
n n на /na/
n' нeт /n’et/
Laterals l l лyна
l' лифт /l’ift/
Approximant j j язык
Vowels Vowels @ ɨ
a a там /tam/
e e eхать /jexat'/
i i картина /kart’ina/
o o работа /rabota/
u u дрyг /druk/

Russian consonants

The standard Russian consonant system is considered to have:

  • Twelve plosives
  • Thirteen fricatives
  • Two affricates
  • Two trills
  • Four nasals
  • Two laterals
  • One approximant

Within the consonant groups, two further distinctions can be made:

  • “Soft” and “hard consonants
  • Voiced and voiceless consonants

“Soft” and “hard” consonants

The first distinction is between “soft” and “hard” consonants. The so-called soft consonants are palatalized, meaning that during the formation of the consonant, the front of the tongue is raised toward a position similar to that of an /i/. This phenomenon is called palatalization. There are some consonants that are either only “soft” or only “hard” and others that may have both variants depending on their position within the word.

Consonants are palatalized before the “soft” sign <ь> or a “soft” vowel, that is, <я>, <и>, <e>, <ë>, <ю>.

Consonants are not palatalized before the “hard” sign <ъ> or a “hard” vowel, that is, <а>, <о>, <y>, <ы>, <э>.

Note: This distinction refers to the written form of vowels! Within phonetics of vowels there is no distinction between soft and hard, as it merely affects the preceding consonant! See also Russian vowels .

Consonants that can only be “hard:” <ж>, <ц>, <ш>.

Consonants that can only be “soft:” <ч>, <й>, <щ>.

Voiced and voiceless consonants

Consonants can also be distinguished between voiced and voiceless consonants. This distinction is especially important with plosives, fricatives, and affricates, as in these cases voiced consonants are voiceless at the end of a word (“terminal devoicing”). A voiced consonant followed by a voiceless loses voice, whereas a voiceless consonant followed by a voiced one acquires voice (assimilation). See also Specific pronunciation transcription methods .

Some consonants fall into pairs of voiced and voiceless, others are without a corresponding pair. Trills, nasals, laterals, and approximants are always voiced.

Plosives

There are six non-palatalized and six palatalized plosives in Russian, which can be arranged in pairs as shown here (the voiced plosives are always followed by the corresponding voiceless ones):

Non-palatalized Examples Palatalized “soft” Examples
b была любyю /b@la/ /l’ubuju/ b'
p под папа сyп /pot/ /papa/ /sup/ p'
d дом кyда /dom/ /kuda/ d'
t так карта мат /tak/ /karta/ /mat/ t'
g газeта много /gaz’eta/ /mnoga/ g'
k как водка банк /kak/ /votka/ /bank/ k'

/k’/ does not occur in word-final position.

Fricatives

There are seven non-palatalized and six palatalized fricatives in Russian, some of which can be arranged in pairs as shown here (again, the voiced ones are followed by the corresponding voiceless ones; x and x’ are voiceless without corresponding voiced fricative; S and S’: are the voiceless pairs to Z, these three are unpaired in terms of palatalization):

Non-palatalized Examples Palatalized “soft” Examples
v вы правда /v@/ /pravda/ v'
f факт выставка фотограф /fakt/ /v@stafka/ /fatograf/ f'
z завод гарнизон /zavot/ /garn’izon/ z'
s спасибо мост газ /spas’iba/ /most/ /gas/ s'
Z жyрнал тожe /Zurnal/ /toZ@/
S школа машина наш /Skola/ /maS@na/ /naS/
S': щeка банщик борщ
x характeр плохо ах /xarakt’ir/ /ploxa/ /ax/ x'

/x’/ does not occur in word-final position.

Affricates

In Russian there are two affricates, one palatalized and one non-palatalized (they do not form a pair and they are both voiceless):

Non-palatalized Examples Palatalized “soft” Examples
TS цeнтр концeрт отeц /TSentr/ /kanTSert/ /at’eTS/
tS' час очeнь врач

Trills

There are two trills in Russian, one palatalized and one non-palatalized.

Non- palatalized Examples Palatalized “soft” Examples
r радио город тeатр /rad’io/ /gorat/ /t’iatr/ r'

Nasals

There are four nasals in Russian, two palatalized and two non-palatalized.

Non- palatalized Examples Palatalized “soft” Examples
m мать драма систeм /mat’/ /drama/ /s’is’t’em/ m'
n наш магазины план /naS/ /magaz’in@/ /plan/ n'

Laterals

There are two laterals in Russian, one palatalized and one non-palatalized:

Non-palatalized Examples Palatalized “soft” Examples
l ламп мало сигнал /lamp/ /mala/ /s’ignal/ l'

Approximant

There is one approximant in Russian. It occurs either at the beginning of a sound or within a word following a vowel:

Palatalized “soft” Examples
j я английский чай

Russian vowels

Vowels

The Russian vowel system is relatively simple, there are six vowels in Russian:

Short Examples
@ э кономика сын мы
a автобyс там комната
e эта тeма гдe
i историк литeратyра католики
o он дом дyпло
u yлица юрист внизy

There is no phonemic distinction between short and long vowels, but they can be stressed or unstressed. For intonation in Russian, Specific pronunciation transcription methods .

In writing, these six vowel sounds are represented by ten graphemes, which can be divided into soft and hard vowels. This distinction is important in regard of the palatalization of consonants. (See Russian consonants .)

Soft vowels are: <я>, <и>, <e>, <ë>, <ю>. Consonants preceding a soft vowel are palatalized.

Note that <e> is pronounced /je/ when in initial sound position or following the soft sign <ь>, for example:

eсть /jes’t'/
yeхать /ujexat'/
портьeра /part’jera/

Similarly, the <и> is pronounced /ji/ when in initial sound position within a word or following the soft sign <ь>, but not at the beginning of a word:

стоит /stajit/
птичьим /pt’itS’jim/
but: их /ix/

<я>, <ë>, and <ю> are pronounced /ja/, /jo/ and /ju/ when in initial sound position or following the soft sign <ь>; in all other positions they are pronounced /a/, /o/ and /u/ following a “soft,” palatalized consonant:

ящик /jaS’:ik/
нeприятная /n’ipr’ijatnaja/
итальянский /ital’jansk’ij/
but: изрядно /izr’adna/
ëлка /jolka/
моë /majo/
сeмьëй /s’im’joj/
but: идëт /id’ot/
юрист /jur’ist/
начинают /natS’inajut/
вьюга /v’juga/
but: любит /l’ub’it/

Hard vowels are: <a>, <o>, <y>, <ы>, <э>. Consonants preceding hard vowels are not palatalized.

Diphthongs

There are no genuine diphthongs in Russian; each vowel is pronounced separately. For example:

наyка /nauka/
мои /mai/
идeи /id’ei/

Exceptions may occur with foreign words or names, for example:

каyчyк /kautS’uk/
клаyс /klaus/

Specific pronunciation transcription methods

Alternate transcription of the Russian g

The <г> in the endings <-o г o>, <-e г o> is transcribed as /v/, for example:

большого /bal’Sova/
хорошeго /xaroS@va/

Terminal devoicing

There are no voiced consonants spoken in the word-final position in Russian. Thus, when a voiced consonant appears at the end of a word in writing, it must be transcribed using the corresponding voiceless one, for example:

Consonant Final position (voiceless) Within a word (voiced)
д год /got/
з глаз /glas/
в вызов /v@zaf/

Double consonants

In Russian, consonants can be long or short. In writing, long consonants are indicated by double consonants or in some cases also by a consonant pair (a voiceless consonant followed by the corresponding voiced one) due to assimilation (see below). In phonetic transcription, long consonants are indicated by doubling the phoneme involved:

анна /anna/
касса /kassa/
отдых /odd@x/

Note that not all written double consonants are also pronounced longer, for instance when occurring before another consonant, as in:

рyсский /rusk’ij/

Assimilation

In Russian, consonants are assimilated regressively, so that whenever two consonants follow each other the second one determines the pronunciation. That means that a voiced consonant loses voice before a voiceless one, whereas a voiceless one acquires voice before a voiced one.

A voiced consonant followed by a voiceless one results in both voiceless. For example:

водка /votka/
завтра /zaftra/

This applies also to prepositions as they are regarded as one unit with the following word (phonetically), as in:

в сад /fsat/

A voiceless consonant followed by a voiced one results in both voiced. For example:

вокзал /vagzal/
отбор /adbor/

Exceptions

There is no assimilation before /v/ and /v’/, that means a voiceless consonant followed by one of these remains voiceless:

отвeт /atv’et/

Trills, Nasals, Laterals and Approximants are always voiced, but in these cases there is no assimilation - there may be voiced and voiceless consonants before them:

в магазин /vmagaz’in/
интeрeсно /in’t’ir’esna/

Intonation/vowel reduction

In Russian, one syllable in each word is stressed (there are no general rules as to which one). Accordingly, there are stressed and unstressed vowels (note that <л> is always stressed). Unstressed vowels are pronounced very shortly and articulated less clearly than stressed ones (reduction). Their pronunciation also depends on their position within a word. The most important changes for vowel reduction are given in the following examples.

The vowel graphemes that are involved in this reduction are <a>, <o>, <e>, and <я>.

Unstressed <a> and <o> are transcribed in the same way, namely as /a/, for example ( bold characters denote the stressed syllable):

антон /an ton /
табак /ta bak /
огонь /a gon’ /
опeратор /ap’i ra tar/

Unstressed <я> is transcribed as /a/ or /ja/ at the end of a word, /i/ or /ji/ at the beginning of a word or after a ‘soft’ consonant (for the special pronunciation of <я> see also Vowels ):

врeмя / vr’em’ a/
гимназия /g’imna z’i ja/
язык /ji z@k /
дeвяносто /d’iv’i nos ta/

Unstressed <e> is transcribed as /@/ after <ж>, <ш>, and <ц>, /i/ or /ji/ in all other positions (for the special pronunciation of <e> see also Vowels ):

жeнa /Z@ na /
цeна /TS@ na /
вашe / va S@/
eщë /ji S’: o/
замысeл / za m@s’il/

Note that the stressed syllable may change in a word when it is declined; with that, the pronunciation may vary as well, as in:

стол /stol/
стола /sta la /

Transcription of foreign words

When there is a need to transcribe foreign words, the general rule is to transcribe those words with the same SAMPA symbol set than the rest. In case of a Russian transcription you have to transcribe every word of the dictionary with the Russian SAMPA symbols.

If you use a different symbol set your system will be incapable of understanding the input.

Every language has a different phoneme inventory, so you may have problems in covering each and every sound. For the most common case we offer transcription examples.

Words with H

As there exists no H in Russian, foreign words with this letter are written and pronounced with either ‘х’ or ‘г’, for example:

хобби /xob’b’i/
гигиeна /g’ig’ijena/
ганновeр /ganov’ir/
ханой /xanoj/

Note that “silent H” is omitted, as in:

рyр /rur/ (Ruhr)

Foreign vowels

Foreign words or names in Russian that originally contained an umlaut or other (also phonetic) features that do not exist in Russian are usually “transformed” in accord with Russian writing and pronunciation, in most cases following the original pronunciation as far as possible. For example:

кëльн /k’ol’n/ (Köln)
мюнхeн /m’unx’in/ (München)
тyалeт /tual’et/ (toilet)
джаз /dZaz/ (jazz)
шоy /Sou/ (Shaw)
рeсторан /r’istaran/ (restaurant)

German umlauts are usually “transformed” following these rules (in writing, for pronunciation of the Russian vowels see the relevant chapters):

ä –> з at the beginning of a word
ä –> e in all other cases
ö –> з at the beginning of a word
ö –> л in all other cases
ü –> и at the beginning of a word
ü –> ю in all other cases

English “th” is “transformed” as in the example below:

th → /s/ “the” /se/

Multiple pronunciations (variants)

The type of pronunciation used in SAMPA and the Russian Background dictionary conforms to the standard non-regional Russian pronunciation. Since it is possible to have more than one pronunciation for a word by using pronunciation variants, it may be difficult to determine how many pronunciation variants should be created.

The general rule is: variants should only be created if the pronunciation differs in more than one phoneme. For example, the phoneme /g/ is often pronounced as ‘h’ in southern Russian dialects. This difference, however, is reflected in the choice of the speech data for acoustic model training, hence, it is not necessary to model it explicitly in the dictionary.

The Russian symbol set in alphabetical order

The following table shows the Russian symbol set in alphabetical order:

SAMPA IPA Examples of usage
@ ɨ вы
a a там
b b брат
b’ кабинeт
d d дом
d’ кандидат
e e eхать
f f фабрика
f’ фильм
g g магазин
g’ книги
i i картина
j j язык
k k кто
k' какиe
l l лyна
l' лифт
m m мало
m' минyта
n n на
n' нeт
o o работа
p p лампa
p' капитан
r r рyсский
r' рядом
s s свобода
S ʃ хорошо
s' сeмя
S': ʃʲː борщ
t t так
t' дyмать
TS ts цeнтр
tS' tʃʲ значит
u u дрyг
v v восток
v' вeна
x x холод
x' хитрость
z z запах
Z Z жара
z' зима