Hindi - Linguistic Structure and Scripts
Understand Hindi phonology, its scripts and romanisation, and the lexical and script differences between Hindi and Urdu.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
How many oral and nasalised vowels are present in the Hindi vowel inventory?
1 of 17
Summary
Phonology of Hindi
Vowel System
Hindi has a rich vowel system consisting of ten oral vowels and three nasalised vowels. An important feature of Hindi's vowel inventory is that vowel length is phonemic—meaning the duration of a vowel can change the meaning of a word, just like changing the vowel itself would. For example, a short vowel /a/ and a long vowel /aː/ represent different sounds and can distinguish between different words.
The oral vowels contrast along dimensions of height (how high the tongue is raised) and backness (whether the tongue is positioned toward the front or back of the mouth). The nasalised vowels have the same qualities but with nasal airflow, adding another layer of contrast to the system.
Consonant Inventory and Key Distinctions
Hindi has 33 consonants, which is quite substantial compared to English. The consonant system includes several important series:
The Retroflex Series. One of the most distinctive features of Hindi phonology is the presence of retroflex consonants—sounds made with the tongue curled back toward the hard palate. These include retroflex stops like $[\ʈ]$ and $[\ɖ]$ (the retroflex equivalents of the dental $[t]$ and $[d]$), as well as retroflex nasals and fricatives. This retroflex series is a characteristic feature of Indian languages and sets Hindi apart from many European languages.
Dental versus Retroflex Contrast. Hindi maintains a critical distinction between dental consonants (where the tongue touches the teeth) and retroflex consonants (where the tongue curves back). This contrast is one of the most important phonemic distinctions in Hindi and is worth learning to hear clearly, as it fundamentally changes meaning. For instance, the dental $[t]$ and retroflex $[\ʈ]$ are entirely different sounds in Hindi.
Aspirated Stops. Hindi also features aspirated stops—consonants produced with a strong puff of air (like the English "p" in "pin" versus the unaspirated "p" in "spin"). In Hindi, aspiration is phonemic, meaning $[pʰ]$ and $[p]$ are distinct sounds that can differentiate word meanings.
Other Notable Consonants. Hindi includes a voiced dental fricative $[\z]$, which is absent in many related languages and adds to Hindi's consonantal diversity.
Comparison: Hindi and Urdu Phonology
While Hindi and Urdu are two standardized registers of the same language, Hindustani, they have developed distinct phonological patterns.
The Retroflex Challenge in Urdu
A key phonological difference is how speakers handle retroflex sounds. Urdu speakers often replace retroflex consonants with dental equivalents. This means that where a Hindi speaker maintains the contrast between dental $[t]$ and retroflex $[\ʈ]$, an Urdu speaker might use dental $[t]$ for both. This is not a deficiency—it simply reflects how Urdu has evolved, with fewer phonemic distinctions in this particular area.
Loan Phonemes from Persian and Arabic
A major phonological difference between Hindi and Urdu relates to loanword phonology. Urdu, having a longer history of contact with Persian and Arabic, has incorporated several foreign phonemes directly into its system:
[x] (voiceless velar fricative, like German "ch")
[z] (voiced alveolar fricative)
[ɣ] (voiced velar fricative)
[q] (voiceless uvular stop)
These phonemes are absent in standard Hindi. When Hindi speakers encounter words containing these sounds (often from Persian or Arabic sources), they typically substitute them with native Hindi sounds: $[x] \rightarrow [kʰ]$, $[z] \rightarrow [dʒ]$, $[ɣ] \rightarrow [g]$, and $[q] \rightarrow [k]$. This substitution strategy reveals how Hindi maintains its core phonological system while still accommodating foreign vocabulary.
Allophonic Variations and Approximants
<extrainfo>
Allophonic Variation in /ʋ/. In Hindi, the labio-velar approximant $[\ʋ]$ can be realised as a glide $[w]$ in on-glide positions (at the beginning of syllables). By contrast, Urdu retains a more consistent $[v]$ sound in most contexts. This difference is subtle but reflects differing phonological processes in the two registers.
</extrainfo>
Nasal and Sibilant Mergers in Urdu
Hindi maintains two distinct sibilant phonemes:
श (ś) - alveolar sibilant $[\ʃ]$
ष (ṣ) - retroflex sibilant
Hindi also has two distinctive nasals (ङ and ञ) that correspond to retroflex and palatal positions respectively.
Urdu often merges these distinctions, using a single sibilant $[\ʃ]$ and nasal $[n]$. This merger simplifies the phonological inventory of Urdu compared to Hindi, making these sounds less contrastive.
Scripts and Their Relationship to Phonology
The Devanagari Script
Hindi is written in Devanagari (also called Nagari), the same script used for Sanskrit and several other Indian languages. Understanding how Devanagari works is crucial for understanding Hindi orthography:
Left-to-right writing. Devanagari is written from left to right, like English and most modern scripts.
Inherent vowel /a/. The most important feature of Devanagari is that each consonant character has an inherent vowel /a/ built into it. This means the character for consonant k actually represents $[ka]$, not just $[k]$. To write a consonant without this following vowel, or with a different vowel, the script uses diacritical marks placed around the consonant character.
The halant sign. To completely suppress the inherent vowel /a/, Hindi uses a special mark called a halant (or virama), which silences the vowel. This is essential for writing consonant clusters or final consonants without a vowel.
This system—where consonants come with automatic vowels—is fundamentally different from the Latin alphabet, where consonants and vowels are independent, and it reflects the phonological structure of Hindi, where words typically alternate between consonants and vowels.
Romanisation Systems
<extrainfo>
Several systems exist for romanising (writing in Latin letters) Devanagari text:
IAST (International Alphabet of Sanskrit Transliteration) uses diacritical marks to represent Devanagari distinctions, making it precise but demanding in typography.
ITRANS uses ASCII characters to represent sounds, making it easier to type on standard keyboards.
ISO 15919 provides another standardized approach to transliteration.
Each system maps Devanagari characters to Latin equivalents, allowing representation of sounds like retroflex consonants and vowel length in the Latin script.
</extrainfo>
Script Differences Between Hindi and Urdu
While Hindi uses Devanagari, Urdu is written in the Perso-Arabic script (also called Nastaliq in its traditional form). This is one of the most visible differences between the two registers, though their phonological systems are largely overlapping. The choice of script reflects historical and cultural associations: Devanagari links Hindi to Sanskrit and Hindu tradition, while the Perso-Arabic script links Urdu to Persian, Arabic, and Islamic tradition.
This difference is primarily orthographic (about writing) rather than phonological (about sounds). The underlying language—the spoken system—is largely the same, which is why speakers of both registers can generally communicate with ease.
Vocabulary: Sources and Etymological Categories
One of the most important ways Hindi and Urdu differ is in their vocabulary, even though they share the same core grammar and phonology. Hindi vocabulary draws from multiple sources, and linguists have categorized Hindi words into five principal etymological categories based on their origin:
The Five Etymology Categories
1. Tatsam Words (तत्सम). These are "exactly Sanskrit" words—direct borrowings from Sanskrit that retain their original spelling and pronunciation. Tatsam words are typically more formal and are often preferred in modern "Pure Hindi" contexts. An example is nāma ("name"), which is identical to its Sanskrit source. These words form the backbone of technical and formal Hindi vocabulary.
2. Ardhatatsam Words (अर्धतत्सम). The name means "half-Sanskrit," and these words are semi-Sanskrit borrowings that have undergone phonological changes as they were transmitted through Hindi's history. They're partially Sanskritized but show signs of adaptation to Hindi phonology. An example is sūraj ("sun"), which comes from Sanskrit sūrya but has been simplified and adapted. These represent an intermediate stage between pure Sanskrit and fully evolved Hindi words.
3. Tadbhav Words (तद्भव). These are native Hindi words that evolved naturally from Sanskrit through intermediate Prakrit languages. The word itself means "born from that," reflecting the idea that these are the natural descendants of Sanskrit roots. An example is kām ("work"), which evolved from Sanskrit karma through regular sound changes over centuries. These words feel native to Hindi speakers because they've undergone the most phonological change and are deeply integrated into everyday speech.
4. Deshaj Words (देशज). These are indigenous creations that originated within India but are not derived from Sanskrit, Persian, Arabic, or other external sources. They include onomatopoetic words and other native creations. These words represent vocabulary that developed independently within Hindi-speaking communities and don't have clear etymological sources outside the language.
5. Videshī Words (विदेशी). The name means "foreign," and these are loanwords adopted from languages outside India. This category includes borrowings from Persian, Arabic, English, Portuguese, and other languages. An example is qila ("fort"), borrowed from Persian. In modern Hindi, especially with globalization, English loanwords have become increasingly common.
Understanding these categories matters because they help explain vocabulary differences between Hindi and Urdu, and they show how Hindi draws on multiple linguistic layers of history.
Loanword Sources: Persian, Arabic, English, and Portuguese
Persian and Arabic Borrowings
Persian influence. Many Persian loanwords entered Hindi during the Delhi Sultanate period (beginning in the 13th century), when Persian was the language of administration and culture. These borrowings include religious terms like Islām and administrative vocabulary related to governance and court life.
Arabic origin. Many words borrowed from Persian are themselves originally Arabic, having entered Persian and then Hindi through the Persian channel. Examples include muśkil ("difficult") and kitāb ("book"), both originally Arabic but transmitted through Persian. This layering of borrowing—from Arabic to Persian to Hindi—is common in South Asian linguistic history.
English and Portuguese Contributions
English loanwords. Modern Hindi continues to borrow from English, particularly for technological and scientific terms. Sometimes Hindi speakers adopt the English word directly (like ṭeliphōn for "telephone"), and sometimes they create calques—structures where they translate the meaning but use Hindi roots (like dūrbhāṣ meaning "far-speech" for "telephone"). The choice depends on context and register; more formal or "Pure Hindi" contexts tend to prefer calques, while everyday speech may use direct borrowings.
Portuguese heritage. Portuguese loanwords entered Hindi during the period of Portuguese colonial presence in India, though these are less numerous than Persian, Arabic, or English borrowings.
Pure Hindi (Śuddh Hindi) and Neologisms
An important movement in modern Hindi has been the promotion of Śuddh Hindi ("Pure Hindi"), which aims to replace foreign loanwords with neologisms (newly created words) formed from Sanskrit roots. For example, rather than adopting the English word "telephone" directly, advocates of Pure Hindi prefer dūrbhāṣ, literally meaning "far-speech," which uses native Sanskrit-derived elements. This represents a conscious effort to develop Hindi's vocabulary capacity while maintaining its connection to Sanskrit heritage.
This movement reflects broader questions about language purity and authenticity in post-colonial India and shows how vocabulary choices can be politically and culturally significant.
Hindi and Urdu: Linguistic Relationship
Mutual Intelligibility and Shared Foundation
An essential fact about Hindi and Urdu is that they are two standardized registers of the same language, called Hindustani. This means:
They are mutually intelligible in spoken form. A Hindi speaker and an Urdu speaker can generally communicate and understand each other in conversation, even though their formal written forms differ significantly.
They share core grammar and phonology. The basic structure of how sentences are formed, how verbs conjugate, and how the phonological system works are fundamentally the same.
They share a common core vocabulary, especially in everyday speech dealing with basic concepts, family, daily activities, and natural phenomena.
The differences between Hindi and Urdu are real and important, but they're more like differences between formal and informal registers or different literary traditions of a single language than like differences between separate languages.
Key Differences: Scripts and Formal Vocabulary
Script. The most obvious difference is that Hindi uses Devanagari while Urdu uses the Perso-Arabic script. This visual difference is immediately apparent and significant for literacy and cultural identity.
Formal vocabulary choices. Beyond script, the registers differ in which etymological categories they draw from:
Hindi incorporates a higher proportion of direct Sanskrit-derived tatsam words, particularly in formal, technical, and literary contexts. This reflects Hindi's connection to Sanskrit tradition and modern nation-building efforts to "Sanskritize" vocabulary.
Urdu contains more Arabic and Persian loanwords and uses fewer Sanskrit terms, reflecting its historical development under Persian-speaking rulers and its cultural ties to the Islamic world and Persian literature.
However, these are tendencies in formal speech and writing, not absolute rules. In everyday conversation, both Hindi and Urdu speakers use a similar mix of words.
Flashcards
How many oral and nasalised vowels are present in the Hindi vowel inventory?
Ten oral and three nasalised vowels.
Which dental and retroflex sounds does Hindi contrast that Urdu speakers often merge?
Dental $[t]$ and $[d]$ versus retroflex $[ʈ]$ and $[ɖ]$.
Which Persian and Arabic loan phonemes are often substituted by Hindi speakers, and what are their typical substitutes?
$[x]$ is substituted with $[kʰ]$
$[z]$ is substituted with $[dʒ]$
$[ɣ]$ is substituted with $[g]$
$[q]$ is substituted with $[k]$
How is the labio-velar approximant $/ʋ/$ typically realized in Hindi on-glide positions?
As a glide $[w]$.
Which two specific sibilants does Hindi use that Urdu often merges into $[ʃ]$?
श $(ś)$ and ष $(ṣ)$.
What is the inherent vowel represented with consonants in Devanagari?
$/a/$.
Which sign is used in Devanagari to mute the inherent vowel of a consonant?
A halant sign.
What are the three common romanisation schemes used for Devanagari characters?
IAST
ITRANS
ISO 15919
In what primary ways do Hindi and Urdu differ despite sharing core vocabulary and grammar?
Lexical borrowings and script.
In transliteration, which Latin letters are commonly used for the sounds क़, ख़, and फ़?
$q$, $kh$, and $f$.
What is the term for the standardized parent language that includes both Hindi and Urdu?
Hindustani.
Which script is used for Urdu as opposed to the Devanagari used for Hindi?
Perso-Arabic script.
What are Tatsam words in the context of Hindi vocabulary?
Words spelled identically to their Sanskrit sources (e.g., nāma).
What are Tadbhav words in Hindi?
Native derivatives evolved from Sanskrit through intermediate Prakrit forms (e.g., kām).
What is the origin of Deshaj words in Hindi?
Indigenous creations or onomatopoetic terms not derived from Sanskrit or foreign languages.
What are Videshī words in Hindi?
Foreign loanwords from languages like Persian, Arabic, English, and Portuguese.
During which historical period did early Persian borrowings enter Hindi?
The Delhi Sultanate.
Quiz
Hindi - Linguistic Structure and Scripts Quiz Question 1: Which consonant contrast does Hindi maintain that many Urdu speakers often replace with dental equivalents?
- Dental [t] and [d] versus retroflex [ʈ] and [ɖ] (correct)
- Voiced aspirated versus voiceless aspirated stops
- Bilabial versus labiodental fricatives
- Palatal versus velar nasals
Hindi - Linguistic Structure and Scripts Quiz Question 2: What inherent vowel is associated with Devanagari consonant symbols, and how can it be suppressed?
- The vowel /a/; it can be muted with a halant sign (correct)
- The vowel /i/; it can be muted with a virama
- The vowel /u/; it can be muted with a chandrabindu
- The vowel /e/; it can be muted with a nukta
Hindi - Linguistic Structure and Scripts Quiz Question 3: Which etymological category comprises Hindi words that are spelled exactly like their Sanskrit originals?
- Tatsam (correct)
- Ardhatatsam
- Tadbhav
- Videshī
Hindi - Linguistic Structure and Scripts Quiz Question 4: When Hindi speakers encounter the Urdu phoneme /q/, which Hindi sound is typically used as a substitute?
- [k] (correct)
- [kʰ]
- [g]
- [dʒ]
Hindi - Linguistic Structure and Scripts Quiz Question 5: Which of the following is a recognized romanisation system for transcribing Devanagari Hindi into Latin script?
- IAST (correct)
- IPA
- ASCII
- Unicode
Hindi - Linguistic Structure and Scripts Quiz Question 6: Which of the following is an example of a pure Hindi neologism formed from Sanskrit roots?
- dūrbhāṣ (“telephone”) (correct)
- kitāb (“book”)
- muśkil (“difficult”)
- telefon (borrowed from English)
Hindi - Linguistic Structure and Scripts Quiz Question 7: What two factors primarily distinguish Hindi from Urdu despite their shared core vocabulary and grammar?
- Different scripts and distinct lexical borrowings (correct)
- Different phoneme inventories and verb conjugations
- Separate dialect origins and unrelated word order
- Variations in tone and stress patterns
Hindi - Linguistic Structure and Scripts Quiz Question 8: In common transliteration of Hindi‑Urdu, which Latin letter represents क़?
- q (correct)
- k
- kh
- x
Hindi - Linguistic Structure and Scripts Quiz Question 9: Which language contains a higher proportion of Arabic and Persian loanwords?
- Urdu (correct)
- Hindi
- Sanskrit
- English
Which consonant contrast does Hindi maintain that many Urdu speakers often replace with dental equivalents?
1 of 9
Key Concepts
Phonology
Hindi phonology
Urdu phonology
Mutual intelligibility of Hindi and Urdu
Writing Systems
Devanagari
Romanisation of Hindi
Vocabulary
Hindi–Urdu vocabulary
Tatsam words
Tadbhav words
Persian loanwords in Hindi
Hindustani language
Definitions
Hindi phonology
The system of sounds in Hindi, including its ten oral vowels, three nasalised vowels, and a 33‑consonant inventory with aspirated and retroflex series.
Urdu phonology
The sound system of Urdu, characterized by additional loan phonemes from Persian and Arabic and the replacement of certain retroflex sounds with dental equivalents.
Devanagari
The left‑to‑right abugida used for writing Hindi, featuring an inherent vowel /a/ that can be muted with a halant sign.
Romanisation of Hindi
The set of transliteration schemes (e.g., IAST, ITRANS, ISO 15919) that map Devanagari characters to Latin script using diacritics.
Hindi–Urdu vocabulary
The shared core lexicon of Hindi and Urdu, differentiated by Sanskrit‑derived tatsam words in Hindi and Persian/Arabic loanwords in Urdu.
Tatsam words
Sanskrit‑derived terms in Hindi that retain their original spelling and meaning, such as nāma (“name”).
Tadbhav words
Native Hindi words that have evolved from Sanskrit through intermediate Prakrit forms, like kām (“work”).
Persian loanwords in Hindi
Lexical items borrowed from Persian during the Delhi Sultanate, many of which entered Hindi with religious or administrative meanings.
Mutual intelligibility of Hindi and Urdu
The phenomenon whereby spoken Hindi and Urdu are largely understandable to each other despite script and lexical differences.
Hindustani language
The lingua franca of northern India and Pakistan, encompassing the standardized registers of Hindi and Urdu.