What is this blog?

"Words and sounds carry histories with them. Not only their own histories, but those of people who have uttered those words."- Me aka Yash.
I pay attention to people speaking. Their choice of words, their choice of pronunciation. And whenever I do hear something which I do not use, I feel obliged to attribute this different choice of words or sounds to history.

This blog is a linguistic record of my world, the sounds I hear and the letters I read, from all the languages I come across.

PS: I am a high school student, and not a linguist, so take what I have to say with a grain of salt.

Sunday, 28 July 2013

Is Devanagari really all that scientific?

Forewarning: This is going to be a slightly complicated post, dealing with specific phenomena, and not generalizations, so don't read it when you're tired.

I have finally just accepted that my blog doesn't have any international audience, and therefore I am just going to assume that all my speakers are Indians and talk about a language, which quite a few Indians know: Hindi. So I am guessing that if you have every studied Hindi in school or even just heard anyone talk about it, you would've come across the idea that Devanagari, the script in which Hindi is written is a perfectly 'scientific' script. Very often, this idea will be juxtaposed with the idea that English has a completely 'unscientific' script. Today, I am going to look at how much truth these sentences have and examine related phenomena which occur in Hindi.

So by 'scientific', what these people mean is phonemic script i.e. a script in which every sound is represented by a single letter and a single letter represents only one sound. Now, we know that is definitely not English. So yes, English is written with, what is called an 'unscientific' script. But what about Hindi? 

While Hindi is more of less phonetic, it is not completely so. For example, we know that any consonant with no vowel marks is pronounced with an 'a' sound.This 'a sound' is represented by a character called the schwa (ə). I have used 'a' to represent this character in this post. Now, we know that क is pronounced 'ka' and म is pronounced 'ma'. So is कम pronounced 'ka-ma'? No, it isn't. At least not in Hindi. It's 'kam'. The schwa after m disappears. You'd think that this only happens at the end of words. But it isn't so either. Consider दशरथ. It isn't pronounced as da-sha-ra-tha. It isn't even 'da-sha-rath'. It's 'dash-rath'. And it isn't that the schwa disappears everywhere. It remains between d and sh, it remains between r and th.  This disappearance of the schwa is called schwa syncope and is one of the very few irregularities of the Devanagari script. Interestingly, even the word Devanagari (देवनागरी) which is pronounced 'dev-naagri' shows this phenomena. 

Now the question which arises is where does the schwa disappear and where not? According to a few Internet sources, there isn't any rule which completely describes this process. There are 2 rules which describe the process partially. Now, I feel that by just adding one more rule, all instances of schwa deletion will be explained. I have presented all the 3 rules below. I haven't been able to find any exceptions to these rules. If you do let me know:

1. Schwa is deleted at the end of words unless the word is made up of only one letter, eg . न 'no'  ( This is obviously because it's not possible to pronounce the consonant without the vowel, which is schwa in this case)

2. Schwa is deleted between two consonants if both the consonants have vowels to their sides: i.e. the structure should be  VC(schwa)CV, where C=consonant, V=vowel. This rule is processed from left to right and it applies after the first rule has been applied. Let us see what this means:
  - Take दशरथ again. By applying the first rule, we get the pronunciation dasharath. The 'a' at the end is eliminated because the first rule has to be applied first. Now let's see the first instance of schwa from the left as this rule applies from left to right. It is denoted in red: dasharath. Does the red 'a' occur in this format VC(a)CV. It is surrounded by two consonants ('d' and 'sh'), the 'sh' is even followed by a vowel, but 'd' is not preceded by a vowel. Thus the structure is C(a)CV, not VC(a)CV. But the next 'a' does follow this rule. It is preceded by 'sh' (C) and then 'a' (V) and it is followed by 'r' (C) and then 'a'(V). Thus, this schwa is removed. So now, the word is pronounced 'dashrath' The next 'a' will not satisfy this rule: it is preceded by two consonats ('sh' and 'r') because we deleted the 'a' in between them. Moreover, it is only followed by a single consonant 'th' and no vowel follows 'th', since we deleted that in the very beginning by applying the first rule. Hence the pronunciation of the word is 'dashrath'
- Similarly, take सूरत- 'surata'. First step, cut out the last 'a': surat. The only other schwa does not fit the format VC(a)CV, and therefore it cannot be deleted. So the word is pronounced 'surat'.

I hope I've been able to explain this complicated rule to you. This was the hardest part of understand the process. The rule number 3 is:

3. When suffixes are added to a word- for the purpose of schwa deletion, they are considered separate words. Let me explain. Take the word कर(kar) for example, which means 'do'. And then take the suffix ता (taa) which denoted regular action- something like the English simple present tense. So करता (है) means (he/she/it) does. Now if the schwa deletion rules are applied to कर  and ता, they will be considered separate words. Let us understand the implications of this rule.

-Take the word सरक which means shift or slip. By applying the above two rules, the pronunciation of this word is 'sarak'. Now, suppose we add a suffix to it and make it सरकता. If we consider सरकता as one single word, then by the schwa rules, we will have to delete the schwa between 'r' and 'k', and retain every other schwa. So the pronunciation of the world would be 'sarkataa'. This is obviously not the case. The word is pronounced as 'saraktaa'. If we consider सरक and ता to be different units however, and the rules apply separately to both the units, we get sarak+taa=saraktaa, which is the correct pronunciation.

Thus, the one irregularity of the Devanagari script can be explained by a series of rules. I have not verified if these rules will actually hold true for all cases, so don't just go by my word on it. There are a few other irregularities in writing Hindi in Devanagari, but I believe this is the most important and prevalent one. Hopefully, this post wasn't too tedious. Next time, I will chose a topic which doesn't require so much of technical explanation.



  1. well analyzed. Looking forward to more.

  2. Always wondered about the pronunciation of the word महल (and similar words with ह in the middle). Why isn't it pronounced ma-ha-la? In spoken Hindi, it almost sounds like we put an ए matra with both म and ह .

    1. Hello Anshul.
      Yes, that is true. It is actually more of the kind of sound that "e" represents in English "bet". In fact, this is the case where there is "aha" anywhere in a word. For example, "bahan" is actually pronounced "behen."
      In fact, "h" triggers all other kinds of sound changes in Hindi. The word "bahut" for example is pronounced "bohot" (sort of). I've tried to find out all the environment "h" changes on my own, but I haven't really been able to make any general rules (except that aha-ehe).
      Thanks for your comment!

  3. Hello Yash,

    I’m glad you have discussed this “scientific” tag for Devanagiri, similar to the myth that Sanskrit is the mother of all languages. I hope you don’t mind if I point out that the 2nd para of your post is not entirely accurate. The letters of the English alphabet actually DO represent a single sound most of the time, while it is the Devanagari alphabet that represents (by default) TWO sounds (CV), at least w.r.t. the consonants.

    The reason why we presume that the English alphabet does not represent a single sound is because letters such as A and C and G have various physical representations. Consider the English letters as having allophonic features, like phonemes have allophones.. …Just as the English phoneme /p/ has the allophones [p] and [pʰ], similarly the letter C in English can represent the sounds [s] or [k]….in fact, there is a rule for when to use [s] and [k]….[k] when the C is followed by the vowels a, o and u, and as [s] when the C is followed by the vowels e, i and y. This rule holds true with words of Latin origin (with some exceptions, of course). The same rule goes for G. In the English alphabet, one can even group letters together to represent a single sound…e.g. C+H = [ṫsh] and T+H = [θ] or [ð]. Because English has borrowed from so many languages throughout its history (and continues to borrow copiously), many of these rules have to be tossed out the window. Unfortunately, scripts rarely keep pace with language change.

    As for Devanagari, it is actually an alpha-syllabic script, also called an Abugida script. The alphabet part is similar to the Greek and Latin scripts where each symbol or glyph denotes a particular individual sound, not a cluster of sounds. However, alphabetic scripts generally have separate symbols for consonants and vowels, like English. Devanagari is consonant based, i.e. each symbol represents a C+V, where the vowels are either inherent in the symbol (with the schwa), or are marked by diacritics like the matras (except when vowels appear singly.) This is why it is considered an abugida script.. The syllabic part of Devanagiri is, of course, that each letter of the alphabet is by default a complete syllable since a vowel is either stand-alone, inherent or marked (V or CV or C+diacritic). If we wish to elide the vowel, we have to either use a diacritic (halant) like क् or join two consonants together, like क्श.

    W.r.t. schwa elision, the phenomenon seems quite complex...When one elides a vowel, one is changing the syllabic structure of the morpheme...this can affect semantics, so morphemic integrity would be a consideration. Also, I think the general prosodic and rhythmic integrity of a particularly language cannot be compromised - so morpho-phonological rules, syllabic and sentence-level stress, are also aspects to be considered.

    Best regards