Friday, October 28, 2005

The All India Alphabet

My sister and I were spending last Sunday evening altering a medieval costume with several yards of fabric in the skirt, with the intention of creating Cinderella's ballgown. We had introduced a hoop and crinoline, and had flounces pinned up half way round the hemline, when we decided to adjust the length. I removed the pins, let the flounces drop and considered starting over again with a new design.

I kid you not - as I later settled in for a few minutes of reading to relax after the sewing, I picked up Colloquial Hindustani, which I had just acquired from a used book store. The book fell open to this page,

"In this simple romanic orthography, Hindustani, though appearing in clothes of new design, is still dressed in a national costume which fits it well, whereas in the usual European transliterations and transcriptions bristling with dots, dashes, and other diacritical marks, which do not really belong to the letters, it looks like a man who has lost his own clothes and has to make shift with an ill-fitting borrowed suit, pinned up here, let down there. To remove the pins and drop the fussy alterations, leaving Hindustani in the bare roman alphabet, is a great temptation to the European. And we know that the dots and dashes do, in fact, tend to wear off. The feeling that diacritics are extraneous to the roman alphabet is very strong indeed among people who do not require them. It is a sound instinct.

Unfortunately the bare unaltered roman alphabet is inadequate for the representation of Indian languages. ...

The alternative is the addition of a minimum number of extra letters of roman type, already well established, enabling us to frame a consistent Indian alphabet. The Indian roman alphabet then takes its proper plce in grammars and dictionaries. A grammatical roman spelling is established, in which all Indians can practise literacy without shame, and which opens the door to easier learning of Indian languages by foreigners of all the continents, It is the method followed in this book." p. xi

I offer this as a bit of history as well as being one possible opinion on diacritics. It is a colourful piece of writing and a perspective on why a certain orthography may or may not gain favour.

Today another roman orthography is often used for transliteration in India, ITRANS. It also lacks diacritics by virtue of being accessible in an ASCII encoding for use on the QWERTY keyboard. Just the unadorned roman alphabet. Even though many sounds require several letters to type, it may still be faster than using the shift key.

I do know that diacritics were used for many languages where the main consideration was to be able to use a typewriter, rather than have to get extra letters for printing. Orthographies are tied to the preferred technology of the day more often than not.

The additional characters of the All India Alphabet in the image above are all available in Unicode in the IPA extensions. As I said, I am posting this quote for historic interest only. I have no particular opinion on roman orthographies in India.

Harley, A. H. Colloquial Hindustani. Introduction by J.R. Firth. 1943. Kegan Paul, Trench, Trubner & Co. London.


Anonymous Anonymous said...

Your remark that "orthographies are tied to the preferred technology of the day more often than not" really hits it on the head. A tangential example of this is that China, and much more recently Taiwan, has switched the format for government documents from top to bottom to left to right. I'm also reminded of the frequent assertion that computers have "saved" Chinese characters, with Chinese-character typewriters having been too cumbersome and complicated for popular usage. (In reality, computers may be contributing to the decline of Chinese characters because people have become so used to relying on their help for the writing of characters that they are increasingly unable to recall how to write characters without aid. This is occurring at all levels of society.)

The question of how many diacritics is "too many" is an interesting one. For Mandarin, people can avoid all diacritics by using Gwoyeu Romatzyh, which features tonal spellings rather than tone marks. (For an example, see this selection from the story of Humpty Dumpty.) But this is a complicated system that I believe greatly increases the chance of spelling errors.

On the other hand, some have objected to Hanyu Pinyin, saying that all using all those tone marks would make the orthography look more like chicken scratches than a regular writing system (whatever that is). That Chinese characters are vastly more visually complicated than even the most extended of alphabets doesn't seem to enter into some people's equation, though.

I tend to think that the best system for native speakers would be the least complicated, thus omitting most diacritics. As such, Mandarin would need fewer marks than, say, French or Czech routinely uses.

I'd like to learn more about the use of pointed/unpointed text with the Hebrew and Arabic scripts. I wonder if you or any of your readers know of studies on the efficiency (for reading, but also for writing, if available) of using or omitting diacritics with these scripts. Or perhaps the partial omission of diacritics.

10:15 PM  
Anonymous Anonymous said...

Mark said,

"(In reality, computers may be contributing to the decline of Chinese characters because people have become so used to relying on their help for the writing of characters that they are increasingly unable to recall how to write characters without aid."

Hi Mark,

This is very interesting and explains why handwriting recognition is so popular as an input method in Vancouver. Parents and teachers want children to learn to write each character correctly.

On your other point about diacritics - I don't really like them because I hated to have to write accents for classical Greek -it seemed downright punitive at the time. However, I do accept accents as a normal part of French and German orthography.

I think that artificially imposing them is a lost cause myself. But if they are there already traditionally and people identify with them - okay. I think there are many good arguments for a little underdifferentiation at the phonemic level if it doesn't create overwhelming ambiguity at the lexical level.


5:41 PM  
Blogger michael farris said...

I lurve diacritics. In old Greek, yeah they don't make much sense anymore, but I like the simple word stress marking in modern Greek. I also love haceks and cedillas and tildes and macrons.

As for Indian languages, modern standard Hindi has it's share of diacritics : f and z (ph and j with dots) are common but so are dotted k, kh and gh (for Arabic/Persian vocabulary) and dotted d and dh retroflex for the flap variants of those and a 'new' vowel for an English 'o' (aa with a breve) (as in doctor). I've seen another diacritic for ae (aa with a tilde) but I don't think it's standard.

10:46 AM  
Blogger michael farris said...

For your amusement (or not) a short simple text in my own Hindi (hindee) romanization. It's a hybrid using film title conventions but consistently and very few diacritics, mainly the subscript dot for retroflexes. Final n represents nasalization after a diphthong or long vowel and /n/ after a short vowel. Note that the most diacriticized word is a borrowing from English. I don't know if it'll survive the transformation to html but here goes.

Raam ke parivaar men maataa aur pitaa hain, Raam ke pitaa-jee ke maataa-pitaa yaanee daadaa-jee aur daadee-jee aur Raam kee bahan – Manju. Raam ke pitaa-jee ek aspataal men ḍŏkṭar hain. Raam kee maataa-jee ek skool men adhyaapikaa hain. Yah skool laḍkiyon kee hai. Raam kee maataa-jee ke skool men laḍke naheen hain. Raam aur Manju skool ke bacce naheen hain. Ve vishvavidyaalay ke chaatr hain.
Yah Raam kaa ghar hai. Ghar par daadaa-jee aur daadee-jee hain. Raam ke pitaa-jee aur maataa-jee kaam par hain. Raam aur Manju vishvavidyaalay men hain.
Ghar men caar kamre hain, ek rasoighar aur ek gusalkhaanaa. Kamron aur rasoighar men khiḍkiyaan hain, gusalkhaane men khiḍkee naheen hai.

9:24 AM  
Anonymous Anonymous said...

Hi Michael,

Once again good for Firefox. I do see the subscript dots. This reminds me of the two different roman orthographies for Cree. One uses double vowels and the other accents. Of course, in the one with double vowels the words are 25% longer but no diacritics. I am wavering on this issue. There must be a happy medium. Certainly the subscript dots seem atractive - not too frequent. Thanks for this example.

11:10 PM  
Blogger michael farris said...

Okay, here's some more examples a more advanced paragraph with a bunch of diacritics (355 characters with spaces)

Bhāratvarṣ mẽ log holī kā tyohār baḍī ghumghām se manāte hãi. Holī phagun mahīne mẽ hotī hai. Phāgun kī purṇimā kī rāt me holikā kā dahan hotā hai. Ham kah sakte hãi ki holikā jalāne se ās-pās kā vātāvaraṇ śuddha ho jātā hai. Agle din log ek-dusre par rang ḍalte hai aur khuśī manāte hãi. Satrī-puruṣ, bacce-booḍhe, sabhī log prem ke rang mẽ ḍub jāte hãi.

The same paragraph in the same system that I showed you earlier (396 characters)

Bhaaratvarṣ men log holee kaa tyohaar baḍee ghumghaam se manaate hain. Holee phagun maheene men hotee hai. Phaagun kee purṇimaa kee raat me holikaa kaa dahan hotaa hai. Ham kah sakte hain ki holikaa jalaane se aas-paas kaa vaataavaraṇ shuddha ho jaataa hai. Agle din log ek-dusre par rang ḍalte hai aur khushee manaate hain. Satree-puruṣ, bacce-booḍhe, sabhee log prem ke rang men ḍub jaate hain.

Finally, a slightly tweaked version of that, taking advantage of hindi phonotactics to some degree (the rules for decoding are more complicated but the overall appearance is I think more asthetically pleasing. (379 characters)

Bhaaratvarṣ men log holi ka tyohaar baḍi ghumghaam se manaate hain. Holi phagun maheene men hoti hai. Phaagun ki purṇima ki raat me holika ka dahan hota hai. Ham kah sakte hain kì holika jalaane se aas-paas ka vaataavaraṇ shuddha ho jaata hai. Agle din log ek-dusre par rang ḍalte hai aur khushi manaate hain. Satri-puruṣ, bacce-booḍhe, sabhi log prem ke rang men ḍub jaate hain.

8:31 AM  
Anonymous Anonymous said...

This ia a gay site it doesnt even tell you how to say the alphabet

12:44 AM  
Anonymous Anonymous said...

wheres the alphabet at??

9:25 AM  
Anonymous Anonymous said...

this sucks where the !#$& is the alphabet?

12:01 PM  

Post a Comment

<< Home