Thursday, June 30, 2005

Tamil PuLLi revisited

N. Ganesan has responded to a comment I made in the Unicode list on Tamil consonants with pulli.

Suzanne McCarthy wrote:

It is also interesting to note that Isaac Taylor 1883 in The Alphabet represented Tamil as an alphabet. The consonant plus pulli was shown as the basic unit unlike all other Brahmi derived scripts.

N. Ganesan wrote:

Thanks for the reference. Very interesting info. Perhaps, C. Beschi (a Jesuit missionary, 17th century) mentions this in his books also (have to check). Indeed, in Tamil grammars (ancient Tolkaappiyam or 12th century Nannuul) consonants with puLLi are said to be the basic units. This data is seen in Tamil dictionaries also in contrast to Hindi dictionaries.

Typically, in any non-Tamil script of India will not depict virama in sentences, but any Tamil text will be full of Pulli. That's the reason Tamil has no conjuncts.
ISCII (foll. it, Unicode) has put the Hindi consonant model upon Tamil.

S. McCarthy wrote:

Diderot's encyclopedia, 1750, portrays Tamil as a syllabary, once again, unique representation among all Brahmi scripts.

N. Ganesan wrote:

Will be interesting to see which early European books mention consonants with puLLi as basic units of Tamil. Taylor, of course. Beschi??

Tamil Grantha code page can have ka, kha, ga, gha, nga, ... as consonants because it's used to write Sanskrit. But Tamil language defines k, ng, c, ny, ... as consonants, so ideally Tamil code page should have had them as consonants. http://www.unicode.org/charts/PDF/U0B80.pdf Consonants: TAMIL LETTER K, TAMIL LETTER NG, TAMIL LETTER C, TAMIL LETTER NY, and so on.

To illustrate: R. S. McGregor, The Oxford Hindi-English dictionary, OUP, 1995 shows ka, kha, ga, gha, ... as consonants on p. xvii.

OTOH, Tamil Lexicon, University of Madras, Vol. I (Reprint: 1982) p. lxviii has Transliteration table First vowels: a, aa, i, ii, ... Then consonants: k, ng, c, ny, T, N, t, n, p, m, ... (Ie., consonants have puLLi, so no inherent vowel).

A reference grammar of cl. Tamil poetry V. S. Rajam, American Philosophical Society, 1992 also defines Tamil consonants with PuLLi. Take R. Gruenendahl, South Indian Scripts in Sanskrit mss. and prints, 2001:Wiesbaden.

Grantha Tamil, Malayalam, Telugu, Kannada and nandinagari have consonants defined as abugidas with /a/. But Tamil alone (p. 44) has Consonants defined with Pulli.

N. Ganesan

Wednesday, June 29, 2005

Robert Bringhurst

Robert Bringhurst has recently published The Solid Form of Language, 2004, in which he presents a new model for classifying writing sytems.

Beginning with the original relationship between a language and its written script, Bringhurst takes us on a history of reading and writing that begins with the interpretation of animal tracks and fast-forwards up to the typographical abundance of more recent times. The first four sections of the essay describe the earliest creation of scripts, their movement across the globe and the typographic developments within and across languages.

In the fifth and final section of the essay, Bringhurst introduces his system of classifying scripts. Placing four established categories of written language - semographic, syllabic, alphabetic and prosodic - on a
wheel adjacent to one another, he uses the location, size and shape of points on the wheel to show the degree to which individual world languages incorporate these aspects of recorded meaning.

This wheel may owe something to the medicine wheel of the aboriginal peoples of North America. The four cardinal directions of the medicine wheel are defined here.

In addition to having his own taxonomic wheel for writing systems, Bringhurst writes about the Cree Syllabary and its use among the Inuktitut, Cree and Ojibway.

For instance, in talking about the artificial syllabary created for the Algonquian languages of the Cree and Ojibwa by James Evans in the 19th century, which was later adapted to Inuktitut (the unrelated language of the Inuit in the Canadian Arctic), he adds, 'Other enthusiasts based new scripts on Evans's principles, adapting the idea of rotating characters to suit the needs of Carrier and Chipewyan -- Athapaskan languages phonologically very different from Cree or Ojibwa.'

The point of Bringhurst's bringing up the Canadian syllabary is to contrast it with another artificial writing system, the Hangul script of Korea. Both systems were new, both were artificial; but in Korea, when Hangul was introduced in the 15th century, there was already a long tradition of writing using the Chinese script -- including a sophisticated calligraphic practice -- whereas the Cree and Ojibwa had not been in the habit of writing down their languages at all. Nor had the Inuit.

'In the Eastern Canadian Arctic,' writes Bringhurst, 'the Inuktitut versions of this script are now a major tool for administrative work and are used for literature as well. Yet after one and a half centuries of use, Canadian syllabics still have not developed a fluent cursive form nor a calligraphic tradition.' After pointing out that most people using the script today "write" it using a keyboard rather than a pen, he concludes, 'It remains to be seen how the script may now develop through the medium of digital design.' Then, responding to political developments in the Canadian North since the original version of this essay, he adds: 'It also remains to be seen what effect the creation of Nunavut will have upon this script. Inuktitut literature is old, and so is the tradition of Inuit independence, but an Inuktitut-speaking bureaucracy has never before existed.'

While Bringhurst comments correctly that no cursive form developed, I have seen syllabics being written fluently. I wonder if it is the very minimal nature of the shapes that has discouraged the development of a cursive form. I would be interested in asking Bringhurst what his thinking was behind this comment.

Bringhurst is himself responsible for designing a Cree font which he used in writing a play, Ursa Major in Latin, Greek, English and Cree.

Ursa Major was typeset in Giovanni Mandersteig’s Dante type with New Hellenic Greek and Robert Bringhurst’s Cree Syllabics. The play was written in English, Latin, Greek and Cree, and is presented on the page in two colours to capture the polyphonic aspect of the work as it has been performed.

Monday, June 27, 2005

The Cree Syllabarium

The Cree syllabary was invented by James Evans and has been used ever since by the Cree, Inuktitut, Oji-Cree, Ojibway and many other First Nations of Canada. Evans' writing system was invented some time between 1837 and 1841. In 1837 Evans took his Ojibway Speller and Interpreter in Indian and English to New York City to be printed.

This is a sample of text of the earlier Roman orthography taken from the speller at the Victoria University library in Toronto.

Uu-nend dus e-geoô u-ne-mo-sug ne-duu-nes-guu-de-
ze-oug, gu-ea oee-duu-goun-ga-oug, gu-ea uu-nend
dus e-geoô gr-je ne-bouu-guu-oug guu-oeen oee-duu-
goun-ga-zee-oug.
Words are divided into syllables by hyphens. I was able to observe Cree speakers, 15 years ago, still using a Roman orthography in an informal way that also broke words into syllables using hyphens. It was not unlike the orthography for Potawatomi seen here and discussed in my post on Potawatomi.

Evans' was using a time honoured and popular method of writing when he used hyphens to mark off each syllable, and arranged the syllables in tables to teach literacy in his speller of 1837. However, in 1841 he presents a system of syllables written with completely novel glyphs or character shapes.

In 1841 this copy of the Cree Hymnbook was printed from Evans' own press in Rossville, Manitoba. Sometime between 1837 and 1841 Evans was inspired to represent the syllables with a distinctive set of symbols.

It happens, however, that the Cree Syllabary belongs to a family of scripts based on the shape of the symbols. The Ramseyer-Northern Bible Society Museum Collection at the University of Minnesota Duluth has a display of bibles in "Constructed Alphabets." Three out of six of these scripts bear a distinct resemblance to each other: Moon, Pollard and Chippewyan (Ojibway).

The Pollard script was invented in 1905 and it is known that Pollard emulated Evans in using similar shapes and in representing syllables, although this was done on an entirely different principle.

The Moon Code, as it is known, was invented in England in 1843 by William Moon who was himself blind. The Moon code was a full alphabetic orthography in which each symbol stood for a letter of the Roman alphabet. However, is is taught by organizing the symbols into an arrangment of similar shapes. In organization these two illustrations, first, A Simplified Alphabet, and Moon Code in Groups bear a uncanny similarity to the original Cree syllabary.

Careful scrutiny reveals that Evans also retained the same order of the shapes, in the p, t, c and m series as the Moon system.

However, the problem remains that Evans invented his system between 1837 and 1841 and Moon between 1841 and 1845. While Evans in Canada preceded Moon; Moon, in England was blind.

One answer is to look at the larger family of scripts being developed for the blind in England around that time. Moon's script was extremely popular and remains to this day an alternative to Braille. The other systems are less well-known.

Tiro Typeworks displays two forms of writing for the blind, by Frere and Lucas, that were published around 1837. We do know that Moon met with Frere in 1841 to discuss writing system design for the blind. This article indicates that the notion of orienting a symbol in different directions, in either a north, south, east, west configuration; or in a north-west, south-west, south-east, south-west orientation was already a design feature of these systems.

We may never know exactly which system inspired James Evans but we can at least be sure that he was tapping into an already recognized and accepted set of design principles, both in his representation of phonology as syllables, and in his selection and organization of symbol shapes.

Of these related systems Canadian aboriginal Syllabics (Cree Syllabary), Pollard (A-Hmao) and Moon are all still in present use.

Writing System Typology

Edited List of Resources on Writing Systems Classification.

Bringhurst, Robert
Poser, William
Rogers, Henry
Sproat, Richard

Original Post Below

Warning: Writing system terminology on this site may be dangerous to your health. It is being used experimentally, experientially, momentarily, temporarily and sometimes even playfully.

"The main problem with blogs is that, as far as Google is concerned, they masquerade as useful information when all they contain is idle chatter," wrote Roddy.


I won't deny that this site is mostly idle chatter but I thought I would provide a little useful information in passing.


For a formal classification of writing systems visit the lecture hall of Richard Sproat . He provides a thorough and well-illustrated page on writing system typology, which he is has subtitled "Crude Taxonomies". I have chosen to post Richard Sproat's page since a) it is the most detailed I can find, b) he has a very robust and current internet presence, c) I enjoyed his article on Indic scripts; and d) I asked him if I could. (I guess I didn't need to ask but I didn't want to walk in on his course uninvited.)

Andreas Goppold also offers a typology of writing systems. Andrew Wilson makes his notes available and so on. Gary Feng, whose blog, Shadow, I follow with interest, has created a list of books on writing systems. A full review of World's Writing Systems by Peter T. Daniels can be read here. There are also the books listed in the sidebar from Omniglot.

If anyone has a great internet resource on writing systems to add please comment.

Saturday, June 25, 2005

The Potawatomi Syllabary

On May 29th Omniglot added a new page on the Potawatomi (Bodewadmi )writing system.

At the bottom of the page the alphabet and its phonetic equivalent is displayed. This represents the Pedagogical writing system for Potawatomi and is the orthography now used for teaching those who speak English how to read Potawatomi. No other writing system is presented. Just the alphabet. There must be more to this than meets the eye. Indeed, there is.

Last year in a language forum I was challenged to present a syllabary that had evolved from an alphabet and I said "The Potawatomi Syllabary, of course." I was refering to the writing system used for Potawatomi in the last century.


"From the 1830's to the 1860's two Jesuits, Fr. Christian Hoecken and Fr. Maurice Gailland, were missionaries to the St. Mary's band of Potawatomi (the combined Prairie and Citizen's Bands). These missionaries developed a writing system that was taught as a syllabary. As you can see from Fig. 2 below, there are different symbols (letters) for consonants and vowels, but they are written in groups of two letters, where each group represents a single syllable. This system came to be known as the "ba-be-bi-bo-bu" syllabary."

This is from www.potawatomilang.org and their article on writing systems is too interesting to miss.

Is it the alphabet or is it a syllabary, no wait, a syllabarium? It is organized as a syllabary. So is it a writing system of its own or just a configuration of the Latin alphabet?

One could be excused for thinking that the organization in a syllabary is an artefact of the last century. However, read on. Here is the Potawatomi alphabet organized in syllables. And here are some thoughts on Potawatomi where it is also organized in syllables. There is an ongoing loyalty to the organization of the alphabet into a syllabary.

Along with this presentation of the syllabary is an interesting reflection on the notion that there should be one right way to write a language.

"To say that any of these alphabets are more correct than any other would probably cause someone’s ego to swell, but we shall attempt it by giving credence to the one we have just written down. Let us say it is closer to the truth of the Potawatomi language in use today. I do not believe the old Potawatomi language can be written any other way than it was put down by so many of the old folks who attempted to convey certain thoughts to us, such as old medicine recipes, songs, stories, and prayers. Too, certain missionaries, priests, nuns, educators, and even members of the foreign military powers who conquered us wrote many facts down about our various languages.

My grandfather once told me there was no one correct way to write this language. It was in the mind of the one who wrote it down, but we have these folks with us today who think they know the truth of everything and so we go. " Donald A. Perrot

The early 1800's saw the emergence of many syllabic writing systems: Cherokee and Cree have their own symbols and have endured intact to this day. The Cree syllabary, now used by many First Nations in Canada and called Canadian Aboriginal Syllabics, can be viewed here where it is used for Inuktitut.

Friday, June 24, 2005

Yuen Ren Chao

Tonight I am rereading parts of Language and Symbolic Systems, 1968, by Yuen Ren Chao. There was a lively discussion about YRC and Gwoyeu Romatzyh as well as his Alice translation on Language Hat last fall and in an article on Pinyin Info. So no need to recap that.

Chao lead the work on Romanization and advocated the use of the alphabet as a parallel system for Chinese.

Here are some of his thoughts on reading Chinese.

"Of the three sizes of units of writing: morphemic, syllabic, and alphabetic, the first involves an enormous number of symbols to learn, the second a lesser number and the third only a handful, which can be learned in a few hours. But it is one thing to teach or learn a system and another thing to use it.

As we have noted, reading is not by letters or by words but by much larger units. From this point of view a morphemic or a word-sign system of writing can be taken in faster than a system based on smaller units. One does to be sure take in English by words and sentences in one glance too, but since there is less individuality in the shapes of the letters, the words do not stand out so prominently as in a text of Chinese characters. In looking for something in a page of English you have to look for it, but in doing the same in a page of characters the thing looked for, if it is on the page, will stare you in the face.

In the language of communication theory each symbol in a character text, being one out of several thousand, carries more "information" than one in a small class of items. The simplest kind of system of writing consists of two words: 0 and 1 and all text consists of nothing but a succession of zeros and ones. Such a language will suit a computer but not the brain of a speaking and reading person. " p. 111-112

Chao advocates the use of the alphabet as a parallel system.

"On a comparable scale of invested interest, the very difficult system of Chinese writing, which will rate very low on most of the requirements – except that of elegance (in a sense) and except that of operational efficiency in terms of information per chunk – has not only served well the Chinese speaking people, but also several of the countries of Eastern Asia speaking various non-Chinese languages. It not only extends widely over space, but also over more than two millennia in time without substantial structural change. It was therefore not without some intellectual and emotional hesitancy that for a number of years I have advocated the use of the Latin alphabet for writing the Chinese language, which will probably be the future form in which the language will be written. However, I felt safe to advocate an alphabetic form of writing Chinese and have actually contributed toward designing and promoting a version of it, for I think that there is little danger of the characters being abolished too soon and that the characters will remain in use for decades, if not indefinitely, as a parallel form of writing." p. 226

Chao defends the operational efficiency of Chinese writing on the basis of information per chunk, as well as the practical need for an alphabetic system. He also theorizes here about an ideal writing system.

"If vested interest could be discounted in favour of end efficiency, my guess for an ideal system of visual and auditory symbols for general purposes of speech and thought will involve neither the extreme paucity in elementary units nor the extreme luxury of thousands of them, but probably about 200 monosyllabic symbols, such that a string of “seven plus or minus two” can be easily grasped in one span of attention. A previous guess (p. 112) on a slightly different basis, came out as 170." p. 226

Although this may have no specific relevance to writing Chinese, for Chao the ideal and most efficient system of visual and auditory symbols is undoubtedly a syllabary. This may stand out as unique in the entire period from 1883, when Isaac Taylor wrote The Alphabet until recently. For Taylor and those who followed and wrote in the English language, it could be argued that the alphabet represented the ideal and most efficient system of writing.

Wednesday, June 22, 2005

The Word Play of Xu Bing

Xu Bing is an artist now living in Brooklyn, New York, who has let his imagination run riot with the interface between the Chinese and English writing systems. He has even managed to create an alphabet of Chinese characters and Chinese calligraphy from English literary works. On Xu Bing Interactives his art comes to life and takes flight, and immigrants tell funny stories based on miunderstanding words. His art explores many facets of word play with lightness and laughter, inviting the audience into his world of Chinese and English literacy. What a delight!

While I am not able to link directly to each piece in his gallery, they can be viewed here under 'My projects' and at Xu Bing Interactives

View living word at Xu Bing Interactives

On the floor of the gallery is written the dictionary definition for "niao", the Chinese word for bird. The "niao" characters break free from the confines of the literal definition and take flight through the installation space. As they rise into the air, the characters transform from a standardized Chinese text into the form of the ancient Chinese pictograph based upon a bird's actual appearance.

View A, B, C, ... at www.xubing.com.

The theme of this work is the awkwardness encountered in linguistic exchange between different cultures. It is comprised of thirty-eight ceramic cubes that represent a sort of transliteration from the twenty-six letters of the Roman alphabet to Chinese characters. The characters that have been chosen are such that, when pronounced, render sounds equivalent to the English letter they represent. The Chinese characters are carved on the upper face of the each ceramic block in the form of a printer's stamp and the Roman letter is printed on the side of the block. For example, the English letter 'A' is rendered by the Chinese 'ai', which means sadness. 'B' is rendered 'bi', which means land on the other side, on the other shore. Some letters need two or three Chinese characters to 'transliterate'. For example, 'W' is rendered 'da', 'bu', 'liu' which means big, cloth and six. This activity may begin with a becoming logic, but ultimately it leaves its subject, transliterated language, virtually meaningless and almost ridiculous.

View New English Calligraphy at www.xubing.com.

These pieces use a traditional form of Chinese Calligraphy Art to display a western piece of writing. The New English Calligraphy alphabet is us to create a piece that is Chinese in appearance, yet understandable to the western viewer. In essence, these texts portray the language or written English in a Chinese form that has never before graced the pages of English text. Poems by Ezra Pound, William Carlos Williams, Robert Frost and others have been rendered along with logos, quotations from Chairman Mao Tsetung and the titles of exhibitions, such as this one for the "Third Asia Pacific Triennial of Contemporary Art".

For the ultimate script tease view Monkeys at Xu Bing Interactives and wash it all down with Laughter.

Tuesday, June 21, 2005

Along the Digraphic Path

When I was making a bit of a mess with the Chinese IME and a Chinese dictionary the other day, Jimmy Ho made a few helpful corrections and then added an example of terminology that is very similar in Greek and Chinese.

"As far as Greek-Chinese analogies go, I am partial to dao 道 / méthodos."

Methodos is from 'μετα' after, along, with and 'οδος ' the way. Method is the English word derived from it. Οδος was an expression in Greek for the way, the path, the Christan life, etc.

Then on a totally different tack, I decided to read about digraphia in China and found two articles of interest. The first article , by Feng Zhiwei, is called The Digraphia problem in the Information Age in the Newsletter of the East Asia Forum on Terminology; and the second is
The Future of the Chinese Writing System with a section subtitled Along the Digraphic Path and was written by Apollo M. Wu.

Here are a few paragraphs from The Digraphia Problem which outlines in detail many historic and legal issues surrounding the "two-script system."

Liu Daosheng's report did not mention at all the universal pinyin approach raised by Mao Zedong, rather, it focused on enlarging Hanyu Pinyin's scope of application. This suggests that our government has abandoned the policy of "pinyin approach" of Mao Zedong and that Hanyu Pinyin will not be regarded as a writing system, but as an auxiliary tool to Hanzi, the Chinese character. Hanzi is the orthodox and legal writing script for Chinese, while Pinyin does not have such a legal status. Therefore, since the 1986 National Conference on Language Works, Pinyin and Hanzi no longer have equal status. ...

Although the current government policy on Pinyin is outlined as above, the Government has indicated that the issue is still open to discussion. Therefore, some of our country's scholars continue to publicly advocate digraphia. For example, Prof. Zhou Youguang advocates the implementation of the "two scripts system" (a dual tracks approach in language development). The Government does not discourage these scholars from expressing their points of view or carrying out freely scientific researches. ...

In practice, most viewers may opt for reading the computer output in conventional Chinese rather than such Pinyin codes, but Pinyin codes rather than Hanzi codes will used for efficient computer processing and data communications.

The following is from The Future of the Chinese Writing System.

Along the Digraphic Path

Korean language has largely phased out of the Chinese / Hangul digraphic mode. It entered the digraphic stage some 500 years ago when Hangul was created with imperial blessing to assist the use of the Chinese character system. Today in Korea, Chinese characters are used less than English and the Hangul text has even incorporated word separation. The Japanese has also largely evolved their writing system along a multi-graphic format using Chinese characters, kana and Roman alphabets. ...

As we progress, the comparative disadvantages in the Chinese writing system will increasingly be translated into unequal productivity and creativity, leading to a vital competitive disadvantage for the Chinese in an increasingly knowledge-based economy. The increased use of Hanyu Pinyin both in learning and information processing should progress along a digraphic path, similar to those followed by other Asian languages.

Both these articles promote a Chinese digraphia albeit with the contrasting images of path or problem.

Sunday, June 19, 2005

The Alexandrian Library

In Chapter 14 of The Alphabet Effect, 'The Printing Press: Enhancing the Alphabet ', Logan goes into more detail on what he means by the alphabetization of knowledge.

"Alphabetization in Semitic, Greek, or Roman culture did not become important until the period of the Alexandrian library in the third century B.C. when the techniques of organization used by the librarians spread to other sectors of the information industry such as authors and public administrators. One of the earliest uses of alphabetic order is found in a late third century B.C. inscription from the island of Cos in which the participants in the cult of Apollo and Heracles arelisted alphabetically (Paton & Hicks 1891, p. 368). " p. 34

The notion that the alphabet transformed indexing, classification and search access is well argued.

When Logan discusses the interaction between literacy, print and science he is on solid ground since this is his own background. When he tries to make a case for certain effects belonging only to the alphabet and not to other scripts, or vice versa, the discussion breaks down from time to time. He occasionally confuses 'dialects' and 'languages' as well as 'languages' and 'scripts'.

Although Logan interminably refers to alphabetic literacy, it turns out in the end that he really means phonetic literacy since his new edition has added "phonetic syllabaries" to alphabetic literacy. (chapter 1 p.6) However, his predecessor, Havelock, whom Logan quotes, drew the line between syllabaries and alphabets.

Some of Logan's sweeping statements are farfetched and his references to specific scripts are not grounded in a knowlege of reading theory. However, I found some relevant material and the book The Alphabet Effect has prompted me to further investigate the history of alphabetization, which we know today by the name of 'sort' or collation.

Saturday, June 18, 2005

Vietnamese

I was working with a Vietnamese social worker recently when he asked if we could look up autism in Vietnamese on the internet. Since I had heard that there was some difficulty in keyboarding Vietnamese, we went to the Vdict dictionary for a reference and then copied and pasted into google. Sure enough we got some hits.

However, he was watching all this intently and said "Oh, no, you don't need the accents - I never use them, just type in the word from the English keyboard." I demurred.

He said, "Look at this. I type 'bai bien' for beach, google and there they are, beaches." Hmmm.

So later, by myself, I ran a little test. I keyboarded 'bai bien' in 4 different ways and then tested them out in google to see what I would get. In spite of the fact that these words to not display in this blog they did display properly in google and I did get hits.

bãi biển - copied from the Vdict dictionary - 403 hits
bai bien - no accents, right off the English keyboard - 103 hits
bãi biên - a mixed encoding, one level of diacritic but not the other - 218 hits
baÞi biêÒn - from the Microsoft Vietnamese keyboard - 1 hit

The hits for bai bien, no accents, were approximately 75 % Vietnamese beaches, some French horses and a few other things. Still lots of beaches. If I had designated the language would I have done better? No further comment at this time.

Visual Bias

Here is a curious passage in Robert Logan's The Alphabet Effect, 1987, p. 121 or here, chapter 9, p. 40. It might be worth reading the context.


"The alphabet by separating the sound, meaning, and appearance of a word separated the eye from the rest of the senses, especially the ear. Preliterate man is multisensual whereas alphabetic man is highly visual. "Between Homer and Plato, the method of storage began to alter, as the information became alphabetized, and correspondingly the eye supplanted the ear as the chief organ for this purpose" (Havelock 1963).

The Greeks created visual space, the geometric space treated in Euclid's elements. With alphabetic literacy visual metaphors for knowledge crept into usage in the Greek language. We use similar metaphors in English as the following examples illustrate. Our word idea derives from the Greek word eidos, "the appearance of a thing". Theory derives from the Greek word theorein, "to view" (the word theater has the same root). The term speculate derives from the Latin specere, "to look".

Logan quotes Havelock who associates the alphabetization of information to the eye supplanting in the ear. I am not quite sure what he means by the alphabetization of information, since encyclopedias were not alphabetized until the 1700's and the alphabetization of glossaries seemed to relate directly to the sound of the word, not its appearance. See this post.

However, he may simply mean 'as information was written in an alphabet' the eye supplanted the ear. Logan expands on this by saying, "With alphabetic literacy visual metaphors for knowledge crept into usage in the Greek language."

I am enquiring today whether alphabetic societies are unique in having visual metaphors for knowledge. If Chinese has visual metaphors for knowledge then what? Maybe this is a universal pattern related not to alphabetic literacy but to any kind of literacy. Maybe it is not related to literacy at all. However, I thought I could make a minor attempt to look at both Greek and Chinese and someone might build on this attempt and give me the real goods.

First, theorein - 'to view' is from θεωρέω - 'look at, view, behold, observe'; and theory θεωρία is 'contemplation and reflection' also 'sight or spectacle', Liddell & Scott, 1871.

Then idea from είδος - the 'appearance of a thing' also 'form, shape or figure' is derived from είδω 'know.' or 'see.' It is also related derivationally to οράω - 'see'.

That's what I find for Greek - now how about Chinese?

识 见 or 見識 jian shi - knowledge from 见 or 見 jian - see

表 象 biao xiang - idea from 象 xiang - shape, form, appearance

I suppose that Logan could argue that the morpheme for the spoken word also occurs in the Chinese word for knowledge. However, λογος - logos or -logy is a familiar morpheme for knowledge in English and it somes from λεγω - to communicate by word of mouth.

So far, the visual bias seems at least as true for Chinese as it is for Greek and Greek-derived languages. If "the eye supplanted the ear", then this was as true for Chinese as it was for Greek, it was a response to literacy, not just alphabetic literacy.

This is just one of the many times Logan confuses his contrast of alphabetic and non-alphabetic with literate and preliterate.

Friday, June 17, 2005

The Vagaries of Sort

Andrew West left a comment on my Siege of Belgrade post to this effect.

The lack of "J" is more likely to be a relic from the time
when "I" and "J" were not considered to be separate
letters. My 1785 edition of Thomas Dyche's wonderful "A
Guide to the English Tongue" (first published 1709) gives
three rather pedantic examples of alphabet poems, the
one quoted below omits "I" and "X", .... and the other two
omit both "J" and "U". Dyche himself vigourously asserted
the distinctiveness of "I", "J", "U" and "V" as independent
letters, but old habits die hard, and so even in the early
19th-century Watts may not have considered "J" to be a
proper letter in its own right.


I went to a Toronto website for further information on things alphabetical and found a book by Jean Florence Shaw called Contributions to a Study of the Printed Dictionary in France Before 1539. In section 3.1.2.1. Arrangement of lemmata, every possible arrangement for entries in a glossary is presented, from the location of the word in a literary text to thematic and alphabetic order.

There is an overwhelming variety in the nature of alphabetical organization from the third century BC on, beginning with a glossary of hard words in Homer. Alphabetical arrangement was often only by the first letter of the word but some went to the second and third letter. However, two books are mentionned from the 2nd and 9th centuries which follow nearly perfect alphabetical order.

Here are some of the different principles of alphabetical arrangement which have been applied unevenly over the millenia.

1. Grouping together all words beginning with the same letter
2. Grouping together to the second letter
3. This was not always the second letter occuring in the word,
the second letter might be the vowel of the first syllable,
regardless of what other letter might intervene.
4. Words whose initial syllable was pronounced the same were grouped together
5. For "U" glossaries distinguished words beginning with a vowel
from those beginning with a consonant
6. "H" was often not taken into consideration either initially or medially
7. "Ph" was mixed with "F", and "C" with "K"
8. Double consonants were grouped with single consonants

Shaw concludes that "The determining factor in ordering entries was the syllable, not the letter." 3.1.2.1.3. I believe she is saying that the order depended more on the pronunciation of the first syllable of the word than on the spelling of the word.

Since one of my interests is assisting dyslexics to search in google more easily, I wonder whether one could use a phonetic spellcheck to input into the search box and recreate the effect of a phonetic search order. How about typing "fonetik" into the box? or Kanada? and getting content that matches that pronunciation? Type fonetically, chose from a series of possible spellings (which you might recognize but cannot produce), check the attached dictionary function, select, confirm, input - sounds like Pinyin to me.

Back to history. Encyclopedic knowledge was not submitted to the indignity of aphabetization until the 18th century. This encyclopedia, described in Encarta, first published in 1704, was considered the first one, in English at least, to be alphabetically arranged. The Lexicon Technicum; or an Universal English Dictionary of Arts and Sciences Explaining not only the Terms of Art, but the Arts Themselves.

Diderot & Alembert's Encyclopédie, 1751 - 1777, was arranged thematically.

Wednesday, June 15, 2005

Roman Literacy Practices

I was rereading Robert Logan's book The Alphabet Effect on the alphabetic writing system because I wanted to check out some of the differences between the 1987 edition and the second edition. While I am still collecting my thoughts on the book as a whole I came across a couple of pages that need to be appreciated by a wide audience. This is from chapter 10, page 49 in the second edition and is posted on Logan's website.

'There is also evidence from a number of literary sources that
Roman schoolboys were taught the art of reading and writing.

"Boys learn in accordance with a written model
(praescriptum); their fingers are held, and they are
guided by the hand of another through the forms
(simulacra) of the letters, then they are told to
copy what is put in front of them and improve
their handwriting by comparison with it"
(Seneca, Epistulae Morales, 94.51)


A more elaborate passage of Quintilian suggests an improvement
on the technique described by Seneca as well as the importance
of penmanship (ibid., p. 25):

"As soon as the child has begun to know the shapes
of the various letters, it will be no bad thing
to have them cut as accurately as possible upon a
board, so that the pen may be guided along the
grooves. Thus mistakes such as occur with wax tablets
will be rendered impossible; for the pen will be confined
between the edges of the letters and will be prevented
from going astray. Further by increasing the frequency
and speed with which they follow these fixed outlines we
shall give steadiness to the fingers, and there will be no
need to guide the child's hand with our own. The art of
writing well and quickly is not unimportant for our
purpose, though it is generally disregarded by persons
of quality. Writing is of the utmost importance in the
study which we have under consideration and by its
means alone can true and deeply rooted proficiency be
obtained. But a sluggish pen delays our thoughts, while
an unformed and illiterate hand cannot be deciphered,
a circumstance which necessitates another wearisome
task, namely the dictation of what we have written to
a copyist".


Another technique apparently employed by the Romans,
according to Quintilian, was the fashioning of ivory letters
to be used as toys by young children to acquaint them with
the alphabet (ibid., p. 25). '

You will understand my fascination with these practices if I explain one of my recent quests. I have been looking for applications which animate the letters and characters of various writing systems. Here are a few. If anyone can add to this list, please do. I have found a couple for the Latin alphabet but, so far, they have not been free.

Chinese
Hindi
Arabic
Tamil

This post has been edited to add that Logan qotes from M. Hadas, Ancilla to Classical Reading, New York, 1954, p.68.

Tuesday, June 14, 2005

Cree Bibliography

Beaton, K.J. Ed. Birch Bark Talking: A Resume of the Life and Work of the Rev. James Evans.(Toronto, The Board of Home Missions, 1940

Bennett, J.A.H., and J.W. Berry. "The Future of Cree Syllabic Literacy in Northern Canada." The Future of Literacy in a Changing World. Ed. Daniel A. Wagner. Rev. ed. Cresskill, NJ: Hampden P, 1999. 271-290.

Berry, J.W. and Bennett, J.A. (1991). Cree Syllabic Literacy: Cultural Context and Psychological Consequences. Tilburg University Monographs in Cross-Cultural Psychology. Tilburg: Tilburg University Press.

Burnaby, Barbara (ed.). Promoting Native Writing Systems in Canada. Toronto: Ontario Institute for Studies in Education, 1985. A fundamental collection of studies on Aboriginal literacy, including case studies in Cree, Inuit, Montagnais and Dene communities.

Mason, Roger Burford. Travels in the Shining Island(Toronto, Natural Heritage Books, 1996)

McLean, John. James Evans: Inventor of the Syllabic System of the Cree Language (Toronto, Methodist Mission Rooms, 1890)

Murdoch, J. (1981) Syllabics: A Successful Educational Innovation. Unpublished masters dissertation, University of Manitoba

Nichols, J. (1996) The Cree syllabary. In P.T. Daniels and W. Bright (eds) The World’s Writing Systems (pp. 599–611). Oxford: Oxford University Press

Peel, Bruce. The Rossville Mission Press(Montreal, Osiris, 1974)

Young, Egerton R. The Apostle of the North: Rev. James Evans (Toronto, Wm. Briggs, 1900)

Shipley,Nan. The James Evans Story (Toronto, The Ryerson Press, 1966)

The Siege of Belgrade

After naming this blog, many different examples of abecedaria keep popping into my head. Here is one bittersweet memory which attests to the power of the alphabet as an index and anchor for the wandering mind.

When I was 12 years old I helped to care for my grandfather who was showing the first symptoms of alzheimer's. He loved to sing old songs of dubious origin and, as is so common for our elders, recite poetry. Since even at that age I loved linguistic curiosities I wrote down one particular poem, the only one that he could still recite in its entirety.

It was called The Siege of Belgrade. He recited it to me line by line and I faithfully recorded it on a page of lined paper, which I have long since lost.
'
The Siege of Belgrade

An Austrian army, awfully arrayed,
Boldly by
battery besieged Belgrade.
Cossack commanders cannonading come,
Dealing destruction's devastating doom.
Every endeavor engineers essay,
For fame, for fortune fighting - furious fray!
Generals 'gainst generals grapple - gracious God!
How honors Heaven heroic hardihood!
Infuriate, indiscrminate in ill,
Kindred kill kinsmen, kinsmen kindred kill.
Labor low levels longest, lofiest lines;
Men march 'mid mounds, 'mid moles, ' mid murderous mines;
Now noxious, noisey numbers nothing, naught
Of outward obstacles, opposing ought;
Poor patriots, partly purchased, partly pressed,
Quite quaking, quickly "Quarter! Quarter!" quest.
Reason returns, religious right redounds,
Suwarrow stops such sanguinary sounds.
Truce to thee, Turkey! Triumph to thy train,
Unwise, unjust, unmerciful Ukraine!
Vanish vain victory! vanish, victory vain!
Why wish we warfare? Wherefore welcome were
Xerxes, Ximenes, Xanthus, Xavier?
Yield, yield, ye youths! ye yeomen, yield your yell!
Zeus', Zarpater's, Zoroaster's zeal,
Attracting all, arms against acts appeal!

by Alaric Alexander Watts (1797 - 1864)

When I made a mental comparison with this copy of the poem I found only two small differences. However, the line for J is omitted. Possibly it was permissible at the time to say the word God in a poem but perhaps not the name Jesus. If anyone knows the lost line I would be interested to hear it.

Could it be that the alphabet, with any tightly attached information, however trivial, is often one of the last memories of the alzheimer's victim? I am glad to say that it did provide a happy afternoon of communion for my grandfather and me.

I remember also this week my Greek teacher, Elizabeth Wilson, who has recently been diagnosed with Alzheimer's.

Monday, June 13, 2005

Tamil Tolkappiyam

Since I have written about the encoding of the Tamil writing system in Unicode and Michael Kaplan has written a detailed response I should clarify my position ( if I have one ... I am an onlooker and occasional client). I do not wish to particpate in any appeal or attempt to re-encode Tamil. I have used Tamil on the internet and have passed on Tamil Unicode software and the link to the Tamil Interface pack for Windows to a few friends. Its great!

I particularly appreciated this from Michael's post.


Look at it another way -- it is much harder to
implement Thai and Lao with their 'visual' encoding
scheme, especially when it comes to operations like
collation. A logical ordering would have been much
easier for everyone to write implementations.

Coulda, woulda, shoulda -- honestly the fewer
innovations in this space, the easier it is to see
implementations appear!

And I know whereof I speak here. I have seen the
impact on a native speaker of a language to see that
language supported on Windows. If you tell that user
that they must wait for a year or five or even ten
years, then the impact is precisely the opposite.
Sometimes it can be devastating.


I am interested in how a script appears as a glyph rather than how it is encoded. When I originally worked with a 12 year old so that he could learn to keyboard in Tamil Unicode and install the Tamil support for his family, he refused to touch the keyboard with any reordering software. However, there are now many suitable options to keyboard Tamil. I still prefer the syllabic editor since I am more of a visual speller in any language. And my young Tamil friend happily uses it - no problem. There are several other popular options.

However, the discussion presented by N. Ganesan on the Unicode list (May 7, 2005) was intriguing.

But in Tamil script "consonant with inherent 'a' plus puLLi"
is a primary unit. Tamil defines pure consonants so
explicitly in 2000 year old grammars, with puLLi and the
consonant with inherent 'a' (as far as Tamil is concerned)
is just one of an abugida series, so gets identical weight as
others in the series. We need to document that evidence on
puLLi as an orthographic device earliest attestation in
India from Tamil material in the Unicode documents also. ...

In Tolkappiyam தொல்காப்பியம், a 2000 year old grammar, one can read these details about the Tamil script.


we find enumerated both the aspects of form and matter,
not only the poetic form but also the phonological and
morphological form.
(1) The alphabetical sounds or phonemes (Eluttu);
(2) their duration (Mattirai);
(3) their knitting together into syllables (Acai) ;
(4) the various permutations and com­binations of these
syllables as feet (cir) ;
(5) the varied integrations of these feet into lines (ati);

The purpose of my musings in abecedaria is to consider how a script or writing system has been historically represented and organized in its concrete glyph form by its users, not how it should be encoded. That is, what are the primary visual shapes and structures of a script.

I find it of interest historically that of the 24 Indic scripts represented only Tamil is shown as an alphabet in Isaac Taylor and only Tamil is shown as a syllabary in Diderot. I was reading about this when I noticed Mr. Ganesan's post. It just seemed too interesting not to pass on. This is a place to collect who said what and when, not how the encoding should work. Sorry if I implied otherwise.

Authors

This is an index of previous posts where I have commented on or quoted from an author who has written about writing systems, or whose writing has impacted on writing system theory.

Bringhurst, Robert
Chao, Yuen Ren.
Darwin, Charles
DeFrancis, John
Diderot, Denis
Faber, Alice
Gaur, Albertine
Poser, William
Sacks, David

Sproat, Richard
Taylor, Isaac
Taylor, Isaac
Tolstoy, Leo

Sunday, June 12, 2005

Battledores and Hornbooks

I have been thinking about how the alphabet has been presented visually over the centuries to gain insight into the way that different cultures have organized their writing system.

Battledores and hornbooks are two of the teaching aids of early literacy that have been used over the centuries. They present the alphabet in linear fashion, then a table of syllables and a religious text.

Hornbooks date from the 1400's to the late 1700's and were made of a small wooden paddle, on which was glued a sheet of paper, covered with a layer of cattle horn which had been soaked and prepared to become flattened and pliant. An alphabet, a short selection of syllables and often the Lord's Prayer were commonly printed on hornbooks.

Battledores, popular in the 1800's, were made of stiff folded cardboard with a greater surface area, so they could also contain pictures. Many of these also portrayed the alphabet, an array of syllables and the Lord's Prayer. However, some came in the form of an illustrated alphabet book.

Syllabaries have shown up as an aid to literacy from the time of the Formello Alphabet on an Etruscan vase to the Battledores of the early 1800's. But what is the role of the syllable in the teaching of reading today?



Tamil

Unicode Marches On

It is always good to keep up with Unicode so I found this post on Ken Arnold's blog of interest.

He refers to the newly encoded Shavian alphabet which is a system for writing English invented in memory of George Bernard Shaw. I have just bought an old copy of The Miraculous Birth of Language by R. A. Wilson of the University of Saskatchewan, (yes I have to put in this little plug for Canada) with a preface by G. B. Shaw, where he proposes the invention of a new script for English.

Since Shaw wrote this preface in 1941, it is understandable that he included this comment which I rather like,

I found myself considering seriously, especially when
the German airmen dropped a bomb near enough to
shake my house, whether I had better not end my
days in Vancouver, if not Saskatoon.


More about Shaw and the Shavian alphabet later.

Morphosyllabic

I would rather be talking about something else and soon I will. However, I want to get this off my chest. Personally I use the term morphosyllabic when I talk or think about the Chinese writing system in an academic context. Otherwise Chinese characters is as good as it gets.

I will be working on the best definition of morphosyllabic for a while. Here is a start. Chinese characters each represent a single syllable, and in the vast majority of cases a single morpheme (from the Zompist ).

Here is what John DeFrancis has to say.

The Chinese system must be classified as a syllabic
system of writing. More specifically, it belongs to the
subcategory that I have labeled meaning plus-sound
syllabic systems or morphosyllabic systems.
(
Visible Speech, 1989, p.115 - 116)


Defrancis first used the term in the second last paragraph of his chapter in The Chinese Language: Fact and Fantasy, 1984.

Some linguists find the term awkward and substitute the term logosyllabary. (World's Writing Systems, P.T. Daniels, 1996, p.4)

Either of these are a million times better than the noncomital term sinogram, favoured by some. That term googles best as a medical procedure which I refuse to dignify with a link!

As for the term ideogram, well, I for one have always liked to read John DeFrancis books on Chinese and chapters of his various books have been made available at Pinyin Info. I would recommend The Ideographic Myth as a pleasant Sunday afternoon read.

Morphosyllabic is also the term prefered by those who write about reading theory and compare effects of dyslexia across different scripts, so it has my vote.

I was using the term morphosyllabic, (in a dialogue with myself), in the late 80's when I was looking at other syllabic scripts that used additional symbols to differentiate what would otherwise be homographs. This term fits me like an old slipper, and awkward as it may feel to others, I think I can guarantee that a sinogram would be worse.

Saturday, June 11, 2005

Chinese

Posts which refer to the Chinese Writing System. This post will be edited on the go.

Morphosyllabic
The Chinese Syllabary
Hanzi Smatter
Composing the Syllable
Cangjie Input
Q9 Input
Googling in Chinese
Chinese and Tamil

The Chinese Syllabary

I had already refered to the Chinese writing system as a syllabary and then I went looking for company. Luckily I didn't have to go to far for corroboration.

The best explanation for thinking about the Chinese writing system as a syllabary is embedded in the FAQ page of Zhongwen.com. In response to the question "How do you know the pronunciation of a character" Zhongwen.com provides this answer.

Chinese characters do not have an alphabet but they
incorporate a rough syllabary. Whereas alphabets
use symbols to represent each phoneme, syllabaries
use symbols for each syllable. ... When a language has
few syllables a syllabary has clear advantages over
an alphabet. Children have less difficulty dividing
words into syllables than into phonemes. (Many
adult Chinese unfamiliar with alphabet-based writing
systems have a very hard time writing even their own
names in Pinyin since it requires phoneme-by-phoneme
dissection of the syllables. And syllabaries allow for
quick reading because the symbols can be made quite
distinct from each other rather than just being letters
in different combinations arranged in a line.


More on why some people call the Chinese writing system ideographic another day.

Hanzi Smatter

I was casting around for something delightful, entertaining and informative to leave you with for the weekend.

Hanzi Smatter is just the thing! Any page, any post, any time. Enjoy.

If information on the Chinese writing system is your top priority go to Pinyin Info via Hanzi Smatter. Cheers!

Complex Scripts

I have added a link to Omniglot in the sidebar this morning. It is the best resource on writing systems on the internet.

However, I have already run into a glitch. This is actually one of the reasons I began this blog. A difference in terminology.

Omniglot uses the term complex scripts for Chinese and other scripts which "may represent both sound and meaning". Simon Ager of Omniglot has elsewhere labeled these scripts logographic. Make up your mind, Simon!

There is another more common use for the term complex scripts and I intend to use the term only in this sense. This is the way the term is used for the installation of international script support and a discussion of keyboarding these scripts. I hope to stick to this definition in future.

Thank you to the Tibetan and Himalyan Digital Library.

I received the suggestion that I needed to create a glossary of writing systems terminology for this site. If a glossary is ever created, it will be the end-product, a result of years of blogging and discussion. Right now there is little consensus on how to label script types.

Having said that I will do my best to provide defintions that relate to the use of a term in this blog. It simply may not be the only definition available.

I'm all business this morning. No fun and games. Where is Taw when you need it?

Composing the Syllable

Rather than try to visualize some of the 100 plus input methods for the Chinese writing system, it is easier to sort them into categories. This is what I read in Answers.com. There are 3 ways to input Chinese, by encoding, by pronunciation and by the structure of the character.

Now I have to ask outright what does it mean to input by encoding. Is there someone out there that thinks of characters as encodings? Who are these people and what do they do, look up a chart of encodings? Do they suggest that us mere mortals should consider this an input method? Maybe someone will enlighten me and tell me more about the mysteries of code.

Okay, there are two main types of Chinese keyboard input method for the rest of us, by pronuncation or by structure.

Simply put, each Chinese character represents a syllable. Since there are many more Chinese characters or syllables than there are keys on the keyboard, (even considering the shift and alt keys) these syllables must be composed out of smaller components. There are two ways to do this. First by the pronunciation of the syllable and second by the structure or shape of the syllable.

To input by pronunciation, one can use Pinyin, Cantonese Pinyin or Zhuyin, also known as Bopomofo. There are also adaptions of these methods. In these methods each sound, or consonant and vowel, are input separately either by using letters of the Latin alphabet or Bopomofo characters. Since more than one character will match each syllable by pronunciation, a series of characters will be displayed and one must be chosen and confirmed.

To input by structure, the visual features of the character are considered. The strokes or larger components which make up the structure of the character must be analysed in some meaningful way. Older methods were based on radicals and other components of the character like stroke order, number, and direction.

Newer javascript structure-based input methods like Q9 have the advantage of displaying components of the character on the virtual keyboard and the user choses the desired component without having to depend on memory for the correct keystroke. These methods are intuitive and can be used without learning anything about the structure beyond the ablility to discriminate whether a stroke is horizontal, vertical or diagonal. (This is an over simplification but I do want to go to sleep tonight.)

This kind of input, structure-based, is also called glyph-based input since the glyph is the shape or structure of the visual image of the character.

Thursday, June 09, 2005

Taw

That last entry was a little too much like work. A mix of Too Much Information with Too Little Information. However, it is not as if I haven't been asked about Cangjie input before. Nevertheless, I need to serve myself desert.

David Sacks' abecedarium follows a long line of predecessors. In 1905, Hubert Skinner wrote The Story of the Letters and Figures. This book is less than scintillating and will not likely be reprinted soon but his chapter on T is small enough and unusual enough to make it worth preserving.

The last of the twenty-two letters of the old Phoenician
alphabet was called Tav (tahv). It was in the form of a
cross, such as is sometimes made in our day by persons
who are unable to write.

The word meant, originally, simply a mark, or sign for
identification, and was used as such in ancient times.

The man who was unable to write was at least able to
"make his mark" in lieu of a signature. Marks were also
placed upon camels and other beasts, to indicate
ownership: and whatever their form, they were
described generally by the name of this letter.

Tav was formed thus.
(the small letter 't' in comic
sans would be closest)

Its sound was a softened T, differing slightly from
the sound of Teth-the basket.

From this letter the Greeks derived their Tau, τ
which became the T of the Latin alphabet and of
our own. The only alteration the letter has
undergone has been the raising of the cross bar.
Probably this was effected gradually and almost
imperceptibly.

Schoolboys who mark a "taw" for their games do
not imagine how old a word they are using. For
how many centuries have boys handed down
the ancient name of this letter, or "mark"with a
slight change in the sound of its name! Phoenician
boys, Greek boys, Roman boys, Saxon boys, and
American boys, have shown that boyhood is the
same, the world over.

I have heard of a marble being called a taw but I have never heard of a mark made in a game being called a taw. I hope some day someone will post a comment and expand on this.

Cangjie Input

Someone once suggested that there might be up to 100 different input methods for Chinese but that sounds like an understatement. And there is no liklihood of a standard method evolving.

I have written about Q9 input because I have been convinced, first by the persuasion of several passionate and compassionate educators; second, because I have read the rave reviews; and third, because I just saw a 12 year old figure it out in a few minutes even with the distraction of me saying "No, no, click that one!"

Above all else Q9 is not yet defined in answers.com and it deserves to be.

Canjie is the preserve of the dedicated and the skilled. Look at Dylan Sung's Canjie pages with his illustration of the four philosophical sets, and enjoy the mystic. Visit the pages of the Cang-Jie Input. It is the domain of the initiated, not the dilletante. Here is a little article that sums it up for me. One word - hardship.

Cangjie was invented in 1979 and allowed Chinese to be input from the QWERTY keyboard so it must have been revolutionary at the time. Once learned, it is supposed to be fast and accurate, and extremely efficient. Pinyin was always ambiguous, you had to stop, pick and choose. However, Pinyin has evolved.

The only statistic that I have seen on input methods in China indicates that 97% use Pinyin input. For Hong Kong 78% use keyboarding of some sort and 21% use handwriting input. Since these are fragmented results at best I asked around last summer.

In Beijing, professors and professionals used Pinyin input. Secretaries used Cangjie or Wubi because it was much faster and more accurate and that was their specific technical training.

In one case I observed a teenager using Pinyin in MSN and asked if I could watch for a bit since this was a particular interest of mine. No problem. I then asked his mother what method she used to keyboard. I got the distinct impression that she did not keyboard.

I have been told that not everyone knows how to use Pinyin well enough for keyboard input and Cangjie takes a considerable amount of time to learn.

In Hong Kong I heard from educators that they were using Q9 for everyone from grade 2 to university level and I have even seen the abstract for this session at a conference at the HKIEd.
The study of learning the "Q9 Chinese input method" by mildly mentally handicapped children . That sounds good to me.

Wednesday, June 08, 2005

Q9 Chinese Input

I love using Pinyin Input because it seems like magic. However, it may appear that I am contradicting myself since clearly Pinyin is phonetic input, based on the alphabet, and I have suggested that I am an advocate of glyph-based or visual input.

I only want to argue against an overemphasis on phonetic input for many scripts. People need both, or ideally, the ability to choose between phonetic and visual or glyph-based input.

One of the big advantages of glyph-based input is its ability to cross dialect or language boundaries. It is not dependent on pronunciation.

I have recently started learning to keyboard with Q9 input which was developed for the Nokia phone. I personally sat down last year with the head of the technology department at the Hong Kong Institute of Education, Kar Tin Lee, and had her explain to me the advantages of Q9 input.

Statistics show that up to 20 % of Hong Kong keyboard input is by handwriting input. However, I showed Q9 to a Canadian 12 year old, and asked him how it compared to handwriting input. "No comparison" he said, "this is much faster."

There is one tiny problem with Q9 - you actually have to know what the Chinese characters look like. I wasn't able to use it myself until I found this online Chinese dictionary which provides characters large enough for someone of my age to discern the individual strokes.

"Oh, no" replied my young friend, "I have the same problem. Sometimes on a test I cannot read the Chinese characters if the font is too small." (He was just saying that to make me feel better.)

Try Q9 here and read more about it here.

Googling in Chinese

I am hopelessly addicted to googling in different languages. Take Chinese for example. If I am just doing it for my own entertainment then I generally google images. This is how it works. I go to a Multilingual Picture Dictionary at languageguide.org and look up the picture and spelling of the item I wish to google. I like to listen to the word also - might as well make it a multisensory experience.

Then I go to google and, using the Pinyin Chinese IME I type the Pinyin spelling of the word into the dialogue box and choose the right syllables out of a list of homophones. (Always checking back to the online dictionary since I don't know Chinese) One of the great fascinations of Pinyin input is how it has evolved so quickly over the last few years. It's smooth and satisfying.

I often wonder why we can't have something like that for English. Why doesn't somebody bundle a homophone checker into an English operating system? I am also still hunting for a phonetic spellchecker that can handle the fact that "what" is best input as "wut". I might believe in the alphabet again when English keyboarding works half as well as Chinese. Until then ...

Language Visible

I was heartily enjoying David Sacks' euphonic introduction to the letter V in his book Language Visible, 2003, otherwise known as Letter Perfect. I particularly thought of how this book would stand as a reflection of its decade with the smooth segue from Churchill's Victory sign into Bush's W.

I had been reading up on V because I am still absorbing the import of Canada's Victory Nickel. I also feel obliged to pay homage to Sack's book as a example of a contemporary abecedarium.

However, my euphoria evaporated with dispatch when I left off grazing through the garden of letters and gave my undivided attention to the introduction. Ouch.

What is the point in a website like Pinyin Info if an author can still get away with a statement like A Chinese symbol is primarily not phonetic; it does not operate by conveying sound? p. 3 Doubtless someone else has pointed out this little detail - yes the Chinese do represent speech with their writing sytem. So I moved on.

Because the memorization step is simple enough
for five- and six-year-olds, the whole process with
an alphabet can be completed before students reach
working age. The learning need not interfere with
the living. This crucial fact has made the alphabet
historically the vehicle of mass literacy
. p. 5

Does he assume that there is no more to literacy than memorizing the letters of the alphabet? Albertine Gaur, among others, dealt with this issue in the 80's.

However, there was another person writing in the 80's on this topic. M.A. Powell, a cuneiform scholar, wrote an article in a journal called Visible Language.

The inescapable conclusion is that the introduction
of the alphabet, by itself, has had little effect upon
the reduction of functional illiteracy, and thus its
importance in the history of human development
has been overestimated, whereas that of cuneiform
has been underestimated.
I will take Visible Language over Language Visible any day when it comes to writing systems and literacy.

Tuesday, June 07, 2005

Caroline Islands Script

Occasionally I find something on the Unicode mail list that just comes out of the blue. Here is a new syllabic script from the Caroline Islands. You can read more about it at www.carolineislandscript.com Don't miss the FAQ page.

Since there are characters represented in this script that have the same glyph shape as letters of the Latin alphabet but as abstract characters represent syllables within the Caroline Island writing system, this has stimulated an interesting interchange on the relationship betwen abstract characters and scripts.

Albertine Gaur

A few years ago I wrote elsewhere about the assumption that the alphabet was the writing system most suited to literacy. I countered this by saying that "The most obvious argument against this interpretation is the example of Japan." So I was interested to find the same sentiment in Albertine Gaur, 1984, The History of Writing.

After the end of the Second World War a mission consisting
of twenty-seven educationists recommended to General
McArthur a drastic overhaul of the Japanese
education
system. They called especially for the abolition of the
'Chinese-derived ideograms', since otherwise Japan could
never hope to achieve parity with the West. Today Japan
has not only achieved this parity, but seems
uncomfortably close to overtaking the west, and this
despite the fact that the Japanese still use their
'Chinese-derived ideograms", and that it takes Japanese
schoolchildren two years longer than their Western
counterparts to learn how to read and write. As we
move towards the 21st century, the 19th century concept
of the alphabet as a Platonic idea towards
which all writing (and information storage) must by
necessity progress becomes less and less tenable. p. 34


I find this use of the phrase alphabet as a Platonic idea fascinating. I was thinking the other day that current keyboarding practise seems to follow the notion that there is an underlying abstract alphabetic idea and that this provides the basis of keyboard input. Will this ever become less tenable?

Before leaving Albertine Gaur, her basic writing system classification is worth noting for how it varies from others. She identifies the nature of a script as one of 5 types, alphabetic, consonantal, syllabic, ideographic, and mixed. None are pure ideographic, but Chinese is listed as mixed. What is interesting to me is her listing of all the following scripts as syllabic - Cherokee, Cree, Japanese, Tamil, Devanangari, etc. as syllabic. She does not differentiate between the syllabaries and the alphasyllabaries. Indic scripts are from their recorded beginnings clearly syllabic. p. 108

Many would disagree with her but I do not find her in any way uninformed on the segmental composition of the Indic syllable. It seems a matter of emphasis. Do you call a script syllabic first and then mention the composition of the syllable, or do you call a script segmental and add that, by the way, it is organized into syllabic units?

Monday, June 06, 2005

Why Abecedaria?

I could have called this The Writing System Blog but it seemed a little too presumptuous. What about the Glyph-based Input Blog - a little too much like a bee in the bonnet.

I want to write about writing systems as concrete realities with a physical organization, something that can be seen, felt, and perceived in the most tangible way. Ever heard of kindergarten teachers cutting letters out of sandpaper for those students who need multisensory input - where is sandpaper when you need it?

I guess abecedaria is about characters in a writing system being primarily glyphs and secondarily abstract codepoints. WYSIWYG input - makes sense to me. This is how I edit the website I am responsible for. Sure I view code once in a while to tidy things up and I don't believe code is too hard for me, but it is easier to start with WYSIWYG.

Oh, and by the way, I also intend to read through four shelves of books on writing systems in the university library nearest to me over the next 5 years.

Chinese and Tamil

I am going to try and explain how come all I know about Tamil keyboarding I learned from Chinese.

This one really twists peoples minds - so I love it. My reasoning is that since the Chinese writing system is a syllabary (a logosyllabary) and the Tamil writing system is a syllabary (an alphasyllabary) they are similar on one axis - still with me?

In each system you really want to input a syllable. You can either compose the syllable phonetically, from the sounds, or graphically, from the components of shape and structure. In each case you cannot input a syllable directly from the keyboard, 6,000 + or 247, doesn't really matter.

In the Chinese Pinyin IME you input the phonetic syllable using the alphabet; and a series of Chinese characters which are a phonetic match appears, then you just chose the correct one. So, I figured, bring that down one level and imagine an application for Tamil where you can type in the consonant and a series of syllables appears which begin with that consonant and then you can just chose the right one.

I personally have no technological skill so I groused about this problem for some time in a writing systems forum and surfed the net into the wee hours. I was just about to give up when I found the Nepali/Devanagari Editor. (I have not yet tried to keyboard Devanagari on this keyboard, too busy elsewhere. )

Then Richard Wordingham, in England and working with Thai, said, "No problem, I'll add Tamil to this and post it on Saturday morning." (We both have daytime jobs doing other things).

Richard added a few details and now this syllabic editor can be used either as a picker or as a keyboard tutor. It is not a full-scale word-processor but my dream is that one day it will be.

Back to Chinese and Tamil. If you truly believe that Tamil is exclusively an abugida you might dismiss all this outright - your loss.

Sunday, June 05, 2005

Keyboard Input for Complex Scripts

Last week Michael Kaplan made this comment in his blog.


Being able to have words even look like they belong
together in
languages like Thai and Hindi and Tamil
really requires either
knowing the language or
memorizing keystrokes."

And Michael should know - he developed an amazing multilingual website.

This was in response to Raymond Chen who said,

First of all, keyboard input is a more complicated
matter than those who imprinted on the English
keyboard realize. Languages with accent marks
have dead keys, Far East languages have a variety
of Input Method Editors, and I have no idea how
complex script languages handle input. There's
more to typing a character than just pressing a
key.

Yes, indeed, keyboard input is a more complicated matter in languages other than English (and maybe a few others) But the question is should it be? and does it have to be? Do we just accept this aspect of the Digital Divide without doing anything about it?

I have watched children who do not read, write or speak English google their favourite game or image on the English keyboard. Can't we make that happen for other languages.? Call me crazy but I think we can.

I have tried out a few javascript online applications and I think there are a couple here and there that really are usable, not just cute little toys, as I was once told. I have seen children use them to google in Chinese and Tamil, and even teach me in a matter of minutes how to do the same.

Here is what we use for Tamil and on this page is a button to open the trial input application for Q9 Chinese input. The claim is that it takes 5 minutes to learn how to use it and I personally observed a child do it. Okay, so it doesn't feel like keyboarding, it feels like mousing around - but it works.

These keyboards work on the principle that one should be able to input a visual glyph for other scripts just as one does in the English alphabet. Forget about the abstract encoded phonetic codepoint. The program presents to the user the visual glyphs and this is used for keyboard input.

The English Keyboard

Last year I made the comment to a group of computer types that you didn’t have to know how to read to google in English. I explained that some 5 year old children sit down at a computer with a scrap of paper with a word printed on it.

One of them rejoined “And why would they want to google if they can’t read?” Incredulous, I replied, “To play Neopets, of course.”

It happened again this winter. Two 5 year olds were discussing what games they play on the computer and one said to the other “No problem – you just type N-E-O-P-E-T-S.” I happen to know for a fact that these two children couldn’t read or write in the conventional sense.

Last year an ESL student who couldn’t speak, read or write English completed a PowerPoint on Canadian Animals. He had used words scribbled on paper and the google: images search.

I recently saw a 12 year old, who, through hearing impairment and other circumstances can not read or write, sit down at the computer and google her favourite celebrity and play a video from the site.

So you don’t actually have to know how to speak, read or write English to google from the English keyboard. Want to try that in Chinese?

Saturday, June 04, 2005

The Tamil Syllabary chez Diderot

I actually mean to say the Tamil syllabarium, that is, the syllable chart used for learning to read and write Tamil. The classification of the script is independent from discussing the table of all possible syllables, of CV forms, in Tamil (excluding the Grantha forms).

I first saw a Tamil syllabarium when one was brought to me by a student who had written it out in pencil on two pieces of binder paper, carefully taped together. Why is it so important to represent the Tamil script in a syllabarium? Simply because the vowels often consist of modifications of the consonant form, rather than separate letters. This is not entirely predictable and there are enough cases where you are better off memorizing the syllables first and understanding how it all gets put together later, especially if you are a child.

When plates of Indic and other Asian scripts were created for the Encylopédie of Diderot and Alembert in the mid 18th century, only Tamil was represented by a syllabarium. The other scripts simply had a list of independent and dependent vowels and the consonants with inherent a.

Once again Tamil breaks the pattern. In Isaac Taylor, Tamil was represented by an abecedarium, an alphabet, and in Diderot by a syllabarium, a syllable chart. The other Indic scripts are represented in both Taylor and Diderot by forms that are now described as abugida forms.

Unicode, however, is oblivious to Diderot and Taylor. Tamil is encoded much like the other Indic scripts, as an abugida, where the primary form of the consonant is considered to be the form which includes the inherent a vowel. It should not be surprising that there have been requests to reencode Tamil, this time as either an alphabet or a syllabary but not an abugida.

Tamil Pulli in Taylor's "Alphabet"

There was quite a lot of talk last month on the Unicode mail list about Tamil. The argument was made that Tamil consonants have been defined as pure "mey" or consonants without a vowel sound for at least 2000 years. However, in Unicode they have been encoded as consonants plus the medial a vowel.

The pure consonant is represented by a consonant plus pulli (dot over the letter). As it stands now the consonant with the inherent a vowel has been encoded and the dot is added as a new codepoint. However, certain Tamil advocates are arguing that the pure consonant, or consonant pulli, should have been encoded and then the pulli would be removed by keying in the medial a vowel.

I have been reading Isaac Taylor's The Alphabet and came across his tables of Indic scripts. Alone out of 24 Indic scripts only Tamil has the pure consonants, or consonant plus pulli, represented in the list. All other scripts are represented by the consonant with the inherent a vowel.

Was Taylor, 1883, somehow aware that for Tamil the pure consonants are the primary forms, whereas in other Indic scripts the primary forms are the consonant with inherent a? How was that knowledge lost to the west?

Roman Numerals

Roman Numerals did serve a useful purpose in that they allowed a new development in the Roman alphabet, two new letters U and J.

In the tenth century V differentiated into U and V, and in the 15th century I differentiated into I and J. V and J, the consonantal forms were used at the beginning of words and U and I, the vocalic sounds were used in medial position. (Isaac Taylor, The Alphabet ii p. 72)

The addition of new letters into the Latin alphabet was possible since, even though the letters of the alphabet had a fixed order, they did not have a fixed value. The addition of new letters to the alphabet did not alter mathematical notation. Roman numerals are not an alphabet derived system, regardless of what they look like.

In contrast, the Greek and Hebrew alphabets have not changed in 2000 years. The Greek alphabet did lose one letter in the mists of antiquity, the digamma, which occurred in 6th position and corresponded to the vav in Hebrew. That has since been replaced with the stigma, a letter used only for the number 6, and not included in the alphabet. However, it serves to ensure that the Greek letters retain their original numerical value.

The use of the alphabet as a numbering system provided stability or rigidity, depending on how you look at it, to the Greek and Hebrew alphabets. The Roman numeral system on the other hand differentiated the numerical system from the writing system and allowed the Roman alphabet greater flexibility. They all were base 10 - so no difference there.

Friday, June 03, 2005

A nickel for your thoughts

I put a finger in my change purse today to coax out a quarter and found an unfamiliar coin. Similar in tint and sheen to a quarter, it was smooth and round, the size of a penny and labeled cents. But how many cents? It was smaller than a quarter, the wrong colour for a penny and too thick for a dime.

The Canadian Victory nickel was issued on May 4, 2005, the 60th anniversary of victory in Europe. V for victory, yes, of course - but V for five?

The original 1943 Victory nickel was made of a different material and had 12 flat sides. The fact that the number 5 was missing was less consequential - who needed it - the distinctive shape identified the nickel. Now I scrutinize a coin whose value is identified only by the V which clasps the torch. The torch I recognize - the torch; be yours to hold it high.

The origins of Roman numeral notation are vague. No one actually knows how V came to stand for the quantity 5. I was taught that V imitates the shape of one hand - five fingers, and X imitates two hands - ten fingers. Others speculate about notches on a tally stick.

Greek and Hebrew numbers were based on a more rational premise than Roman numerals. The first ten letters of the alphabet stood for the numbers 1 to 10, the eleventh for 20, the twelfth for 30 and so on. This system fixed the order of the alphabet in antiquity.

Another Greek system was acrophonic, Δ (delta) for 10 because 10 in Greek, δεκα, started with a Δ. These systems are logical, if you forget a number you can figure it out - but V for five - you either know it or you don't.

The Roman numerals are not even part of the school curriculum for numeracy. Why didn't the Mint consider a 12-sided edition of this nickel, more faithful to the original not less.

Leo Tolstoy

Writing about the syllabaria reminded me of a quote from Tolstoy, which dates from the 1870's when he put down the writing of one of his novels to teach reading to children. It seems that Tolstoy recognized that children were people too.

"One pupil has a good memory, and it is easier for him to memorize the syllables themselves than to comprehend the vowellessness of the consonants; another reflects calmly and will comprehend a most rational sound method; another has a fine instinct and grasps the law of word combinations by reading whole words at a time ." This quote appeared in Tolstoy on Education, 1967.

Some day I will find a copy of Taylor, volume 1, and read his tripartite typology of writing sytems. However, these three components, sounds (phonemes), syllables and words were already the recognized components of writing.

Abecedaria and Syllabaria

Since I only have volume 2 of Isaac Taylor's The Alphabet , it seemed natural to just let it fall open and start reading. Chapter 7, section 7 is called The Abecedaria. This simply refers to the writing the letters of an alphabet in sequence for practise in reading or spelling. In 1882 a famous abecedarium was found on a blackware vase in Formello near Veii by Prince Chigi. This has been called the Formello alphabet.

The oldest complete abecedaria are grafitti on the walls of Pompeii but many ancient alphabets have been found engraved on vases, bowls, ink-bottles and the drinking-cups of children. These are sometimes accompanied by a syllabarium, the writing out of several series of syllables.

Modern abecedaria include alphabet books, illustrated alphabets, photographic alphabets, animated alphabets, acrostics and poems. An abecedarium is the linear representation of the alphabet in a fixed order. A syllabarium is the two-dimensional representation of the writing system in a matrix.

The term syllabarium was used to label the visual representation of Cree, Inuit and many other writing sytems of the Aboriginal Peoples of North America. The term means the representation of the characters in a syllable chart, it does not serve to classify the script as a syllabary. Unfortunately, today the syllable chart is usually called a syllabary and the term syllabarium has been lost.

Chinese writing has usually been represented by a chart of the 214 radicals or keys. However, it was also represented by books of rhyme tables. These were, in effect, syllabaria.