Friday, July 22, 2005

Books in my Bag

I will be out of town and don't expect to have access to a computer for a few weeks. I have only just scraped the surface of most of my topics so this does not reflect writers block. I'll be back. I know that there are topics which have been suggested that I have not mentioned. Apologies. I had no idea it would be so hard to choose just one thing to blog about every day.

I missed the "books on my desk" meme so here are the "books in my bag".

Deafening. Frances Itani. 2003

War Trash. Ha Jin. 2004.

A Hundred and One Days: A Baghdad Journal. Asne Seierstad. 2003. (author of The Bookseller of Kabul. )

Le Paria du Danube. Jean Thuillier. 1983.

My Name is Red. Orhan Pamuk. 2002.

Optimus Cont.

I have fallen behind - this is to be expected when you are camping in the past.

Mike Kaplan, in Sorting It All Out, has posted twice on the Optimus Keyboard.

The so-called Ultimate Keyboard
Is the Optimus keyboard just a myth?

Well. I have to admit - my first thought was - its a joke, then, its not a joke, its great, its not so great, its possible, its not possible, never mind, read Mike's posts on this.

And particularly note his reference to the Tablet PC - okay I haven't tried one of those. Hmm, it sounds like its worth looking at.

Real Character

While messing about on the internet looking for vestiges of a universal writing system, one that I was assured existed, from reading Johanna Drucker's most serious work The Alphabetic Labyrinth, I found this.

Okay, so this is a recently fabricated text. However, it is written in a writing system that was created in the 17th century, when a universal writing sytem was still considered viable.

This message was posted on the internet with a challenge to anyone who could decode it. Here is the story of Todd Garrison who undertook this labour of classical proportion and, without the help of Johanna Drucker, followed the trail to the end. What a story!

For those who want to skip the story and get on with the weekend, I can assure you it is only that, a story, a vehicle of entertainment, even arcane and of no technological significance whatsoever.

The writing system was called Real Character and reading about it has given me some idea of what some Europeans were thinking about language in the 17th century. If you are interested in scripts this also might illustrate how an ideographic script would operate. Well, maybe that is going too far. However, this script was invented, inspired by the belief in a method of writing that would communicate ideas directly to the mind.

A Universal Writing System

My head has really been off in the clouds. While writing about DeFrancis and Chinese, I have really been thinking about the desire in 17th century Europe to create a universal writing system. I hope this quote from Diodorus about Egyptian will give an idea of what a universal system would be. "Not syllables to render an underlying sense" but "drawing objects whose metaphorical meaning is impressed on the memory." (see below) That would mean bypassing the spoken word.

That is a description of what westerners thought Egyptian and Chinese writing was. It turned out that Egyptian and Chinese writing does, in fact, represent, consonants or syllables to give an underlying sense. However, the notion that they might represent thought directly had a profound effect on Western thought - one that is not easily relinquished.

(This is my last DeFrancis quote for the month since I will soon be offline for a few weeks.)

"Up until two hundred years ago the prevailing view about Egyptian was that it simply was not a phonetic system of any kind. This was the opinion of a Greek historian, Diodorus Siculus, who visited Egypt in the first century B.C., a time when the traditional script was still in use. He was the first to describe the writing. Obviously impressed chiefly by its appearance, and not understanding how the script really worked, he said that the Egyptians called their peculiar symbols "hieroglyphs" and that "their script does not work by putting syllables together to render an underlying sense, but by drawing objects whose metaphorical meaning is impressed on the memory" (Pope 1975:17) For two thousand years decipherment of the script was held back by the tenaciously held belief that the signs were symbolic and not phonetic.

The signs certainly do give the impression of being mere pictographics, Many, strikingly iconic, depict recognizable objects - a hand, an eye, an owl, a snake, a giraffe, and many others. Some symbols are more stylized, but, they too, often suggest things or actions. The belief that these symbols conveyed thought without regard to sound was reinforced when Westerners came in contact with the Chinese system of writing. Here was another system that was believed to be symbolic and nonphonetic, a property credited with giving to Chinese characters a timelessness and universality unmatched by scripts that were acknowledged to be tied to particular forms of speech. The discovery of another system with such marvelous communicative power helped spark an interest, shared by Leibniz and other leading thinkers, in developing a universal writing system." DeFrancis. Visible Speech. 1989. 151.

Thursday, July 21, 2005

The Ideographic Myth

I have Visible Speech, DeFrancis, 1989, the hard copy book, from the library and I have to return it soon so ... I need to record a few more DeFranics quotes for myself.

Here is a page from Visible Speech, 1989, which does not appear on the Pinyin Info website but which adds to the argument which DeFrancis makes in The Ideographic Myth, the sample chapter from The Chinese Language: Fact and Fantasy, 1984, posted on Pinyin Info. I am providing it as both an introduction and a supplement to this chapter. It is not a summary of the chapter, but a taste ...

"Writers who refer to Chinese characters as "ideographs" can be divided into several groups. One consists of those who are essentially innocent of any knowledge of Chinese and really believe that the characters represent ideas and not sounds. Another group consists chiefly of specialists in Chinese who recognize in varying degrees that Chinese characters represent sounds but consider this to be immaterial on the grounds that they can directly convey meaning to the eye. ... Another representative is the French scholar Georges Margoulies, author of La Langue et l'ecriture (1957), a work that was written for a popular audience.

Still another group includes many people who use the term ideographic because it is the most popular designation for the characters, just as sweetbread is used as the common designation for an item of food that is neither sweet nor bread. To my intense chagrin, I used to belong to the third group, on the rather unthinking grounds that I should go along with whatever was common usage. It was only on reading Margoulies that I was awakened to the error of my ways.

Margoulies presents an extended essay extolling the superiority of Chinese "ideographs" as symbols which convey thought directly to the mind without having to rely on the phonetic information they contain, and do so so well that they could function as a universal system of writing. I was much annoyed by the book. ...

To counter the nonsense purveyed by writers like Margoulies, and to expiate my own sin in this area, I have dealt at length with the issues involved in various works. One chapter of The Chinese Language: Fact and Fanatasy deals specifically with the ideographic myth. The present work extends to writing in general the refutation of the widely held notion of ideographic writing. The concept of logographic writing is also rejected both here and there." p. 222

I consider this to be a good introduction to The Ideographic Myth. But it also brings my thinking back to the desire in Renaissance Europe for a universal system of writing. Europeans wanted Chinese to be ideographic as proof that there could be a universal system of writing that would bypass particular forms of spoken language and communicate thought directly to the mind.

Understanding the nature of the ideographic myth helps me to understand what people were looking for from an ideal writing system in Renaissance Europe.

Wednesday, July 20, 2005

Saki Mafundikwa

It is time for some eye candy. Not that I don't take this book seriously, I do - but this is a tip off that the interest is visual. No wonder, the author is a graphic designer. Saki Mafundikwa is director of the Zimbabwe Institute of Vigital Arts, in Harare, Zimbabwe. He has written Afrikan Alphabets: the story of writing in Afrika.

Here is a summary of a talk he gave to introduce his book. What attracted me to this book originally was this detailed and illustrated synopsis of his book (don't miss this and be sure to *scroll down) which Mafundikwa posted on the Ziva website in 2000.

"Due to the proliferation of the personal computer there is an explosion of typography design, young Afrikan designers can reach into their rich heritage and come up with a whole new typographic language. Designers from other cultures can also dig into this brand new bag that's been brought to the table for inspiration. It might seem like I keep talking about designers but the truth of the matter is that this book will benefit the general population since most people's conceptions of Afrika are formed and shaped by Hollywood (Tarzan et al) and the news media's fascination with reporting on Afrika only when there is negative news to report. " Saki Mafundikwa.

Mafundikwa not only writes about the scripts of Afrika, but the Ziva website presents contemporary font designs of its students.

What intrigues me here is that I now realize there is a wide literature on writing systems from the point of view of their graphic design. The ethos of a university linguistics department insulated me from this world of thinking about scripts as graphic elements. Maybe it is just me ...

*I have to add that working with children, well, you know ... instead of scroll down I have heard children say "squirrel down". Isn't that a treat?


I found this comment from Konrad Tuchscherer, Aug. 2, 2003, in Qalam, a forum about writing systems.

"The site ( devoted to 'Afrikan' alphabets, apparently put together by a graphic designer, is unreliable (and not just for Kikakui). But this isn't strange, since there is a lot of material available on the www relating to African scripts, much of it wrong (even absurd).

I see the Vai syllabary listed, with date of origin given incorrectly as1883. The Vai script dates to 1832 or 1833. For the most recent information on the Vai script, see Konrad Tuchscherer and P.E.H. Hair,"Cherokee in West Africa: Examining the Origins of the Vai Script", _History in Africa_, Vol. 29 (2002), pp. 427-486. On page 440 is the earliest extant manuscript in the Vai script.The Mende Kikakui script dates to ca. 1917. I call it "Kikakui" because that is the only name Kikakui literates give it!

The Mende part is added so that people know the people/language I am referring to. It was devised by Mohammed Turay (born ca. 1850), an Islamic scholar, at a town called Maka (Barri Chiefdom, southern Sierra Leone). One of Turay's Koranic students was a young man named Kisimi Kamara. Kamara was the grandson of Turay's sister. Kamara also married Turay's daughter, Mariama. Turay devised a form of writing called 'Mende Abajada' (meaning 'Mende alphabet'), which was inspired in part by the Arabicabjad in part by the Vai syllabary.

Turay's 'Mende Abajada' was adjusted a bit (order of characters) by Kamara, and probably corresponds to the first 42 characters of the script, which is an abugida. Kamara developed the script further (with help from his brothers), adding more than 150 other syllabic characters. Kamara then popularized the script and gained quite a following as result -- which he used to help establish himself as one of the most important chiefs in southern Sierra Leone during his time (in several places I have read about 'Kisimi Kamara' the 'simple village tailor', which is absurd!)

Kikakui is still used today, but perhaps by less than 500 people. There is also an associated number writing system, which is entirely original (and, like the characters of the script, written from right to left).

I did research on Kikakui in Sierra Leone, on and off, from 1990-94 (first in the south, and after the 'war' started, in the east). The most complete study of Kikakui is: Konrad Tuchscherer, The Kikakui(Mende) Syllabary and Number Writing System: Descriptive, Historical andEthnographic Accounts of a West African Tradition of Writing (Ph.D.,University of London, 1996). Since that might be difficult to lay hands on, there is some good information on the history of the script in: Konrad Tuchscherer, "African Script and Scripure: The History of theKikakui (Mende) Writing System for Bible Translations", _AfricanLanguages and Cultures_ 8, 2 (1995), pp. 169-188."

Tuesday, July 19, 2005

The Origins of Writing

Here is one of the really significant new ideas DeFrancis had. (At least he doesn't reference anyone else for it so I will quote him as the source for now. )

He saw that in all languages where full writing developed, "the syllable was usually the unit of meaning." Full writing was the coming together of syllable and morpheme. It is one of the extraordinarily clear ideas which, after you see it, seems obvious, but someone had to be first to observe and record this fact. I believe it was DeFrancis and shall give him the credit until proven otherwise.

"It is probably not accidental that the three seemingly unrelated inventions of writing - Sumerian, Chinese, and Mayan - which we now take up in detail, were all based on the syllabic principle. There can be little doubt that it is easier to conceptulize a syllable than to analyse utterances into their smaller phonemic units. This is especially likely to be the case if the syllabic structure of a particular language, when compared to that of other languages, posesses special features that make it easier to concentrate attention on the syllable.

Such indeed was the case for Sumerian, Chinese, and Mayan. In all these languages, more so than in English and many other forms of speech, the syllable was usually the unit of meaning. It was often even an independent word. To be sure, this semantic feature should not be exaggerated, as is frequently done by those who misrepresent the languages as 'monosyllabic' in the sense of consisting exclusively of words of one syllable. It is only in relative terms that their syllables are more heavily endowed with meaning and that their words consist of one syllable." DeFrancis. Visible Speech. 1989.

DeFrancis claimed that all full writing systems are phonographic, either syllabic or alphabetic, and thereby overthrew Isaac Taylor's previous paradigm of logographic, syllabic, alphabetic. That Chinese, Mayan and Sumerian happen to also have many morphemes of one syllable does not make their writing any less phonographic but only adds the dimension of morpheme to the syllabic units.

John DeFrancis

I wrote about Isaac Taylor in my first post on this blog. I intended to get back to him but haven't so far. He wrote The Alphabet in 1883 and it has been considered the first scientific work on writing systems written in English.

In the 20th century I believe that there was also one book which brought about a paradigm shift in writing system theory. Visible Speech: The Diverse Oneness of Writing Systems. by John DeFrancis, 1989.

In a discussion group last year on several occasions when cornered on a definition or idea, I said, "I think that comes from DeFrancis, doesn't it?" Now I am rereading it to reassure myself. Yes, he did say all that and much more. The only problem is that someone asked Dewho?? So, if you want to read one book - this would be my choice. But then who wants to read one book?

I have added a couple of book lists to the sidebar and I want to particularly mention Gary Feng's book list. Gary has a great blog on cognitive psychology and writing systems but it seems to be off for the summer so I'll try to link up to it later.


Several books of DeFrancis are listed on the Pinyin Readings section of the Pinyin Info website and a chapter of each has been made available to read. This is a great resource.

(I found this site a couple of years ago when I first started using google and typed in "morphosyllabic" - there it was in one of DeFrancis' books! Its funny now to think about the first word that I typed into google!)

UnicodeInput Utility

One of the blogs I read belongs to Mike Kaplan who created the Trigeminal website, featured in the sidebar. (I do know that my sidebar is a little puny, but I have had the habit of bookmarking sites in my favourites for ever - and can't seem to change the habit. That is also why I don't have an RSS feed.)

Mike writes enough posts everday in Sorting It All Out that he can afford to be both entertaining and technically detailed - quite an achievement! One of the many pluses of reading Mike's blog is that each post is brought to you by a Unicode character.

"This post brought to you by "ש" (U+05e9, a.k.a. HEBREW LETTER SHIN)"

Remember those long ago days ...

Yesterday he posted on a Unicodeinput Utility. I haven't tried it yet but it looks like it will answer a question I asked last month about how one could input by codepoint. I have always just input letters or characters, of some kind or other, but I have wondered what it meant to input by 'codepoint'. This needs to be in my list of resources. Thanks, Mike.

A Chinese Typewriter

I have been thinking about how we might one day live in a post QWERTY world. However. I love going back into the past and finding out how some of the same problems were approached, and sometimes solved, in a different technological era.

This morning (well, about noon where I am) Jimmy Ho sent me this image of a Chinese typewriter. I wasn't exactly sure of how to describe it but I have found this 1980 patent for a similar Chinese typewriter.

"A Chinese typewriter comprising a keyboard for the input of numerical and command signals, a control circuit for control of the system, a rotating drum carrying a film strip on which are optically stored a plurality of Chinese characters, a CRT display for verifying the desired character, and a printer."

While I am not sure if this is identical , the patent description, quite long, does make for a fascinating read. There are many social and theoretical asides in the description that give interesting insights into the interaction of the Chinese writing system and technology.

"It is therefore an object of the present invention to provide a typewriter for Chinese characters (or other ideographic characters) that is mechanically simple, that can be made relatively inexpensively, and that is relatively small, compact and light weight. It is a further object of the invention to provide such a typewriter that can be operated after a relatively short training period, and without knowledge of the Chinese language or of the origin and meaning of the ideographs. It is also an object to provide a Chinese typewriter that can be operated at speeds comparable to those at which alphabetic typewriters are operated."

I also found this site about a visit to the Harbin Welding Institute and more from the Science and Society Picture Library.

This kind of then and now comparison is getting to be quite addictive. Thanks, Jimmy!


Jimmy added this comment. I think I was so excited to get a photo from far away and post it all in a matter of minutes that I didn't realize I should wait for Jimmy's commentary to go with it. Live and learn.

"I took that picture around 1990 in a small printing workshop in Beijing. What strikes me about the machine is that it looks like a hybrid of a regular alphabetic typewriter and those letter cases (sorry, I lack the appropriate terminology) used by the typographers. To put it shortly, while there is the paper cylinder (or "drum"?) and a familiar general shape, the absence of a keyboard is was makes it so unique, definitely not a common "daziji" (that reminds me of my amusement when I remarked the old cyrillic typewriters that are still in place in some Chinese administrations).

At the time, I was told how many movable characters there are, but I didn’t write it down; an estimation based on a close-up of the "case" should be possible, though. Another thing I should have asked is the system commanding the characters arrangement (obviously, I was still a beginner in Chinese).

Once you know on which principles that guide the disposition of the characters (frequency order is only one out of many possibilities), you only have to get used to the manipulation of the articulated arm that "takes" the character and types it on the paper (you can see on the picture that a character is already out of the case, ready to be projected on the drum), and you can then compose a page in a reasonable time.

This is how it was described to me, but, unfortunately, I haven’t seen it work, because I happened to visit the room off schedule.

I am intrigued by the patent you link to, and I am not sure if I understood it correctly, but it surely doesn’t match the machine in the picture. With its 'rotating optical storage device [that] has stored visual representations of characters" and "keyboard providing keys for generating numerical signals representative of numerical values corresponding to an ideographic character,' it describes instead a much more sophisticated kind of typewriter."

Monday, July 18, 2005

A Post QWERTY World

Here are today's comment from the Unicode list on the Optimus keyboard. They have a good point. We all want a post-qwerty world but what is the best way to get there.

Ken Whistler:

"I remain an utter skeptic about the usefulness of LED keyboards.They will serve a niche market, of course, but for most typists, they would be counterproductive. The whole concept of moving your hands *off* the keys to *read* the keys, and then back on the keys to key your input -- particularly for complicated IME's that change the state of input as you go -- strikes me as at best un-ergonomic and at worst frustrating and inefficient.The better alternative, all along, as far as I am concerned, has been to use the *visual* environment -- namely the monitor you are already looking at -- to provide a virtual keyboard that matches whatever the input state is. That avoids the hardware issues of LED keys (including the fact that keyboards get dirty), and lets you focus your visual attention on the screen and your tactile attention to the keyboard. Furthermore, a visual keyboard can also respond to pointing device (mouse, whatever) input as well, so that you can "hunt and peck" on it with the mouse, if you like, for unfamiliar keys, or to deal with the shift states of IME's."

And Raymond Mercier:

"We are more bound by the tactile experience of the keys than we imagine, even those of us who are not real touch typists.I tried a new keyboard recently (mis-designed by Belkin) that had the unfortunate characteristic that the character appeared on the screen only after the finger was lifted from the key, instead of appearing on the downstroke of the key. It was quite impossible to type at normal speed. That experience rather suggests to me that normal typing would be very frustrated by having to pay so much visual attention to the keys."

Personally I have been using virtual keyboards for multilingual keyboarding and really like them. However, they are often not the keyboards bundled with my software. I have written about some of these keyboards before.

I did try downloading a few keyboards but even then I could never get the functionality out of them that a javascript application offers. However, these are limited to prototypes or samples, limited window space etc. They suit my cognitive processes but are not adequate for real world word-processing.

When it comes to keyboards, I am still looking for a solution.

That reminds me - I tried checking email on my cellphone the other day. Now, that felt like magic. I found that it took a minute to train my reflexes to get the buttons pushed, 4x for S, 2x for U and 4x for Z, then I was off to the races. I could definitely get used to that.

I need to add that these are two of my favourite keyboards: the Syllabic Keyboard Editor for Tamil and Q9 input for Chinese. You really have to use them, click around and try them out, even if you can't read what you have written.

Having said that, some people are more interested in a different layout for the alphabet - that's important too. We all want a post-qwerty world.

Question: How should I spell post-qwerty, with or without a hyphen, with or without caps. Too bad - it will never google anyhow.


Sunday, July 17, 2005

Optimus Update

This is an update from the Art Lebedev Studio

Frequently Answered Answers about the Optimus keyboard

"It's in initial stage of production
We hope it will be released in 2006
It will cost less than a good mobile phone
It will be real
It will be OS-independent (at least it can work in some default state with any OS)
It will support any language or layout
Moscow is the capital of Russia
Each key could be programmed to produce any sequence
It will be an open-source keyboard, SDK will be available
Some day it will be split ('ergonomic')
It will most likely use OLED technology (e-paper is sooo slow)
Our studio is located two blocks from the Kremlin
It will feature a key-saver
Keys will use animation when needed
It has numeric keypad because we love it
There's no snow in Moscow during Summer
It will be available worldwide (why not?)
OEM will be possible (why not?)
Contact us for hi-res images, or interview inquires
We want to thank everyone for the support.
Stay tuned for our next projects."

Here are some reactions to the Optimus Keyboard from the Unicode list .

"I was specifically thinking about Japanese input. And I am curious how that would be implemented. Of course, it is not so much the sillabic part of the input that is a problem (there are Japanese kb layouts that assign one syllable to one key, even most Japanese users type with combinations of qwerty letters.) It is rather the fact that the qwerty input part is converted 2 times: first time when the second letter is associated to the first to produce a Japanese kana, second when the kana (on its own or associated with other kanas) is converted to a kanji."

"The LED keykeyboard in effect levels the field because the keyboard can visually represent any layout you select. The changing visual key sequences idea ... is another interesting dimension."

"I am surprised that there is even some uncertainty as to how such a keyboard might be used for non-latin scripts. Consider, for instance, its use with Simplified (mainland) and Traditional (Taiwan) Chinese and Japanese: In all three cases, there exist four or five commonly-used input methods designed to be used with a standard QWERTY keyboard, that depend on remapping keys. The only alternatives are expensive, huge (largely unstandardised) keyboards originally designed for typesetters."

Saturday, July 16, 2005


I was recently contacted by someone from the spelling society. Steve Bett has a website on the alphabet, a fascinating page of related links and is the editor of the journal of the Simpified Spelling Society .

To get an idea of how a simplified spelling might look, visit this page for a paragraph from The Little Red Hen in a dozen forms of simplified spelling. I don't think any of these can be labeled a different script. However, they stimulate thought and discussion. Then I found Steve's pages on Unifon and other alternate transcriptions.

Unifon is indeed a different script. (I do need to mention that unless you have the Unifon font installed it doesn't display as it should.) In actual fact the Unifon font is considered a variation of the alphabet. Since there is no lower case in Unifon, the extra 18 letters of the Unifon system are keyed in by using the lower case English alphabet.

I then flipped through my latest acquisition from the bookstore, The Writing Systems of the World by Florian Coulmas, 1989, and found this image labeled "a transitory alphabet for English."

Unifon was invented in 1959 by a Chicago economist John Malone. You can read more about its history here.

My first reaction to this 40 letter alphabet was that it also has several letters derived form More's Utopian alphabet. It comes from the same tradition of Utopian internationalism and is one of the scripts featured by the World Language Process.

I wonder whether these scripts, Utopia, Elizabethan stenography, Moon code for the blind, Cree and Unifon can be considered part of the same family of scripts by virtue of their visual construction or glyph. Writing systems are usually labeled by the manner in which the symbols represent the phonology of the language. However, these scripts not only have symbols which look the same but they are all connected by the themes of accessibility and universality.

I leave you with a word of caution. Here is Coulmas on the reform of English orthography.

"For instance, if the principles of morphemic invariance, etymology, homograph avoidance or deviant proper name spelling, all of which play an important part in the present English spelling, were discarded for the sake of a rigorously phonemic orthography, the result would be too strange and involve too many changes in the established spelling habits to be accepted by the literate part of the speech community." Coulmas. Writing Systems of the World. 1989. p. 256.

Images are from these sites:

Friday, July 15, 2005

The Rosetta Stone

Fri, 15 Jul 2005 19:41:39 +0100
To: "Unicode Discussion"
From: "Michael Everson"
Subject: Happy Rosetta Stone Day

"According to the Wikipedia, French Soldiers in the Egyptian port city of Rashid uncovered the Rosetta Stone (indeed the Stone of Rashid) on 1799-07-15."

The Rosetta Stone is a large granite stone found in Rashid, Egypt in 1799, which contains an inscription in two languages and three scripts. The languages are Coptic and Greek. Coptic is the ancient Egyptian language now used only in the Coptic church.

The scripts are hieroglyphic, demotic and Greek. (Enlarge this image to view the Rosetta Stone in detail. It is 2480 x 3290, so keep clicking till you get the detail.)

Since Coptic was only a lesser known language, but not an unknown language, the Rosetta Stone helped but was not critical in the understanding of a language. However, it enabled the deciphering of Egyptian hieroglyphics as a script. The most amazing thing of all is that hieroglyphics turned out to be a largely phonetic script and not just a set of ideographs. Go figure!

"Hieroglyphs consist of three kinds of characters: phonetic characters, including single-consonant characters, like an alphabet, but also many representing two or three consonants, logographs, representing a word, and determinatives, which indicate the semantic category of a spelled-out word without indicating its precise meaning." Wikipedia

The wikipedia article explains that:

"Rosetta Stone is also used as a metaphor to refer to anything that is a critical key to a process of decryption, translation, or a difficult problem, ... "

In a narrow sense Rosetta Stone can be used as a metaphor for a key to the deciphering of any set of written symbols.

However, the most popular uses of Rosetta Stone are for the foreign language software and the Rosetta Project: Building an archive of all documented human languages. This confusion of script and language is very common but particularly unfortunate for me since there are several scripts that I can decode and keyboard without knowing the language. (I try to take this intermingling of language and script not as a personal affront but as a "cross-eyed bear.")

In general Rosetta Stone is used as a metaphor for the solution to a problem, or for the key to unlocking any set of unknown data.

Five great links for finding out more about the Rosetta Stone:

British Museum
Minnesota State University
University of Oregon
Cleveland Museum of Art

The Optimus Keyboard

There are several alternative keyboards which have been designed over the years. I haven't gotten around to reviewing them yet. Apologies to those who requested them. They are on my list.

However, this is a new design which was posted on the Unicode list today. The Optimus Keyboard.

"Every key of the Optimus keyboard is a stand-alone display showing exactly what it is controlling at this very moment."

"Optimus is good for any layouts—Cyrillic, Ancient Greek, Georgian, Arabian—and so on to infinity: notes, numerals, special symbols, HTML codes, mathematical functions."

Someone out there is doing some alternate thinking.

An Abecedaria Abecedarium

Alessan said...

"When you can get drunk on writing, what do other things matter?

Alcohol behind closed doors, everywhere -
finally going home.
I just kicked last-orders - man!
Noble obsession, picking queries
regarding symbols through undertones,
values, weights, xenography,
zymurgy, &c."

Abecedaria blog condones divine elixir .....

Ampersand is a fitting last symbol. I am going to let readers research "zymurgy" as I did.

Wednesday, July 13, 2005

Shorthand Writing Systems

Shorthand was known and used in classical Greece and Rome. That is a very interesting story and perfectly good for a rainy day.

In medieval times shorthand fell out of use as a system in its own right. In 1588 there is a reappearance of shorthand systems. Timothy Bright was the author of the first shorthand "Characterie; An Arte of Shorte, Swifte, and Secrete Writing by Character 1588." Unfortunately I don't know at this point what this system looked like. That's the way the cookie crumbles.

However, I do know what John Willis' shorthand system looked like. The Art of Stenographie, 1602, by John Willis, introduced the phonetic shorthand style that was to be used from 1602 until 1837, when Pittman's shorthand replaced it in common use. Although there were 48 different publications of shorthand systems during that time they all used a similar graphic form.

These systems are all illustrated in The Alphabetic Labyrinth by Johanna Drucker, 1995. John Willis' Stenographie is also well illustrated in The World's Writing Systems by Peter T. Daniels, 1996. There is no internet image available for these systems. Some day I will make one - on that future rainy day.

However, in Willis' system 16 symbols out of 22 are identical to the Cree syllabary and William Moon's blind code. The link is there, very clear and obvious. The system is based on the rotation in four orientations of basic symbols similar to the central 10 symbols in More's Utopian alphabet. The importance of More's sytem is that it is the first I have seen which demonstrates rotation as a basic feature for creating a symbol.

In John Nichols' chapter on Cree in The World's Writing System, John states that Cree was based on shorthand. The confusion has come from the fact that neither he nor anyone else that I have read actually said that Cree was based on the shorthand systems that existed before the Pittman system was published in 1837. It is possible that this has been known by many writers on Cree but it has not been explicitly illustrated. And I cannot do that today either.

In any case, there is a system of visual glyphs, which appear to be derived from the Utopian alphabet in 1529, that were used as shorthand from 1602 to 1837. Then in 1837 they were replaced as the dominant shorthand system and reappeared as the Cree syllabary in 1841 and the Moon code in 1843. The type of writing system varied but not the actual shapes of the symbols. Shorthand is phonetic, Moon code is an alphabet and Cree is a syllabary.

I consider this a lame duck post since I cannot illustrate what needs illustrating.

Tuesday, July 12, 2005

The Spelling Book

1690 New England Primer

Konrad Tuchscherer mentions one other possible influence on the creation of the Vai syllabary - the English speller.

"There remains, however, a possible mission influence on the creation of a syllabary which turns, not on individuals, but on one piece of school equipment, the spelling book. ... As even today, the spelling books presented certain words broken up into syllables. We know that Sequoyah, before he invented his syllabary, saw a spelling book. It is ... just possible that ... this gave them the idea of analysing and representing language in syllables." p. 480

Here is where a picture is worth a thousand words. What did a speller look like in the 1830's? This illustration is from a 1690 New England primer. There is evidence that reading was taught in this way up until the 1840's if not later.

"In his lectures and reports beginning in 1841, Mann attacked the alphabetic and syllabic methods of teaching reading as meaningless repetition of "skeleton-shaped ghosts." He pointed out, for example, that l- e- g, does not spell "leg" but "elegy" From Teaching Reading - A History by R. M. Wilson.

Certainly other missionaries were still using a syllabic method to teach reading in 1859. In the Sandwich Islands the missionaries were having students copy "the alphabet, syllabic, and reading lessons of the spelling-book, and the scripture extracts" Polynesian Researches chap.1.

Could the mundane English primer be the inspiration for the Cherokee and Vai syllabaries?

Cherokee and Vai

I have been reading Cherokee and West Africa: Examining the Origins of the Vai Script by Konrad Tuchscherer, History in Africa 29, 2002. As the title suggests this article examines the many references to a connection between the Cherokee and Vai scripts and makes additional evidence available. As Tuchscherer explains, this story has been mentioned before but he provides further evidence.

While it is accepted that the Vai syllabary was invented by Momolu Duwalu Bukele and other elders, without the known direct involvement of any missionary or outsider, the similarity in the structure of the Vai and Cherokee syllabaries has suggested a link between the two. There are variations of this account but all agree that the Vai syllabary had an indigenous Vai origin around 1830-34.

Tuchscherer recounts the details of a certain Cherokee Indian, Austin Curtis, who emigrated from the United States to Liberia in 1823, two years after the invention of the Cherokee syllabary in 1821. He settled in Vai country in 1829, three years before the accepted date for the first appearance of the Vai syllabary in 1832.

This discussion concerns not the forms of the individual letters, which may be attributed to ancient ideographs, but the possible transference of the notion of a syllabary. However, Tuchscherer also puts forward the consideration that missionaries in the 1830's were experimenting with syllabaries, based on the recognized success of the Cherokee syllabary.

He has reproduced this passage from an article called "Cherokee Alphabet" by Samuel Worcester, Cherokee Phoenix (21 February, 1828).

"The circumstances of the alphabet being syllabic, and the number of syllables so small, is the greatest reason why the task of learning to read the Cherokee language is so vastly easier than that of learning to read English. When an English scholar recollects the tedious months occupied in his spelling-book, he regards it as a matter of astonishment, and nearly incredible, that an active Cherokee boy may learn to read his own language in a day, and that not more than two or three days is ordinarily requisite. ...

When an English child has learned the names of his letters, he has but just begun learning to read.- The main thing is to learn the combinations of sounds; unless, indeed, it be a still more difficult task, to divest himself of the idea that he must pronounce the name of each successive letter in order to read. If, for an illustration, ba, were to be pronounced be-a, he would soon learn. But after once learning to pronounce the letter be, then to detach from the consonant sound that of the vowel e, and attach to it that of a in one instance, i in another, and so on, and in the same manner to learn a thousand other, and some of them extremely complicated combinations, is a task indeed.

But the Cherokee boy has not a single combination to learn, except that of s with a succeeding consonant; and the name of each character is the syllable which it represents. To read is only to repeat successively the names of the several letters, When, therefore, he has learned two characters, he can read a word composed of those two; when he has learned three, he can read any wor written with those three, and when he has learned his alphabet, he can read his language. I say he can read, not perfectly, but he can spell out the meaning, and by practice, may become perfect."

Certainly it is recognized that the success of the Cherokee syllabary was an influence on James Evans in his choice of a syllabary as a type of writing system for Cree, even if not for the forms of his system. Is it worth considering whether the Cherokee syllabary was also an influence on the invention of the Vai syllabary? Others have argued that a syllabary is always the first candidate for an invented script - no direct influence is needed.

The Cherokee Syllabary

Someone asked for a post on Cherokee - about day 3 of this blog if I remember correctly. Whew. I don't know where to start.

The Cherokee syllabary was invented by Sequoyah (George Guess) who presented this writing system to his tribe in 1821.

The syllabary has many forms. First, there is the original handwritten form. In this picture one can see the typewritten forms in smaller print to the side. Soon after, 1829?, the typewritten forms, chosen by Sequoyah, were used for a newspaper, handbills and hymnbooks. Books and Bibles in Cherokee can also be viewed at this site.

The syllabary is organized in this chart by the order that the English letter corresponding to the sound has in the alphabet. It can be sorted by row or by column, although by row is the more common order. (This fascinates me. The Vai syllabary has this option as well.)

However, there is another order altogether. This page shows the "characters as arranged by the inventor" above the "characters systematically arranged with the sounds."

These are some of the finer points. The Cherokee Syllabary is best known for the role it has played in American history as a vehicle of literacy for the Cherokee Nation. With a printing press and newpaper, the syllabary was used for letters, education, communication, Indigenous knowledge and Christian religious publications.

In world history this syllabary has a pivotal role as the first of the modern indigenous syllabaries that later emerged in Canada, Africa and Asia. Sequoyah's invention was a unique contribution to the history of the world's writing systems.

Here are the Unicode codepoints for Cherokee.

Online Translation Service

Jean-Philippe SARDA sent me an email about his new free online translation service called Cucumis.

"Cucumis roughly translates as "Watermelon" from Latin, a spherical fruit like the earth, full of vitality and happiness. With about 3000 spoken languages over the world, we hope this website will help us to get to know each other."

"To use the Cucumis services, you must be a registered member, you gain points when you translate a text and you need points to submit a text to be translated. Cucumis will be useful to you only when you speak at least one foreign language."

I imagine he is talking about something more substantial than what would fit in the comment section of blogger! Interesting.

Right now I am working on translating a chapter or two each from several old books on writing systems written in French. And some day I want to read Le Lingue Utopiche which is written in Italian. So, yes, Jean-Phillipe, I think the internet is making us all more aware of our need to communicate across languages regardless of where we live.

Only thing is - I never thought of French as being a "foreign" language. I don't think my skills will be in much demand but I am happy to pass this on.

Coincidence #1 - I was eating watermelon when I got this email.
Coincidence #2 - I needed a translation today.
Coincidence #3 - This is the second item with a Latin title I received today.
Coincidence #4 - I am researching the origins of the Utopian alphabet - which was used for writing Latin.


Monday, July 11, 2005

Non Legitur

Marco Cimarosti sent me a copy of his new book Non Legitur: Giro del mondo in trentatré scritture. This book demonstrates how 33 scripts from around the world work. It includes 9 alphabets, 15 Indic scripts, Chinese, Japanese, Korean, Ethiopic, Tifinagh, Maldivian, Yi, Cree, Cherokee and numerals. It demonstrates how these scripts are composed in Unicode and then explains how they can be read.

For Indic scripts Marco gives details for how each syllable is composed. For Chinese he provides a table of characters listed by radical with a pronunciation guide for Mandarin, Cantonese, Korean, Japanese (2), and the meaning.

At the end of each script section he lists 12 international words in that script, words like "telephone" "taxi" "coffee". The reader must test out their skills and see if they can decode these words. The answer key is in the back. (I hope to bring you some of these lists of words in the fall.)

The author hopes that his book will help readers muddle through the tapestry of undecipherable signs as they travel around the world. I know the average person would ask "Why try? I would never understand what the sign says anyway, since it is in a foreign language." Not always so, as I recently found out.

I was waiting for a friend on a street corner and decided to brush up on my Panjabi, which is quickly becoming Vancouver's third language. I was looking at the sign in Panjabi over a fashion store window and was asking myself what the word for 'fashion' might be in Panjabi . After a few mental contortions I realized that the name of the store in Panjabi script was indeed "Fashan stor."

I believe Marco is saying "When abroad, give that foreign script a try. You never know, the sign might say something useful like 'bank' 'taxi' 'maps' etc." Here is how Marco described his book to me last winter. "Just a small guide for people who are curious about the funny letters they have seen during their holidays in the sun."

This book is written in Italian but, given the visual layout of the book and its subject matter, that is only a minor drawback.

I would, however, beg my readers to assist me in translating the first paragraph of the introduction where Marco provides the rationale for the Latin title of his book.

'Quando un copista medievale, in un testo in latino, trovava un citazione in greco, anziché copiarla scriveva "Graecum est, non legitur", cioè "È greco, non si può leggere". Con questa annotazione ammetteva nonsolo di non capire il greco ma addirittura di non conoscere le lettere dell'alfabeto di questa lingua. Nell'epoca di Internet e dei voli chartersono cambiate tante cose: troviamo normale avere vicini di casa marocchini o giapponesi, andare in vacanza in Grecia o in Thailandia, trovare nel manuale d'uso del frullatore la traduzione cinese o russa delle istruzioni. Eppure, se ci chiedessero di leggere o di copiare una scrittura diversa da quella latina, non sapremmo far altro che esclamare smarriti: "Non legitur!" '

I wish also to mention that while I provided the Tamil wordlist for this book, Marco provided clarity and common sense in discussions about the Cree writing system. Much of my recent investigation into antecedents for Cree are in response to comments he made last year.

Addendum: Translation provided by Simon.

"When a medieval scribe found a citation in Greek in a Latin text, instead of copying it they would write "Graecum est, non legitur", that is, "it's Greek, it can't be read". With this annotation, they admitted not only to not understanding Greek but also to not even knowing the letters of that language's alphabet. In the age of the Internet and of charter flights many things have changed: we find it normal to have Moroccan or Japanese neighbours, to holiday in Greece or Thailand, to find the instructions for the food processor translated into Chinese or Russian. Even so, if we were asked to read or copy a script other than the Roman, we could do nothing more than exclaim in bewilderment: "Non legitur!".

Sunday, July 10, 2005

Meditations on Utopia

It is too much in my nature to rush past new information, jumping to conclusions, head over heels into a novel theory. Or to reduce to bits and pieces some older construction. Today I shall treat as a day of rest and contemplation, to observe and meditate on the forms before me.

I see the predominance of the circle and the rectangle, with a triangle, for the trinity(?) in a nearly central position. I see to the left of the triangle an anomalous shape - it is the only one which eludes analysis.

The more symmetrical shapes resist rotation into different orientations. I was about to say that "f "cannot be rotated, but I stop myself and rephrase it. The rotation of the symbol for "f" cannot be perceived. (I take a detour and consider whether I would want to use an encoding for the Utopian alphabet, but I correct myself and google a Utopian font instead. It exists but I will not use it today.) After all, this is the Latin alphabet in a different visual form: it is not actually an alien writing system.

A, b, c, d, e, f, parallel r, s, t, v, x, y. The alphabet is composed of 6 modified circles, 4 curly and rotated semi-circles, an anomalous figure, a triangle, 4 rotated right-angles, and 6 modified rectangles. An idea suggests itself, if the triangle represents God, could the anomalous reunited semi-circles represent humanity? Maybe, but that is outside of my task today - that is speculation and not observation.I have forsworn speculation for today. It occurs to me that I might one day choose to read the entire text of Utopia to understand this better.

Now back to what I know as fact. Thomas More created a religious and God centred Utopia. Aside form this single central feature, his work has been compared to that of Plato and Marx.

In spite of the emergence of vernacular written languages in the 1500's, More wrote his book in Latin.* He wished to create a universal work that spoke to humanity as a whole and not to a particular nationality.

More lived on the brink of emerging European nationalism. He clung to an earlier way, an idealistic vision of a united humanity, not divided into races and religions. An odd book Utopia, fettered by the harshness of that time but dreaming of universality and tolerance.

From the printers note we know that the Indian tongue was "nothing so strange among us" and it makes me wonder if he were refering to Devanagari or Tamil (scroll down and look at the long e). The creator of this alphabet may owe some portion of his invention to Indian alphabets. Perhaps I see some Greek as well. I leave this thought for later.

And reread this passage from The Alphabetic Labyrinth by Johanna Drucker, 1995.

"Attitudes towards languages evolved in the Renaissance from the conviction that language could be analysed for its perfection, as evidence of divine inspiration, to a realization that languages were embedded in human history, were inconsistent, imperfect, and subject to chanage. The boundaries of cultural experience expanded with increased exploration and trade, increasing exposure to an array of foreign languages which appeared exotic to Europeans whose own language schemes were put into perspective by contrast.

The universal language schemes which emerged in this period, particularly in the 17th century, were motivated by one or more of the following desires: to recover the lost perfection assumed to be embodied in the original language spoken by Adam; to find a system of polyglot translation capable of breaking down the barriers between peoples which were embodied in linguistic differences; and to construct a system in which linguistic categories would designate logical categories in a more perfect relation of language and knowlege. It was written language, rather than spoken language, which lent itself to these proposals, in part because the concrete quality of its visual form lent itself to more ready manipulation within the descriptive systems."

This meditation poses several further questions. What does More's Utopia reveal about 16th century attitudes and beliefs concerning languages and writing systems? Is the graphic representation of the Utopian alphabet a candidate as an early antecedent for the graphic shapes of the Cree Syllabarium? Which Eastern writing systems were known in Europe in the first half of the 16th century?

*If you have seen the movie Man for all Seasons you may remember that More's daughter, Meg, held her own with Henry VIII in a conversation in Latin. H.8. was not too shabby at Latin himself, I imagine, with his clerical education.

Addendum: I have incorrectly used the term rotations for what should be labeled 'orientations', or vertical and horizontal flips.

Saturday, July 09, 2005

The Utopian Alphabet

The Utopian Alphabet of Thomas More

The Utopian alphabet was invented by Thomas More for his book, Utopia, 1516. However, when the book was printed the fonts for his alphabet were not available as this note From the Printer to the Reader explains:

"THE Utopian alphabet, good reader, which in the above written epistle is promised, hereunto I have not now adjoined, because I have not as yet the true characters or forms of the Utopian letters. And no marvel, seeing it is a tongue to us much stranger than the Indian, the Persian, the Syrian, the Arabic, the Egyptian, the Macedonian, the Sclavonian, the Cyprian, the Scythian, etc. Which tongues, though they be nothing so strange among us as the Utopian is, yet their characters we have not. But I trust, God willing, at the next impression hereof, to perform that which now I cannot: that is to say, to exhibit perfectly unto thee the Utopian alphabet. In the meantime accept my goodwill. And so farewell."

We know what these letters look like because they were recorded in Champs Fleury, 1529, by Geofroy Tory, who is famous for relating the letters of the alphabet to the proportions of human anatomy.


"This example was given a Latin translation by More:

"My ruler Utopos [Greek for "no place"] made me into an island from a not-island. Unique among lands, and without philosophy, I signifiy for mortals the philosophical city. I freely share my gifts, and accept without complaint what is better."

I found this alphabet while flipping through The Alphabet Abecedarium by Richard Firmage, 1993. Naturally the Utopian alphabet was in chapter U. About this alphabet Firmage says:

"The new alphabets proposed by linguistic or artistic reformers have become mere footnotes or curious period pieces in the general history of writing. Some alphabets were never intended for general acceptance or even any actual use, however, being merely literary appendages or embellishments. These include what Geofroy Tory called Utopian or Voluntary letters - named after the alphabet devised by Thomas More in his Utopia. These letterforms are a literary conceit or exercise in ingenuity meant to give a flavour of authenticity to fictional accounts of the civilization of imaginary societies." p. 226

These letters show an early attempt to construct a writing system using a set of shapes in four orientations. Does it also reflect a desire to seek a set of ideal letters not based on the arbitrary inherited Phoenician symbols?

PS Humble apologies, dear reader, I have just previous to this accidently posted a message from the Conlang list that I meant to save as a draft. Being ignorant of blogging etiquette, I did not remove the post but have left it as is.

Moraic Writing Systems

Dirk Elzinga, Con Lang List 5 Oct 2000

"Okay, here's the poop from a phonologist working within the Generative tradition (that would be me). A mora is a unit of syllable weight. Long vowels have two moras, short vowels have one. This difference shows up in stress systems: in many languages there is a principle whereby heavy (i.e., bimoraic) syllables attract stress; this is known as the Weight-to-Stress Principle (WtS).

Languages which have been cited as examples of the operation of WtS are Latin, Hindi, Yupik, and (various varieties of) Arabic, among others. In many languages where WtS is operative, syllables which are closed by a consonant also count as heavy; in these cases, one speaks of a consonant being moraic "by position" (i.e., as the coda of a syllable). Again, Latin has been cited as an example of such a language, as has Arabic.

There are also languages which show effects of WtS, but which do not count closed syllables as heavy; Shoshoni is such a language. That's why linguists distinguish consonants which are moraic "by position" from those which are not.

(In an interesting twist, Yupik shows both kinds of patterns. If a consonant closes one of the first two syllables of a word, that syllable counts as heavy and attracts stress. Closed syllables following the second syllable are out of luck and are always light, unless they contain a long (bimoraic)vowel.

Pitch accent systems, such as found in Japanese and (presumably) Ancient Greek, may follow different rules. Using the term 'mora' to describe such systems may therefore be confusing to linguists such as myself who know the term as applied to stress systems, regardless of the weight of tradition supporting such usage. It's just our poor luck that we didn't get a good Classical

For me, the debate over whether Japanese is "moraic" or "syllabic" is a non-issue; Japanese is clearly divided into syllables, and is just as clearly sensitive to the weight of those syllables. The issue for me in Japanese is a representational one: is the onset of a Japanese syllable adjoined to the first mora, or is it adjoined to the syllable directly? Thus for the final syllable of the word _hatten_ the debate is over the following representations (best viewed in a monowidthfont): s s syllable \ /\ m m / m m mora / t e n t e n segment I agree with linguists who argue for the former, the representation in which the syllable onset is adjoined to the first mora. Evidence for this representation comes from a language game in which the final mora of a word uttered by the first player must serve as the initial mora uttered by the second. Players exchange words following this pattern for as long as possible. If a player uses a word ending in the moraic nasal she loses, since the moraic nasal may not begin any syllable inJapanese. As Nik has already noted, the pitch accent system is sensitive to syllable boundaries; hence syllables must also be recognized as prosodic units in Japanese.Comments/corrections welcome. "

Dirk Elzinga

Friday, July 08, 2005

Unicode Philosophy

There is a philosophical discussion on the Unicode List. This is a nice change from the "fi ligature", "UCS-2/4 & BOM" and "JIS X 0208". Not that those can't be very important too.

However, I wish to give you a taste of the current conversation which is on a different plane altogether.

This is from Gregg Reynolds, Fri Jul 08 2005 - 18:58:27 CDT
Re: Demystifying the Politburo

"Seriously (I'll try), the question of participation of native speakers is (IMHO) and important and thorny one.

On the one hand, nothing says native speakers are the best informants. And as a matter of policy I see no reason why a *standards* body (especially an industry standard body) should have a requirement for native speaker participation; after all, the (industry-defined) goal is to get a standard, not to make everybody happy. No doubt such participation is desirable, but it's quite a different thing to say it's required. Standards have to work in the marketplace in order to become standards.

On the other hand, it's pretty obvious (to me at least) that participation of native speakers in standardization of cultural artifacts like written language is a Good Thing. (List: I know, I know, Unicode does not encode written language, it encodes characters/scripts/whatever. But the perception will always and inevitably be that it is an encoding or modeling of written language.)

I can't help drawing an analogy (if that's the right word) to the ideas often discussed by Edward Said, among others. He wrote extensively about how the West (that fearsome boogeyman) controls the narrative of/about the East. It doesn't really matter if I as a Westerner get it right; the East (South, Middle East, slightly East and a little South but ... etc.) should speak for itself. (Or something like that; it's been a while). Now, one may agree or disagree with his language (I'm not so crazy about it myself), but there is no denying that his views are supported by a large population in both East and West. Defining an encoding that models (in some way) non-Western languages without significant - and visible - participation of native speakers seems analogous to "us" telling their history instead of letting "them" tell their history.

On the third hand, it's clear (but maybe only to those who follow the Unicode list) that people like Mr. Everson work very closely with native speakers, so you can't really argue that the linguistic communities were/are not represented. We are clearly not the 19th century.

On the fourth hand, it's also clear (to me at least) that Unicode works great for some linguistic communities and not so great for others. (You knew it was coming, and here it is: Unicode is very bad indeed for the RTL community in general and Arabic in particular. ;-) This gets back to the design principles (and the interests that drive them) of Unicode, which work better for some languages than others.

And then there are the pragmatic issues which you have outlined concisely in another message.

Obviously I haven't quite wrapped my mind around these issues yet so I beg the indulgence of you and other Listerines. I (rashly?) assume that pretty much everybody on this list is interested in "getting it right" for everybody, and therefore might be a little interested in such considerations. It's not a case of blaming, but of understanding. I think.

Personally, I think Unicode is (well, may be) of enormous historical significance, yet it flies almost entirely under the cultural radar, at least in the US. I daresay most places in the world that will eventually be heavily influenced by Unicode are more or less oblivious to it.


Thanks, very interesting. I see many of the scripts being worked on list one "Everson" as the contact. Who is this mysterious and ubiquitous "Everson", anyway? Is it one person? Sounds an awful lot like the fictional Cecil Adams to me: (


To view this post in context link to the Unicode Mail List Archives here or in my sidebar. You can either join by following the instructions, in which case you can participate, or you can read the archives which are password protected. The password protection is only to avoid spam and the list is public: the password is posted.

One of the current issues is about how well publicised the work of Unicode is. I want to do my bit.

Vai Unicode Proposal

A conversation has begun on Language Hat about the Proposal to add the Vai script to the BMP of the UCS. I feel I am lagging behind: this was posted a few days ago. But I have been lost in the Middle Ages for a brief interim. More about that later. Here are a few comments on the Vai Proposal. First, Tim May made a comment about one minority script being used to describe another.

What I don't understand about the Vai Proposal is this line in the second paragraph of page 2.

"(Strictly speaking, the writing system is based on the mora, as a syllable may be written with up to two characters.)"

I have commented on this before and I am simply sitting on the sidelines for a bit to see how this use of the term mora plays out. I had understood that a mora was "a suprasegmental unit of length, smaller than or coincidental with a syllable, that is studied as a part of the stress pattern of the language." (Google)


The structure of the Vai glyphs as traditionally given in syllabary charts shows a vertical glyph relation between many characters; accordingly, a visual sort which preserves this relationship makes sense for assisting readers in finding characters in lists. In this “column-based” sort, for each rhyme, the full run of consonants from Ø to ny- is given for the [e] vowel, then the next column of consonants from Ø to ny- is given for the [i] vowel, and so on to the [E] vowel. The order of the vowel series in Vai seems to be based on the names of the corresponding transliteration vowel in English; compare [e i a o u O E] with [eê i: aê oÜ ju] + [O E]. This order is sung in alphabet chants in Liberia, however, which is why we have chosen it in the absence of a “standard” Vai ordering. Other ordering systems exist. A “row-based” order based on the Latin transliteration of Vai characters is found; others are “column-based” as though with the vowel orders sorted as in Latin [a E e i O o u] or [a e E i o O u]; a “linguists’ order” [i a u E e o O] is sometimes found, as in Dalby 1967." page 4

This is an unusual ordering for any script, by rime rather than onset. Has anyone seen this before? I have not. Can you imagine flipping through a Vai dictionary.

Next, there is the question of the extended character set of Massaquoi. page 2. Surely this will make it harder to search in Vai, (if anybody ever would) One would have to guess at the spelling decisions that were made about each word in the first place as it is doubtful that everyone already uses the extended set.

Altogether this is a very complex proposal with many interesting issues. I need to get back to reading the articles on Vai that were recently sent to me.

Wednesday, July 06, 2005


The revised proposal for encoding the Lepcha script has been posted to the Unicode Mail List, Monday July 4, 2005. The fonts created by Jason Glavy have already been in use along with the Lepcha Language Kit based on the tentative codepoints in Unicode.

The proposal with the proposed codepoints and collation sequence is about 4 pages long and the figures or illustrations make up the last 15 pages. The following illustrations are included.
  • punctuation
  • the nukta
  • history of the script
  • comparison with other Indo-Tibetan scripts
  • examples of handwritten text from a variety of sources
  • more than one consonant array
  • lists of the vowels
  • a syllabary chart or syllabarium
  • table of Lepcha nominal glyphs in Unicode
  • the unicode codepoints

Now, I confess to a problem. For many this may look like a fairly dry and boring list. But to some of us who love writing systems the list might just as well be from the map provided in the top of a box of chocolates. The problem is not how to pay attention to such details, but rather which one to study first.

Hmm, shall I try the cream ganache or the espresso, the almond praline or hazelnut, the nougatine or caramel? Each person will have their favourite flavour and may choose to dig right in and flood their mouth with the taste, or savour it at the end.

My obsession right now is the handwriting, the shapes, the direction, the spacing, etc. The syllabary first and the handwriting last with everything else in between. Unlike chocolates I can go back and study it all again.

I hope that this will not be seen of as a trivialization of a serious undertaking. With all due respect - read on. Proposal for encoding the Lepcha script

Tuesday, July 05, 2005

A Consonant Array

We think of an alphabet as a linear phenomenon which stretches out in one dimension unless lack of space requires that it double back on itself and start a new line. The vowels are deposited among the consonants in an unorganized pattern to be picked out later and set apart. Each vowel has two regular sounds and an uneven number of other sounds which it can make in combination with other letters. But this is not reflected in the alphabet itself.

The letters of the alphabet are arbitrary and once a pattern is found it is shown to be inconsistent. B resembles P and G resembles C, but no one can convince me that D resembles T. Where did K come from? Or do I mean, where did C come from?

The letters remain in a fixed sequence by convention: there is no asociation to a numeric sequence for the Roman alphabet.

The alphabet has been variously described:

"Hopelessly inadequate alphabet devised centuries before the English language existed to record another and very different language. Even this alphabet is reduced to absurdity by a foolish orthography based on the notion that the busines of spellingis to represent the origin and history of a word instead of its sound and meaning." Shaw in his preface to The Miraculous Birth of Language.

"Meaningless shapes arbitrarily linked to meaningless sounds." ?

"The sillinesses of the English alphabet are quite beyond enumeration. That alphabet consists of nothing whatever except sillinesses. I venture to repeat that whereas the English orthography needs reforming and simplifying, the English alphabet needs it two or three million times more." Mark Twain.

On the other hand, Indic writing systems are presented in a two dimensional array which is organized phonetically.

"The precedence of grammar over the script is a hallmark of Indian writing system. Long before there was any script, they had developed the concept of letters (aksara), consonants (vyanjana) and vowels (svara). Phonetic analysis of 'mantras' and its recitation from generation to generation was transferred orally in the absence of any writing system. Therefore, we see a systematic arrangement of letters in the script. The whole set of basic letters are arranged in a phonetic order. Vowels come first in the arrangement (aksaramala, the string of letters) followed by the consonants grouped together on the basis of their articulation (velars, pre-palatals, retroflexes, dentals and labials, in that order.) All the consonant letters have an inherent vowel 'a' with them (unless or otherwise specified by other explicit vowel signs attached." Prakash and Joshi Orthography and Reading in Kannada.

In addition to this organization of consonants and vowels into tables, the syllables are also presented in a syllable matrix. While the article by Prakesh and Joshi provides a table of all the syllables in the Kannada aksaramala, I cannot find a comparable table for Kannada on the internet. The table for Kannada on page 100 of their article has 16 columns and 34 rows. The authors indicate that children learning to read in Kannada are not taught the classification of sounds and lettrs but mechanically learn the syllables from one end of the syllable chart to the other. (1995)

So an Indic writing system has two arrays; one for the classification of sounds and the other for the aksara, or syllables. These two tables combined demonstrate the organization of an Indic writing system. The Roman alphabet has a single arbitrary one dimensional organization.

I am organizing this material to remind myself of how the concept of a writing system may vary from one culture to another.

Monday, July 04, 2005

Cree Keyboards

Cree Typewriter. Courtesy of the Provincial Museum of Alberta, Edmonton, Alberta, Canada(Andru McCracken photographer)

Another unique feature of the Cree syllabarium is that it fits on the QWERTY keyboard very neatly. The original inventory of symbols was 36 syllabics, 9 finals, h, w and the overdot; with only one case. A Cree typewriter had the standard number of keys: likewise all the symbols fit on a standard computer keyboard. The placement of syllabics on this Cree typewriter follows the syllabics chart (Tiro Typeworks). The vowels start on on the far left, then the p-series, the t-series, and so on.

There are two basic ways to keyboard Cree. There is a typewriter style programme with one keystroke for one syllable and there is a transliteration keyboard where each consonant and vowel is typed in and the programme transforms it into syllabics.

The first method is called glyph-based keyboarding, since the character is chosen by its visual shape and keyed in; the other method is called phonetic keyboarding since the alphabetic letters for the sound of the character are keyed in sequence, two keystrokes for each syllabic.

When I was working in Northern Ontario in the early 90's I saw Cree and Oji-Cree speakers using the syllabic writing system for many purposes and in different media - handwriting, typing and typesetting in newsletters, newspapers, etc. For these people there was a continuity from handwriting to keyboarding and they would naturally use one keystroke for one syllabic as they would on a typewriter.

However, now I often read that phonetic keyboarding, using the alphabet to key in the syllabics. is very popular for those who did not have the opportunity to become literate in Cree but are, of course, literate in English.

Both kinds of keyboards for Canadian Aboriginal Syllabics are available at Chris Harvey's website.

Description of a Glyph-based Keyboard

"One-key, one-character. With one keystroke, one syllabic character appears, be it a full syllabic: like ᒧ,ᔦ, and ᒐ, or a final: like ᐤ, ᐨ, or ᐦ. Combining symbols, e.g. the mid-dot ᐧ, are typed separately."

Keyboard Map (All syllabics are represented on a key of their own.)

Description of Phonetic Keyboards

"Syllabic keyboards are quite different, as the nature of the writing is not alphabetic. Previously on some computer and typewriter fonts, each unique syllabic has given its own key, so that '∩' (ti) might be the 'e' key, and '∟' (ma) would be the 'n' key. Typists would either have to memorise this new key mapping, or they would resort to cutting and pasting little papers onto their keyboard. The keyboards take a quite different approach. They keyboard contains only vowels and finals, e.g. ∆ (i) and ′ (t). To create a syllabic character like '∩' (ti), the typist would key in ′ + ∆ ('t' + 'i'). This frees up people from learning a new set of key mappings, and allows touch typists already proficient in English, French, etc. to type quickly and error free."

Keyboard Map (Remember that this keyboard map will look half-empty as there are only 12 consonants, represented by the finals, and 4 vowels.)

Scroll down this webpage for detailed explanation of Cree keyboards.

Sunday, July 03, 2005

Greek Contextual Shaping

Greek has a variant form for the final s. However, it turns out that β and ζ also have an initial/non-initial distinction. This can be seen (with a magnifying glass, I might add) in the names Hezekias and Zorobabel; and in Babylon.

You can view the full text of this version of the The Greek New Testament, Stephanus 1550 Received Text with 7678 textual variant notes, containing all the readings of four printed editions at the Bibles Repository hosted by . Thanks to John Hudson for mentioning this resource on the Unicode list (Sun Jul 03 2005 - 20:46:18 CDT).

I understand from John Hudson's post that this is an unusual renaissance font and does not represent a standard for modern or classical Greek. The Didot font with contextual shaping can be seen here.

Bibliography Writing Systems

Prakash, P. and Joshi, R. Malatesha. Orthography and Reading In Kannada: A Dravidian Language. In Insup Tayor and David Olson. ed. Scripts and Literacy:Reading and Learning to Read Alphabets, Syllabaries and Characters. 1995. Kluwer. Dordrecht.

Saturday, July 02, 2005

Eh? To Zed: A Canadian Abecedarium

Google's little maple leaf beckons me to add one more post for the weekend. Eh? to Zed: A Canadian Abecedarium. This totally resonates. What zed word would any respectable Canadian learn first? Zamboni, of course!

A Lyrical Abecedarium

I need to find some soothing and lyrical abecedaria for the long weekend and cool weather - I am so glad it is not raining. Thanks to Kuri of and her July post of 2 years ago here is just the thing - it needs no introduction.


Angels bring confusion.
Don’t ever forget god’s hand
Is juggling knives like man’s nature.
Occultists properly question reality.
Saints travel unbroken vigils
without x-ing yesterday’s zodiac.

Read them all at her post Abecedarium.

I also enjoyed her recent post Book Tag as it is time to find some summer reading that is not about writing sytems. Yes, I do have other interests. Thanks, Kuri. I hope you keep blogging as you travel.

PS. If anyone feels compelled to write one of these lyrical abecedaria I would love to post it.

The Moraic Principle

The moraic principle is also presented in William Poser's writing systems typology. The following is the earliest reference to the moraic principle that I have found.

What do “phonemic” writing systems represent?: Arabic Huruuf, Japanese Kana, and the Moraic Principle
Author: Ratcliffe R.R.1
Source: Written Language & Literacy, March 2001, vol. 4, no. 1, pp. 1-14(14)
Publisher: John Benjamins Publishing Company

"The traditional classification of phonemic writing systems into three types — syllabaries, consonantal scripts, and alphabets — is based on a phonological theory which recognizes only the syllable and the segment as potential units of representation. It is argued here that an accurate typology of phonemic writing systems requires recognition of two further dimensions of phonological structure: phonological time, and the sonority hierarchy. The analysis focuses on two “typical” non-alphabetic systems — Japanese kana and the Arabic script, the former traditionally classed as a syllabary, the latter as a consonantal script. It is argued that the two scripts in fact share a common organizational principle, namely the iconic representation of phonological time."

Sproat and Faber

Richard Sproat's article on Indic scripts relates directly to reading theory. However, I was not able to find a direct quote which I felt explained his position on reading and script type. He sent me this clarification in an email earlier this spring. Thanks, Richard.

"My classification of scripts is based on purely formal properties at an abstract level. From that point of view, Indic scripts, as well as Amharic or Hangul are all segmental, just as much as or almost as much as English or Spanish. In every case you combine symbols for consonants and vowels together. This sets them apart from things like Japanese kana or Sumerian or Linear B or Vai.

The only difference is that in scripts like Indian scripts the combination is more complex than a simple left-to-right (or right-to-left) concatenation of symbols. And of course in Indian scripts there's the issue of the inherent vowel, which makes them slightly less than fully segmental. But at least the directionality of the combination is, inmy view, a purely surface phenomenon. Hence my classification.

But having said that, there is no question that things like directionality, reduction or modification of symbols (e.g. thediacritic vowels vs. their full form), fusion of symbols (as in TamilCV), transparency of the phonology (e.g. Tamil expression of voiced vsvoiceless stops), are all going to have an effect on phonemic awareness. Also, as I suggested above, the way the scripts are taught. We know that some of these things do have an effect: amount of reduction certainly is relevant as shown by work of Padakannaya on Kannada and Devanagari (for Hindi).

But I don't think this relates directly to the classification of scripts. Faber tried to do something like this in her 1992 article, but she ended up with the, in my view, mistaken conclusion that this was an all-or-nothing thing: either you had a purely linear segmental script and total phonemic awareness; or you had one of these other phonographic systems and no phonemic awareness. This is just wrong, as finer grained studies have shown. One of the things Padakannaya and I are hoping to do is continue these finer grained studies and tease apart what factors are relevant for determining phonemic awareness."

This is the title of Alice Faber's 1992 article, which I have yet to find online. However, I feel that she has significantly influenced reading theory and writing system classification. I look forward to reading more of her work when I can.

A. Faber, "Phonemic Segmentation as Epiphenomenon: Evidence from the History of Alphabetic Writing," The Linguistics of Literacy (Typological Studies in Language, 21), S. D. Lima, M. Noonan & P. Downing, eds. Amsterdam: Johns Benjamins, 111-134, 1992. (Also: Haskins Laboratories Status Report on Speech Research SR-101/102: 28-40, 1990.)

Rather than focusing on any actual disagreement between the ideas of Sproat and Faber, I tend to value the contributions both are making to the field of reading theory.

Onset and Rime

Bridget's presentation on Akkadian mentioned onset and rhyme (rime) as elements of phonology which may be represented by a writing system. Here are the relevant lines.

"Onset / rhyme: S = 2G {p, t, k, sp, st, sk…} {a, a:, i, aw, um…}
Fan Qie, Bopomofo, Hmong (Pollard, Pahaw)"

This is particularly timely since I was thinking about how best to explain the very significant difference between the Evans and Pollard syllabic systems.

The Evans syllabary (so called) represents consonants by the basic shape of the character and vowel by rotation. Four vowels are represented and the difference between long and short vowels is further indicated by an optional overdot. Cree has seven vowels. There are also finals, that is, characters to represent final consonants.

The Pollard syllabary represents consonants, by initials; and rimes by finals. There are 20 characters in the rime or vowel category. In this writing system the finals represent the rime. They are of a lesser status, being smaller in size than the initials or consonants: the characters are arranged into syllable level units.

Both the Evans and Pollard systems are organized into units which represent syllables but in a manner than demonstrates analysis of the segment.

Now what is Fan Qie? (Bopomofo will wait for another day but it is probably fairly well-known as the phonetic writing of Chinese, also called Zhuyin.)

Chinese made use of rhyme tables or rhymebooks which are the organization of Chinese characters by onset and rime into tables or charts. Dylan Sung has devoted part of his website to a discussion of rhyme tables. A particularly famous book of rimes was called Quiyun, dated 601 AD in the Sui dynasty.

Fanqie is an earlier development, circa 200 AD, in which the pronunciation of a Chinese character was explained by two other characters, one with the same onset and a second character with the same rime.

In this way the pronunciation of 'table' could be explained in English as 'top' + 'cable', or 'stay' as 'stop' + 'day.'

Fanqie is dated as early as the Han period in Notes on Middle Chinese.

"Chinese is a monosyllable-structured language. Its characters or words are composed of single syllable sounds. Fanqie, as we understand it today, is a method of indicating pronunciation by dividing the single-syllable word into two parts: the initial (shengmu 聲母) and the final (yunmu韻母). Using this method, the pronunciation of an unknown word can be represented using two known sounds, one for the initial and one for the final of a given syllable.

For example, the fanqie for東 is德紅. The initial of 德 (d/e) is d, and the final of 紅 (h/ong) is ong. d + ong à dong. This shows that the initials of 東 (d/ong) and 德 (d/e) are both d; which is referred to as shuangsheng 雙聲. The finals of東 (d/ong) and 紅 (h/ong) are also the same: ong; which is called dieyun叠韻. Shuangsheng and dieyun are set rules in fanqie. Also, the initial determines if the word is voiceless (qing 清) or voiced (zhuo 濁); and the final determines the tone of the word. This method of indicating pronunciation is roughly what we understand today as fanqie. We are not sure if it was used in the same way during ancient times, but it is generally assumed that there were differences."

Notes on Middle Chinese is an intriguing introduction to the topic of early fanqie and the awareness of segments smaller than the syllable in Chinese. Interestingly the author indicates that this awareness was subsequent to the introduction of Sanskrit into China.

Invention of Fanqie

"Another important factor that fanqie should arrive in China at this particular time was the introduction of Buddhism into China shortly before the Christian era, near the end of the Western Han Dynasty西漢. Through cultural interaction and translation of Buddhist texts, the Chinese became familiar with the splitting of Sanskrit or Pali syllables into initial and final parts and applied the system to glossing Chinese syllables."