Saturday, December 24, 2005

Roman Shorthand: Tironian Notes

I have accepted that I must simply work at improving my reading knowledge of German. This shouldn't be impossible since I once studied German and spent one summer with a family near Tübingen. However, no polished German translations are about to turn up here under my authourship.

This is the first 20 words of Psalm 12:6-7 * in Tironian notes. The best resource that I have found so far on Tironian notes is Boge's Griechische Tachygraphie and this site with images of a manuscript by Karl Eberhard Henke. This will keep me busy for a while.

Tironian Notes are attributed to Tiro, who worked for Cicero. The National Court Reporters Assocation has a great article on The History of Shorthand By Anita Kreitzman. Here is the section on Roman shorthand.

"Shorthand in ancient Rome seems to have appeared as early as 200 B.C. with the poet Quintas Ennius, who devised a system of 1,100 signs. But it was not until Plutarch in 63 B.C. that definite and indisputable evidence of the use of shorthand is recorded. He writes of the debate on the Catilinian conspiracy that was recorded in shorthand in the Roman Senate as the famous orator Cicero expounded his views.

It is interesting that Cicero was indirectly responsible for the method of shorthand devised by Tiro. Tiro was a slave of Rome and had been granted his freedom by Marcus Tullius Cicero. Upon becoming a freedman he adopted the first two names of his master and thereafter was known as Marcus Tullius Tiro. Highly educated, "he then became Cicero's secretary and confidant," and as such had the opportunity and fortunately the intelligence and skill to invent a system of shorthand that was to be used in the Roman Senate and as a basis for future shorthand systems. Initially, his system involved abbreviations of the more popular words with the remainder of the text filled in from memory using context clues. Not a very accurate method, but Tiro continued to improve on his system by devising further abbreviations for common sentences and phrases used by the orators of the day. He is also credited with inventing the ampersand, which is still in use today.

In the Curia, as many as 40 shorthand writers were stationed in the different areas. They recorded what they could and their transcripts were then compared and compiled in order to record the complete orations of such greats as Cicero and Julius Caesar. Today, in our own Congress, a similar system is used except that the reporters work in relays.

Famous writers such as Horace, Livy, Ovid, Martial, Pliny, Facitus and Suetonius make mention of shorthand in ancient Rome. Julius Caesar, himself, was proficient in shorthand. And to be proficient in shorthand was not an easy task.

The ancient Roman scribe did not have paper, pen, pencil or ink. How, then, did they record the events? The medium was a tablet with raised edges covered with a wax layer. As many as 20 such tablets could be fastened together to form a book. A stylus, similar to a pencil, was used for the actual writing. The point was ivory or steel, the other end flat in order to easily smooth the wax when the notes were no longer needed and a new tablet required. Ironically, it was with such instruments that Caesar was stabbed to death. Had Caesar the foresight to see his fate, perhaps he would not have pursued his interest in shorthand.

Others who demonstrated an avid interest in shorthand writing included Titus Vespasian Caesar, who was so skilled at shorthand that he participated in "contests for wagers and personally taught the art to his stepson," and Augustus Octavianus, an expert shorthand writer who "appointed three classes of stenographers for the imperial government." He considered the skill so important that he taught it to his grandchildren. And even Seneca, the great orator and philosopher, who became so fascinated with shorthand that he improved Tiro's system by adding several thousand abbreviations of his own."

Somehow I could not abreviate this article and extract the interesting parts - it is all too fascinating. I am off to study Karl Eberhard Henke.

Notes:

Image is from "Du Charactère Sténographique de Toute Écriture." Yves Duhoux. Studia Minora Facultatis Philosophicae Universitatis Brunensis N 6-7, 2001-2002. Unfortunately Duhoux does not give the location for the Latin manuscript but it was also mentioned in M. Proux. 1910. Manuel de paléographie latine et française. Album. Paris.

Karl Eberhard Henke:Tironische Noten. MGH-Bibliothek Hs. B 16. Digitale Ed. [Manuskript ca. 1954] / Konzeption u. Bildbearbeitung: Arno Mentzel-Reuters

Boge, Hebert. Griechische Tachygraphie und Tironische Noten. 1973. Akademie Verlag. Berlin.

Friday, December 23, 2005

William Moon Blind Alphabet

This is the Moon writing system from the early 1840's in England. It is still in use by a limited number of older people in England..

Below is the Cree syllabary. The characters for the p, t, ch, m series are in the same order as the Moon alphabet, when it is grouped by shape. The Cree p,t,k,ch finals also appear as a group in the Moon alphabet.

The fact that these two systems are so similar cannot be a coincidence. These systems appeared within two years of each other, 1841 in Canada and 1843, in England. I suggest that they had a common ancestor in the shorthand descended from John Willis shorthand.

The Moon Code, as it is known, was invented in England between 1843 and 1847 by William Moon who was himself blind. The Moon code was a full alphabetic orthography in which each symbol stood for a letter of the Roman alphabet. However, is is taught by organizing the symbols into an arrangment of similar shapes. It is still used today.

A direct predecessor of the Moon alphabet was the Lucas system. "The script invented in 1832 by Thomas Lucas at Bristol, England, consisting of embossed characters in the sort of symbols used by stenographers, was used in both China and India." The Lucas system can be seen here.

The evolution of Cree from the Frere and Lucas systems has already been written about at Tiro Typeworks.

I am only adding the pieces about the Moon alphabet which shows the order of the symbols, and the John Willis shorthand. More here.


Notes: A Simplified Alphabet. The Ramseyer-Northern Bible Society Museum Collection at the University of Minnesota Duluth.

Thursday, December 22, 2005

The Silver Gospel

Lat night the men were discussing history as usual and the Goths came up in conversation. I mentioned casually that there was a Bible translation into Gothic in the fourth century. They were ruminating on military campaigns. However, one guest paused in thought and said, "Gothic, fourth century - I didn't think it was written that early."

So here it is. This is the Lord's Prayer in the Silver Gospel and there are almost endless internet resources on it. It is officially called the Codex Argenteus and is a copy of the Gothic Bible which was translated by the Gothic bishop Wulfila, who designed the Gothic alphabet.

More about the Codex Argenteus and its significance in studying early Gothic here.

"The manuscript, the Codex argenteus, is probably written in Ravenna during the Ostrogothic empire, and probably for the Ostrogothic king, Theodoric the Great, in the beginning of the sixth century. It is written on very thin purple-coloured vellum of high quality with gold and silver ink. The silver text is dominating, and therefor the manuscript is called the »silver book«, or » codex argenteus «. It was made to be an admirable book, which may be difficult to see today, when hastily looking at its roughly handled remnants in Carolina Rediviva in Uppsala. Probably it originally had a splendid binding with pearls and precious stones. The text of the Silver Bible is one of the oldest and most comprehensive documents in the Gothic language known today. Beside the Silver Bible, there are very few text lines in Gothic handed down to posterity."

Now for the good stuff. There is a Project Wulfila with resources on the Gothic language and each word of the Lord's Prayer above can be read in Gothic and compared to the English and the Greek. The Lord's prayer is in Matt.6:9 starting in the middle of the verse. Notice that the word order of the Gothic follows the word order of the Greek, since it is a very literal translation. Each word of the Gothic is clickable so you can crosscheck. I believe this is the earliest record of a language ancestor to English. (direct ancestor - not PIE)

Notes: The image above is from Alfabetos de Ayer y de Hoy.

Thanks for this comment from Curtis.

Gothic is classified as East Germanic, while English is West Germanic; Gothic is at best a cousin to English, not a direct ancestor. See e.g. http://softrat.home.mindspring.com/germanic.html, http://www.ethnologue.com/show_family.asp?subid=90067.

Wednesday, December 21, 2005

John Willis Shorthand

I confess. I can't remember where I read that X was used for Christ in the 16th century. I'll find it soon. However, as I went through my notes on shorthand I realized that I now have this image. It is the shorthand system developed by John Willis in The Art of Stenographie, 1602. Here X is 'ch'. The question is whether X alone would represent Christ.

The earliest shorthand for English was that of Timothie Bright, 1588. Apart from the basic symbols which are presented in Joanna Drucker's The Alphabetic Labyrinth, I have not seen Bright's system. However, it is possible that X was used for Christ in one or both of these systems.

There is a Bible in the John Willis shorthand system here at University College in London. There are a few other shorthand items there also. One day maybe I will be able to have a look for myself. Any Londoners out there anxious to look at a shorthand Bible from the early 17th century?

Notes: This image of John Willis shorthand is found in World's Writing Systems by Peter T Daniels and William Bright

Monday, December 19, 2005

Merry Xmas




This is a verse from the Bible in James Bay Cree, published in 2001 by the Canadian Bible Society. It says "Then Simon Peter answered, you are the Christ, the Son of the Living God. (of God who lives, the son.)" Matthew 16:16

The ninth word from the beginning is X for Christ. The Chi sign X is used for the name of Christ in this New Testament published in 2001. In Unicode it is U+166D : CANADIAN SYLLABICS CHI SIGN.

There is another way to write "Christ" in Cree. Here is a verse of Silent Night in Western Cree. At the beginning of the fourth line Christ’s name is written phonetically. However, ‘r’ is not a Cree sound and the syllabic used for ‘r’ shows that this is a non-Cree word. The double consonants are also foreign to Cree, so the name of Christ is identifiable as a foreign word in Cree when spelled out phonetically.

The use of the Greek letter chi for Christ has a long history. The first shorthand for Christ seems to have been ΧΡΣ P46. This site explains that the Nomina Sacra were used not as abbreviations but to set apart holy words in text.

Two kinds of shorthand were used from the third century up until the 16th century in Greek manuscripts. First, the nomina sacra, where a closed set of frequently occuring siginificant names were abbreviated to create a logographic entity. Second, there were ligatures which shortened or combined two or three letters, especially grammatical endings, later even including the accent in the ligature.

Χριστος has been represented by Χρς, or Χς, and by ΧΡ in art and other representation. I have not found the ΧΡ in manuscripts and would not expect it since the manuscript form always includes the grammatical ending.

A quick glance at some facsimiles of Greek manuscripts* shows that the words ιησους, χριστος, θεος, ανθρωπος, πατερ, ματερ, πνευμα and some other words were represented by their initial and final one or two letters which represent the grammatical ending. This could be ς,υ,ν,οι, ι &c.

For this reason, I am assuming that the transition from Χς to Χ happened with the beginning of the use of the vernacular languages in Europe, when the ending was no longer relevant. There would be no reason to retain the last letter and X alone came to represent Christ. There is also no reason to see a sign of disrespect in the transition from Χς to Χ. And so Xmas first appeared in English texts in the 16th century.

Χ retained the meaning of Christ for those who knew Greek but possibly also in some form of British shorthand at least up until the last century. It occurs in the Cree writing system devised by James Evans in 1841 and now called Canadian Aboriginal Syllabics, pictured at the beginning of this post. It is recognized that Evans drew on his knowledge of early British shorthand for the Cree syllabary. However, he must also have studied Greek so either way he would be familiar with the chi X symbol.

* Barbour, Ruth. Greek Literary Hands. 1981. Clarendon Press. Oxford.

I have previously posted on the use of the Greek chi symbol here and Greek Literary Hands here.

Update:

Further information on the Chi sign X and its first use in English are at the folloing links.

http://www.ewtn.com/library/ANSWERS/ISGODAGI.HTM
http://christmas.123holiday.net/
http://www.christmascarnivals.com/trivia/
http://www.freerepublic.com/focus/f-news/1538036/posts

This is a general hodgepodge of information but one site claims that Wycliffe used the sign X for Christ. It should be possble to check that out.

Tuesday, December 13, 2005

Delphi Tablet: Der Erfinder

It is rather slow going on the Delphi tablet. The book is in German and while I have had a generous offer of help on the German, I simply have no idea which part of the book I want to know about most. So I am slogging away, dipping into a little here and there.

The epigram describing the Delphi writing system, or set of indiosyncratic symbols, whatever one wants to call them, is dated "In the time of the Delphic Archon Charixenos." (277/276 BCE) The inventor's name , begins with M. According to Boge, who quotes Bousquet, this must be the philosopher Menedemos of Eritrea, an accomplished politician, teacher, architect, artist and sculptor, who was priest in Delphi at that time.

There seems to be a year or two of variance on these dates so there is some doubt, but that is the best I can do for now.

It is interesting to note that Greek has letters for double consonants like 'ps', 'dz', 'ks', and English still has 'ks'. There isn't much more about the double consonants of Delphi but there is a lot more to learn about classical tachygraphy.

To view previous posts on the Delphi tablet, use the 'search this blog' button and enter "Delphi Tablet".

Thanks for the additional comments, Gary.

Sunday, December 11, 2005

Judeo-Portuguese

Thanks to Don Osborn for mentioning this in qalam.

Old Portuguese in Hebrew Script: convention, contact, and convivência

"This dissertation explores the process undertaken by medieval writers to produce Portuguese-language texts using the letters of the Hebrew alphabet. Through detailed philological analyses of five Judeo-Portuguese texts, I examine the strategies by which Hebrew script is adapted to represent medieval Portuguese in the context of other Roman-letter and Hebrew-language writing. I focus on the writing system in order to challenge the conception of such texts as marked or marginal, a view that misleadingly equates language and script.

I argue that the adaptation of Hebrew script for medieval Portuguese is neither derivative of Roman-letter writing nor entirely dependent upon the conventions of written Hebrew. Nor is it an adaptation performed anew by each writer and influenced primarily by spoken language. The perspective I adopt thereby rejects the premise that the patterns manifested in this unconventional orthography are ad hoc creations by its writers, that it requires extra effort from its readers, or that it is less 'native' than the dominant, more conventionalized, Roman-based adaptation that normally bears the title 'written Portuguese.' "

More about Texts in Hebrew Script here.

"Medieval Judeo-Portuguese texts can be found in libraries all around the world. The oldest known document is a treatise on the art of manuscript illumination dating from 1262, written in Portuguese with Hebrew characters – O livro de como se fazem as cores. It is a document of prime importance for the history of Hebrew manuscript illumination, as the instructions contained in the text were used for the illumination of an elaborate Bible manuscript in Corunna, Galicia, in 1476 (Blondheim 1929-1930).

The oldest known liturgical text is a Spanish Mahzor in Hebrew script, published in Portugal around 1485, which includes ritual instructions in Portuguese Aljamiado (Metzger 1977). "

I have found images of Ladino or Sephardic manuscripts on the internet but none so far that are Judeo-Portuguese. Maybe some other time.

Wednesday, December 07, 2005

Delphi Tablet III

I posted in October on the Delphi tablet here and here. This image is piece #6323. However, I only have time for the first four lines at the moment.

Ἧ πολὺ κ[αλ]ίστωι σε θεαί, Μ[..., γέρησαν
Δώρωι Π[ιερ]ίδες παρθένοι ε[ὺπλόκαμοι]
Αίπερ σοι [τό]δε μούνωι ὲπιχθ[ονίων ἀνθρώπον,
Ὤπασα[ν] ἐξευρεῖν πείρατα πά[ντα τέχνης.] *

With this translation into German,

Wirklich mit einem sehr schonen Geschenk haben dich, M...
die schonehaarigen pierischen Jungfrauen geehrt
die dich als einzigen Mensche auf Erden damit
begabt haben, jegliche Grenzen der Kunst zu erfinden

Truly with a very great gift
have the beautiful haired Pieridean maidens
honoured you, who is the only human on earth
to whom it has been given to invent the very finisher of all arts

In plain English,

The Muses have truly honoured
you with a great gift,
for you to be the sole inventor
on earth of the ultimate art.

I have disagreed with the German translator somewhat on this phrase, πείρατα πάντα τέχνης. Or maybe I am unfamiliar with the German term 'Grenze der Kunst.' Πείρατα can refer to boundaries or borders, but it also is used for the goldsmiths tools, the finishers of art.

Here is Liddell and Scott.

I - an end as in the ends of the earth
II - the end or issue of a thing: the furthest point, the utmost verge: the chief or most important object.
III - that which finishes, a godsmith's tools are called πείρατα τέχνης, the finishers of art.

I am inclined to think now of this system on the Delphi tablet as a poetic device, or a way to represent phonology, or even an early example of a constructed script. After all, the name of the inventor was mentioned, although it has not been preserved in this fragment.

Any further comments are welcome, whether to improve my translation or otherwise.

* The bracketed letters have been cited by Boge from J. Bousquet, 1956.

Bibliography

Boge, Herbert. Griechische Tachygraphie und Tironische Noten. 1973. Akademie Verlag. Berlin.

Tuesday, December 06, 2005

Syriac Again

Okay, I goofed. I posted today a draft from November 30, 2005 and there it is.

This gives me a chance to post Tim May's comment with his Syriac text plus vowels, where it can be more easily read.

By comparing the Syriac and Latin versions, and referring to the Omniglot page, I've managed to render the first sentence of Malcuno Zcuro in Unicode. There are probably some errors - I don't really know anything about Syriac spelling, and there are a lot of diacritics that basically look like a dot. Also the editor I was using didn't render the text quite perfectly in some cases, leaving me uncertain as to the correct order. But it should be mostly correct.

ܥܶܡܪܺܝ ܫܶܬ݂ ܐܷܫܢܶܐ ܚܙܶܐ ܗ݇ܘܰܝܠܺܝ ܢܰܩܠܰܐ ܒܶܟܬ݂ܳܘܳܐ ܕܥܰܠ ܗ݇ܘ݂ ܥܳܒܳܐ ܒܬ݂ܽܘܠܳܐ ܕܟܶܬܘܰܐ ܐܷܫܡܶܗ »ܫܰܪ̈ܒܶܐ ܕܰܐܬ݂ܶܢ ܒܪܺܝܫܶܗ-ܕܚܰܕ݇« ܨܽܘܪܬܳܐ ܗܕ݂ܺܝܪܳܬܐ.

Cəmri šeṯ əšne ḥzewayli naqla bkṯowo dcal u cobo-bṯulo dkətwa əšme »Šarbe daṯən briše-dḥa« ṣurto hḏirto.

If you want to see it in a form more closely resembling the original, the Beth Mardutho fonts include several Serto variants. You can see samples on the Syriac page of David McCreedy's Gallery of Unicode Fonts. (Incidentally, Estrangelo Edessa, in Windows, is actually one of the fonts from this package.)

Sunday, December 04, 2005

Hebrew Online Keyboard with Vowels

I was back visiting the Lesser Of Two Weevils to pick up a few tips on how she can input and display Hebrew vowels so easily. I found this online keyboard with vowels. How cool is that! I don't want to lose it since I have been looking for one for ever so here it is.

And she has a post which may elucidate the Golem legend. Here is an excerpt from her post on the The Power of the Word. (This also gives me a chance to input and display Hebrew.) The topic of her post is the word דִבְּר diber, 'to speak', from dabar 'word'. The point is that this is a keyboard that takes no time to learn, just click around. Okay, so it is a picker. It works.

Here is Talmida on 'the word'.

In finishing my translation in 2 Kings 14 I ran into an expression that I found very satisfying.

Verse 27 begins,

וְלֹא-דִבֶּר יְהוָה לִמְחוֹת אֶת-שֵׁם יִשְׂרָאֵל
/v'lo-diber adonai limhhot et-shem yisrael.
Word for word, that works out to this: and not - he spoke the Lord to erase name of Israel.This spoke (no pun intended) to me in a very powerful way.

One of the things that I feel drawing me (calling me?) to study Hebrew is that the Hebrew words themselves are important -- not just their meanings, but the words. I'm not quite sure how--and I don't think I'm ready to study Kabbalah just yet--but I sense that there is something just beyond my grasp and that the way to reach it is to master Biblical Hebrew, and when that's done, I will see my way clear to the next step God wants me to take. I've blogged about this a bit before.

This passage resonated so powerfully with me I want to shout it out! The verb diber means, to speak. The noun form, davar, means word, thing, affair. If you look in a modern Hebrew New Testament, the Gospel of John tells us that "in the beginning was davar". There's a reason for that.

The most common English translations have a similar spin on this verse of 2 Kings:

And the LORD said not that he would blot out the name of Israel (KJV)
But the LORD had not said that he would blot out the name of Israel (NRSV)
And the Lord did not say that he would blot out the name of Israel (D-R)

But check out the Judaica Press version:
And the Lord did not speak to eradicate the name of Israel (JPCT)


Thanks, Talmida.

I have not forgotten Syriac, or the Delphi tablet, but unfortunately those posts are a little more work since they require images. Sorry.

Scalable Vector Graphics

When I asked about missing characters the other day Simos sent this comment about Webfonts, SVG and the new Firefox 1.5.

You can follow the tutorial at w3.org,http://www.w3.org/International/O-MissCharGlyphFor missing fonts in the system, you may specify a Webfont (downloaded dynamically) or even use SVG fonts.

The new version of Mozilla Firefox 1.5 was released a few days ago and it probably is the first browser with SVG support. Have a look at the sample page with SVG fonts, athttp://www.carto.net/papers/svg/samples/text.shtml

I will take time to absorb some of this but I am trying to familiarize myself with some of these ideas. There have been many things that I thought I would never try but I have ended up familiar with; so I'm thinking about this.

I ended up reading on this page.

SVG: Scalable Vector Graphics, a new, completely open standard recommended and developed by the World Wide Web Consortium (W3C), the development of which is seconded by many notable software groups and scientific communities. SVG offers all the advantages of Flash, the de-facto standard of the day (refer to above), plus the following features: embedded fonts, extensible markup language (XML), stylesheets (CSS), interactivity and animation. With the help of the DOM, full HTML compatibility is obtained. For a more detailed description, please go to the main section of this article.

Embedded fonts and extensible markup language. Yes, I think this relates. The best thing about this page is that it really spells things out. Each acronym actually comes with the full name written after it. How cool is that. Now I finally know what pdf means!

It also spells out the difference between 'de jure 'standards and 'de facto' standards. I think I figured that out but now I have a nice Latin way to express it.

And on to this page.

Dangers of Right to Left

I've had a busy time at work lately so I have been doing some reactional surfing on the net and not so much hard work taking screenshots of this and that. I also got to feeling a little lonely for some female company. :-) It had to happen!

I found a great blog with the most lovely Hebrew vowels. I haven't spent time trying to display these yet but I've seen it done a few places.

At The Lesser of Two Weevils I read this post and thought that maybe it was a good thing that I hop around from one writing system to another after all.

Talmida writes:

Turns out there IS a downside to studying God's language. You could flunk an eye exam.

Without any conscious thought whatsoever, I read the eyecharts from right to left today. The optometrist was quite concerned (what on earth is she seeing?) until we figured out what I was doing.

It's odd -- if there are legible words, my brain apparently says "read", and I start at the left. But since it was just random letters, I'm guessing that my brain concluded "decipher!" and started at the right as I do in Hebrew. Even when I was made aware of what I was doing, I had to force myself to read from the left. My eye wanted to begin at the right to turn the letters in to words.

What an interesting phenomenon.

I've also added Fontblog and Blogamundo to my sidebar. I've been reading these blogs on and off for a month or two and just haven't edited the sidebar. There is some great stuff there.

Thursday, December 01, 2005

Website Etiquette

Here is something I have been wondering for some time. Should one try to make a post display well in more than one browser? I have been using Internet Explorer most of the time. I installed Firefox a couple of months ago and have used it whenever I visited a site with too many empty boxes. Fairly frequently actually.

I have had the philosophy so far that I should try to make my own posts display well in IE. This means that I always checked which font displayed all the characters that I wanted to use and then defined the font. This only applies for polytonic Greek and Extended Latin as far as I know. All the complex scripts like Tamil and Syriac seem to display without a problem.

However, when I went to post the transcriptions for Syriac I could not find a font, already bundled in Windows, that had both U+02BF : MODIFIER LETTER LEFT HALF RING and U+1E6D : LATIN SMALL LETTER T WITH DOT BELOW. COMBINING DOT BELOW 0323 does occur in Lucida Sans Unicode but it is significantly out of position.

Therefore, I am unable to properly display the transcription for Syriac in my post unless I recommend that the post be viewed in Firefox, or that the viewer download a special font. Of course, this is what others have been doing all along. I somehow thought that it wouldn't be necessary for this blog.

The question now is whether one should post these characters at all knowing that others might be in a position to view only empty boxes. I will chose not to for now since this is not a specialist blog on Syriac.

I haven't really tried to display a transcription for Tamil either. When it comes to working with transcriptions the computer does not compare to good old pencil and paper.

Wednesday, November 30, 2005

Syriac Vowels

Last time I posted on Syriac I was asking myself about Syriac vowels. The vowels in the Eastern and Western versions of Syriac are quite different. I had actually assumed that they would be reflected in different fonts. I was surprised when I found out they they are encoded separately. I have no idea if this is a good thing or a bad thing.

It seems to me that it would create two separate encodings for the same word and more difficulties for searching. Someone please tell me this is not so. I also suppose that there was some good reason that this was done. I'll be keeping an eye open for some discussion of this if it ever comes up.

The same effect occurs in Cree. Here are the Eastern Finals in the top row and the Western Finals below. They are also encoded separately.





Thanks to Omniglot for these images. I also see that Omniglot has a Cree text on this page, which represents Cree as I have seen it written. There are no points other than the mid-dot and Western Finals. It is a fast fluent way to write, close to shorthand, as each spoken syllable is represented by a simple stroke on paper and the final vowels are a brief tick. That was how it was originally used.

Well, I digress. This is enough for tonight since neither of these scripts are searcable on the internet yet. I wait to see what happens. Google is an established English way of life now, but for some scripts it is still a very log way off.

Tuesday, November 29, 2005

Syriac Keyboards

I thought that I would try out the Syriac keyboards tonight. There are two. The first one is not romanized and does not relate to any other keyboard I know.

I tried out the second one. It is called a 'phonetic' keyboard which seems to mean, in this case, a romanized keyboard. It matches the QWERTY keyboard as much as possible.

However, take a good look at these keyboards - these images are close to life size. Now I have to say that I have tried onscreen keyboards from lots of different developers and they are all the same in this respect - they are completely unreadable by anyone over 40 and by many children.

Frankly, it is somewhat reassuring for me to know that I have this onscreen keyboard in the accessibility options, as long as I don't actually intend to use it.

Next step. I opened Wordpad and set it for Estrangelo Edessa font size 26. Then I keyed in the letters across the QWERTY keyboard with this result. Beautiful. It was a keeper.

Nowever, I had one more step to complete. I switched to BabelPad and keyed in the same sequence then I clicked on u ̈ and produced the second image.

In this display the letters are in their 'logical' left to right order. Using right to left is no big deal for me since I have studied Hebrew ... once upon a time ... but if I work in logical order then the cursor goes with me and not against me. That makes it worth considering. The major advantage is that I now have the independent forms not the connected ones.

Now, if only I knew some Syriac to type. I have found the image from yesterday's post and type in the wordlist. (Minus the two words which have letters that are too small for me to decipher.)

ܛܘܪܐ - turā mountain
ܡܕܝܬܐ - mdittā city
ܡܠܟܐ - malkā king
ܡܠܟܬܐ - malktā queen
ܥܡܐ - ʿammā people
ܟܬܒ - ktab to write
ܢܦܠ - npal to fall
ܥܪܩ - ʿraq to flee
ܫܡܥ - šmaʿ to hear

Syriac is absolutely beautiful and keying it in was a dream. Learning more Syriac actually seems possible. I don't have any unusual abilities in the area of visual memory so there are only a few scripts that I am truly comfortable with. I hope that Syriac will become one of those. I was pleasantly surprised by all the books on Syriac available from Amazon. Neat.

I did notice, however, that there were extra symbols, superscripts or diacritics in the text of the Syriac (Jacobite) script version of the Little Prince. I have no idea what they are. Vowels I would guess, but I don't really know.

The only difficulty I had with this post was that there is no SMALL LETTER T WITH DOT BELOW in the Lucida Sans Unicode Font, which is where I found the left half ring. A problem for another day.

Sunday, November 27, 2005

Inside Malcuno Zcuro

Wolfgang has kindly sent a view of the inside of Malcuno Zcuro, both in the Syriac script and in the Latin script. The book also has a wordlist at the bottom of each page which makes it even more attractive for language learners. Click on these images to enlarge.

I am reposting Wolfgang's email since he provides this interesting information.

Saint-Exupery's "Le Petit Prince" was translated by the "Circle of Aramaic Students" at Heidelberg University, Germany.

I contacted the professor who initiated the Aramaic translation. He assured me that "zcuro" is the correct translation for "little" as far as the Tur Abdin dialect is concerned. He assumes that the persons who came up with "zeuro" must have consulted a dictionary of the Old Aramaic language.

BTW a copy of "Malkuno Zcuro" (ISBN 3-937467-15-7) can be obtained from the following book company:
http://www.verlag-tintenfass.de/
info@verlag-tintenfass.de

And yes I did consult a dictionary of Old Aramaic. However, I have since looked at a few books that are available at Amazon.com on Syriac. These include a dictionary, grammar and various other books. In one I found an example of Syriac vocabulary transcribed with the left half ring for the 'ayn as had been suggested earlier by Simon. However, in the Little Prince orthography the 'ayn is written with a 'c'.

Books available at Amazon.com on Syriac are A Compendious Syriac Dictionary and an Introduction to Syriac: An Elementary Grammar With Readings from Syriac Literature with this editorial review.

Syriac is the Aramaic dialect of Edessa in Mesopotamia. Today it is the classical tongue of the Nestorians and Chaldeans of Iran and Iraq and the liturgical language of the Jacobites of Eastern Anatolia and the Maronites of Greater Syria.

Syriac is also the language of the Church of St. Thomas on the Malabar Coast of India. Syriac belongs to the Levantine group of the central branch of the West Semitic languages. Syriac literature flourished from the third century on and boasts of writers like Ephraem Syrus, Aphraates, Jacob of Sarug, John of Ephesus, Jacob of Edessa, and Barhebraeus.

After the Arab conquests, Syriac became the language of a tolerated but disenfranchised and diminishing community and began a long, slow decline both as a spoken tongue and as a literary medium in favor of Arabic. Syriac played an important role as the intermediary through which Greek learning passed to the Islamic world. Syriac translations also preserve much Middle Iranian wisdom literature that has been lost in the original.

Tim May has pointed out that Meltho Open Type Syriac fonts are available Beth Mardutho.
Syriac is notable for being one of the scripts on the Xian Stele in China, as well as on the tombstones in Quangzhou. (I have not found and image for this yet.)

(Oddly the Estrangelo Syriac script also appears on the bookplate for the Gleason Moss Collection of H.A. Gleason, Jr., who was my first and well-loved linguistics professor. His father was the botanist H.A. Gleason.)

And finally a nice link here to look at a few related scripts and their transcriptions together in a table. And there is the right half ring and the left half ring. Now I get it.

Actually I intended to end here but really I have to identify the variant of Syriac script which appears in Malkunoc Zcuro. It looks like Jacobite or Serto script from comparison with the Omniglot page. At Amazon dot com I have found a Syriac Bible in the Jacobite script.

Here is a clip from the Syriac Bible: Jacobite Script, Ancient and for comparison a chunk of non-continuous text from Malkuno Zcuro.


Would it be fair to say that Syriac has several diascripts? Hmm.


Saturday, November 26, 2005

BabelStone Blog

Andrew West's recent post about What's New in Unicode 5.0 provided links to some interesting reading. First, he answered my question about Phoenician. You can read his answer here. I didn't bring this up to reopen a debate which I have no part in. Rather, I was away for the month of August and missed the end of that story.

However, I found a document called N2990 particularly useful. This document not only records votes but also records comments. Among the comments, I noticed this line.

Encoding Phoenician is redundant, and needlessly proliferates Canaanite diascripts.

I googled diascripts and came up with this document which supplied a definition. "Diascript is to script as dialect is to language." Good, one more thing to think about.

Next, in the same document on page 9, I found an interesting item.

For character names and named UCS sequence identifiers, two names shall be considered unique and distinct if they are different even when SPACE and medial HYPHEN-MINUS characters are ignored and even when the words "LETTER", "CHARACTER", and "DIGIT" are ignored in comparison of the names.

EXAMPLE 1
The following hypothetical character names would not be unique and distinct:
MANICHAEAN CHARACTER A
MANICHAEAN LETTER A


That answers another question I had for Andrew about character names. Now I know that the part of the name that designates it a 'character' or a 'letter' is not to be considered significant.

However, this is tricky because if the name of the character differs by the word 'letter' or 'symbol' they are indeed separate characters.

U+03F0 : GREEK KAPPA SYMBOL

U+03BA : GREEK SMALL LETTER KAPPA

While Andrew has tallied up the the number of characters in Unicode in How many Unicode characters are there? I have entertained myself with another of my trivial tasks.

These little trivia games I play sometimes are simply to familiarize myself with a script or a technical detail and entertain myself at the same time. Many have no point at all. Neither does this. It is a tally of the names of characters used in Unicode and gave me a happy half-hour of playing with BabelMap.

Character Names by Block for a few representative blocks.

Arabic Letter
Latin Letter
Bengali Letter
Bopomofo Letter
Braille Pattern Dots
Cherokee Letter
CKJ Unified Ideograph
Cypriot Syllable
Deseret Letter
Devanagari Letter
Ethiopic Syllable
Hangul Choseong
Hangul Syllable
Hiragana Letter
Katakana Letter
Linear B Ideogram
Canadian Syllabics
Linear B Syllable

This is just to condition myself so that in the middle of discovering Katakana at some future date I don't do a double take when I discover that they are letters and not syllables. Ethiopic, Cypriot and Hangul have syllables but Cherokee and Katakana do not. The name for Canadian Syllabics seems to feature the name of the block. Surely the character itself is a 'syllabic', while the system is 'syllabics'. I have to think about this too.

However, there they are and I am taking a step towards becoming familiar with these names. It helps if you want to search for a character by name to know the name. I also explored many of the features of BabelPad described in this post.

I look forward to hearing more about Phags-pa some day.

Friday, November 25, 2005

Phoenician Alphabet

First, I have been reading, but not commenting on, the Tel Zayit Abecedary controversy. Somehow, conducting a functional literacy assessment for 1000 BC seemed a little daunting. However, I have now checked out all the links provided by Language Log of Nov. 14 and Nov. 21, 2005. (I can't seem to figure out how to link to these posts directly.)

I also note that Phoenician, among other writing systems, has been accepted for encoding in Unicode version 5. Proposed New Characters: Pipeline Table. So it doesn't seem out of the way to practise typing in Phoenician to get myself accustomed to a new keyboard.

Fortunately, Nizar Habash has posted a little demo here. Actually he is using some kind of frames on this site so follow Research> Human Computer Interface> Phoenician Nuun Demo (Phoenician-English Input Method.)

You can see that I have faithfully reproduced the sequence of letters from the Tel Zayit Abecedary. This keyboard is pretty intuitive and uses only two letters in the shift position: teth and sade. (I can't seem to get SMALL LETTER S WITH DOT BELOW to display for me in blogger. Maybe another day.)

Update:

I must have forgotten a snippet of code yesterday because when I went in today and defined the font as Microsoft Sans Serif the desired character was just fine, thank you very much. This is what I wanted: ṣādē.

Semitish

I received this email a few days ago.

I wanted to ask you about something that I believe I once saw somewhere online but I can't find now. It pertains to a Hebrew and Arabic alphabet reform that someone was proposing, an odd combination of the two alphabets. Does that ring a bell? If so, I'd appreciate it if you could tell me who is behind this so I can look it up. Thanks!

This is something that I had never heard of and it doesn't google very well. So I posted this message in Qalam, the writing systems forum. In about half an hour I received a reply.

While I am delighted to receive such emails - flattered really, readers can themselves go straight to qalam and bypass me althogether. There you will find 268 script enthusiasts. Right now it looks a little quiet. But here is good too - lots of commenters to augment my musings, thank goodness.

The answer might possibly be the Alphabet of Semitish by Nizar Habash. He has an interesting site to explore. He is an "Associate Research Scientist at the Center for Computational Learning systems in Columbia University." Habas has also invented the Delason Constructed Language and writing system. Of particular interest here is his Palisra Gallery with this introduction.

What is Palisra?

Palisra is an artistic exploration of the nature of a world where Palestinian and Israeli nationalisms never existed. They are replaced by a merged nationalism, that of the people of the Holy Land.


This is an ongoing project that includes creating all elements of an alternative merged nationalism: flag, money notes, stamps, religious art, and language (an Arabic-Hebrew esperanto we are calling Semitish).


Is this a vision of things to come or an elaborate escape of a bloody reality?

That's up to you to decide.


In my next post I hope to feature Nizra's Phoenician input utility!

Chinese Input Method Popularity

Here is an interesting post on Chinese input Methods from Lee Sau Dan in Sci.Lang.

Jer writes:

Hi - Has anyone read statistics about input methods used in China? I assume the Pinyin systems would be the most popular, followed by Wubi. (Wubizixing).

Lee Sau Dan writes:

In the sphere of traditional characters, Cangjie is quite popular,because it's ubiquitous and it's what professional typists are trained to use. Zhuyin (based on the bopomofo phonetic transcription system) may come next, but there are many people using various other methods. e.g. People in Hong Kong like to use Jian3yi4, which is a sort of broken Cangjie. Many use Cantonese-based input methods. In Taiwan, some Minnan-based methods are popular, too.

Jer writes:

Anyone care to predict the future? Will it stay as it is now where most people prefer to type the pinyin pronunciation then choose the correct character, but more serious people put in the time to learn Wubi?

Lee Sau Dan writes:

Even those how don't bother to learn Wubi are using something other than plain Pinyin, because the latter is too slow to be used intensively. e.g. there is Jian3pin4, which substitutes some digraphs in Pinyin (e.g. "zh", "sh") with single keystrokes. And phrase-based input methods are gaining ground because of the increased inputting speed.

Jer writes:

I can't really picture a system faster than Wubi taking over.

Lee Sau Dan writes:

Go beyond the "type character by character" mindset and you'll be able to imagine faster methods. Is it too hard to imagine typing "i18n"and have the input method turn it into "internationalization" for you automagically? I don't think so.

Lee Sau Dan 李守敦

These last couple of lines hint at what is ahead in input methods.

Thursday, November 24, 2005

Addenda and Errata II

I hope no one thinks that this is an encyclopedia; or that I shouldn't be posting if I make the occasional error. Especially when I copy something verbatim from somewhere else without checking the tiny details.

This one I found quite interesting so I'll blog about how I have checked this out.

First, Simon commented here,

I believe the transcription of title should be "Malkuno Zeuro", not "Malkuno Zcuro". It's hard to tell whether it's a "c" or an "e" in the script on the second image, and also the "kaph" and "e" are quite similar in the first image, but ܙܥܘܪܐ makes more sense as "little".

I have to say that it still looks like a 'c' to me but ... I then checked out Simon's blog. Right, he posts in Hebrew so maybe there is something to this.

I then got out Holladay's Hebrew and Aramaic Lexicon. I am not sophisticated enough to find an online dictionary for Aramaic yet so this will have to do. It is just barely back on the shelf from checking out 'Emeth Hesed'. (Yes, it is an 'aleph' that was removed not an 'e' to turn 'emeth' into 'meth'. Another detail that I copied from someone else's story. Actually I knew it was an aleph but the story was being told in English so I went with it. Sloppy, sloppy!)

Anyway... in the Lexicon I found צעירו masculine singular for 'little' or 'small'. So 'zeuro' it is.

Next, to see how the confusion came about I checked the two possibilities that Simon mentioned in Syriac. They do indeed look somewhat similar.
ܙܟܘܪܐ zcuro (a non-existant word)
ܙܥܘܪܐ zeuro meaning little [Addenda: 'zcuro' would be the correct transliteration of this word since 'ayn is often tranliterated with a 'c']

Okay, 'zcuro' is an error, [Addenda: zcuro is correct] and now I can see how the error came about. Checking in BabelMap I easily found that the first is 'zain, kaph, waw, rish, alaph' and the other is 'zain, e, waw, rish, alaph.'
[Addenda: The 'e' is better labeled 'ayn' and is pronounced as a pharyngeal fricative, transliterated by 'c']

Really, no need to make that mistake, but I think the fact that it looked like a 'c' in English threw me off. [Addenda: Yes, it is a 'c'.]

No excuses though. One of the reasons I am blogging is in order to have this kind of give and take, and learn more. I found this little bit of research quite fun, and confirmation that one does not have to just let something go just because it is in another script and an image. Thanks, Simon. I assume that bloggers don't have to be infallible, do they?

I also have updates to these posts.
The Italic Ampersand
Vietnamese Revisited
'Qness' or the tradition of 'Q'
Greg Vilk

Now, where can one buy this book? Hmm. This is the info from Wolfgang Kuhl.

"Malkuno Zcuro" Antoine de Saint-Exupéry's "Le Petit Prince" (The Little Prince) in modern Aramaic language (Tur Abdin dialect) spoken in South East Turkey was printed in Germany and will be available in November 2005. The text is printed in Aramaic script (Syriac) with Latin transcription. The book also contains vocabularies in German, French, English, Turkish as well as in Swedish. BTW "Malkuno" means "prince".

You can find Wolfgang's original notice about this book on this webpage with his email address. Maybe the book is now available.

Endnote #1:

This comment from Lameen Souag has clarified that it is, in fact, zcuro. Thank you, Lameen.


This is etymologically correct; Proto-Semitic (and Arabic) s.aghiir > s.ghiir > zghiir by voicing assimilation > z`iir by regular sound shift. (Dunno why it's got -uu-.) However, it's not orthographically correct: that's a c, not an e, because Semitists often use a c to represent the pharyngeal `ayn.

Lameen also has a fascinating post today about Oldest African Dictionaries.

Endnote #2:

This is from Wofgang Kuhl, who sent me the information in the first place. My apologies for doubting the original orthography, Wolfgang.

Saint-Exupery's "Le Petit Prince" was translated by the "Circle of Aramaic Students" at Heidelberg University, Germany.

I contacted the professor who initiated the Aramaic translation. He assured me that "zcuro" is the correct translation for "little" as far as the Tur Abdin dialect is concerned. He assumes that the persons who came up with "zeuro" must have consulted a dictionary of the Old Aramaic language.

BTW a copy of "Malkuno Zcuro" (ISBN 3-937467-15-7) can be obtained from the following book company:

http://www.verlag-tintenfass.de/

info@verlag-tintenfass.de

Endnote #3:

Simon continues,

Though as a Unicode purist, I would myself prefer to write it as ʿyn, using U+02BF MODIFIER LETTER LEFT HALF RING

First, why is the 'ayn labeled Syriac letter e in Unicode?

[Paragraph removed to the comment section.]

Definitely Firefox is becoming increasingly necessary because these extra characters are not displaying well in IE especially in the comment section.

Wednesday, November 23, 2005

Where is Your Son?

சிற்றில் நற்றூண் பற்றி நின் மகன்
யாணடூளனோ ஏன வினவுதி ஏன் மகன்
யாண்டு உளன் ஆயினும் ஆறியேன் ஒரும்
புஸி சேரநது பொகிய கல் ஆலை போல
இன்ற வயிறோ இதுவே
தோன்றுவன் மாதோ போர்கள்ளத் தானே

'You stand against the pillar
of my hut and ask:
Where is your son?
I don't really know.
My womb was once
a lair
for that tiger
You can see him now
only in battlefields.'

Kavarpentu puranamuru 86 (transl A.K.Ramanujan 1985:184)

This is a poem cited by Sanford Steever in his article on Tamil in World's Writing Systems edited by Peter T. Daniels and William Bright. This book has articles on 80 writing systems. My favourite characteristic of this book is the short selection provided in each writing system with a transliteration, transcription and translation.

(See the full version at the bottom of the page. I have omitted the transcription and left the transliteration unmarked by accents. I haven't learned to keyboard underdots and macrons yet. Sorry.)

I always find these selections reveal something about culture, human nature or both. I chose this poem to keyboard since I was in the mood to type a little Tamil.

I started with the Inscript keyboard and soon found that I needed to use the shift key for every second letter. I had the on-screen keyboard from Start> Programs> Accessories> Accessibility> On-screen keyboard open. However, it only displays either the base state *or* the shift state not both at once. So hunt and peck didn't work. I then found that there were syllables in the text that I could not readily identify. This is not suprising given that World's Writing Systems uses a variant form of Tamil font.


This is the Tamil keyboard in Windows. I have put the two together myself just to have a way to view them both at once.


I finally ended up using the Tamil phonetic (romanized) keyboard here with syllable display and that went well. Pretty easy once you get used to it. Actually there are two vowels where the shift key is needed. I had forgotten that.

Tamil is where it all began for me. I was working on a multilingual computing project a couple of years ago when I tried getting young people, who were somewhat familiar with typing Tamil in a previous encoding, to use the Inscript keyboard for Unicode Tamil. No way.

It took me over a year to get things sorted out for Tamil - I dropped the project and the rest is history. But if it weren't for this keyboard I would not have felt the need to connect with others and find out more about Unicode and related issues. Most other languages that we needed i.e. Chinese, Russian, Greek, Hebrew, Japanese, Korean and other Latin keyboards were no problem. Vietnamese ... well yes and no. Other languages just didn't seem available at the time.

Text of poem with transliteration and literal translation.

சிற்றில் நற்றூண் பற்றி நின் மகன்
cirril narrun parri nin makan
small house pillar leaning your son

யாணடூளனோ ஏன வினவுதி ஏன் மகன்
yantulano ena vinavuti en makan
where.is.he that you.ask my son

யாண்டு உளன் ஆயினும் ஆறியேன் ஒரும்
yantu ulan ayinum ariyen orum
where he.is that I.don't. know once

புஸி சேரநது பொகிய கல் ஆலை போல
puli cerntu pokiya kal alai pola
tiger joining going stone lair like

இன்ற வயிறோ இதுவே
inra vayiro ituve
begot womb this

தோன்றுவன் மாதோ போர்கள்ளத் தானே
tonruvan mato porkallat tane
appear indeed battlefield only

You stand against the pillar
of my hut and ask:
Where is your son?
I don't really know.
My womb was once
a lair
for that tiger
You can see him now
only in battlefields.

Kavarpentu puranamuru 86 (transl AKRamanujan 1985:184)

From Poems of Love and War, selected and translated by A.K. Ramanujan, 1985. Columbia University Press.

Pater Noster

This is a long overdue post. The Christus Rex website displays the Lord's Prayer in 1322 different dialects and languages. Some of these are images of the Lord's Prayer in tiles from the Convent of Pater Noster. Here is the Lord's Prayer in Armenian.

"The Convent of the Pater Noster was built over the site where Jesus taught His disciples the Lord's Prayer. The walls are decorated with 140 ceramic tiles, each one inscribed with the Lord's Prayer in a different language."

If you can add to this internet collection, contact the Christus Rex website (email is on the website.) The website is well-known and has received many internet awards.

A collection of Hail Mary Prayers on this website have been contributed by the Marion Library Collection in Dayton, Ohio.

Thanks to Wolfgang Kuhl who contributes to the Christus Rex website and told me about it last year. He also sent me information about the Little Prince in Syriac here.

Tuesday, November 22, 2005

Senari

Christopher Green has written me the following,

I study an African language called Senari for which a native speaker and myself are devising a standardized orthography in hopes of being able to develop computer programs to promote literacy in the language.

A graduate student in sociolinguistics at Florida State University, his blog is on "a wide range of linguistic topics, many of which are about language maintainance and policy."

This is from his post Language of the Week - "N"

The language is Nafara, a dialect of the Gur-language Senari spoken by a cultural group in the northern part of Cotê d'Ivoire. I've had the privilege of studying Nafara alongside a native speaker of the language...who incidentally also speaks English, Dyula, French, and Yoruba! This may sound like an amazing and unusual talent, but a great deal of people living in multiethnic west Africa often known 4 or more languages fluently.

So why do I love Nafara so much? Well, back when I first decided that I wanted to be a linguist, I was introduced to Sidiky Diarrasouba, the native Nafara speaker I mentioned just above. He is an educator turned linguist, who decided to come to the United States to investigate a way to develop the necessary materials to revitalize his native language and to promote literacy within his culture.

I have been assisting Sidiky in analyzing the discourse structure of Nafara fables in order to determine a functional grammar and the rules of syntax of his language. We have also attempting to find a practical orthography so that his language can begin to be written.

I thought that I would look up the little that is already available about this language for starters. Above is the Hail Mary in a previous orthography dated 1931. Next, according to this link, "Detailed dialect survey work is currently being carried out by the SIL in the area." The Rosetta Stone Project also records some kind of orthography for Senoufo (Senari) here.

However, the Ethnologue reports these rather bleak literacy rates so it doesn't sound as if any orthography has much currency at the moment. "Literacy rate in first language: 1% to 5%. Literacy rate in second language: 5% to 15%." and further references here. This is a bit of a reality check for some of us.

For a few dry details, traditional issues in orthography creation or revision, are whether the orthography is similiar or dissimilar to the official language orthography; whether it will be phonemic or morphophonemic; at what level it will be standardized, i.e. village, region or district; and whether it will underdifferentiate or not. These are some of the linguistic considerations and there are dozens of books on this topic, so enough of that.

I spend most of my time now checking to see if an orthography 1. displays well on the internet, 2. is easy to search and 3. most of all how easy it is to keyboard.

Some people of interest when working on African orthographies are Don Osborn at Bisharat.net who has written about Senufo here. Also Chris Harvey and Moyogo. Good luck, Chris!

Saturday, November 19, 2005

The All India Keyboard

Recently I wrote about the All India Alphabet. This alphabet has been replaced by an all India transliteration scheme called ITRANS.

There is also an all India keyboard called the Inscript keyboard. This keyboard works well for Devanagari, with its 34 consonants and 12 vowels. The vowels are encoded as both initials and diacritics so that makes 58 letters altogether and a few more symbols. No upper and lower case so all is well.

Tamil, on the other hand, has only 18 consonants and 12 vowels. These vowels have two forms, as in Devanagari. Because these forms are context dependent there is an argument that the two forms could both be input with the same keystroke. That would make 30 letters altogether. In that case, the basic Tamil writing system could be represented on the keyboard in the unshifted state.

Using the Inscript keyboard for Tamil means using a keyboard with 4 blank spaces in the unshifted state, while 3 more keys in the unshifted state have Grantha letters on them. These are letters for writing Sanskrit and are not part of the basic Tamil alphabet. Likewise 7 of the basic Tamil consonants are in the shift state.

You really should be able to type Tamil without using the shift key at all. It may be hard to see but here in the Tamil99 keyboard all the basic letters are in the unshifted state.

In actual fact most Tamil probably use a transliteration IME since that means the shift key is never needed. Who can imagine anything better than that? However, direct input keyboards and typewriter keyboards (IME's) are necessary to provide input for those unfamiliar with the English alphabet or a transliteration scheme.

So why bother mentioning this oddity, the Tamil inscript keyboard? First, because when I started learning to type in Tamil, I was told that this Inscript keyboard was the 'ordinary Tamil keyboard'. And second, because the Inscript keyboard for Tamil is the only Tamil keyboard packaged in Windows.

So there I was 2 years ago trying to learn this strange keyboard and getting more frustrated by the moment. People thought that I was a whiner for complaining about it at all. Now I know better and use an IME of some kind. I actually know how to use this keyboard but when I want to work with someone who is Tamil I generally give it the go-by.

More recently plenty of Tamil transliteration programs and other keyboards have become available as free downloads. My favourite is the online syllabic editor, of course, which was adapted by Richard Wordingham from a Hindi online keyboard, for me to use with Tamil children.

However, the Inscript keyboard remains as the only Tamil keyboard in Windows. If anyone knows what it is doing there, drop me a line.

Greg Vilk

Greg has sent me a copy of his new novel Golem so I have indulged myself for a few days in attempting to decipher the central puzzle of this novel. I have not succeeded in unraveling the mystery but I have spent some enjoyable hours trying.

This novel is set in Thule Bay in northern Greenland. This could only be Qaanaaq, a settlement whose name is a palindrome. Several clues point to the use of the palindrome in deciphering the two 'keywords' of the story, the words written on the scroll placed in the golem's mouth.

In trying to decide if these words were in Hebrew, Latin or English, I first researched the history of the palindrome. Palindromes are an ancient tradition, dating back to 275 BC. I found famous Greek and Latin palindromes but less use of the palindrome in Hebrew. Along with palindromes there are also reversable words. This offers much more scope for decipherment.

The first keyword is the 'word of creation' which brings the golem to life; and the second keyword, a reverse of the first, will destroy him. I found that the effect of the script, with its many reversed letters, (a realistic feature in my books, since I am familiar with many real scripts with reversed letters) distracted me from perceiving the sequence of the letters in reverse. Therefore I reconstructed the keywords by number.

I wrote down the 'word of creation' as 12134521 and its reverse as 21543211. To visualize this better I organized the letters like this 121-345-21 and 21-543-211.

Now, assuming first the simplest interpretation, that the words are understandable in English, I worked on combinations of letters that would fit this pattern. The double final letters could only be ll, ss, or ee. The other possibilities, zz, and ff, seem too improbable. However, maybe I am barking up the wrong tree.

Next, I switched to researching the legend of the golem in history. I found out that one of the original 'words of creation' was 'emeth' (truth) written on the golem's forehead. With the erasure of the 'e' altering 'emeth' to read 'meth' (death), the golem was destroyed. I assume a similar method must work with Vilk's two keywords.

This was just the beginning of the investigations I pursued in working on this puzzle. Overall, the historic elements in this novel refering to the creation of the golem stand up as highly accurate to the original golem legend, which is a pleasant surprise these days. Good work, Greg.

While I have not succeeded in deciphering the ancient script, there are many more tantalizing clues embedded in the text. There are allusions to the first chapter of Genesis, the first chapter of John's gospel, the Lord's Prayer and other famous quotes. I have not ruled out the possibility that the names of the characters also provide clues. You have to read the novel and decide for yourself.

There is one little detail I do have to mention in the interests of 'herstoricity'. The female character should give up her pantyhose, since this item of attire was not invented until 1959, some 17 years after the setting for this novel.

There is an interesting discussion about 'speech' and the letters of the Hebrew alphabet here and here.

Update: In response to a comment on Language Hat I need to add that 'emeth' is אמת and without the aleph מת is 'dead'. This is actually the triliteral root מןת. I think there is an expression ךבר אמת 'word of truth'. However, in this novel certain conversation points in the direction of a 'word of creation.' Hmm. Help welcome.

On other points, I can not guarantee that I am pointing anyone in the right direction on deciphering Greg's script.

Thursday, November 17, 2005

'Qness' or the tradition of 'Q'

I had a very positive reaction to the Telex input method mentioned by Michael Farris and quoted in my Unikey post. (f, s, r, x, j become the tone keys) Afterall, the index fingers on the 'f' and 'g' keys, are made for multitasking.

However, Mark saw it differently. His reaction was "Ackj! Ohx myg eyesf!" and I thought "What does this have to do with his eyes?" His sensitive fingers maybe - but surely not his eyes.

This alerted me to the fact that not everyone perceives the relationship between the key and the letter stenciled on it in the same way. For me there is an arbitrary relationship at best between the letter portrayed on the key and the key itself.

A key may have a certain English letter stenciled on it but no one key has any one letter as its essential quality. The quality of the upper left lettered key is not 'Qness'; it simply happens to have 'Q 'stenciled on it. It has no 'Qness' unless I am typing in the Latin alphabet on a QWERTY keyboard. Then I assign it temporary 'Qness'.

So I was surprised to read another post today on the Better Bibles Blog in which I discovered that indeed there are others who believe in essential 'Qness' or in "Wisdom in the Q Tradition".

From Announcing a perfectly accurate Bible Translation I heard for the first time about a new Bible translation theory in the tradition of 'Q'. Here is an oft-quoted verse in this new translation.

"hOUTWS GAR HGAPHSEN hO QEOS TON KOSMON, hWSTE TON hUION TON MONOGENH EDWKEN, hINA PAS hO PISTEUWN EIS AUTON MH APOLHTAI ALL' ECHi ZWHN AIWNION."

While Mike Sangrey, the author of this post, intends to publish a dictionary of neologisms to support this new translation, I believe that Mark S. would be able to shortcut that process significantly by teaching readers how to understand the essential quality of each key. They need to realize that the letter stenciled on the key is, in fact, the literal *signification* of that key, and any divergence from this literal truth is a perversion of the intent of the original author of the keyboard.

I, however, am not such a literalist, and tend to be more flexible in my assignment of essential qualities. I am a Thomas concerning the 'Qness' of Q and and open to consider the possiblility that 'Q' may actually represent Θ in this context.

Note: Mike Sangrey offers a complementary sushi knife for those who order this translation today.

Update #1:

I guess I should explain this. Q is not actually the input key for Greek theta when using a Greek Unicode keyboard. However, in the symbol font, a Greek look-alike font for Latin, theta replaces q.

Here is the qwerty keyboard set for the Symbol font. I hope it works.

qwertyuiop
qwertyuiop

asdfghjkl
asdfghjkl

zxcvbnm
zxcvbnm

And there is the mysterious little digamma, (#6) I believe, fourth from the end in the 'v' position. Correct me if I am wrong.

Update #2

This is the same text as the quote above but with symbol as the defined font. It is the Latin character set with a Greek look-alike font. It had me fooled the first time I saw it. Somehow I learned to use Greek Unicode first and then I saw this. But for many people it is the other way around.

This is John 3:16. For God so loved the world...

OUTWS GAR HGAPHSEN O QEOS TON KOSMON, WSTE TON UION TON MONOGENH EDWKEN, INA PAS O PISTEUWN EIS AUTON MH APOLHTAI ALL ECH ZWHN AIWNION.

Tuesday, November 15, 2005

Spelling in Chinese

After posting on Zhuang I went back and carefully reread the 9 methods of composing characters in Zhuang and non-standard Cantonese in this article.

A Comparison of the Graphical Conventions in the Written Representation of Zhuang and Cantonese by Prof. Robert S. Bauer

I left off with this last sentence,

For various reasons neither the old Zhuang script nor the written form of Cantonese has undergone the formal process of standardization; the lack of standardization has created the phenomenon of allography in both writing systems.

I don't want to go into all 9 conventions here but this is the last one cited.

9) Graphs whose pronunciations are "spelled" by their two component characters; that is, two (typically standard Chinese) characters are combined to form the target character, and the Zhuang or Cantonese reading of one of the characters represents the initial consonant of the target character, while the rime of the second character corresponds to the rime of the target character (this method resembles the 反切 principle that was employed in the ancient Chinese rime books).

I get the impression that rather than using two distinct characters as in fanqie, two components are combined in one character. This is described by the author as "spelling" out the pronunciation in a character.

I returned to Dylan Sung's website on the history of the Chinese language and script for a description of fanqie. (View his sitemap here.)

Splicing sounds

In order to fix the sounds of a character, we needed a method in which to do it. Very early on in the late Han period (25-220), splicing two characters for the intial and rhyme was the method to pin down the sounds. This is known as the FanQie (反切) method. Prior to the Sui (581 - 618) and early Tang (618 - 907) dynasties, the character "fan" 反 was used to symbolise this splicing. After the establishment of the Tang Dynasty, the character "qie" 切 was used.

Here is an example of how Fan and Qie splicing work.


[This character has the] old pronunciation "tung", and both methods use two extra characters, the first of which is the initial, and the second an exact rhyme to our example. The splicing works exactly the same way in both examples.

For a further discussion of fanqie I went here.

The fanqie spelling is a word-based analogical spelling system in which words are spelled in terms of other [familiar] words. Fanqie was never intended to, nor is it capable of, making distinctions beyond those of the words of any given speaker or reader. Neither the rhymes nor the fanqie spellings of the words of any given dialect or literary tradition can be arbitrarily extended (or "refined") so as to include the rhymes or words of another dialect which may have distinguished them differently or which did not distinguish them at all, as the Qieyun compilers indicate.

Or read the book.

I have recently made the delightful but necessarily time-consuming discovery that if a book is listed at Pinyin info it is likely available at the university library near me. I have a stack of these books on my desk, and some of them I have actually read.

Two thoughts from reading all this. First, different kinds of phonography were used to generate new characters or 字 zi. Second, allography is a great term for a phenomenon which fascinates us all - non-standard writing. (Well, most of us.) In the midst of the all-encompassing standardization that is happening as graphs and systems enter Unicode, many of us will be mourning 'allography' or trying to find ways to keep it alive in spite of itself.

Sunday, November 13, 2005

Google

Here are some responses to the Vietnamese search problem that focus on the search engine and not the keyboard. I think this is an issue that anyone who is searching the internet needs to be aware of.

First, Andrew C. commented,

The key issue is that Google, like many web services does not bother to normalize Unicode strings. Google seems to take it byte by byte. The result is that the microsoft layout compared to a precomposed (NFC) string or even a NFD string produces different results.The W3C have released a draft version of part of their character model that tackles normalization. http://www.w3.org/TR/2004/WD-charmod-20040225/

Then Simon reponded,

Actually Google makes the effort to normalise the search strings.For example, for Greek, Google knows about cases (does case mappings):http://www.google.com/search?q=ιστολόγιοhttp://www.google.com/search?q=ΙΣΤΟΛΌΓΙΟhttp://www.google.com/search?q=ΙΣΤΟΛΟΓΙΟhttp://www.google.com/search?q=ιστολογιο and also can work irrespective of accents! This might come from the case mapping rules for Greek; when you capitalise words, the accents are often removed. For more, seehttp://www.unicode.org/reports/tr21/tr21-5.html

Then Andrew C. continued,

As Simon has indicated, Google has put a lot of work into some languages to optimise searching in those languages. But if you use a language they haven't optimised for, you tend to have problems. As far as I can tell, Google seems to operate on byte sequences rather than character sequences. One trap people fall into is the assumption that because Google has an interface translated into a langauge, then Google is a suitable search tool for that language.

Recently, I've been researching Khmer search engines. The Google interface has been translated into Khmer, but it doesn't seem to be possible to actually search sucessfully in Khmer unicode, even though there are Khmer unicode sites that have been indexed by Google.

I also know that I don't need accents to google in French. And this week I have been busily working away on my own little project on Andreas Müller (1630-1694). 'Muller', 'Müller' and 'Mueller' all give me the same search results. After a little testing it seems that the precomposed accents - acute, grave, cirmumflex and umlaut are normalized. However, maybe not the combining diacritics or even precomposed letters with two diacritcs. Hmm. I can't really say.

However, here is another little problem - when I get to the page I want and use the edit:find feature, I have to be exact and use every little accent. I have to search the page using Muller, Müller and Mueller as separate searches. No normalization there! I wondered why all those pages gave me no results.

Well, Müller is not going anywhere so I can catch up with him now.

Additional Comments:

On another topic altogether, I don't have time to quote and comment on the many great posts that I read. I assume that if they are in my sidebar people will find them eventually.

However, here are a few things worth mentionning. First, Andrew West has made his first post Tibetan Extensions 1 : Astrological Pebble Symbols on his new BabelStone blog. Then there is Lameen Souag's post on A comparative linguist of the 10th century and finally the ongong discussion of the Tel Zayit Alphabet on Language Log.

Update #1: See Mike's post for a more refined search engine experiment.

Update #2: See further comment here .

Unikey

I haven't posted much about keyboards lately so this seems like a good time. This is about the Unikey Vietnamese keyboard which has "all 3 popular input methods: TELEX, VNI and VIQR." (Screenshots)

Michael Farris has made this comment about it.

Not exactly your comment, [that's okay. Michael, this is a blog, remember] but for Vietnamese, I use a non-microsoft keyboard called unikey. It has several options, I use unicode precomposed characters and telex input, a vietnamese system that takes a little getting used to.

Here's a list of some words, with the input on the left, output in the middle and English gloss on the right.

vieejt Việt Vietnamese
nguwowfi người person
tooi tôi I
owr ở at
sawsp sắp imm. future marker
ddax đã past marker

the tone keys (f, s, r, x, j) can be typed either after the vowel or after all the segmental letters of the word have been typed. The latter method is probably better as it assigns the tone marker better in ambiguous cases (but I'm used to writing tone as I go along). It's much faster than when I inputted a 100 or so pages of dictionary entries using keyboard shortcuts of my own devising in a floating accent system that I hate with a passion now (can you say awkward and time consuming and frustrating?)

Thanks, Michael, for explaining this. It sounded a little odd at first but entirely suitable kinesthetically. There is a big difference between just finding all the accents in the first place, and then finding an input method that can be easily typed. I still find French awkward. Especially since I have switched keyboards a few times over the years.

Here is another comment on the Telex input method.

That is also the case in vietnamese "telex style" input. A very popular input method as it allows very fast typing. The vowels with a circumflex, as well as the D stroke, are written by redoubling the letter. Then, unused letters of the latin alphabet (j,x,...) are used to indicate the different accents. But those letters can be typed almost anywhere on the syllabe (vietnamese is written with syllabes separated by spaces). For example "Vietnam" in vietnamese is written with the "e" having acircumflex accent and a dot below the letter. With the telex input method: "Vieejt Nam" but also must be accepted"Vieetj Nam" (yes, the accent is always on the last vowel of a syllable with several vowels).

If you think about how the lettered keys will look as you type, this will throw you off. But think of what will display on the screen instead, as the accents are added either after the letter or after the syllable which they modify, up to you. More intuitive than dead keys and no long finger stretches to the top row.

However, the top row is way better than at the side on the quotation mark key. Some of us have very disobedient pinkies - they never do as they are told - better for drinking tea, really.

Another recommended Vietnamese keyboard is VPSKeys.

For Mark, look at this comment about using telex input for Pinyin. Have you ever seen that?

I'm off for a cup of tea. The power of suggestion!

Further from Michael Farris:

Unikey telex input is also forgiving in that you don't have to delete wrong accents. If I mistype owr as owf I just add the r after f (owfr) and it corrects the tone. And tone placement is a little tricky in words with, for example, the sequence -oa- as the tone mark goes on either the o or a depending on the final. Typing the tone right after the vowel is less accurate than typing the tone as the final element (which always places it correctly).

Also, of the fine "tone" letters, three are used in Vietnamese, r, x and s are all initial consonants (so their use after vowels is unambiguous).