Saturday, June 11, 2005

Composing the Syllable

Rather than try to visualize some of the 100 plus input methods for the Chinese writing system, it is easier to sort them into categories. This is what I read in There are 3 ways to input Chinese, by encoding, by pronunciation and by the structure of the character.

Now I have to ask outright what does it mean to input by encoding. Is there someone out there that thinks of characters as encodings? Who are these people and what do they do, look up a chart of encodings? Do they suggest that us mere mortals should consider this an input method? Maybe someone will enlighten me and tell me more about the mysteries of code.

Okay, there are two main types of Chinese keyboard input method for the rest of us, by pronuncation or by structure.

Simply put, each Chinese character represents a syllable. Since there are many more Chinese characters or syllables than there are keys on the keyboard, (even considering the shift and alt keys) these syllables must be composed out of smaller components. There are two ways to do this. First by the pronunciation of the syllable and second by the structure or shape of the syllable.

To input by pronunciation, one can use Pinyin, Cantonese Pinyin or Zhuyin, also known as Bopomofo. There are also adaptions of these methods. In these methods each sound, or consonant and vowel, are input separately either by using letters of the Latin alphabet or Bopomofo characters. Since more than one character will match each syllable by pronunciation, a series of characters will be displayed and one must be chosen and confirmed.

To input by structure, the visual features of the character are considered. The strokes or larger components which make up the structure of the character must be analysed in some meaningful way. Older methods were based on radicals and other components of the character like stroke order, number, and direction.

Newer javascript structure-based input methods like Q9 have the advantage of displaying components of the character on the virtual keyboard and the user choses the desired component without having to depend on memory for the correct keystroke. These methods are intuitive and can be used without learning anything about the structure beyond the ablility to discriminate whether a stroke is horizontal, vertical or diagonal. (This is an over simplification but I do want to go to sleep tonight.)

This kind of input, structure-based, is also called glyph-based input since the glyph is the shape or structure of the visual image of the character.


