Title HanGlyph seal
Line
| | | | | |
 English |  Chinese 

HanGlyph — a Chinese Character Descripton Language

HanGlyph is a Chinese character description language. It is defined based on the historic and contemporary studies of the structure of Chinese characters. The most crucial characteristic of this language is that it is abstract and it captures only the topological relation of the strokes that form a character.

variations of zhong The essential information needed to distinguish a Chinese character is the arrangement of strokes. The precise location of each stroke can vary in a large extent up to a certain threshold, the character can still be recognized correctly, for instance, the figure on the right shows several variations of the character 中, all of them can be recognized correctly even though the locations of the strokes vary considerably.

On the other hand, some characters are very similar. For example, the following two characters, 土 and 士, comprise exactly the same strokes and in exactly the same arrangement. The only difference between them is the relative length of the two horizontal strokes. How much longer a horizontal stroke is in these characters is unimportant for distinguishing between them. To recognize the character 土, the threshold is that the upper horizontal stroke must be shorter than the lower one. Therefore, the HanGlyph language does not describe the precise geometric information of the characters.

Composition of the character ming HanGlyph describes Chinese characters in a constructive manner. This means, that a character is composed by a sequence of operations that are applied to strokes. The result of each operation is a new glyph which can be operated on with another stroke or glyph to form more complicated glyph repeatedly. The figure on the right illustrates the composition of the character 明.

The strokes

Strokes are the primitive building blocks of characters. The set of strokes in HanGlyph consists of 41 shapes. They are based on the modern writing style of Kai (楷書). Their selection is the results of careful study of many Chinese linguistic and graphological works.

When writing HanGlyph expressions, it is important to view strokes as abstract entities. For instance, a horizontal stroke (橫) is merely a trajectory whose two ends are at the similar height. The thickness and any possible decoration are irrelevant, they are considered only when the character is rendered.

Each primitive stroke is assigned a Latin letter as its code so that users can write HanGlyph expressions using a standard qwerty keyboard easily. All strokes are listed in the stroke table (in pdf format), and their properties are described in Strokes Section of the HanGlyph Language Reference Manual.

The operators and relations

To form a Chinese character, one combines primitive strokes using operations. Five operators are defined as listed in operators table. Each operator combines two operands to form a stroke cluster. This operation continues recursively until the desired character is formed. For example, to describe the character 土, one may first combine two horizontal strokes using the top-bottom operator, then use the cross operator to add a vertical stroke. The HanGlyph expression for this character (written in ASCII characters) is h h=s+.

HanGlyph expression is written in postfix notation, i.e., you write the strokes first, then the operator that combines the strokes.

A summary of the language can be found here.