Unicode character encoding

The CommonPoint application system uses the Unicode character set standard to internally represent all text data. Unicode, a fixed-width 16-bit character encoding system, contains codes for every character needed by the major writing systems in use throughout the world today. Unicode provides full character coverage for the major scripts, listed in the table below, as well as punctuation, symbols, and control characters. The character set for each script is independent; that is, even if a character appears in multiple scripts, it has a separate code within each script. For example, the character A has a code for the Roman alphabet and another code for the Greek alphabet. Note that this applies to scripts, not languages. The character A is identical for the English and French languages, for example.

Arabic Greek Kana Tamil
Armenian Gujarati Kannada Telugu
Bengali Gurmukhi Lao Thai
Cyrillic Han Malayalam Zhuyinfuhao
Devanagari Hangul Oriya
Georgian Hebrew Roman

In addition to the script name, Unicode associates other semantic information with each character. Each character has type properties that describe the usage of the character. Some examples of type properties are:

The CommonPoint implementation of Unicode characters provides access to all character property information. Never modify the semantic information associated with a character. If you need to provide a character with different semantics, add a new character to the private use area (see "Unicode private use area" on page 24).

For details on Unicode design and usage, see The Unicode Standard, Version 1.0, 2 volumes (Addison-Wesley, 1991) and Unicode Technical Report #4 (Unicode Inc., 1993). Unicode Technical Report #4 describes the amendments to Unicode 1.0 that constitute Unicode 1.1, the version of Unicode implemented by the CommonPoint application system.


[Contents] [Previous] [Next]
Click the icon to mail questions or corrections about this material to Taligent personnel.
Copyright©1995 Taligent,Inc. All rights reserved.

Generated with WebMaker