Creating a text-ordering specification

Currently, the collation table used by a TTableBasedTextOrder instance can be created from a text file containing an ordered sequence, from least to greatest, of comparison elements. Each entry comprises one or more characters and the ordering priority of that entry.

File syntax

To create an entry for a grouped character such as the Spanish ch, simply list all the characters in the grouped character in the entry.

To create an entry for an expanding character, separate the main character from the expanded characters by a slash--for example, Æ/E.

Indicate ordering priorities by the following symbols:

* Indicates a primary difference
+ Indicates a secondary difference
- Indicates a tertiary difference
= Indicates an equal relationship (or quarternary difference)

You can also use the following symbols:

$ Indicates that the following four characters reference a hexadecimal Unicode value
@ Indicates the position within the ordering of all characters not listed in the table
# Indicates that this entire line is a comment
{} Encloses comments within a single line

Sample entries

The following lists some sample entries from the text file specification for the English ordering object:

      # Control characters are all considered equal to NULL.
      
    =$0000{NULL}
      =$0001{START OF HEADING}
      =$0002{END OF HEADING}
          .
          .
          .
      # Accents have secondary differences.
      +$0301{COMBINING ACUTE ACCENT}
      +$0300{COMBINING GRAVE ACCENT}
      +$0302{COMBINING CIRCUMFLEX ACCENT}
      +$0308{COMBINING DIAERESIS}
          .
          .
          .
      # Rules for alphabetic character sorting.
      *a
      
    =$00E1{LATIN SMALL LETTER A WITH ACUTE}/$0301{COMBINING ACUTE ACCENT}
      =$00E0{LATIN SMALL LETTER A WITH GRAVE}/$0300{COMBINING GRAVE ACCENT}
      =$00E2{LATIN SMALL LETTER A WITH CIRCUMFLEX}/$0302{COMBINING CIRCUMFLEX ACCENT}
      =$00E3{LATIN SMALL LETTER A WITH TILDE}/$0303{COMBINING TILDE}
      =$00E4{LATIN SMALL LETTER A WITH DIAERESIS}/$0308{COMBINING DIAERESIS}
      -A
      =$00C1{LATIN CAPITAL LETTER A WITH ACUTE}/$0301{COMBINING ACUTE ACCENT}
      =$00C0{LATIN CAPITAL LETTER A WITH GRAVE}/$0300{COMBINING GRAVE ACCENT}
      =$00C2{LATIN CAPITAL LETTER A WITH CIRCUMFLEX}/$0302{COMBINING CIRCUMFLEX ACCENT}
      =$00C3{LATIN CAPITAL LETTER A WITH TILDE}/$0303{COMBINING TILDE}
      =$00C4{LATIN CAPITAL LETTER A WITH DIAERESIS}/$0308{COMBINING DIAERESIS}
      +$00E6{LATIN SMALL LIGATURE AE}/e
      -$00C6{LATIN CAPITAL LIGATURE AE}/e
      +$00E5{LATIN SMALL LETTER A WITH RING ABOVE}/a
      -$00C5{LATIN CAPITAL LETTER A WITH RING ABOVE}/a
      *b
      -B
      *c
      =$00E7{LATIN SMALL LETTER C WITH CEDILLA}/$0327{COMBINING CEDILLA}
      -C
      =$00C7{LATIN CAPITAL LETTER C WITH CEDILLA}/$0327{COMBINING CEDILLA}
      *d
      -D
      *$00F0{LATIN SMALL LETTER ETH (Icelandic)}
      -$00D0{LATIN CAPITAL LETTER ETH (Icelandic)}
      *e
      =$00E9{LATIN SMALL LETTER E WITH ACUTE}/$0301{COMBINING ACUTE ACCENT}
      =$00E8{LATIN SMALL LETTER E WITH GRAVE}/$0300{COMBINING GRAVE ACCENT}
      =$00EA{LATIN SMALL LETTER E WITH CIRCUMFLEX}/$0302{COMBINING CIRCUMFLEX ACCENT}
      =$00EB{LATIN SMALL LETTER E WITH DIAERESIS}/$0308{COMBINING DIAERESIS}
      -E
      =$00C9{LATIN CAPITAL LETTER E WITH ACUTE}/$0301{COMBINING ACUTE ACCENT}
      =$00C8{LATIN CAPITAL LETTER E WITH GRAVE}/$0300{COMBINING GRAVE ACCENT}
      =$00CA{LATIN CAPITAL LETTER E WITH CIRCUMFLEX}/$0302{COMBINING CIRCUMFLEX ACCENT}
      =$00CB{LATIN CAPITAL LETTER E WITH DIAERESIS}/$0308{COMBINING DIAERESIS}
      *f
      -F
      *g
      -G
      *h
      -H
          .
          .
          .
      # All unmentioned characters are sorted after the specified characters.
      @
NOTE Some of the CommonPoint application system files that contain collation tables were created using the Macintosh character set and may not display correctly on other platforms. These collation tables still work correctly, but you may not be able to view them. This is intended to be fixed in a later release.


[Contents] [Previous] [Next]
Click the icon to mail questions or corrections about this material to Taligent personnel.
Copyright©1995 Taligent,Inc. All rights reserved.

Generated with WebMaker