|
|
In general, characters may be specified in one of six different ways
(the following examples all specify the ASCII character ``A''):
65 decimal
0101 octal
0x41 hexadecimal
'A' quoted character
'\101' quoted octal
'\x41' quoted hexadecimal
The information in the specification file is to an extent free format. A
particular type of definition is started by one of the following keywords:
PRIM: ZERO: EQUIV: DOUBLE:
The keywords, PRIM:, ZERO: and EQUIV:, are concerned directly with the setting of the collation ordering of characters.
A group of characters which are to be collated as equal, unless all other
characters in a pair of strings are also equal, are grouped together with
the PRIM: keyword.
The position of a particular group in the specification file
is significant as far as the collation ordering is concerned.
Collating elements following the PRIM:
keyword are separated by white spaces.
A two-character collating element can be specified here by
(ab), where a and b
are the two characters making up the sequence. The order of the collating
elements defined in one group is significant in secondary collation
ordering.
It is also possible to define a range of characters, for example:
PRIM: 'a' - 'z'
Collating elements following the ZERO: keyword, are to be ignored
when collating. The format of the definitions is the same as with
PRIM:. Ranges of characters can also be defined, as for example:
ZERO: 0x80 - 0x9f
EQUIV: is used to give two collating elements identical positions
in the collation ordering. The syntax is:
EQUIV: a = b
where a and b are the two equal collating elements. There can be only one definition for each occurrence of this keyword.
Single characters which are to be collated as two characters, for example
the German sharp s, are defined with the DOUBLE: keyword.
The syntax is:
DOUBLE: a = (b c)
where a is the single character, and b and c are the two characters in the collating sequence. There can be only one definition for each occurrence of this keyword. The single character a must not also appear after a PRIM:, a ZERO: or an EQUIV: keyword.
All characters following the hash character are treated as a comment and ignored up to the end of the line, unless the hash is within a quoted string.
The concise format locale table is placed in a file named collate in the current directory. This file should be copied or moved to the correct place in the setlocale file tree (see locale(M)). To prevent accidental corruption of the output data, the file is created with no write permission; if the coltbl utility is run in a directory containing a write-protected collate file, the utility will ask if the existing file should be replaced -- any response other than ``yes'' or ``y'' will cause coltbl to terminate without overwriting the existing file.