|
|
To give even more flexibility to the
programmer in an Asian environment,
ANSI C provides wide character constants
and wide string literals.
These have the same form as their non-wide versions
except that they are immediately prefixed by the letter L:
'x' | regular character constant |
'¥' | regular character constant |
L'x' | wide character constant |
L'¥' | wide character constant |
"abc¥xyz" | regular string literal |
L"abc¥xyz" | wide string literal |
¥
is encoding-specific,
but if it consists of more than one byte,
the value of the character constant '¥'
is implementation defined,
just as the value of 'ab'
is implementation defined.
A regular string literal contains exactly the bytes (except for escape
sequences)
specified between the quotes,
including the bytes of each specified multibyte character.
When the compilation system encounters a wide character constant
or wide string literal,
each multibyte character is converted
(as if by calling the mbtowc()
function)
into a wide character.
Thus, the type of L'¥'
is wchar_t the type of L"abc¥xyz"
is array of wchar_t
with length eight.
(Just as with regular string literals,
each wide string literal has an extra zero-valued element appended,
but in these cases it is a wchar_t
with value zero.)
Just as regular string literals can be used as a short-hand method for character array initialization, wide string literals can be used to initialize wchar_t arrays:
wchar_t wp = L"a¥z"; wchar_t x[] = L"a¥z"; wchar_t y[] = {L'a', L'¥', L'z', 0}; wchar_t z[] = {'a', L'¥', 'z', '\0'};In the above example, the three arrays
x
, y
, and z
, and the array
pointed to by wp
, have the same length
and all are initialized with identical values.
Finally, adjacent wide string literals will be concatenated, just as with regular string literals. However, adjacent regular and wide string literals produce undefined behavior. A compiler is not even required to complain if it does not accept such concatenations.