Complying with standard C

C language features

To give even more flexibility to the programmer in an Asian environment, ANSI C provides wide character constants and wide string literals. These have the same form as their non-wide versions except that they are immediately prefixed by the letter L:

'x' regular character constant

'¥' regular character constant

L'x' wide character constant

L'¥' wide character constant

"abc¥xyz" regular string literal

L"abc¥xyz" wide string literal

Notice that multibyte characters are valid in both the regular and wide versions. The sequence of bytes necessary to produce the ideogram ¥ is encoding-specific, but if it consists of more than one byte, the value of the character constant '¥' is implementation defined, just as the value of 'ab' is implementation defined. A regular string literal contains exactly the bytes (except for escape sequences) specified between the quotes, including the bytes of each specified multibyte character.

When the compilation system encounters a wide character constant or wide string literal, each multibyte character is converted (as if by calling the mbtowc() function) into a wide character. Thus, the type of L'¥' is wchar_t the type of L"abc¥xyz" is array of wchar_t with length eight. (Just as with regular string literals, each wide string literal has an extra zero-valued element appended, but in these cases it is a wchar_t with value zero.)

Just as regular string literals can be used as a short-hand method for character array initialization, wide string literals can be used to initialize wchar_t arrays:

   wchar_t wp = L"a¥z";
   wchar_t x[] = L"a¥z";
   wchar_t y[] = {L'a', L'¥', L'z', 0};
   wchar_t z[] = {'a', L'¥', 'z', '\0'};

In the above example, the three arrays x, y, and z, and the array pointed to by wp, have the same length and all are initialized with identical values.

Finally, adjacent wide string literals will be concatenated, just as with regular string literals. However, adjacent regular and wide string literals produce undefined behavior. A compiler is not even required to complain if it does not accept such concatenations.

'x'	regular character constant
'¥'	regular character constant
L'x'	wide character constant
L'¥'	wide character constant
"abc¥xyz"	regular string literal
L"abc¥xyz"	wide string literal