Developing distributed applications using ONC RPC and XDR

XDR standard

This section defines the XDR standard. This standard is independent of languages, operating systems and hardware architectures. Once data is shared among machines, it should not matter that the data was produced on an SCO OpenServer system, but is consumed by a Sun workstation, or conversely. Similarly, the choice of operating systems should have no influence on how the data is represented externally. For programming languages, data produced by a C program should be readable by a FORTRAN or Pascal program.

The XDR standard depends on the assumption that bytes (or octets) are portable. A byte is defined to be eight bits of data. It is assumed the hardware that encodes bytes onto various media preserves the meaning of those bytes across hardware boundaries. For example, the Ethernet standard suggests that bytes be encoded using the ``little endian'' format. Hardware implementations of both Sun workstation and SCO OpenServer platforms adhere to the standard.

The XDR standard also suggests a language used to describe data. The language is a variant of C in that it is a data description language, not a programming language. In a similar way, the Xerox Courier Standard uses a variant of Mesa as its data description language.

Basic block size

The representation of all items requires a multiple of four bytes (or 32 bits) of data. The bytes are numbered 0 through n1. The bytes are read from or written to some byte stream such that byte m always precedes byte m+1.

Integer

An XDR signed integer is a 32-bit piece of data that encodes an integer in the range [-2147483648,2147483647]. The integer is represented in two's complement notation. The most and least significant bytes are 0 and 3, respectively. The data description of integers is integer.

Unsigned integer

An XDR unsigned integer is a 32-bit piece of data that encodes a nonnegative integer in the range [0,4294967295]. It is represented by an unsigned binary number whose most and least significant bytes are 0 and 3, respectively. The data description of unsigned integers is unsigned.

Enumerations

Enumerations are useful for describing subsets of the integers. Enumerations have the same representation as integers. The data description of enumerated data is as follows:

   typedef enum { name = value, .... } type-name;

For example, the three colors red, yellow, and blue could be described by an enumerated type:

   typedef enum { RED = 2, YELLOW = 3, BLUE = 5 } colors;

Booleans

Booleans are important enough and occur frequently enough to warrant their own explicit type in the standard. Boolean is an enumeration with the following form:

   typedef enum { FALSE = 0, TRUE = 1 } boolean;

Hyper integer and hyper unsigned

The standard also defines 64-bit (8-byte) numbers called hyper integer and hyper unsigned. Their representations are the obvious extensions of the integer and unsigned, defined above. The most and least significant bytes are 0 and 7, respectively.

Floating point and double precision

The standard defines the encoding for the floating-point data types float (32 bits or 4 bytes) and double (64 bits or 8 bytes). The encoding used is the IEEE standard for normalized single- and double-precision floating point numbers. (See the IEEE floating-point standard for more information.) The standard encodes the following three fields, which describe the floating point number:

S: The sign of the number. Values 0 and 1 represent positive and negative, respectively.
E: The exponent of the number, base 2. Floats devote 8 bits to this field, while doubles devote 11 bits. The exponents for float and double are biased by 127 and 1023, respectively.
F: The fractional part of the number's mantissa, base 2. Floats devote 23 bits to this field, while doubles devote 52 bits.

Therefore, the floating-point number is described by:

(-1)^S x 2^(E-Bias) x 1.F

Just as the most and least significant bytes of a number are 0 and 3, the most and least significant bits of a single-precision floating point number are 0 and 31. The beginning and most significant bit offsets of S, E, and F are 0, 1, and 9, respectively.

Doubles have the analogous extensions. The beginning and most significant bit offsets of S, E, and F are 0, 1, and 12, respectively.

The IEEE specification should be consulted concerning the encoding for signed zero, signed infinity (overflow) and denormalized numbers (underflow). Under IEEE specifications, the ``NaN'' (not a number) is system-dependent and should not be used.

Standard opaque data

At times, fixed-sized uninterpreted data needs to be passed among machines. This data is called opaque and is described as:

   typedef opaque type-name[n];
   opaque name[n];

where n is the (static) number of bytes necessary to contain the opaque data. If n is not a multiple of four, the n bytes are followed by enough (up to 3) zero-valued bytes to make the total byte count of the opaque object a multiple of four.

Counted-Byte strings

The standard defines a string of n (numbered 0 through n1) bytes to be the number n encoded as unsigned, and followed by the n bytes of the string. If n is not a multiple of four, the n bytes are followed by enough (up to 3) zero-valued bytes to make the total byte count a multiple of four. The data description of strings is as follows:

   typedef string type-name<N>;
   typedef string type-name<>;
   string name<N>;
   string name<>;

The data description language uses angle brackets (< and >) to denote anything that is of varying length, as opposed to square brackets to denote fixed-length sequences of data.

The constant N denotes an upper bound of the number of bytes that a string may contain. If N is not specified, it is assumed to be 2^32 - 1, the maximum length. The constant N would normally be found in a protocol specification. For example, a filing protocol may state that a file name can be no longer than 14 bytes, such as:

   string filename<14>;

The XDR specification does not say what the individual bytes of a string represent; this important information is left to higher-level specifications. A reasonable default is to assume that the bytes encode ASCII characters.

Fixed arrays

The data description for fixed-size arrays of homogeneous elements is as follows:

   typedef elementtype type-name[n];
   elementtype name[n];

Fixed-size arrays of elements numbered 0 through n

1 are encoded by individually encoding the elements of the array in their natural order, 0 through n

Counted arrays

Counted arrays provide the ability to encode variable-length arrays of homogeneous elements. The array is encoded as the element count n (an unsigned integer), followed by the encoding of each of the array's elements, starting with element 0 and progressing through element n1. The data description for counted arrays is similar to that of counted strings:

   typedef elementtype type-name<N>;
   typedef elementtype type-name<>;
   elementtype name<N>;
   elementtype name<>;

Again, the constant N specifies the maximum acceptable element count of an array; if N is not specified, it is assumed to be 2^32 - 1.

Structures

The data description for structures is very similar to that of standard C:

   typedef struct {
           component-type component-name;
           ...
   } type-name;

The components of the structure are encoded in the order of their declaration in the structure.

Standard discriminated unions

A discriminated union is a type composed of a discriminant followed by a type selected from a set of prearranged types according to the value of the discriminant. The type of the discriminant is always an enumeration. The component types are called ``arms'' of the union. The discriminated union is encoded as its discriminant followed by the encoding of the implied arm. The data description for discriminated unions is as follows:

   typedef union switch (discriminant-type) {
           discriminant-value: arm-type;
           ...
           default: default-arm-type;
   } type-name;

The default arm is optional. If it is not specified, then a valid encoding of the union cannot take on unspecified discriminant values. Most specifications neither need nor use default arms.

Missing specifications

The standard lacks representations for bit fields and bitmaps, since the standard is based on bytes. This does not imply that no specification should be attempted.

Library primitive / XDR standard cross-reference

The following table shows the association between the C library primitives discussed in the section ``XDR library primitives'' and the standard data types defined in this section. It also shows the subsections within these two document sections where each primitive and data type is discussed.

C Primitive XDR Type Sections

xdr_int

xdr_long integer Number filters

Integer

xdr_short

xdr_u_int

xdr_u_long unsigned Number filters

Unsigned Integer

xdr_u_short

- hyper integer Hyper integer and hyper unsigned

hyper unsigned

xdr_float float Floating-point filters

Floating point and double precision

xdr_double double Floating-point filters

Floating point and double precision

xdr_enum enum_t Enumeration filters

Enumerations

xdr_bool bool_t Enumeration filters

Booleans

xdr_string string Strings

Counted-byte strings

xdr_bytes Byte arrays

xdr_array (varying arrays) Arrays

Counted arrays

- (fixed arrays) Fixed-size arrays

Fixed arrays

xdr_opaque opaque Constructed opaque data

Standard opaque data

xdr_union union Constructed discriminated unions

Standard discriminated unions

xdr_reference - Pointers

- struct Hyper integer and hyper unsigned

C Primitive	XDR Type	Sections
xdr_int
xdr_long	integer	Number filters
		Integer
xdr_short
xdr_u_int
xdr_u_long	unsigned	Number filters
		Unsigned Integer
xdr_u_short
-	hyper integer	Hyper integer and hyper unsigned
	hyper unsigned
xdr_float	float	Floating-point filters
		Floating point and double precision
xdr_double	double	Floating-point filters
		Floating point and double precision
xdr_enum	enum_t	Enumeration filters
		Enumerations
xdr_bool	bool_t	Enumeration filters
		Booleans
xdr_string	string	Strings
		Counted-byte strings
xdr_bytes		Byte arrays
xdr_array	(varying arrays)	Arrays
		Counted arrays
-	(fixed arrays)	Fixed-size arrays
		Fixed arrays
xdr_opaque	opaque	Constructed opaque data
		Standard opaque data
xdr_union	union	Constructed discriminated unions
		Standard discriminated unions
xdr_reference	-	Pointers
-	struct	Hyper integer and hyper unsigned

The record marking standard

A record is composed of one or more record fragments. A record fragment is a four-byte header followed by 0 to 2^31 - 1 bytes of fragment data. The bytes encode an unsigned binary number; as with XDR integers, the byte order is from highest to lowest. The number encodes two values: a boolean that indicates whether the fragment is the last fragment of the record (bit value 1 implying the fragment is the last fragment), and a 31-bit unsigned binary value that is the length in bytes of the fragment's data. The boolean value is the high-order bit of the header; the length is the 31 low-order bits.

NOTE: This record specification is not in XDR standard form and cannot be implemented using XDR primitives.