(gmp.info.gz) Binary to Radix
Info Catalog
(gmp.info.gz) Radix Conversion Algorithms
(gmp.info.gz) Radix Conversion Algorithms
(gmp.info.gz) Radix to Binary
Binary to Radix
---------------
Conversions from binary to a power-of-2 radix use a simple and fast
O(N) bit extraction algorithm.
Conversions from binary to other radices use one of two algorithms.
Sizes below `GET_STR_PRECOMPUTE_THRESHOLD' use a basic O(N^2) method.
Repeated divisions by b^n are made, where b is the radix and n is the
biggest power that fits in a limb. But instead of simply using the
remainder r from such divisions, an extra divide step is done to give a
fractional limb representing r/b^n. The digits of r can then be
extracted using multiplications by b rather than divisions. Special
case code is provided for decimal, allowing multiplications by 10 to
optimize to shifts and adds.
Above `GET_STR_PRECOMPUTE_THRESHOLD' a sub-quadratic algorithm is
used. For an input t, powers b^(n*2^i) of the radix are calculated,
until a power between t and sqrt(t) is reached. t is then divided by
that largest power, giving a quotient which is the digits above that
power, and a remainder which is those below. These two parts are in
turn divided by the second highest power, and so on recursively. When
a piece has been divided down to less than `GET_STR_DC_THRESHOLD'
limbs, the basecase algorithm described above is used.
The advantage of this algorithm is that big divisions can make use
of the sub-quadratic divide and conquer division ( Divide and
Conquer Division), and big divisions tend to have less overheads than
lots of separate single limb divisions anyway. But in any case the
cost of calculating the powers b^(n*2^i) must first be overcome.
`GET_STR_PRECOMPUTE_THRESHOLD' and `GET_STR_DC_THRESHOLD' represent
the same basic thing, the point where it becomes worth doing a big
division to cut the input in half. `GET_STR_PRECOMPUTE_THRESHOLD'
includes the cost of calculating the radix power required, whereas
`GET_STR_DC_THRESHOLD' assumes that's already available, which is the
case when recursing.
Since the base case produces digits from least to most significant
but they want to be stored from most to least, it's necessary to
calculate in advance how many digits there will be, or at least be sure
not to underestimate that. For GMP the number of input bits is
multiplied by `chars_per_bit_exactly' from `mp_bases', rounding up.
The result is either correct or one too big.
Examining some of the high bits of the input could increase the
chance of getting the exact number of digits, but an exact result every
time would not be practical, since in general the difference between
numbers 100... and 99... is only in the last few bits and the work to
identify 99... might well be almost as much as a full conversion.
`mpf_get_str' doesn't currently use the algorithm described here, it
multiplies or divides by a power of b to move the radix point to the
just above the highest non-zero digit (or at worst one above that
location), then multiplies by b^n to bring out digits. This is O(N^2)
and is certainly not optimal.
The r/b^n scheme described above for using multiplications to bring
out digits might be useful for more than a single limb. Some brief
experiments with it on the base case when recursing didn't give a
noticeable improvement, but perhaps that was only due to the
implementation. Something similar would work for the sub-quadratic
divisions too, though there would be the cost of calculating a bigger
radix power.
Another possible improvement for the sub-quadratic part would be to
arrange for radix powers that balanced the sizes of quotient and
remainder produced, ie. the highest power would be an b^(n*k)
approximately equal to sqrt(t), not restricted to a 2^i factor. That
ought to smooth out a graph of times against sizes, but may or may not
be a net speedup.
Info Catalog
(gmp.info.gz) Radix Conversion Algorithms
(gmp.info.gz) Radix Conversion Algorithms
(gmp.info.gz) Radix to Binary
automatically generated byinfo2html