(gmp.info.gz) Binary to Radix

(gmp.info.gz) Radix Conversion Algorithms
 
 Binary to Radix
 ---------------
 
 Conversions from binary to a power-of-2 radix use a simple and fast
 O(N) bit extraction algorithm.
 
    Conversions from binary to other radices use one of two algorithms.
 Sizes below `GET_STR_PRECOMPUTE_THRESHOLD' use a basic O(N^2) method.
 Repeated divisions by b^n are made, where b is the radix and n is the
 biggest power that fits in a limb.  But instead of simply using the
 remainder r from such divisions, an extra divide step is done to give a
 fractional limb representing r/b^n.  The digits of r can then be
 extracted using multiplications by b rather than divisions.  Special
 case code is provided for decimal, allowing multiplications by 10 to
 optimize to shifts and adds.
 
    Above `GET_STR_PRECOMPUTE_THRESHOLD' a sub-quadratic algorithm is
 used.  For an input t, powers b^(n*2^i) of the radix are calculated,
 until a power between t and sqrt(t) is reached.  t is then divided by
 that largest power, giving a quotient which is the digits above that
 power, and a remainder which is those below.  These two parts are in
 turn divided by the second highest power, and so on recursively.  When
 a piece has been divided down to less than `GET_STR_DC_THRESHOLD'
 limbs, the basecase algorithm described above is used.
 
    The advantage of this algorithm is that big divisions can make use
 of the sub-quadratic divide and conquer division ( Divide and
 Conquer Division), and big divisions tend to have less overheads than
 lots of separate single limb divisions anyway.  But in any case the
 cost of calculating the powers b^(n*2^i) must first be overcome.
 
    `GET_STR_PRECOMPUTE_THRESHOLD' and `GET_STR_DC_THRESHOLD' represent
 the same basic thing, the point where it becomes worth doing a big
 division to cut the input in half.  `GET_STR_PRECOMPUTE_THRESHOLD'
 includes the cost of calculating the radix power required, whereas
 `GET_STR_DC_THRESHOLD' assumes that's already available, which is the
 case when recursing.
 
    Since the base case produces digits from least to most significant
 but they want to be stored from most to least, it's necessary to
 calculate in advance how many digits there will be, or at least be sure
 not to underestimate that.  For GMP the number of input bits is
 multiplied by `chars_per_bit_exactly' from `mp_bases', rounding up.
 The result is either correct or one too big.
 
    Examining some of the high bits of the input could increase the
 chance of getting the exact number of digits, but an exact result every
 time would not be practical, since in general the difference between
 numbers 100... and 99... is only in the last few bits and the work to
 identify 99...  might well be almost as much as a full conversion.
 
    `mpf_get_str' doesn't currently use the algorithm described here, it
 multiplies or divides by a power of b to move the radix point to the
 just above the highest non-zero digit (or at worst one above that
 location), then multiplies by b^n to bring out digits.  This is O(N^2)
 and is certainly not optimal.
 
    The r/b^n scheme described above for using multiplications to bring
 out digits might be useful for more than a single limb.  Some brief
 experiments with it on the base case when recursing didn't give a
 noticeable improvement, but perhaps that was only due to the
 implementation.  Something similar would work for the sub-quadratic
 divisions too, though there would be the cost of calculating a bigger
 radix power.
 
    Another possible improvement for the sub-quadratic part would be to
 arrange for radix powers that balanced the sizes of quotient and
 remainder produced, ie. the highest power would be an b^(n*k)
 approximately equal to sqrt(t), not restricted to a 2^i factor.  That
 ought to smooth out a graph of times against sizes, but may or may not
 be a net speedup.
Info Catalog
(gmp.info.gz) Radix Conversion Algorithms
(gmp.info.gz) Radix to Binary
automatically generated byinfo2html