(gmp.info.gz) Assembler Floating Point
Info Catalog
(gmp.info.gz) Assembler Functional Units
(gmp.info.gz) Assembler Coding
(gmp.info.gz) Assembler SIMD Instructions
Floating Point
--------------
Floating point arithmetic is used in GMP for multiplications on CPUs
with poor integer multipliers. It's mostly useful for `mpn_mul_1',
`mpn_addmul_1' and `mpn_submul_1' on 64-bit machines, and
`mpn_mul_basecase' on both 32-bit and 64-bit machines.
With IEEE 53-bit double precision floats, integer multiplications
producing up to 53 bits will give exact results. Breaking a 64x64
multiplication into eight 16x32->48 bit pieces is convenient. With
some care though six 21x32->53 bit products can be used, if one of the
lower two 21-bit pieces also uses the sign bit.
For the `mpn_mul_1' family of functions on a 64-bit machine, the
invariant single limb is split at the start, into 3 or 4 pieces.
Inside the loop, the bignum operand is split into 32-bit pieces. Fast
conversion of these unsigned 32-bit pieces to floating point is highly
machine-dependent. In some cases, reading the data into the integer
unit, zero-extending to 64-bits, then transferring to the floating
point unit back via memory is the only option.
Converting partial products back to 64-bit limbs is usually best
done as a signed conversion. Since all values are smaller than 2^53,
signed and unsigned are the same, but most processors lack unsigned
conversions.
Here is a diagram showing 16x32 bit products for an `mpn_mul_1' or
`mpn_addmul_1' with a 64-bit limb. The single limb operand V is split
into four 16-bit parts. The multi-limb operand U is split in the loop
into two 32-bit parts.
+---+---+---+---+
|v48|v32|v16|v00| V operand
+---+---+---+---+
+-------+---+---+
x | u32 | u00 | U operand (one limb)
+---------------+
---------------------------------
+-----------+
| u00 x v00 | p00 48-bit products
+-----------+
+-----------+
| u00 x v16 | p16
+-----------+
+-----------+
| u00 x v32 | p32
+-----------+
+-----------+
| u00 x v48 | p48
+-----------+
+-----------+
| u32 x v00 | r32
+-----------+
+-----------+
| u32 x v16 | r48
+-----------+
+-----------+
| u32 x v32 | r64
+-----------+
+-----------+
| u32 x v48 | r80
+-----------+
p32 and r32 can be summed using floating-point addition, and
likewise p48 and r48. p00 and p16 can be summed with r64 and r80 from
the previous iteration.
For each loop then, four 49-bit quantities are transfered to the
integer unit, aligned as follows,
|-----64bits----|-----64bits----|
+------------+
| p00 + r64' | i00
+------------+
+------------+
| p16 + r80' | i16
+------------+
+------------+
| p32 + r32 | i32
+------------+
+------------+
| p48 + r48 | i48
+------------+
The challenge then is to sum these efficiently and add in a carry
limb, generating a low 64-bit result limb and a high 33-bit carry limb
(i48 extends 33 bits into the high half).
Info Catalog
(gmp.info.gz) Assembler Functional Units
(gmp.info.gz) Assembler Coding
(gmp.info.gz) Assembler SIMD Instructions
automatically generated byinfo2html