| This is ../../gmp/doc/gmp.info, produced by makeinfo version 4.8 from |
| ../../gmp/doc/gmp.texi. |
| |
| This manual describes how to install and use the GNU multiple |
| precision arithmetic library, version 6.1.0. |
| |
| Copyright 1991, 1993-2015 Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this |
| document under the terms of the GNU Free Documentation License, Version |
| 1.3 or any later version published by the Free Software Foundation; |
| with no Invariant Sections, with the Front-Cover Texts being "A GNU |
| Manual", and with the Back-Cover Texts being "You have freedom to copy |
| and modify this GNU Manual, like GNU software". A copy of the license |
| is included in *Note GNU Free Documentation License::. |
| |
| INFO-DIR-SECTION GNU libraries |
| START-INFO-DIR-ENTRY |
| * gmp: (gmp). GNU Multiple Precision Arithmetic Library. |
| END-INFO-DIR-ENTRY |
| |
| |
| File: gmp.info, Node: Exact Division, Next: Exact Remainder, Prev: Block-Wise Barrett Division, Up: Division Algorithms |
| |
| 15.2.5 Exact Division |
| --------------------- |
| |
| A so-called exact division is when the dividend is known to be an exact |
| multiple of the divisor. Jebelean's exact division algorithm uses this |
| knowledge to make some significant optimizations (*note References::). |
| |
| The idea can be illustrated in decimal for example with 368154 |
| divided by 543. Because the low digit of the dividend is 4, the low |
| digit of the quotient must be 8. This is arrived at from 4*7 mod 10, |
| using the fact 7 is the modular inverse of 3 (the low digit of the |
| divisor), since 3*7 == 1 mod 10. So 8*543=4344 can be subtracted from |
| the dividend leaving 363810. Notice the low digit has become zero. |
| |
| The procedure is repeated at the second digit, with the next |
| quotient digit 7 (7 == 1*7 mod 10), subtracting 7*543=3801, leaving |
| 325800. And finally at the third digit with quotient digit 6 (8*7 mod |
| 10), subtracting 6*543=3258 leaving 0. So the quotient is 678. |
| |
| Notice however that the multiplies and subtractions don't need to |
| extend past the low three digits of the dividend, since that's enough |
| to determine the three quotient digits. For the last quotient digit no |
| subtraction is needed at all. On a 2NxN division like this one, only |
| about half the work of a normal basecase division is necessary. |
| |
| For an NxM exact division producing Q=N-M quotient limbs, the saving |
| over a normal basecase division is in two parts. Firstly, each of the |
| Q quotient limbs needs only one multiply, not a 2x1 divide and |
| multiply. Secondly, the crossproducts are reduced when Q>M to |
| Q*M-M*(M+1)/2, or when Q<=M to Q*(Q-1)/2. Notice the savings are |
| complementary. If Q is big then many divisions are saved, or if Q is |
| small then the crossproducts reduce to a small number. |
| |
| The modular inverse used is calculated efficiently by `binvert_limb' |
| in `gmp-impl.h'. This does four multiplies for a 32-bit limb, or six |
| for a 64-bit limb. `tune/modlinv.c' has some alternate implementations |
| that might suit processors better at bit twiddling than multiplying. |
| |
| The sub-quadratic exact division described by Jebelean in "Exact |
| Division with Karatsuba Complexity" is not currently implemented. It |
| uses a rearrangement similar to the divide and conquer for normal |
| division (*note Divide and Conquer Division::), but operating from low |
| to high. A further possibility not currently implemented is |
| "Bidirectional Exact Integer Division" by Krandick and Jebelean which |
| forms quotient limbs from both the high and low ends of the dividend, |
| and can halve once more the number of crossproducts needed in a 2NxN |
| division. |
| |
| A special case exact division by 3 exists in `mpn_divexact_by3', |
| supporting Toom-3 multiplication and `mpq' canonicalizations. It forms |
| quotient digits with a multiply by the modular inverse of 3 (which is |
| `0xAA..AAB') and uses two comparisons to determine a borrow for the next |
| limb. The multiplications don't need to be on the dependent chain, as |
| long as the effect of the borrows is applied, which can help chips with |
| pipelined multipliers. |
| |
| |
| File: gmp.info, Node: Exact Remainder, Next: Small Quotient Division, Prev: Exact Division, Up: Division Algorithms |
| |
| 15.2.6 Exact Remainder |
| ---------------------- |
| |
| If the exact division algorithm is done with a full subtraction at each |
| stage and the dividend isn't a multiple of the divisor, then low zero |
| limbs are produced but with a remainder in the high limbs. For |
| dividend a, divisor d, quotient q, and b = 2^mp_bits_per_limb, this |
| remainder r is of the form |
| |
| a = q*d + r*b^n |
| |
| n represents the number of zero limbs produced by the subtractions, |
| that being the number of limbs produced for q. r will be in the range |
| 0<=r<d and can be viewed as a remainder, but one shifted up by a factor |
| of b^n. |
| |
| Carrying out full subtractions at each stage means the same number |
| of cross products must be done as a normal division, but there's still |
| some single limb divisions saved. When d is a single limb some |
| simplifications arise, providing good speedups on a number of |
| processors. |
| |
| The functions `mpn_divexact_by3', `mpn_modexact_1_odd' and the |
| internal `mpn_redc_X' functions differ subtly in how they return r, |
| leading to some negations in the above formula, but all are essentially |
| the same. |
| |
| Clearly r is zero when a is a multiple of d, and this leads to |
| divisibility or congruence tests which are potentially more efficient |
| than a normal division. |
| |
| The factor of b^n on r can be ignored in a GCD when d is odd, hence |
| the use of `mpn_modexact_1_odd' by `mpn_gcd_1' and `mpz_kronecker_ui' |
| etc (*note Greatest Common Divisor Algorithms::). |
| |
| Montgomery's REDC method for modular multiplications uses operands |
| of the form of x*b^-n and y*b^-n and on calculating (x*b^-n)*(y*b^-n) |
| uses the factor of b^n in the exact remainder to reach a product in the |
| same form (x*y)*b^-n (*note Modular Powering Algorithm::). |
| |
| Notice that r generally gives no useful information about the |
| ordinary remainder a mod d since b^n mod d could be anything. If |
| however b^n == 1 mod d, then r is the negative of the ordinary |
| remainder. This occurs whenever d is a factor of b^n-1, as for example |
| with 3 in `mpn_divexact_by3'. For a 32 or 64 bit limb other such |
| factors include 5, 17 and 257, but no particular use has been found for |
| this. |
| |
| |
| File: gmp.info, Node: Small Quotient Division, Prev: Exact Remainder, Up: Division Algorithms |
| |
| 15.2.7 Small Quotient Division |
| ------------------------------ |
| |
| An NxM division where the number of quotient limbs Q=N-M is small can |
| be optimized somewhat. |
| |
| An ordinary basecase division normalizes the divisor by shifting it |
| to make the high bit set, shifting the dividend accordingly, and |
| shifting the remainder back down at the end of the calculation. This |
| is wasteful if only a few quotient limbs are to be formed. Instead a |
| division of just the top 2*Q limbs of the dividend by the top Q limbs |
| of the divisor can be used to form a trial quotient. This requires |
| only those limbs normalized, not the whole of the divisor and dividend. |
| |
| A multiply and subtract then applies the trial quotient to the M-Q |
| unused limbs of the divisor and N-Q dividend limbs (which includes Q |
| limbs remaining from the trial quotient division). The starting trial |
| quotient can be 1 or 2 too big, but all cases of 2 too big and most |
| cases of 1 too big are detected by first comparing the most significant |
| limbs that will arise from the subtraction. An addback is done if the |
| quotient still turns out to be 1 too big. |
| |
| This whole procedure is essentially the same as one step of the |
| basecase algorithm done in a Q limb base, though with the trial |
| quotient test done only with the high limbs, not an entire Q limb |
| "digit" product. The correctness of this weaker test can be |
| established by following the argument of Knuth section 4.3.1 exercise |
| 20 but with the v2*q>b*r+u2 condition appropriately relaxed. |
| |
| |
| File: gmp.info, Node: Greatest Common Divisor Algorithms, Next: Powering Algorithms, Prev: Division Algorithms, Up: Algorithms |
| |
| 15.3 Greatest Common Divisor |
| ============================ |
| |
| * Menu: |
| |
| * Binary GCD:: |
| * Lehmer's Algorithm:: |
| * Subquadratic GCD:: |
| * Extended GCD:: |
| * Jacobi Symbol:: |
| |
| |
| File: gmp.info, Node: Binary GCD, Next: Lehmer's Algorithm, Prev: Greatest Common Divisor Algorithms, Up: Greatest Common Divisor Algorithms |
| |
| 15.3.1 Binary GCD |
| ----------------- |
| |
| At small sizes GMP uses an O(N^2) binary style GCD. This is described |
| in many textbooks, for example Knuth section 4.5.2 algorithm B. It |
| simply consists of successively reducing odd operands a and b using |
| |
| a,b = abs(a-b),min(a,b) |
| strip factors of 2 from a |
| |
| The Euclidean GCD algorithm, as per Knuth algorithms E and A, |
| repeatedly computes the quotient q = floor(a/b) and replaces a,b by v, |
| u - q v. The binary algorithm has so far been found to be faster than |
| the Euclidean algorithm everywhere. One reason the binary method does |
| well is that the implied quotient at each step is usually small, so |
| often only one or two subtractions are needed to get the same effect as |
| a division. Quotients 1, 2 and 3 for example occur 67.7% of the time, |
| see Knuth section 4.5.3 Theorem E. |
| |
| When the implied quotient is large, meaning b is much smaller than |
| a, then a division is worthwhile. This is the basis for the initial a |
| mod b reductions in `mpn_gcd' and `mpn_gcd_1' (the latter for both Nx1 |
| and 1x1 cases). But after that initial reduction, big quotients occur |
| too rarely to make it worth checking for them. |
| |
| |
| The final 1x1 GCD in `mpn_gcd_1' is done in the generic C code as |
| described above. For two N-bit operands, the algorithm takes about |
| 0.68 iterations per bit. For optimum performance some attention needs |
| to be paid to the way the factors of 2 are stripped from a. |
| |
| Firstly it may be noted that in twos complement the number of low |
| zero bits on a-b is the same as b-a, so counting or testing can begin on |
| a-b without waiting for abs(a-b) to be determined. |
| |
| A loop stripping low zero bits tends not to branch predict well, |
| since the condition is data dependent. But on average there's only a |
| few low zeros, so an option is to strip one or two bits arithmetically |
| then loop for more (as done for AMD K6). Or use a lookup table to get |
| a count for several bits then loop for more (as done for AMD K7). An |
| alternative approach is to keep just one of a or b odd and iterate |
| |
| a,b = abs(a-b), min(a,b) |
| a = a/2 if even |
| b = b/2 if even |
| |
| This requires about 1.25 iterations per bit, but stripping of a |
| single bit at each step avoids any branching. Repeating the bit strip |
| reduces to about 0.9 iterations per bit, which may be a worthwhile |
| tradeoff. |
| |
| Generally with the above approaches a speed of perhaps 6 cycles per |
| bit can be achieved, which is still not terribly fast with for instance |
| a 64-bit GCD taking nearly 400 cycles. It's this sort of time which |
| means it's not usually advantageous to combine a set of divisibility |
| tests into a GCD. |
| |
| Currently, the binary algorithm is used for GCD only when N < 3. |
| |
| |
| File: gmp.info, Node: Lehmer's Algorithm, Next: Subquadratic GCD, Prev: Binary GCD, Up: Greatest Common Divisor Algorithms |
| |
| 15.3.2 Lehmer's algorithm |
| ------------------------- |
| |
| Lehmer's improvement of the Euclidean algorithms is based on the |
| observation that the initial part of the quotient sequence depends only |
| on the most significant parts of the inputs. The variant of Lehmer's |
| algorithm used in GMP splits off the most significant two limbs, as |
| suggested, e.g., in "A Double-Digit Lehmer-Euclid Algorithm" by |
| Jebelean (*note References::). The quotients of two double-limb inputs |
| are collected as a 2 by 2 matrix with single-limb elements. This is |
| done by the function `mpn_hgcd2'. The resulting matrix is applied to |
| the inputs using `mpn_mul_1' and `mpn_submul_1'. Each iteration usually |
| reduces the inputs by almost one limb. In the rare case of a large |
| quotient, no progress can be made by examining just the most |
| significant two limbs, and the quotient is computed using plain |
| division. |
| |
| The resulting algorithm is asymptotically O(N^2), just as the |
| Euclidean algorithm and the binary algorithm. The quadratic part of the |
| work are the calls to `mpn_mul_1' and `mpn_submul_1'. For small sizes, |
| the linear work is also significant. There are roughly N calls to the |
| `mpn_hgcd2' function. This function uses a couple of important |
| optimizations: |
| |
| * It uses the same relaxed notion of correctness as `mpn_hgcd' (see |
| next section). This means that when called with the most |
| significant two limbs of two large numbers, the returned matrix |
| does not always correspond exactly to the initial quotient |
| sequence for the two large numbers; the final quotient may |
| sometimes be one off. |
| |
| * It takes advantage of the fact the quotients are usually small. |
| The division operator is not used, since the corresponding |
| assembler instruction is very slow on most architectures. (This |
| code could probably be improved further, it uses many branches |
| that are unfriendly to prediction). |
| |
| * It switches from double-limb calculations to single-limb |
| calculations half-way through, when the input numbers have been |
| reduced in size from two limbs to one and a half. |
| |
| |
| |
| File: gmp.info, Node: Subquadratic GCD, Next: Extended GCD, Prev: Lehmer's Algorithm, Up: Greatest Common Divisor Algorithms |
| |
| 15.3.3 Subquadratic GCD |
| ----------------------- |
| |
| For inputs larger than `GCD_DC_THRESHOLD', GCD is computed via the HGCD |
| (Half GCD) function, as a generalization to Lehmer's algorithm. |
| |
| Let the inputs a,b be of size N limbs each. Put S = floor(N/2) + 1. |
| Then HGCD(a,b) returns a transformation matrix T with non-negative |
| elements, and reduced numbers (c;d) = T^-1 (a;b). The reduced numbers |
| c,d must be larger than S limbs, while their difference abs(c-d) must |
| fit in S limbs. The matrix elements will also be of size roughly N/2. |
| |
| The HGCD base case uses Lehmer's algorithm, but with the above stop |
| condition that returns reduced numbers and the corresponding |
| transformation matrix half-way through. For inputs larger than |
| `HGCD_THRESHOLD', HGCD is computed recursively, using the divide and |
| conquer algorithm in "On Scho"nhage's algorithm and subquadratic |
| integer GCD computation" by Mo"ller (*note References::). The recursive |
| algorithm consists of these main steps. |
| |
| * Call HGCD recursively, on the most significant N/2 limbs. Apply the |
| resulting matrix T_1 to the full numbers, reducing them to a size |
| just above 3N/2. |
| |
| * Perform a small number of division or subtraction steps to reduce |
| the numbers to size below 3N/2. This is essential mainly for the |
| unlikely case of large quotients. |
| |
| * Call HGCD recursively, on the most significant N/2 limbs of the |
| reduced numbers. Apply the resulting matrix T_2 to the full |
| numbers, reducing them to a size just above N/2. |
| |
| * Compute T = T_1 T_2. |
| |
| * Perform a small number of division and subtraction steps to |
| satisfy the requirements, and return. |
| |
| GCD is then implemented as a loop around HGCD, similarly to Lehmer's |
| algorithm. Where Lehmer repeatedly chops off the top two limbs, calls |
| `mpn_hgcd2', and applies the resulting matrix to the full numbers, the |
| sub-quadratic GCD chops off the most significant third of the limbs (the |
| proportion is a tuning parameter, and 1/3 seems to be more efficient |
| than, e.g, 1/2), calls `mpn_hgcd', and applies the resulting matrix. |
| Once the input numbers are reduced to size below `GCD_DC_THRESHOLD', |
| Lehmer's algorithm is used for the rest of the work. |
| |
| The asymptotic running time of both HGCD and GCD is O(M(N)*log(N)), |
| where M(N) is the time for multiplying two N-limb numbers. |
| |
| |
| File: gmp.info, Node: Extended GCD, Next: Jacobi Symbol, Prev: Subquadratic GCD, Up: Greatest Common Divisor Algorithms |
| |
| 15.3.4 Extended GCD |
| ------------------- |
| |
| The extended GCD function, or GCDEXT, calculates gcd(a,b) and also |
| cofactors x and y satisfying a*x+b*y=gcd(a,b). All the algorithms used |
| for plain GCD are extended to handle this case. The binary algorithm is |
| used only for single-limb GCDEXT. Lehmer's algorithm is used for sizes |
| up to `GCDEXT_DC_THRESHOLD'. Above this threshold, GCDEXT is |
| implemented as a loop around HGCD, but with more book-keeping to keep |
| track of the cofactors. This gives the same asymptotic running time as |
| for GCD and HGCD, O(M(N)*log(N)) |
| |
| One difference to plain GCD is that while the inputs a and b are |
| reduced as the algorithm proceeds, the cofactors x and y grow in size. |
| This makes the tuning of the chopping-point more difficult. The current |
| code chops off the most significant half of the inputs for the call to |
| HGCD in the first iteration, and the most significant two thirds for |
| the remaining calls. This strategy could surely be improved. Also the |
| stop condition for the loop, where Lehmer's algorithm is invoked once |
| the inputs are reduced below `GCDEXT_DC_THRESHOLD', could maybe be |
| improved by taking into account the current size of the cofactors. |
| |
| |
| File: gmp.info, Node: Jacobi Symbol, Prev: Extended GCD, Up: Greatest Common Divisor Algorithms |
| |
| 15.3.5 Jacobi Symbol |
| -------------------- |
| |
| [This section is obsolete. The current Jacobi code actually uses a very |
| efficient algorithm.] |
| |
| `mpz_jacobi' and `mpz_kronecker' are currently implemented with a |
| simple binary algorithm similar to that described for the GCDs (*note |
| Binary GCD::). They're not very fast when both inputs are large. |
| Lehmer's multi-step improvement or a binary based multi-step algorithm |
| is likely to be better. |
| |
| When one operand fits a single limb, and that includes |
| `mpz_kronecker_ui' and friends, an initial reduction is done with |
| either `mpn_mod_1' or `mpn_modexact_1_odd', followed by the binary |
| algorithm on a single limb. The binary algorithm is well suited to a |
| single limb, and the whole calculation in this case is quite efficient. |
| |
| In all the routines sign changes for the result are accumulated |
| using some bit twiddling, avoiding table lookups or conditional jumps. |
| |
| |
| File: gmp.info, Node: Powering Algorithms, Next: Root Extraction Algorithms, Prev: Greatest Common Divisor Algorithms, Up: Algorithms |
| |
| 15.4 Powering Algorithms |
| ======================== |
| |
| * Menu: |
| |
| * Normal Powering Algorithm:: |
| * Modular Powering Algorithm:: |
| |
| |
| File: gmp.info, Node: Normal Powering Algorithm, Next: Modular Powering Algorithm, Prev: Powering Algorithms, Up: Powering Algorithms |
| |
| 15.4.1 Normal Powering |
| ---------------------- |
| |
| Normal `mpz' or `mpf' powering uses a simple binary algorithm, |
| successively squaring and then multiplying by the base when a 1 bit is |
| seen in the exponent, as per Knuth section 4.6.3. The "left to right" |
| variant described there is used rather than algorithm A, since it's |
| just as easy and can be done with somewhat less temporary memory. |
| |
| |
| File: gmp.info, Node: Modular Powering Algorithm, Prev: Normal Powering Algorithm, Up: Powering Algorithms |
| |
| 15.4.2 Modular Powering |
| ----------------------- |
| |
| Modular powering is implemented using a 2^k-ary sliding window |
| algorithm, as per "Handbook of Applied Cryptography" algorithm 14.85 |
| (*note References::). k is chosen according to the size of the |
| exponent. Larger exponents use larger values of k, the choice being |
| made to minimize the average number of multiplications that must |
| supplement the squaring. |
| |
| The modular multiplies and squarings use either a simple division or |
| the REDC method by Montgomery (*note References::). REDC is a little |
| faster, essentially saving N single limb divisions in a fashion similar |
| to an exact remainder (*note Exact Remainder::). |
| |
| |
| File: gmp.info, Node: Root Extraction Algorithms, Next: Radix Conversion Algorithms, Prev: Powering Algorithms, Up: Algorithms |
| |
| 15.5 Root Extraction Algorithms |
| =============================== |
| |
| * Menu: |
| |
| * Square Root Algorithm:: |
| * Nth Root Algorithm:: |
| * Perfect Square Algorithm:: |
| * Perfect Power Algorithm:: |
| |
| |
| File: gmp.info, Node: Square Root Algorithm, Next: Nth Root Algorithm, Prev: Root Extraction Algorithms, Up: Root Extraction Algorithms |
| |
| 15.5.1 Square Root |
| ------------------ |
| |
| Square roots are taken using the "Karatsuba Square Root" algorithm by |
| Paul Zimmermann (*note References::). |
| |
| An input n is split into four parts of k bits each, so with b=2^k we |
| have n = a3*b^3 + a2*b^2 + a1*b + a0. Part a3 must be "normalized" so |
| that either the high or second highest bit is set. In GMP, k is kept |
| on a limb boundary and the input is left shifted (by an even number of |
| bits) to normalize. |
| |
| The square root of the high two parts is taken, by recursive |
| application of the algorithm (bottoming out in a one-limb Newton's |
| method), |
| |
| s1,r1 = sqrtrem (a3*b + a2) |
| |
| This is an approximation to the desired root and is extended by a |
| division to give s,r, |
| |
| q,u = divrem (r1*b + a1, 2*s1) |
| s = s1*b + q |
| r = u*b + a0 - q^2 |
| |
| The normalization requirement on a3 means at this point s is either |
| correct or 1 too big. r is negative in the latter case, so |
| |
| if r < 0 then |
| r = r + 2*s - 1 |
| s = s - 1 |
| |
| The algorithm is expressed in a divide and conquer form, but as |
| noted in the paper it can also be viewed as a discrete variant of |
| Newton's method, or as a variation on the schoolboy method (no longer |
| taught) for square roots two digits at a time. |
| |
| If the remainder r is not required then usually only a few high limbs |
| of r and u need to be calculated to determine whether an adjustment to |
| s is required. This optimization is not currently implemented. |
| |
| In the Karatsuba multiplication range this algorithm is |
| O(1.5*M(N/2)), where M(n) is the time to multiply two numbers of n |
| limbs. In the FFT multiplication range this grows to a bound of |
| O(6*M(N/2)). In practice a factor of about 1.5 to 1.8 is found in the |
| Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range. |
| |
| The algorithm does all its calculations in integers and the resulting |
| `mpn_sqrtrem' is used for both `mpz_sqrt' and `mpf_sqrt'. The extended |
| precision given by `mpf_sqrt_ui' is obtained by padding with zero limbs. |
| |
| |
| File: gmp.info, Node: Nth Root Algorithm, Next: Perfect Square Algorithm, Prev: Square Root Algorithm, Up: Root Extraction Algorithms |
| |
| 15.5.2 Nth Root |
| --------------- |
| |
| Integer Nth roots are taken using Newton's method with the following |
| iteration, where A is the input and n is the root to be taken. |
| |
| 1 A |
| a[i+1] = - * ( --------- + (n-1)*a[i] ) |
| n a[i]^(n-1) |
| |
| The initial approximation a[1] is generated bitwise by successively |
| powering a trial root with or without new 1 bits, aiming to be just |
| above the true root. The iteration converges quadratically when |
| started from a good approximation. When n is large more initial bits |
| are needed to get good convergence. The current implementation is not |
| particularly well optimized. |
| |
| |
| File: gmp.info, Node: Perfect Square Algorithm, Next: Perfect Power Algorithm, Prev: Nth Root Algorithm, Up: Root Extraction Algorithms |
| |
| 15.5.3 Perfect Square |
| --------------------- |
| |
| A significant fraction of non-squares can be quickly identified by |
| checking whether the input is a quadratic residue modulo small integers. |
| |
| `mpz_perfect_square_p' first tests the input mod 256, which means |
| just examining the low byte. Only 44 different values occur for |
| squares mod 256, so 82.8% of inputs can be immediately identified as |
| non-squares. |
| |
| On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, |
| for a total 99.25% of inputs identified as non-squares. On a 64-bit |
| system 97 is tested too, for a total 99.62%. |
| |
| These moduli are chosen because they're factors of 2^24-1 (or 2^48-1 |
| for 64-bits), and such a remainder can be quickly taken just using |
| additions (see `mpn_mod_34lsub1'). |
| |
| When nails are in use moduli are instead selected by the `gen-psqr.c' |
| program and applied with an `mpn_mod_1'. The same 2^24-1 or 2^48-1 |
| could be done with nails using some extra bit shifts, but this is not |
| currently implemented. |
| |
| In any case each modulus is applied to the `mpn_mod_34lsub1' or |
| `mpn_mod_1' remainder and a table lookup identifies non-squares. By |
| using a "modexact" style calculation, and suitably permuted tables, |
| just one multiply each is required, see the code for details. Moduli |
| are also combined to save operations, so long as the lookup tables |
| don't become too big. `gen-psqr.c' does all the pre-calculations. |
| |
| A square root must still be taken for any value that passes these |
| tests, to verify it's really a square and not one of the small fraction |
| of non-squares that get through (i.e. a pseudo-square to all the tested |
| bases). |
| |
| Clearly more residue tests could be done, `mpz_perfect_square_p' only |
| uses a compact and efficient set. Big inputs would probably benefit |
| from more residue testing, small inputs might be better off with less. |
| The assumed distribution of squares versus non-squares in the input |
| would affect such considerations. |
| |
| |
| File: gmp.info, Node: Perfect Power Algorithm, Prev: Perfect Square Algorithm, Up: Root Extraction Algorithms |
| |
| 15.5.4 Perfect Power |
| -------------------- |
| |
| Detecting perfect powers is required by some factorization algorithms. |
| Currently `mpz_perfect_power_p' is implemented using repeated Nth root |
| extractions, though naturally only prime roots need to be considered. |
| (*Note Nth Root Algorithm::.) |
| |
| If a prime divisor p with multiplicity e can be found, then only |
| roots which are divisors of e need to be considered, much reducing the |
| work necessary. To this end divisibility by a set of small primes is |
| checked. |
| |
| |
| File: gmp.info, Node: Radix Conversion Algorithms, Next: Other Algorithms, Prev: Root Extraction Algorithms, Up: Algorithms |
| |
| 15.6 Radix Conversion |
| ===================== |
| |
| Radix conversions are less important than other algorithms. A program |
| dominated by conversions should probably use a different data |
| representation. |
| |
| * Menu: |
| |
| * Binary to Radix:: |
| * Radix to Binary:: |
| |
| |
| File: gmp.info, Node: Binary to Radix, Next: Radix to Binary, Prev: Radix Conversion Algorithms, Up: Radix Conversion Algorithms |
| |
| 15.6.1 Binary to Radix |
| ---------------------- |
| |
| Conversions from binary to a power-of-2 radix use a simple and fast |
| O(N) bit extraction algorithm. |
| |
| Conversions from binary to other radices use one of two algorithms. |
| Sizes below `GET_STR_PRECOMPUTE_THRESHOLD' use a basic O(N^2) method. |
| Repeated divisions by b^n are made, where b is the radix and n is the |
| biggest power that fits in a limb. But instead of simply using the |
| remainder r from such divisions, an extra divide step is done to give a |
| fractional limb representing r/b^n. The digits of r can then be |
| extracted using multiplications by b rather than divisions. Special |
| case code is provided for decimal, allowing multiplications by 10 to |
| optimize to shifts and adds. |
| |
| Above `GET_STR_PRECOMPUTE_THRESHOLD' a sub-quadratic algorithm is |
| used. For an input t, powers b^(n*2^i) of the radix are calculated, |
| until a power between t and sqrt(t) is reached. t is then divided by |
| that largest power, giving a quotient which is the digits above that |
| power, and a remainder which is those below. These two parts are in |
| turn divided by the second highest power, and so on recursively. When |
| a piece has been divided down to less than `GET_STR_DC_THRESHOLD' |
| limbs, the basecase algorithm described above is used. |
| |
| The advantage of this algorithm is that big divisions can make use |
| of the sub-quadratic divide and conquer division (*note Divide and |
| Conquer Division::), and big divisions tend to have less overheads than |
| lots of separate single limb divisions anyway. But in any case the |
| cost of calculating the powers b^(n*2^i) must first be overcome. |
| |
| `GET_STR_PRECOMPUTE_THRESHOLD' and `GET_STR_DC_THRESHOLD' represent |
| the same basic thing, the point where it becomes worth doing a big |
| division to cut the input in half. `GET_STR_PRECOMPUTE_THRESHOLD' |
| includes the cost of calculating the radix power required, whereas |
| `GET_STR_DC_THRESHOLD' assumes that's already available, which is the |
| case when recursing. |
| |
| Since the base case produces digits from least to most significant |
| but they want to be stored from most to least, it's necessary to |
| calculate in advance how many digits there will be, or at least be sure |
| not to underestimate that. For GMP the number of input bits is |
| multiplied by `chars_per_bit_exactly' from `mp_bases', rounding up. |
| The result is either correct or one too big. |
| |
| Examining some of the high bits of the input could increase the |
| chance of getting the exact number of digits, but an exact result every |
| time would not be practical, since in general the difference between |
| numbers 100... and 99... is only in the last few bits and the work to |
| identify 99... might well be almost as much as a full conversion. |
| |
| The r/b^n scheme described above for using multiplications to bring |
| out digits might be useful for more than a single limb. Some brief |
| experiments with it on the base case when recursing didn't give a |
| noticeable improvement, but perhaps that was only due to the |
| implementation. Something similar would work for the sub-quadratic |
| divisions too, though there would be the cost of calculating a bigger |
| radix power. |
| |
| Another possible improvement for the sub-quadratic part would be to |
| arrange for radix powers that balanced the sizes of quotient and |
| remainder produced, i.e. the highest power would be an b^(n*k) |
| approximately equal to sqrt(t), not restricted to a 2^i factor. That |
| ought to smooth out a graph of times against sizes, but may or may not |
| be a net speedup. |
| |
| |
| File: gmp.info, Node: Radix to Binary, Prev: Binary to Radix, Up: Radix Conversion Algorithms |
| |
| 15.6.2 Radix to Binary |
| ---------------------- |
| |
| *This section needs to be rewritten, it currently describes the |
| algorithms used before GMP 4.3.* |
| |
| Conversions from a power-of-2 radix into binary use a simple and fast |
| O(N) bitwise concatenation algorithm. |
| |
| Conversions from other radices use one of two algorithms. Sizes |
| below `SET_STR_PRECOMPUTE_THRESHOLD' use a basic O(N^2) method. Groups |
| of n digits are converted to limbs, where n is the biggest power of the |
| base b which will fit in a limb, then those groups are accumulated into |
| the result by multiplying by b^n and adding. This saves |
| multi-precision operations, as per Knuth section 4.4 part E (*note |
| References::). Some special case code is provided for decimal, giving |
| the compiler a chance to optimize multiplications by 10. |
| |
| Above `SET_STR_PRECOMPUTE_THRESHOLD' a sub-quadratic algorithm is |
| used. First groups of n digits are converted into limbs. Then adjacent |
| limbs are combined into limb pairs with x*b^n+y, where x and y are the |
| limbs. Adjacent limb pairs are combined into quads similarly with |
| x*b^(2n)+y. This continues until a single block remains, that being |
| the result. |
| |
| The advantage of this method is that the multiplications for each x |
| are big blocks, allowing Karatsuba and higher algorithms to be used. |
| But the cost of calculating the powers b^(n*2^i) must be overcome. |
| `SET_STR_PRECOMPUTE_THRESHOLD' usually ends up quite big, around 5000 |
| digits, and on some processors much bigger still. |
| |
| `SET_STR_PRECOMPUTE_THRESHOLD' is based on the input digits (and |
| tuned for decimal), though it might be better based on a limb count, so |
| as to be independent of the base. But that sort of count isn't used by |
| the base case and so would need some sort of initial calculation or |
| estimate. |
| |
| The main reason `SET_STR_PRECOMPUTE_THRESHOLD' is so much bigger |
| than the corresponding `GET_STR_PRECOMPUTE_THRESHOLD' is that |
| `mpn_mul_1' is much faster than `mpn_divrem_1' (often by a factor of 5, |
| or more). |
| |
| |
| File: gmp.info, Node: Other Algorithms, Next: Assembly Coding, Prev: Radix Conversion Algorithms, Up: Algorithms |
| |
| 15.7 Other Algorithms |
| ===================== |
| |
| * Menu: |
| |
| * Prime Testing Algorithm:: |
| * Factorial Algorithm:: |
| * Binomial Coefficients Algorithm:: |
| * Fibonacci Numbers Algorithm:: |
| * Lucas Numbers Algorithm:: |
| * Random Number Algorithms:: |
| |
| |
| File: gmp.info, Node: Prime Testing Algorithm, Next: Factorial Algorithm, Prev: Other Algorithms, Up: Other Algorithms |
| |
| 15.7.1 Prime Testing |
| -------------------- |
| |
| The primality testing in `mpz_probab_prime_p' (*note Number Theoretic |
| Functions::) first does some trial division by small factors and then |
| uses the Miller-Rabin probabilistic primality testing algorithm, as |
| described in Knuth section 4.5.4 algorithm P (*note References::). |
| |
| For an odd input n, and with n = q*2^k+1 where q is odd, this |
| algorithm selects a random base x and tests whether x^q mod n is 1 or |
| -1, or an x^(q*2^j) mod n is 1, for 1<=j<=k. If so then n is probably |
| prime, if not then n is definitely composite. |
| |
| Any prime n will pass the test, but some composites do too. Such |
| composites are known as strong pseudoprimes to base x. No n is a |
| strong pseudoprime to more than 1/4 of all bases (see Knuth exercise |
| 22), hence with x chosen at random there's no more than a 1/4 chance a |
| "probable prime" will in fact be composite. |
| |
| In fact strong pseudoprimes are quite rare, making the test much more |
| powerful than this analysis would suggest, but 1/4 is all that's proven |
| for an arbitrary n. |
| |
| |
| File: gmp.info, Node: Factorial Algorithm, Next: Binomial Coefficients Algorithm, Prev: Prime Testing Algorithm, Up: Other Algorithms |
| |
| 15.7.2 Factorial |
| ---------------- |
| |
| Factorials are calculated by a combination of two algorithms. An idea is |
| shared among them: to compute the odd part of the factorial; a final |
| step takes account of the power of 2 term, by shifting. |
| |
| For small n, the odd factor of n! is computed with the simple |
| observation that it is equal to the product of all positive odd numbers |
| smaller than n times the odd factor of [n/2]!, where [x] is the integer |
| part of x, and so on recursively. The procedure can be best illustrated |
| with an example, |
| |
| 23! = (23.21.19.17.15.13.11.9.7.5.3)(11.9.7.5.3)(5.3)2^19 |
| |
| Current code collects all the factors in a single list, with a loop |
| and no recursion, and compute the product, with no special care for |
| repeated chunks. |
| |
| When n is larger, computation pass trough prime sieving. An helper |
| function is used, as suggested by Peter Luschny: |
| |
| n |
| ----- |
| n! | | L(p,n) |
| msf(n) = -------------- = | | p |
| [n/2]!^2.2^k p=3 |
| |
| Where p ranges on odd prime numbers. The exponent k is chosen to |
| obtain an odd integer number: k is the number of 1 bits in the binary |
| representation of [n/2]. The function L(p,n) can be defined as zero |
| when p is composite, and, for any prime p, it is computed with: |
| |
| --- |
| \ n |
| L(p,n) = / [---] mod 2 <= log (n) . |
| --- p^i p |
| i>0 |
| |
| With this helper function, we are able to compute the odd part of n! |
| using the recursion implied by n!=[n/2]!^2*msf(n)*2^k. The recursion |
| stops using the small-n algorithm on some [n/2^i]. |
| |
| Both the above algorithms use binary splitting to compute the |
| product of many small factors. At first as many products as possible |
| are accumulated in a single register, generating a list of factors that |
| fit in a machine word. This list is then split into halves, and the |
| product is computed recursively. |
| |
| Such splitting is more efficient than repeated Nx1 multiplies since |
| it forms big multiplies, allowing Karatsuba and higher algorithms to be |
| used. And even below the Karatsuba threshold a big block of work can |
| be more efficient for the basecase algorithm. |
| |
| |
| File: gmp.info, Node: Binomial Coefficients Algorithm, Next: Fibonacci Numbers Algorithm, Prev: Factorial Algorithm, Up: Other Algorithms |
| |
| 15.7.3 Binomial Coefficients |
| ---------------------------- |
| |
| Binomial coefficients C(n,k) are calculated by first arranging k <= n/2 |
| using C(n,k) = C(n,n-k) if necessary, and then evaluating the following |
| product simply from i=2 to i=k. |
| |
| k (n-k+i) |
| C(n,k) = (n-k+1) * prod ------- |
| i=2 i |
| |
| It's easy to show that each denominator i will divide the product so |
| far, so the exact division algorithm is used (*note Exact Division::). |
| |
| The numerators n-k+i and denominators i are first accumulated into |
| as many fit a limb, to save multi-precision operations, though for |
| `mpz_bin_ui' this applies only to the divisors, since n is an `mpz_t' |
| and n-k+i in general won't fit in a limb at all. |
| |
| |
| File: gmp.info, Node: Fibonacci Numbers Algorithm, Next: Lucas Numbers Algorithm, Prev: Binomial Coefficients Algorithm, Up: Other Algorithms |
| |
| 15.7.4 Fibonacci Numbers |
| ------------------------ |
| |
| The Fibonacci functions `mpz_fib_ui' and `mpz_fib2_ui' are designed for |
| calculating isolated F[n] or F[n],F[n-1] values efficiently. |
| |
| For small n, a table of single limb values in `__gmp_fib_table' is |
| used. On a 32-bit limb this goes up to F[47], or on a 64-bit limb up |
| to F[93]. For convenience the table starts at F[-1]. |
| |
| Beyond the table, values are generated with a binary powering |
| algorithm, calculating a pair F[n] and F[n-1] working from high to low |
| across the bits of n. The formulas used are |
| |
| F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k |
| F[2k-1] = F[k]^2 + F[k-1]^2 |
| |
| F[2k] = F[2k+1] - F[2k-1] |
| |
| At each step, k is the high b bits of n. If the next bit of n is 0 |
| then F[2k],F[2k-1] is used, or if it's a 1 then F[2k+1],F[2k] is used, |
| and the process repeated until all bits of n are incorporated. Notice |
| these formulas require just two squares per bit of n. |
| |
| It'd be possible to handle the first few n above the single limb |
| table with simple additions, using the defining Fibonacci recurrence |
| F[k+1]=F[k]+F[k-1], but this is not done since it usually turns out to |
| be faster for only about 10 or 20 values of n, and including a block of |
| code for just those doesn't seem worthwhile. If they really mattered |
| it'd be better to extend the data table. |
| |
| Using a table avoids lots of calculations on small numbers, and |
| makes small n go fast. A bigger table would make more small n go fast, |
| it's just a question of balancing size against desired speed. For GMP |
| the code is kept compact, with the emphasis primarily on a good |
| powering algorithm. |
| |
| `mpz_fib2_ui' returns both F[n] and F[n-1], but `mpz_fib_ui' is only |
| interested in F[n]. In this case the last step of the algorithm can |
| become one multiply instead of two squares. One of the following two |
| formulas is used, according as n is odd or even. |
| |
| F[2k] = F[k]*(F[k]+2F[k-1]) |
| |
| F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k |
| |
| F[2k+1] here is the same as above, just rearranged to be a multiply. |
| For interest, the 2*(-1)^k term both here and above can be applied |
| just to the low limb of the calculation, without a carry or borrow into |
| further limbs, which saves some code size. See comments with |
| `mpz_fib_ui' and the internal `mpn_fib2_ui' for how this is done. |
| |
| |
| File: gmp.info, Node: Lucas Numbers Algorithm, Next: Random Number Algorithms, Prev: Fibonacci Numbers Algorithm, Up: Other Algorithms |
| |
| 15.7.5 Lucas Numbers |
| -------------------- |
| |
| `mpz_lucnum2_ui' derives a pair of Lucas numbers from a pair of |
| Fibonacci numbers with the following simple formulas. |
| |
| L[k] = F[k] + 2*F[k-1] |
| L[k-1] = 2*F[k] - F[k-1] |
| |
| `mpz_lucnum_ui' is only interested in L[n], and some work can be |
| saved. Trailing zero bits on n can be handled with a single square |
| each. |
| |
| L[2k] = L[k]^2 - 2*(-1)^k |
| |
| And the lowest 1 bit can be handled with one multiply of a pair of |
| Fibonacci numbers, similar to what `mpz_fib_ui' does. |
| |
| L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k |
| |
| |
| File: gmp.info, Node: Random Number Algorithms, Prev: Lucas Numbers Algorithm, Up: Other Algorithms |
| |
| 15.7.6 Random Numbers |
| --------------------- |
| |
| For the `urandomb' functions, random numbers are generated simply by |
| concatenating bits produced by the generator. As long as the generator |
| has good randomness properties this will produce well-distributed N bit |
| numbers. |
| |
| For the `urandomm' functions, random numbers in a range 0<=R<N are |
| generated by taking values R of ceil(log2(N)) bits each until one |
| satisfies R<N. This will normally require only one or two attempts, |
| but the attempts are limited in case the generator is somehow |
| degenerate and produces only 1 bits or similar. |
| |
| The Mersenne Twister generator is by Matsumoto and Nishimura (*note |
| References::). It has a non-repeating period of 2^19937-1, which is a |
| Mersenne prime, hence the name of the generator. The state is 624 |
| words of 32-bits each, which is iterated with one XOR and shift for each |
| 32-bit word generated, making the algorithm very fast. Randomness |
| properties are also very good and this is the default algorithm used by |
| GMP. |
| |
| Linear congruential generators are described in many text books, for |
| instance Knuth volume 2 (*note References::). With a modulus M and |
| parameters A and C, an integer state S is iterated by the formula S <- |
| A*S+C mod M. At each step the new state is a linear function of the |
| previous, mod M, hence the name of the generator. |
| |
| In GMP only moduli of the form 2^N are supported, and the current |
| implementation is not as well optimized as it could be. Overheads are |
| significant when N is small, and when N is large clearly the multiply |
| at each step will become slow. This is not a big concern, since the |
| Mersenne Twister generator is better in every respect and is therefore |
| recommended for all normal applications. |
| |
| For both generators the current state can be deduced by observing |
| enough output and applying some linear algebra (over GF(2) in the case |
| of the Mersenne Twister). This generally means raw output is |
| unsuitable for cryptographic applications without further hashing or |
| the like. |
| |
| |
| File: gmp.info, Node: Assembly Coding, Prev: Other Algorithms, Up: Algorithms |
| |
| 15.8 Assembly Coding |
| ==================== |
| |
| The assembly subroutines in GMP are the most significant source of |
| speed at small to moderate sizes. At larger sizes algorithm selection |
| becomes more important, but of course speedups in low level routines |
| will still speed up everything proportionally. |
| |
| Carry handling and widening multiplies that are important for GMP |
| can't be easily expressed in C. GCC `asm' blocks help a lot and are |
| provided in `longlong.h', but hand coding low level routines invariably |
| offers a speedup over generic C by a factor of anything from 2 to 10. |
| |
| * Menu: |
| |
| * Assembly Code Organisation:: |
| * Assembly Basics:: |
| * Assembly Carry Propagation:: |
| * Assembly Cache Handling:: |
| * Assembly Functional Units:: |
| * Assembly Floating Point:: |
| * Assembly SIMD Instructions:: |
| * Assembly Software Pipelining:: |
| * Assembly Loop Unrolling:: |
| * Assembly Writing Guide:: |
| |
| |
| File: gmp.info, Node: Assembly Code Organisation, Next: Assembly Basics, Prev: Assembly Coding, Up: Assembly Coding |
| |
| 15.8.1 Code Organisation |
| ------------------------ |
| |
| The various `mpn' subdirectories contain machine-dependent code, written |
| in C or assembly. The `mpn/generic' subdirectory contains default code, |
| used when there's no machine-specific version of a particular file. |
| |
| Each `mpn' subdirectory is for an ISA family. Generally 32-bit and |
| 64-bit variants in a family cannot share code and have separate |
| directories. Within a family further subdirectories may exist for CPU |
| variants. |
| |
| In each directory a `nails' subdirectory may exist, holding code with |
| nails support for that CPU variant. A `NAILS_SUPPORT' directive in each |
| file indicates the nails values the code handles. Nails code only |
| exists where it's faster, or promises to be faster, than plain code. |
| There's no effort put into nails if they're not going to enhance a |
| given CPU. |
| |
| |
| File: gmp.info, Node: Assembly Basics, Next: Assembly Carry Propagation, Prev: Assembly Code Organisation, Up: Assembly Coding |
| |
| 15.8.2 Assembly Basics |
| ---------------------- |
| |
| `mpn_addmul_1' and `mpn_submul_1' are the most important routines for |
| overall GMP performance. All multiplications and divisions come down to |
| repeated calls to these. `mpn_add_n', `mpn_sub_n', `mpn_lshift' and |
| `mpn_rshift' are next most important. |
| |
| On some CPUs assembly versions of the internal functions |
| `mpn_mul_basecase' and `mpn_sqr_basecase' give significant speedups, |
| mainly through avoiding function call overheads. They can also |
| potentially make better use of a wide superscalar processor, as can |
| bigger primitives like `mpn_addmul_2' or `mpn_addmul_4'. |
| |
| The restrictions on overlaps between sources and destinations (*note |
| Low-level Functions::) are designed to facilitate a variety of |
| implementations. For example, knowing `mpn_add_n' won't have partly |
| overlapping sources and destination means reading can be done far ahead |
| of writing on superscalar processors, and loops can be vectorized on a |
| vector processor, depending on the carry handling. |
| |
| |
| File: gmp.info, Node: Assembly Carry Propagation, Next: Assembly Cache Handling, Prev: Assembly Basics, Up: Assembly Coding |
| |
| 15.8.3 Carry Propagation |
| ------------------------ |
| |
| The problem that presents most challenges in GMP is propagating carries |
| from one limb to the next. In functions like `mpn_addmul_1' and |
| `mpn_add_n', carries are the only dependencies between limb operations. |
| |
| On processors with carry flags, a straightforward CISC style `adc' is |
| generally best. AMD K6 `mpn_addmul_1' however is an example of an |
| unusual set of circumstances where a branch works out better. |
| |
| On RISC processors generally an add and compare for overflow is |
| used. This sort of thing can be seen in `mpn/generic/aors_n.c'. Some |
| carry propagation schemes require 4 instructions, meaning at least 4 |
| cycles per limb, but other schemes may use just 1 or 2. On wide |
| superscalar processors performance may be completely determined by the |
| number of dependent instructions between carry-in and carry-out for |
| each limb. |
| |
| On vector processors good use can be made of the fact that a carry |
| bit only very rarely propagates more than one limb. When adding a |
| single bit to a limb, there's only a carry out if that limb was |
| `0xFF...FF' which on random data will be only 1 in 2^mp_bits_per_limb. |
| `mpn/cray/add_n.c' is an example of this, it adds all limbs in |
| parallel, adds one set of carry bits in parallel and then only rarely |
| needs to fall through to a loop propagating further carries. |
| |
| On the x86s, GCC (as of version 2.95.2) doesn't generate |
| particularly good code for the RISC style idioms that are necessary to |
| handle carry bits in C. Often conditional jumps are generated where |
| `adc' or `sbb' forms would be better. And so unfortunately almost any |
| loop involving carry bits needs to be coded in assembly for best |
| results. |
| |
| |
| File: gmp.info, Node: Assembly Cache Handling, Next: Assembly Functional Units, Prev: Assembly Carry Propagation, Up: Assembly Coding |
| |
| 15.8.4 Cache Handling |
| --------------------- |
| |
| GMP aims to perform well both on operands that fit entirely in L1 cache |
| and those which don't. |
| |
| Basic routines like `mpn_add_n' or `mpn_lshift' are often used on |
| large operands, so L2 and main memory performance is important for them. |
| `mpn_mul_1' and `mpn_addmul_1' are mostly used for multiply and square |
| basecases, so L1 performance matters most for them, unless assembly |
| versions of `mpn_mul_basecase' and `mpn_sqr_basecase' exist, in which |
| case the remaining uses are mostly for larger operands. |
| |
| For L2 or main memory operands, memory access times will almost |
| certainly be more than the calculation time. The aim therefore is to |
| maximize memory throughput, by starting a load of the next cache line |
| while processing the contents of the previous one. Clearly this is |
| only possible if the chip has a lock-up free cache or some sort of |
| prefetch instruction. Most current chips have both these features. |
| |
| Prefetching sources combines well with loop unrolling, since a |
| prefetch can be initiated once per unrolled loop (or more than once if |
| the loop covers more than one cache line). |
| |
| On CPUs without write-allocate caches, prefetching destinations will |
| ensure individual stores don't go further down the cache hierarchy, |
| limiting bandwidth. Of course for calculations which are slow anyway, |
| like `mpn_divrem_1', write-throughs might be fine. |
| |
| The distance ahead to prefetch will be determined by memory latency |
| versus throughput. The aim of course is to have data arriving |
| continuously, at peak throughput. Some CPUs have limits on the number |
| of fetches or prefetches in progress. |
| |
| If a special prefetch instruction doesn't exist then a plain load |
| can be used, but in that case care must be taken not to attempt to read |
| past the end of an operand, since that might produce a segmentation |
| violation. |
| |
| Some CPUs or systems have hardware that detects sequential memory |
| accesses and initiates suitable cache movements automatically, making |
| life easy. |
| |
| |
| File: gmp.info, Node: Assembly Functional Units, Next: Assembly Floating Point, Prev: Assembly Cache Handling, Up: Assembly Coding |
| |
| 15.8.5 Functional Units |
| ----------------------- |
| |
| When choosing an approach for an assembly loop, consideration is given |
| to what operations can execute simultaneously and what throughput can |
| thereby be achieved. In some cases an algorithm can be tweaked to |
| accommodate available resources. |
| |
| Loop control will generally require a counter and pointer updates, |
| costing as much as 5 instructions, plus any delays a branch introduces. |
| CPU addressing modes might reduce pointer updates, perhaps by allowing |
| just one updating pointer and others expressed as offsets from it, or |
| on CISC chips with all addressing done with the loop counter as a |
| scaled index. |
| |
| The final loop control cost can be amortised by processing several |
| limbs in each iteration (*note Assembly Loop Unrolling::). This at |
| least ensures loop control isn't a big fraction the work done. |
| |
| Memory throughput is always a limit. If perhaps only one load or |
| one store can be done per cycle then 3 cycles/limb will the top speed |
| for "binary" operations like `mpn_add_n', and any code achieving that |
| is optimal. |
| |
| Integer resources can be freed up by having the loop counter in a |
| float register, or by pressing the float units into use for some |
| multiplying, perhaps doing every second limb on the float side (*note |
| Assembly Floating Point::). |
| |
| Float resources can be freed up by doing carry propagation on the |
| integer side, or even by doing integer to float conversions in integers |
| using bit twiddling. |
| |
| |
| File: gmp.info, Node: Assembly Floating Point, Next: Assembly SIMD Instructions, Prev: Assembly Functional Units, Up: Assembly Coding |
| |
| 15.8.6 Floating Point |
| --------------------- |
| |
| Floating point arithmetic is used in GMP for multiplications on CPUs |
| with poor integer multipliers. It's mostly useful for `mpn_mul_1', |
| `mpn_addmul_1' and `mpn_submul_1' on 64-bit machines, and |
| `mpn_mul_basecase' on both 32-bit and 64-bit machines. |
| |
| With IEEE 53-bit double precision floats, integer multiplications |
| producing up to 53 bits will give exact results. Breaking a 64x64 |
| multiplication into eight 16x32->48 bit pieces is convenient. With |
| some care though six 21x32->53 bit products can be used, if one of the |
| lower two 21-bit pieces also uses the sign bit. |
| |
| For the `mpn_mul_1' family of functions on a 64-bit machine, the |
| invariant single limb is split at the start, into 3 or 4 pieces. |
| Inside the loop, the bignum operand is split into 32-bit pieces. Fast |
| conversion of these unsigned 32-bit pieces to floating point is highly |
| machine-dependent. In some cases, reading the data into the integer |
| unit, zero-extending to 64-bits, then transferring to the floating |
| point unit back via memory is the only option. |
| |
| Converting partial products back to 64-bit limbs is usually best |
| done as a signed conversion. Since all values are smaller than 2^53, |
| signed and unsigned are the same, but most processors lack unsigned |
| conversions. |
| |
| |
| |
| Here is a diagram showing 16x32 bit products for an `mpn_mul_1' or |
| `mpn_addmul_1' with a 64-bit limb. The single limb operand V is split |
| into four 16-bit parts. The multi-limb operand U is split in the loop |
| into two 32-bit parts. |
| |
| +---+---+---+---+ |
| |v48|v32|v16|v00| V operand |
| +---+---+---+---+ |
| |
| +-------+---+---+ |
| x | u32 | u00 | U operand (one limb) |
| +---------------+ |
| |
| --------------------------------- |
| |
| +-----------+ |
| | u00 x v00 | p00 48-bit products |
| +-----------+ |
| +-----------+ |
| | u00 x v16 | p16 |
| +-----------+ |
| +-----------+ |
| | u00 x v32 | p32 |
| +-----------+ |
| +-----------+ |
| | u00 x v48 | p48 |
| +-----------+ |
| +-----------+ |
| | u32 x v00 | r32 |
| +-----------+ |
| +-----------+ |
| | u32 x v16 | r48 |
| +-----------+ |
| +-----------+ |
| | u32 x v32 | r64 |
| +-----------+ |
| +-----------+ |
| | u32 x v48 | r80 |
| +-----------+ |
| |
| p32 and r32 can be summed using floating-point addition, and |
| likewise p48 and r48. p00 and p16 can be summed with r64 and r80 from |
| the previous iteration. |
| |
| For each loop then, four 49-bit quantities are transferred to the |
| integer unit, aligned as follows, |
| |
| |-----64bits----|-----64bits----| |
| +------------+ |
| | p00 + r64' | i00 |
| +------------+ |
| +------------+ |
| | p16 + r80' | i16 |
| +------------+ |
| +------------+ |
| | p32 + r32 | i32 |
| +------------+ |
| +------------+ |
| | p48 + r48 | i48 |
| +------------+ |
| |
| The challenge then is to sum these efficiently and add in a carry |
| limb, generating a low 64-bit result limb and a high 33-bit carry limb |
| (i48 extends 33 bits into the high half). |
| |
| |
| File: gmp.info, Node: Assembly SIMD Instructions, Next: Assembly Software Pipelining, Prev: Assembly Floating Point, Up: Assembly Coding |
| |
| 15.8.7 SIMD Instructions |
| ------------------------ |
| |
| The single-instruction multiple-data support in current microprocessors |
| is aimed at signal processing algorithms where each data point can be |
| treated more or less independently. There's generally not much support |
| for propagating the sort of carries that arise in GMP. |
| |
| SIMD multiplications of say four 16x16 bit multiplies only do as much |
| work as one 32x32 from GMP's point of view, and need some shifts and |
| adds besides. But of course if say the SIMD form is fully pipelined |
| and uses less instruction decoding then it may still be worthwhile. |
| |
| On the x86 chips, MMX has so far found a use in `mpn_rshift' and |
| `mpn_lshift', and is used in a special case for 16-bit multipliers in |
| the P55 `mpn_mul_1'. SSE2 is used for Pentium 4 `mpn_mul_1', |
| `mpn_addmul_1', and `mpn_submul_1'. |
| |
| |
| File: gmp.info, Node: Assembly Software Pipelining, Next: Assembly Loop Unrolling, Prev: Assembly SIMD Instructions, Up: Assembly Coding |
| |
| 15.8.8 Software Pipelining |
| -------------------------- |
| |
| Software pipelining consists of scheduling instructions around the |
| branch point in a loop. For example a loop might issue a load not for |
| use in the present iteration but the next, thereby allowing extra |
| cycles for the data to arrive from memory. |
| |
| Naturally this is wanted only when doing things like loads or |
| multiplies that take several cycles to complete, and only where a CPU |
| has multiple functional units so that other work can be done in the |
| meantime. |
| |
| A pipeline with several stages will have a data value in progress at |
| each stage and each loop iteration moves them along one stage. This is |
| like juggling. |
| |
| If the latency of some instruction is greater than the loop time |
| then it will be necessary to unroll, so one register has a result ready |
| to use while another (or multiple others) are still in progress. |
| (*note Assembly Loop Unrolling::). |
| |
| |
| File: gmp.info, Node: Assembly Loop Unrolling, Next: Assembly Writing Guide, Prev: Assembly Software Pipelining, Up: Assembly Coding |
| |
| 15.8.9 Loop Unrolling |
| --------------------- |
| |
| Loop unrolling consists of replicating code so that several limbs are |
| processed in each loop. At a minimum this reduces loop overheads by a |
| corresponding factor, but it can also allow better register usage, for |
| example alternately using one register combination and then another. |
| Judicious use of `m4' macros can help avoid lots of duplication in the |
| source code. |
| |
| Any amount of unrolling can be handled with a loop counter that's |
| decremented by N each time, stopping when the remaining count is less |
| than the further N the loop will process. Or by subtracting N at the |
| start, the termination condition becomes when the counter C is less |
| than 0 (and the count of remaining limbs is C+N). |
| |
| Alternately for a power of 2 unroll the loop count and remainder can |
| be established with a shift and mask. This is convenient if also |
| making a computed jump into the middle of a large loop. |
| |
| The limbs not a multiple of the unrolling can be handled in various |
| ways, for example |
| |
| * A simple loop at the end (or the start) to process the excess. |
| Care will be wanted that it isn't too much slower than the |
| unrolled part. |
| |
| * A set of binary tests, for example after an 8-limb unrolling, test |
| for 4 more limbs to process, then a further 2 more or not, and |
| finally 1 more or not. This will probably take more code space |
| than a simple loop. |
| |
| * A `switch' statement, providing separate code for each possible |
| excess, for example an 8-limb unrolling would have separate code |
| for 0 remaining, 1 remaining, etc, up to 7 remaining. This might |
| take a lot of code, but may be the best way to optimize all cases |
| in combination with a deep pipelined loop. |
| |
| * A computed jump into the middle of the loop, thus making the first |
| iteration handle the excess. This should make times smoothly |
| increase with size, which is attractive, but setups for the jump |
| and adjustments for pointers can be tricky and could become quite |
| difficult in combination with deep pipelining. |
| |
| |
| File: gmp.info, Node: Assembly Writing Guide, Prev: Assembly Loop Unrolling, Up: Assembly Coding |
| |
| 15.8.10 Writing Guide |
| --------------------- |
| |
| This is a guide to writing software pipelined loops for processing limb |
| vectors in assembly. |
| |
| First determine the algorithm and which instructions are needed. |
| Code it without unrolling or scheduling, to make sure it works. On a |
| 3-operand CPU try to write each new value to a new register, this will |
| greatly simplify later steps. |
| |
| Then note for each instruction the functional unit and/or issue port |
| requirements. If an instruction can use either of two units, like U0 |
| or U1 then make a category "U0/U1". Count the total using each unit |
| (or combined unit), and count all instructions. |
| |
| Figure out from those counts the best possible loop time. The goal |
| will be to find a perfect schedule where instruction latencies are |
| completely hidden. The total instruction count might be the limiting |
| factor, or perhaps a particular functional unit. It might be possible |
| to tweak the instructions to help the limiting factor. |
| |
| Suppose the loop time is N, then make N issue buckets, with the |
| final loop branch at the end of the last. Now fill the buckets with |
| dummy instructions using the functional units desired. Run this to |
| make sure the intended speed is reached. |
| |
| Now replace the dummy instructions with the real instructions from |
| the slow but correct loop you started with. The first will typically |
| be a load instruction. Then the instruction using that value is placed |
| in a bucket an appropriate distance down. Run the loop again, to check |
| it still runs at target speed. |
| |
| Keep placing instructions, frequently measuring the loop. After a |
| few you will need to wrap around from the last bucket back to the top |
| of the loop. If you used the new-register for new-value strategy above |
| then there will be no register conflicts. If not then take care not to |
| clobber something already in use. Changing registers at this time is |
| very error prone. |
| |
| The loop will overlap two or more of the original loop iterations, |
| and the computation of one vector element result will be started in one |
| iteration of the new loop, and completed one or several iterations |
| later. |
| |
| The final step is to create feed-in and wind-down code for the loop. |
| A good way to do this is to make a copy (or copies) of the loop at the |
| start and delete those instructions which don't have valid antecedents, |
| and at the end replicate and delete those whose results are unwanted |
| (including any further loads). |
| |
| The loop will have a minimum number of limbs loaded and processed, |
| so the feed-in code must test if the request size is smaller and skip |
| either to a suitable part of the wind-down or to special code for small |
| sizes. |
| |
| |
| File: gmp.info, Node: Internals, Next: Contributors, Prev: Algorithms, Up: Top |
| |
| 16 Internals |
| ************ |
| |
| *This chapter is provided only for informational purposes and the |
| various internals described here may change in future GMP releases. |
| Applications expecting to be compatible with future releases should use |
| only the documented interfaces described in previous chapters.* |
| |
| * Menu: |
| |
| * Integer Internals:: |
| * Rational Internals:: |
| * Float Internals:: |
| * Raw Output Internals:: |
| * C++ Interface Internals:: |
| |
| |
| File: gmp.info, Node: Integer Internals, Next: Rational Internals, Prev: Internals, Up: Internals |
| |
| 16.1 Integer Internals |
| ====================== |
| |
| `mpz_t' variables represent integers using sign and magnitude, in space |
| dynamically allocated and reallocated. The fields are as follows. |
| |
| `_mp_size' |
| The number of limbs, or the negative of that when representing a |
| negative integer. Zero is represented by `_mp_size' set to zero, |
| in which case the `_mp_d' data is unused. |
| |
| `_mp_d' |
| A pointer to an array of limbs which is the magnitude. These are |
| stored "little endian" as per the `mpn' functions, so `_mp_d[0]' |
| is the least significant limb and `_mp_d[ABS(_mp_size)-1]' is the |
| most significant. Whenever `_mp_size' is non-zero, the most |
| significant limb is non-zero. |
| |
| Currently there's always at least one limb allocated, so for |
| instance `mpz_set_ui' never needs to reallocate, and `mpz_get_ui' |
| can fetch `_mp_d[0]' unconditionally (though its value is then |
| only wanted if `_mp_size' is non-zero). |
| |
| `_mp_alloc' |
| `_mp_alloc' is the number of limbs currently allocated at `_mp_d', |
| and naturally `_mp_alloc >= ABS(_mp_size)'. When an `mpz' routine |
| is about to (or might be about to) increase `_mp_size', it checks |
| `_mp_alloc' to see whether there's enough space, and reallocates |
| if not. `MPZ_REALLOC' is generally used for this. |
| |
| The various bitwise logical functions like `mpz_and' behave as if |
| negative values were twos complement. But sign and magnitude is always |
| used internally, and necessary adjustments are made during the |
| calculations. Sometimes this isn't pretty, but sign and magnitude are |
| best for other routines. |
| |
| Some internal temporary variables are setup with `MPZ_TMP_INIT' and |
| these have `_mp_d' space obtained from `TMP_ALLOC' rather than the |
| memory allocation functions. Care is taken to ensure that these are |
| big enough that no reallocation is necessary (since it would have |
| unpredictable consequences). |
| |
| `_mp_size' and `_mp_alloc' are `int', although `mp_size_t' is |
| usually a `long'. This is done to make the fields just 32 bits on some |
| 64 bits systems, thereby saving a few bytes of data space but still |
| providing plenty of range. |
| |
| |
| File: gmp.info, Node: Rational Internals, Next: Float Internals, Prev: Integer Internals, Up: Internals |
| |
| 16.2 Rational Internals |
| ======================= |
| |
| `mpq_t' variables represent rationals using an `mpz_t' numerator and |
| denominator (*note Integer Internals::). |
| |
| The canonical form adopted is denominator positive (and non-zero), |
| no common factors between numerator and denominator, and zero uniquely |
| represented as 0/1. |
| |
| It's believed that casting out common factors at each stage of a |
| calculation is best in general. A GCD is an O(N^2) operation so it's |
| better to do a few small ones immediately than to delay and have to do |
| a big one later. Knowing the numerator and denominator have no common |
| factors can be used for example in `mpq_mul' to make only two cross |
| GCDs necessary, not four. |
| |
| This general approach to common factors is badly sub-optimal in the |
| presence of simple factorizations or little prospect for cancellation, |
| but GMP has no way to know when this will occur. As per *Note |
| Efficiency::, that's left to applications. The `mpq_t' framework might |
| still suit, with `mpq_numref' and `mpq_denref' for direct access to the |
| numerator and denominator, or of course `mpz_t' variables can be used |
| directly. |
| |
| |
| File: gmp.info, Node: Float Internals, Next: Raw Output Internals, Prev: Rational Internals, Up: Internals |
| |
| 16.3 Float Internals |
| ==================== |
| |
| Efficient calculation is the primary aim of GMP floats and the use of |
| whole limbs and simple rounding facilitates this. |
| |
| `mpf_t' floats have a variable precision mantissa and a single |
| machine word signed exponent. The mantissa is represented using sign |
| and magnitude. |
| |
| most least |
| significant significant |
| limb limb |
| |
| _mp_d |
| |---- _mp_exp ---> | |
| _____ _____ _____ _____ _____ |
| |_____|_____|_____|_____|_____| |
| . <------------ radix point |
| |
| <-------- _mp_size ---------> |
| |
| The fields are as follows. |
| |
| `_mp_size' |
| The number of limbs currently in use, or the negative of that when |
| representing a negative value. Zero is represented by `_mp_size' |
| and `_mp_exp' both set to zero, and in that case the `_mp_d' data |
| is unused. (In the future `_mp_exp' might be undefined when |
| representing zero.) |
| |
| `_mp_prec' |
| The precision of the mantissa, in limbs. In any calculation the |
| aim is to produce `_mp_prec' limbs of result (the most significant |
| being non-zero). |
| |
| `_mp_d' |
| A pointer to the array of limbs which is the absolute value of the |
| mantissa. These are stored "little endian" as per the `mpn' |
| functions, so `_mp_d[0]' is the least significant limb and |
| `_mp_d[ABS(_mp_size)-1]' the most significant. |
| |
| The most significant limb is always non-zero, but there are no |
| other restrictions on its value, in particular the highest 1 bit |
| can be anywhere within the limb. |
| |
| `_mp_prec+1' limbs are allocated to `_mp_d', the extra limb being |
| for convenience (see below). There are no reallocations during a |
| calculation, only in a change of precision with `mpf_set_prec'. |
| |
| `_mp_exp' |
| The exponent, in limbs, determining the location of the implied |
| radix point. Zero means the radix point is just above the most |
| significant limb. Positive values mean a radix point offset |
| towards the lower limbs and hence a value >= 1, as for example in |
| the diagram above. Negative exponents mean a radix point further |
| above the highest limb. |
| |
| Naturally the exponent can be any value, it doesn't have to fall |
| within the limbs as the diagram shows, it can be a long way above |
| or a long way below. Limbs other than those included in the |
| `{_mp_d,_mp_size}' data are treated as zero. |
| |
| The `_mp_size' and `_mp_prec' fields are `int', although the |
| `mp_size_t' type is usually a `long'. The `_mp_exp' field is usually |
| `long'. This is done to make some fields just 32 bits on some 64 bits |
| systems, thereby saving a few bytes of data space but still providing |
| plenty of precision and a very large range. |
| |
| |
| The following various points should be noted. |
| |
| Low Zeros |
| The least significant limbs `_mp_d[0]' etc can be zero, though |
| such low zeros can always be ignored. Routines likely to produce |
| low zeros check and avoid them to save time in subsequent |
| calculations, but for most routines they're quite unlikely and |
| aren't checked. |
| |
| Mantissa Size Range |
| The `_mp_size' count of limbs in use can be less than `_mp_prec' if |
| the value can be represented in less. This means low precision |
| values or small integers stored in a high precision `mpf_t' can |
| still be operated on efficiently. |
| |
| `_mp_size' can also be greater than `_mp_prec'. Firstly a value is |
| allowed to use all of the `_mp_prec+1' limbs available at `_mp_d', |
| and secondly when `mpf_set_prec_raw' lowers `_mp_prec' it leaves |
| `_mp_size' unchanged and so the size can be arbitrarily bigger than |
| `_mp_prec'. |
| |
| Rounding |
| All rounding is done on limb boundaries. Calculating `_mp_prec' |
| limbs with the high non-zero will ensure the application requested |
| minimum precision is obtained. |
| |
| The use of simple "trunc" rounding towards zero is efficient, |
| since there's no need to examine extra limbs and increment or |
| decrement. |
| |
| Bit Shifts |
| Since the exponent is in limbs, there are no bit shifts in basic |
| operations like `mpf_add' and `mpf_mul'. When differing exponents |
| are encountered all that's needed is to adjust pointers to line up |
| the relevant limbs. |
| |
| Of course `mpf_mul_2exp' and `mpf_div_2exp' will require bit |
| shifts, but the choice is between an exponent in limbs which |
| requires shifts there, or one in bits which requires them almost |
| everywhere else. |
| |
| Use of `_mp_prec+1' Limbs |
| The extra limb on `_mp_d' (`_mp_prec+1' rather than just |
| `_mp_prec') helps when an `mpf' routine might get a carry from its |
| operation. `mpf_add' for instance will do an `mpn_add' of |
| `_mp_prec' limbs. If there's no carry then that's the result, but |
| if there is a carry then it's stored in the extra limb of space and |
| `_mp_size' becomes `_mp_prec+1'. |
| |
| Whenever `_mp_prec+1' limbs are held in a variable, the low limb |
| is not needed for the intended precision, only the `_mp_prec' high |
| limbs. But zeroing it out or moving the rest down is unnecessary. |
| Subsequent routines reading the value will simply take the high |
| limbs they need, and this will be `_mp_prec' if their target has |
| that same precision. This is no more than a pointer adjustment, |
| and must be checked anyway since the destination precision can be |
| different from the sources. |
| |
| Copy functions like `mpf_set' will retain a full `_mp_prec+1' limbs |
| if available. This ensures that a variable which has `_mp_size' |
| equal to `_mp_prec+1' will get its full exact value copied. |
| Strictly speaking this is unnecessary since only `_mp_prec' limbs |
| are needed for the application's requested precision, but it's |
| considered that an `mpf_set' from one variable into another of the |
| same precision ought to produce an exact copy. |
| |
| Application Precisions |
| `__GMPF_BITS_TO_PREC' converts an application requested precision |
| to an `_mp_prec'. The value in bits is rounded up to a whole limb |
| then an extra limb is added since the most significant limb of |
| `_mp_d' is only non-zero and therefore might contain only one bit. |
| |
| `__GMPF_PREC_TO_BITS' does the reverse conversion, and removes the |
| extra limb from `_mp_prec' before converting to bits. The net |
| effect of reading back with `mpf_get_prec' is simply the precision |
| rounded up to a multiple of `mp_bits_per_limb'. |
| |
| Note that the extra limb added here for the high only being |
| non-zero is in addition to the extra limb allocated to `_mp_d'. |
| For example with a 32-bit limb, an application request for 250 |
| bits will be rounded up to 8 limbs, then an extra added for the |
| high being only non-zero, giving an `_mp_prec' of 9. `_mp_d' then |
| gets 10 limbs allocated. Reading back with `mpf_get_prec' will |
| take `_mp_prec' subtract 1 limb and multiply by 32, giving 256 |
| bits. |
| |
| Strictly speaking, the fact the high limb has at least one bit |
| means that a float with, say, 3 limbs of 32-bits each will be |
| holding at least 65 bits, but for the purposes of `mpf_t' it's |
| considered simply to be 64 bits, a nice multiple of the limb size. |
| |
| |
| File: gmp.info, Node: Raw Output Internals, Next: C++ Interface Internals, Prev: Float Internals, Up: Internals |
| |
| 16.4 Raw Output Internals |
| ========================= |
| |
| `mpz_out_raw' uses the following format. |
| |
| +------+------------------------+ |
| | size | data bytes | |
| +------+------------------------+ |
| |
| The size is 4 bytes written most significant byte first, being the |
| number of subsequent data bytes, or the twos complement negative of |
| that when a negative integer is represented. The data bytes are the |
| absolute value of the integer, written most significant byte first. |
| |
| The most significant data byte is always non-zero, so the output is |
| the same on all systems, irrespective of limb size. |
| |
| In GMP 1, leading zero bytes were written to pad the data bytes to a |
| multiple of the limb size. `mpz_inp_raw' will still accept this, for |
| compatibility. |
| |
| The use of "big endian" for both the size and data fields is |
| deliberate, it makes the data easy to read in a hex dump of a file. |
| Unfortunately it also means that the limb data must be reversed when |
| reading or writing, so neither a big endian nor little endian system |
| can just read and write `_mp_d'. |
| |
| |
| File: gmp.info, Node: C++ Interface Internals, Prev: Raw Output Internals, Up: Internals |
| |
| 16.5 C++ Interface Internals |
| ============================ |
| |
| A system of expression templates is used to ensure something like |
| `a=b+c' turns into a simple call to `mpz_add' etc. For `mpf_class' the |
| scheme also ensures the precision of the final destination is used for |
| any temporaries within a statement like `f=w*x+y*z'. These are |
| important features which a naive implementation cannot provide. |
| |
| A simplified description of the scheme follows. The true scheme is |
| complicated by the fact that expressions have different return types. |
| For detailed information, refer to the source code. |
| |
| To perform an operation, say, addition, we first define a "function |
| object" evaluating it, |
| |
| struct __gmp_binary_plus |
| { |
| static void eval(mpf_t f, const mpf_t g, const mpf_t h) |
| { |
| mpf_add(f, g, h); |
| } |
| }; |
| |
| And an "additive expression" object, |
| |
| __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> > |
| operator+(const mpf_class &f, const mpf_class &g) |
| { |
| return __gmp_expr |
| <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g); |
| } |
| |
| The seemingly redundant `__gmp_expr<__gmp_binary_expr<...>>' is used |
| to encapsulate any possible kind of expression into a single template |
| type. In fact even `mpf_class' etc are `typedef' specializations of |
| `__gmp_expr'. |
| |
| Next we define assignment of `__gmp_expr' to `mpf_class'. |
| |
| template <class T> |
| mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr) |
| { |
| expr.eval(this->get_mpf_t(), this->precision()); |
| return *this; |
| } |
| |
| template <class Op> |
| void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval |
| (mpf_t f, mp_bitcnt_t precision) |
| { |
| Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t()); |
| } |
| |
| where `expr.val1' and `expr.val2' are references to the expression's |
| operands (here `expr' is the `__gmp_binary_expr' stored within the |
| `__gmp_expr'). |
| |
| This way, the expression is actually evaluated only at the time of |
| assignment, when the required precision (that of `f') is known. |
| Furthermore the target `mpf_t' is now available, thus we can call |
| `mpf_add' directly with `f' as the output argument. |
| |
| Compound expressions are handled by defining operators taking |
| subexpressions as their arguments, like this: |
| |
| template <class T, class U> |
| __gmp_expr |
| <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > |
| operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2) |
| { |
| return __gmp_expr |
| <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > |
| (expr1, expr2); |
| } |
| |
| And the corresponding specializations of `__gmp_expr::eval': |
| |
| template <class T, class U, class Op> |
| void __gmp_expr |
| <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval |
| (mpf_t f, mp_bitcnt_t precision) |
| { |
| // declare two temporaries |
| mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision); |
| Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t()); |
| } |
| |
| The expression is thus recursively evaluated to any level of |
| complexity and all subexpressions are evaluated to the precision of `f'. |
| |
| |
| File: gmp.info, Node: Contributors, Next: References, Prev: Internals, Up: Top |
| |
| Appendix A Contributors |
| *********************** |
| |
| Torbjo"rn Granlund wrote the original GMP library and is still the main |
| developer. Code not explicitly attributed to others, was contributed by |
| Torbjo"rn. Several other individuals and organizations have contributed |
| GMP. Here is a list in chronological order on first contribution: |
| |
| Gunnar Sjo"din and Hans Riesel helped with mathematical problems in |
| early versions of the library. |
| |
| Richard Stallman helped with the interface design and revised the |
| first version of this manual. |
| |
| Brian Beuning and Doug Lea helped with testing of early versions of |
| the library and made creative suggestions. |
| |
| John Amanatides of York University in Canada contributed the function |
| `mpz_probab_prime_p'. |
| |
| Paul Zimmermann wrote the REDC-based mpz_powm code, the |
| Scho"nhage-Strassen FFT multiply code, and the Karatsuba square root |
| code. He also improved the Toom3 code for GMP 4.2. Paul sparked the |
| development of GMP 2, with his comparisons between bignum packages. |
| The ECMNET project Paul is organizing was a driving force behind many |
| of the optimizations in GMP 3. Paul also wrote the new GMP 4.3 nth |
| root code (with Torbjo"rn). |
| |
| Ken Weber (Kent State University, Universidade Federal do Rio Grande |
| do Sul) contributed now defunct versions of `mpz_gcd', `mpz_divexact', |
| `mpn_gcd', and `mpn_bdivmod', partially supported by CNPq (Brazil) |
| grant 301314194-2. |
| |
| Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' |
| configure. He has also made valuable suggestions and tested numerous |
| intermediary releases. |
| |
| Joachim Hollman was involved in the design of the `mpf' interface, |
| and in the `mpz' design revisions for version 2. |
| |
| Bennet Yee contributed the initial versions of `mpz_jacobi' and |
| `mpz_legendre'. |
| |
| Andreas Schwab contributed the files `mpn/m68k/lshift.S' and |
| `mpn/m68k/rshift.S' (now in `.asm' form). |
| |
| Robert Harley of Inria, France and David Seal of ARM, England, |
| suggested clever improvements for population count. Robert also wrote |
| highly optimized Karatsuba and 3-way Toom multiplication functions for |
| GMP 3, and contributed the ARM assembly code. |
| |
| Torsten Ekedahl of the Mathematical department of Stockholm |
| University provided significant inspiration during several phases of |
| the GMP development. His mathematical expertise helped improve several |
| algorithms. |
| |
| Linus Nordberg wrote the new configure system based on autoconf and |
| implemented the new random functions. |
| |
| Kevin Ryde worked on a large number of things: optimized x86 code, |
| m4 asm macros, parameter tuning, speed measuring, the configure system, |
| function inlining, divisibility tests, bit scanning, Jacobi symbols, |
| Fibonacci and Lucas number functions, printf and scanf functions, perl |
| interface, demo expression parser, the algorithms chapter in the |
| manual, `gmpasm-mode.el', and various miscellaneous improvements |
| elsewhere. |
| |
| Kent Boortz made the Mac OS 9 port. |
| |
| Steve Root helped write the optimized alpha 21264 assembly code. |
| |
| Gerardo Ballabio wrote the `gmpxx.h' C++ class interface and the C++ |
| `istream' input routines. |
| |
| Jason Moxham rewrote `mpz_fac_ui'. |
| |
| Pedro Gimeno implemented the Mersenne Twister and made other random |
| number improvements. |
| |
| Niels Mo"ller wrote the sub-quadratic GCD, extended GCD and jacobi |
| code, the quadratic Hensel division code, and (with Torbjo"rn) the new |
| divide and conquer division code for GMP 4.3. Niels also helped |
| implement the new Toom multiply code for GMP 4.3 and implemented helper |
| functions to simplify Toom evaluations for GMP 5.0. He wrote the |
| original version of mpn_mulmod_bnm1, and he is the main author of the |
| mini-gmp package used for gmp bootstrapping. |
| |
| Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply |
| strategy, and found the optimal strategies for evaluation and |
| interpolation in Toom multiplication. |
| |
| Marco Bodrato helped implement the new Toom multiply code for GMP |
| 4.3 and implemented most of the new Toom multiply and squaring code for |
| 5.0. He is the main author of the current mpn_mulmod_bnm1, |
| mpn_mullo_n, and mpn_sqrlo. Marco also wrote the functions mpn_invert |
| and mpn_invertappr, and improved the speed of integer root extraction. |
| He is the author of the current combinatorial functions: binomial, |
| factorial, multifactorial, primorial. |
| |
| David Harvey suggested the internal function `mpn_bdiv_dbm1', |
| implementing division relevant to Toom multiplication. He also worked |
| on fast assembly sequences, in particular on a fast AMD64 |
| `mpn_mul_basecase'. He wrote the internal middle product functions |
| `mpn_mulmid_basecase', `mpn_toom42_mulmid', `mpn_mulmid_n' and related |
| helper routines. |
| |
| Martin Boij wrote `mpn_perfect_power_p'. |
| |
| Marc Glisse improved `gmpxx.h': use fewer temporaries (faster), |
| specializations of `numeric_limits' and `common_type', C++11 features |
| (move constructors, explicit bool conversion, UDL), make the conversion |
| from `mpq_class' to `mpz_class' explicit, optimize operations where one |
| argument is a small compile-time constant, replace some heap |
| allocations by stack allocations. He also fixed the eofbit handling of |
| C++ streams, and removed one division from `mpq/aors.c'. |
| |
| David S Miller wrote assembly code for SPARC T3 and T4. |
| |
| Mark Sofroniou cleaned up the types of mul_fft.c, letting it work |
| for huge operands. |
| |
| Ulrich Weigand ported GMP to the powerpc64le ABI. |
| |
| (This list is chronological, not ordered after significance. If you |
| have contributed to GMP but are not listed above, please tell |
| <gmp-devel@gmplib.org> about the omission!) |
| |
| The development of floating point functions of GNU MP 2, were |
| supported in part by the ESPRIT-BRA (Basic Research Activities) 6846 |
| project POSSO (POlynomial System SOlving). |
| |
| The development of GMP 2, 3, and 4.0 was supported in part by the |
| IDA Center for Computing Sciences. |
| |
| The development of GMP 4.3, 5.0, and 5.1 was supported in part by |
| the Swedish Foundation for Strategic Research. |
| |
| Thanks go to Hans Thorsen for donating an SGI system for the GMP |
| test system environment. |
| |
| |
| File: gmp.info, Node: References, Next: GNU Free Documentation License, Prev: Contributors, Up: Top |
| |
| Appendix B References |
| ********************* |
| |
| B.1 Books |
| ========= |
| |
| * Jonathan M. Borwein and Peter B. Borwein, "Pi and the AGM: A Study |
| in Analytic Number Theory and Computational Complexity", Wiley, |
| 1998. |
| |
| * Richard Crandall and Carl Pomerance, "Prime Numbers: A |
| Computational Perspective", 2nd edition, Springer-Verlag, 2005. |
| `http://www.math.dartmouth.edu/~carlp/' |
| |
| * Henri Cohen, "A Course in Computational Algebraic Number Theory", |
| Graduate Texts in Mathematics number 138, Springer-Verlag, 1993. |
| `http://www.math.u-bordeaux.fr/~cohen/' |
| |
| * Donald E. Knuth, "The Art of Computer Programming", volume 2, |
| "Seminumerical Algorithms", 3rd edition, Addison-Wesley, 1998. |
| `http://www-cs-faculty.stanford.edu/~knuth/taocp.html' |
| |
| * John D. Lipson, "Elements of Algebra and Algebraic Computing", The |
| Benjamin Cummings Publishing Company Inc, 1981. |
| |
| * Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, |
| "Handbook of Applied Cryptography", |
| `http://www.cacr.math.uwaterloo.ca/hac/' |
| |
| * Richard M. Stallman and the GCC Developer Community, "Using the |
| GNU Compiler Collection", Free Software Foundation, 2008, |
| available online `https://gcc.gnu.org/onlinedocs/', and in the GCC |
| package `https://ftp.gnu.org/gnu/gcc/' |
| |
| B.2 Papers |
| ========== |
| |
| * Yves Bertot, Nicolas Magaud and Paul Zimmermann, "A Proof of GMP |
| Square Root", Journal of Automated Reasoning, volume 29, 2002, pp. |
| 225-252. Also available online as INRIA Research Report 4475, |
| June 2002, `http://hal.inria.fr/docs/00/07/21/13/PDF/RR-4475.pdf' |
| |
| * Christoph Burnikel and Joachim Ziegler, "Fast Recursive Division", |
| Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, |
| `http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022' |
| |
| * Torbjo"rn Granlund and Peter L. Montgomery, "Division by Invariant |
| Integers using Multiplication", in Proceedings of the SIGPLAN |
| PLDI'94 Conference, June 1994. Also available |
| `https://gmplib.org/~tege/divcnst-pldi94.pdf'. |
| |
| * Niels Mo"ller and Torbjo"rn Granlund, "Improved division by |
| invariant integers", IEEE Transactions on Computers, 11 June 2010. |
| `https://gmplib.org/~tege/division-paper.pdf' |
| |
| * Torbjo"rn Granlund and Niels Mo"ller, "Division of integers large |
| and small", to appear. |
| |
| * Tudor Jebelean, "An algorithm for exact division", Journal of |
| Symbolic Computation, volume 15, 1993, pp. 169-180. Research |
| report version available |
| `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz' |
| |
| * Tudor Jebelean, "Exact Division with Karatsuba Complexity - |
| Extended Abstract", RISC-Linz technical report 96-31, |
| `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz' |
| |
| * Tudor Jebelean, "Practical Integer Division with Karatsuba |
| Complexity", ISSAC 97, pp. 339-341. Technical report available |
| `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz' |
| |
| * Tudor Jebelean, "A Generalization of the Binary GCD Algorithm", |
| ISSAC 93, pp. 111-116. Technical report version available |
| `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz' |
| |
| * Tudor Jebelean, "A Double-Digit Lehmer-Euclid Algorithm for |
| Finding the GCD of Long Integers", Journal of Symbolic |
| Computation, volume 19, 1995, pp. 145-157. Technical report |
| version also available |
| `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz' |
| |
| * Werner Krandick and Tudor Jebelean, "Bidirectional Exact Integer |
| Division", Journal of Symbolic Computation, volume 21, 1996, pp. |
| 441-455. Early technical report version also available |
| `ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz' |
| |
| * Makoto Matsumoto and Takuji Nishimura, "Mersenne Twister: A |
| 623-dimensionally equidistributed uniform pseudorandom number |
| generator", ACM Transactions on Modelling and Computer Simulation, |
| volume 8, January 1998, pp. 3-30. Available online |
| `http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz' |
| (or .pdf) |
| |
| * R. Moenck and A. Borodin, "Fast Modular Transforms via Division", |
| Proceedings of the 13th Annual IEEE Symposium on Switching and |
| Automata Theory, October 1972, pp. 90-96. Reprinted as "Fast |
| Modular Transforms", Journal of Computer and System Sciences, |
| volume 8, number 3, June 1974, pp. 366-386. |
| |
| * Niels Mo"ller, "On Scho"nhage's algorithm and subquadratic integer |
| GCD computation", in Mathematics of Computation, volume 77, |
| January 2008, pp. 589-607. |
| |
| * Peter L. Montgomery, "Modular Multiplication Without Trial |
| Division", in Mathematics of Computation, volume 44, number 170, |
| April 1985. |
| |
| * Arnold Scho"nhage and Volker Strassen, "Schnelle Multiplikation |
| grosser Zahlen", Computing 7, 1971, pp. 281-292. |
| |
| * Kenneth Weber, "The accelerated integer GCD algorithm", ACM |
| Transactions on Mathematical Software, volume 21, number 1, March |
| 1995, pp. 111-122. |
| |
| * Paul Zimmermann, "Karatsuba Square Root", INRIA Research Report |
| 3805, November 1999, |
| `http://hal.inria.fr/inria-00072854/PDF/RR-3805.pdf' |
| |
| * Paul Zimmermann, "A Proof of GMP Fast Division and Square Root |
| Implementations", |
| `http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz' |
| |
| * Dan Zuras, "On Squaring and Multiplying Large Integers", ARITH-11: |
| IEEE Symposium on Computer Arithmetic, 1993, pp. 260 to 271. |
| Reprinted as "More on Multiplying and Squaring Large Integers", |
| IEEE Transactions on Computers, volume 43, number 8, August 1994, |
| pp. 899-908. |
| |
| |
| File: gmp.info, Node: GNU Free Documentation License, Next: Concept Index, Prev: References, Up: Top |
| |
| Appendix C GNU Free Documentation License |
| ***************************************** |
| |
| Version 1.3, 3 November 2008 |
| |
| Copyright (C) 2000-2002, 2007, 2008 Free Software Foundation, Inc. |
| `http://fsf.org/' |
| |
| Everyone is permitted to copy and distribute verbatim copies |
| of this license document, but changing it is not allowed. |
| |
| 0. PREAMBLE |
| |
| The purpose of this License is to make a manual, textbook, or other |
| functional and useful document "free" in the sense of freedom: to |
| assure everyone the effective freedom to copy and redistribute it, |
| with or without modifying it, either commercially or |
| noncommercially. Secondarily, this License preserves for the |
| author and publisher a way to get credit for their work, while not |
| being considered responsible for modifications made by others. |
| |
| This License is a kind of "copyleft", which means that derivative |
| works of the document must themselves be free in the same sense. |
| It complements the GNU General Public License, which is a copyleft |
| license designed for free software. |
| |
| We have designed this License in order to use it for manuals for |
| free software, because free software needs free documentation: a |
| free program should come with manuals providing the same freedoms |
| that the software does. But this License is not limited to |
| software manuals; it can be used for any textual work, regardless |
| of subject matter or whether it is published as a printed book. |
| We recommend this License principally for works whose purpose is |
| instruction or reference. |
| |
| 1. APPLICABILITY AND DEFINITIONS |
| |
| This License applies to any manual or other work, in any medium, |
| that contains a notice placed by the copyright holder saying it |
| can be distributed under the terms of this License. Such a notice |
| grants a world-wide, royalty-free license, unlimited in duration, |
| to use that work under the conditions stated herein. The |
| "Document", below, refers to any such manual or work. Any member |
| of the public is a licensee, and is addressed as "you". You |
| accept the license if you copy, modify or distribute the work in a |
| way requiring permission under copyright law. |
| |
| A "Modified Version" of the Document means any work containing the |
| Document or a portion of it, either copied verbatim, or with |
| modifications and/or translated into another language. |
| |
| A "Secondary Section" is a named appendix or a front-matter section |
| of the Document that deals exclusively with the relationship of the |
| publishers or authors of the Document to the Document's overall |
| subject (or to related matters) and contains nothing that could |
| fall directly within that overall subject. (Thus, if the Document |
| is in part a textbook of mathematics, a Secondary Section may not |
| explain any mathematics.) The relationship could be a matter of |
| historical connection with the subject or with related matters, or |
| of legal, commercial, philosophical, ethical or political position |
| regarding them. |
| |
| The "Invariant Sections" are certain Secondary Sections whose |
| titles are designated, as being those of Invariant Sections, in |
| the notice that says that the Document is released under this |
| License. If a section does not fit the above definition of |
| Secondary then it is not allowed to be designated as Invariant. |
| The Document may contain zero Invariant Sections. If the Document |
| does not identify any Invariant Sections then there are none. |
| |
| The "Cover Texts" are certain short passages of text that are |
| listed, as Front-Cover Texts or Back-Cover Texts, in the notice |
| that says that the Document is released under this License. A |
| Front-Cover Text may be at most 5 words, and a Back-Cover Text may |
| be at most 25 words. |
| |
| A "Transparent" copy of the Document means a machine-readable copy, |
| represented in a format whose specification is available to the |
| general public, that is suitable for revising the document |
| straightforwardly with generic text editors or (for images |
| composed of pixels) generic paint programs or (for drawings) some |
| widely available drawing editor, and that is suitable for input to |
| text formatters or for automatic translation to a variety of |
| formats suitable for input to text formatters. A copy made in an |
| otherwise Transparent file format whose markup, or absence of |
| markup, has been arranged to thwart or discourage subsequent |
| modification by readers is not Transparent. An image format is |
| not Transparent if used for any substantial amount of text. A |
| copy that is not "Transparent" is called "Opaque". |
| |
| Examples of suitable formats for Transparent copies include plain |
| ASCII without markup, Texinfo input format, LaTeX input format, |
| SGML or XML using a publicly available DTD, and |
| standard-conforming simple HTML, PostScript or PDF designed for |
| human modification. Examples of transparent image formats include |
| PNG, XCF and JPG. Opaque formats include proprietary formats that |
| can be read and edited only by proprietary word processors, SGML or |
| XML for which the DTD and/or processing tools are not generally |
| available, and the machine-generated HTML, PostScript or PDF |
| produced by some word processors for output purposes only. |
| |
| The "Title Page" means, for a printed book, the title page itself, |
| plus such following pages as are needed to hold, legibly, the |
| material this License requires to appear in the title page. For |
| works in formats which do not have any title page as such, "Title |
| Page" means the text near the most prominent appearance of the |
| work's title, preceding the beginning of the body of the text. |
| |
| The "publisher" means any person or entity that distributes copies |
| of the Document to the public. |
| |
| A section "Entitled XYZ" means a named subunit of the Document |
| whose title either is precisely XYZ or contains XYZ in parentheses |
| following text that translates XYZ in another language. (Here XYZ |
| stands for a specific section name mentioned below, such as |
| "Acknowledgements", "Dedications", "Endorsements", or "History".) |
| To "Preserve the Title" of such a section when you modify the |
| Document means that it remains a section "Entitled XYZ" according |
| to this definition. |
| |
| The Document may include Warranty Disclaimers next to the notice |
| which states that this License applies to the Document. These |
| Warranty Disclaimers are considered to be included by reference in |
| this License, but only as regards disclaiming warranties: any other |
| implication that these Warranty Disclaimers may have is void and |
| has no effect on the meaning of this License. |
| |
| 2. VERBATIM COPYING |
| |
| You may copy and distribute the Document in any medium, either |
| commercially or noncommercially, provided that this License, the |
| copyright notices, and the license notice saying this License |
| applies to the Document are reproduced in all copies, and that you |
| add no other conditions whatsoever to those of this License. You |
| may not use technical measures to obstruct or control the reading |
| or further copying of the copies you make or distribute. However, |
| you may accept compensation in exchange for copies. If you |
| distribute a large enough number of copies you must also follow |
| the conditions in section 3. |
| |
| You may also lend copies, under the same conditions stated above, |
| and you may publicly display copies. |
| |
| 3. COPYING IN QUANTITY |
| |
| If you publish printed copies (or copies in media that commonly |
| have printed covers) of the Document, numbering more than 100, and |
| the Document's license notice requires Cover Texts, you must |
| enclose the copies in covers that carry, clearly and legibly, all |
| these Cover Texts: Front-Cover Texts on the front cover, and |
| Back-Cover Texts on the back cover. Both covers must also clearly |
| and legibly identify you as the publisher of these copies. The |
| front cover must present the full title with all words of the |
| title equally prominent and visible. You may add other material |
| on the covers in addition. Copying with changes limited to the |
| covers, as long as they preserve the title of the Document and |
| satisfy these conditions, can be treated as verbatim copying in |
| other respects. |
| |
| If the required texts for either cover are too voluminous to fit |
| legibly, you should put the first ones listed (as many as fit |
| reasonably) on the actual cover, and continue the rest onto |
| adjacent pages. |
| |
| If you publish or distribute Opaque copies of the Document |
| numbering more than 100, you must either include a |
| machine-readable Transparent copy along with each Opaque copy, or |
| state in or with each Opaque copy a computer-network location from |
| which the general network-using public has access to download |
| using public-standard network protocols a complete Transparent |
| copy of the Document, free of added material. If you use the |
| latter option, you must take reasonably prudent steps, when you |
| begin distribution of Opaque copies in quantity, to ensure that |
| this Transparent copy will remain thus accessible at the stated |
| location until at least one year after the last time you |
| distribute an Opaque copy (directly or through your agents or |
| retailers) of that edition to the public. |
| |
| It is requested, but not required, that you contact the authors of |
| the Document well before redistributing any large number of |
| copies, to give them a chance to provide you with an updated |
| version of the Document. |
| |
| 4. MODIFICATIONS |
| |
| You may copy and distribute a Modified Version of the Document |
| under the conditions of sections 2 and 3 above, provided that you |
| release the Modified Version under precisely this License, with |
| the Modified Version filling the role of the Document, thus |
| licensing distribution and modification of the Modified Version to |
| whoever possesses a copy of it. In addition, you must do these |
| things in the Modified Version: |
| |
| A. Use in the Title Page (and on the covers, if any) a title |
| distinct from that of the Document, and from those of |
| previous versions (which should, if there were any, be listed |
| in the History section of the Document). You may use the |
| same title as a previous version if the original publisher of |
| that version gives permission. |
| |
| B. List on the Title Page, as authors, one or more persons or |
| entities responsible for authorship of the modifications in |
| the Modified Version, together with at least five of the |
| principal authors of the Document (all of its principal |
| authors, if it has fewer than five), unless they release you |
| from this requirement. |
| |
| C. State on the Title page the name of the publisher of the |
| Modified Version, as the publisher. |
| |
| D. Preserve all the copyright notices of the Document. |
| |
| E. Add an appropriate copyright notice for your modifications |
| adjacent to the other copyright notices. |
| |
| F. Include, immediately after the copyright notices, a license |
| notice giving the public permission to use the Modified |
| Version under the terms of this License, in the form shown in |
| the Addendum below. |
| |
| G. Preserve in that license notice the full lists of Invariant |
| Sections and required Cover Texts given in the Document's |
| license notice. |
| |
| H. Include an unaltered copy of this License. |
| |
| I. Preserve the section Entitled "History", Preserve its Title, |
| and add to it an item stating at least the title, year, new |
| authors, and publisher of the Modified Version as given on |
| the Title Page. If there is no section Entitled "History" in |
| the Document, create one stating the title, year, authors, |
| and publisher of the Document as given on its Title Page, |
| then add an item describing the Modified Version as stated in |
| the previous sentence. |
| |
| J. Preserve the network location, if any, given in the Document |
| for public access to a Transparent copy of the Document, and |
| likewise the network locations given in the Document for |
| previous versions it was based on. These may be placed in |
| the "History" section. You may omit a network location for a |
| work that was published at least four years before the |
| Document itself, or if the original publisher of the version |
| it refers to gives permission. |
| |
| K. For any section Entitled "Acknowledgements" or "Dedications", |
| Preserve the Title of the section, and preserve in the |
| section all the substance and tone of each of the contributor |
| acknowledgements and/or dedications given therein. |
| |
| L. Preserve all the Invariant Sections of the Document, |
| unaltered in their text and in their titles. Section numbers |
| or the equivalent are not considered part of the section |
| titles. |
| |
| M. Delete any section Entitled "Endorsements". Such a section |
| may not be included in the Modified Version. |
| |
| N. Do not retitle any existing section to be Entitled |
| "Endorsements" or to conflict in title with any Invariant |
| Section. |
| |
| O. Preserve any Warranty Disclaimers. |
| |
| If the Modified Version includes new front-matter sections or |
| appendices that qualify as Secondary Sections and contain no |
| material copied from the Document, you may at your option |
| designate some or all of these sections as invariant. To do this, |
| add their titles to the list of Invariant Sections in the Modified |
| Version's license notice. These titles must be distinct from any |
| other section titles. |
| |
| You may add a section Entitled "Endorsements", provided it contains |
| nothing but endorsements of your Modified Version by various |
| parties--for example, statements of peer review or that the text |
| has been approved by an organization as the authoritative |
| definition of a standard. |
| |
| You may add a passage of up to five words as a Front-Cover Text, |
| and a passage of up to 25 words as a Back-Cover Text, to the end |
| of the list of Cover Texts in the Modified Version. Only one |
| passage of Front-Cover Text and one of Back-Cover Text may be |
| added by (or through arrangements made by) any one entity. If the |
| Document already includes a cover text for the same cover, |
| previously added by you or by arrangement made by the same entity |
| you are acting on behalf of, you may not add another; but you may |
| replace the old one, on explicit permission from the previous |
| publisher that added the old one. |
| |
| The author(s) and publisher(s) of the Document do not by this |
| License give permission to use their names for publicity for or to |
| assert or imply endorsement of any Modified Version. |
| |
| 5. COMBINING DOCUMENTS |
| |
| You may combine the Document with other documents released under |
| this License, under the terms defined in section 4 above for |
| modified versions, provided that you include in the combination |
| all of the Invariant Sections of all of the original documents, |
| unmodified, and list them all as Invariant Sections of your |
| combined work in its license notice, and that you preserve all |
| their Warranty Disclaimers. |
| |
| The combined work need only contain one copy of this License, and |
| multiple identical Invariant Sections may be replaced with a single |
| copy. If there are multiple Invariant Sections with the same name |
| but different contents, make the title of each such section unique |
| by adding at the end of it, in parentheses, the name of the |
| original author or publisher of that section if known, or else a |
| unique number. Make the same adjustment to the section titles in |
| the list of Invariant Sections in the license notice of the |
| combined work. |
| |
| In the combination, you must combine any sections Entitled |
| "History" in the various original documents, forming one section |
| Entitled "History"; likewise combine any sections Entitled |
| "Acknowledgements", and any sections Entitled "Dedications". You |
| must delete all sections Entitled "Endorsements." |
| |
| 6. COLLECTIONS OF DOCUMENTS |
| |
| You may make a collection consisting of the Document and other |
| documents released under this License, and replace the individual |
| copies of this License in the various documents with a single copy |
| that is included in the collection, provided that you follow the |
| rules of this License for verbatim copying of each of the |
| documents in all other respects. |
| |
| You may extract a single document from such a collection, and |
| distribute it individually under this License, provided you insert |
| a copy of this License into the extracted document, and follow |
| this License in all other respects regarding verbatim copying of |
| that document. |
| |
| 7. AGGREGATION WITH INDEPENDENT WORKS |
| |
| A compilation of the Document or its derivatives with other |
| separate and independent documents or works, in or on a volume of |
| a storage or distribution medium, is called an "aggregate" if the |
| copyright resulting from the compilation is not used to limit the |
| legal rights of the compilation's users beyond what the individual |
| works permit. When the Document is included in an aggregate, this |
| License does not apply to the other works in the aggregate which |
| are not themselves derivative works of the Document. |
| |
| If the Cover Text requirement of section 3 is applicable to these |
| copies of the Document, then if the Document is less than one half |
| of the entire aggregate, the Document's Cover Texts may be placed |
| on covers that bracket the Document within the aggregate, or the |
| electronic equivalent of covers if the Document is in electronic |
| form. Otherwise they must appear on printed covers that bracket |
| the whole aggregate. |
| |
| 8. TRANSLATION |
| |
| Translation is considered a kind of modification, so you may |
| distribute translations of the Document under the terms of section |
| 4. Replacing Invariant Sections with translations requires special |
| permission from their copyright holders, but you may include |
| translations of some or all Invariant Sections in addition to the |
| original versions of these Invariant Sections. You may include a |
| translation of this License, and all the license notices in the |
| Document, and any Warranty Disclaimers, provided that you also |
| include the original English version of this License and the |
| original versions of those notices and disclaimers. In case of a |
| disagreement between the translation and the original version of |
| this License or a notice or disclaimer, the original version will |
| prevail. |
| |
| If a section in the Document is Entitled "Acknowledgements", |
| "Dedications", or "History", the requirement (section 4) to |
| Preserve its Title (section 1) will typically require changing the |
| actual title. |
| |
| 9. TERMINATION |
| |
| You may not copy, modify, sublicense, or distribute the Document |
| except as expressly provided under this License. Any attempt |
| otherwise to copy, modify, sublicense, or distribute it is void, |
| and will automatically terminate your rights under this License. |
| |
| However, if you cease all violation of this License, then your |
| license from a particular copyright holder is reinstated (a) |
| provisionally, unless and until the copyright holder explicitly |
| and finally terminates your license, and (b) permanently, if the |
| copyright holder fails to notify you of the violation by some |
| reasonable means prior to 60 days after the cessation. |
| |
| Moreover, your license from a particular copyright holder is |
| reinstated permanently if the copyright holder notifies you of the |
| violation by some reasonable means, this is the first time you have |
| received notice of violation of this License (for any work) from |
| that copyright holder, and you cure the violation prior to 30 days |
| after your receipt of the notice. |
| |
| Termination of your rights under this section does not terminate |
| the licenses of parties who have received copies or rights from |
| you under this License. If your rights have been terminated and |
| not permanently reinstated, receipt of a copy of some or all of |
| the same material does not give you any rights to use it. |
| |
| 10. FUTURE REVISIONS OF THIS LICENSE |
| |
| The Free Software Foundation may publish new, revised versions of |
| the GNU Free Documentation License from time to time. Such new |
| versions will be similar in spirit to the present version, but may |
| differ in detail to address new problems or concerns. See |
| `https://www.gnu.org/copyleft/'. |
| |
| Each version of the License is given a distinguishing version |
| number. If the Document specifies that a particular numbered |
| version of this License "or any later version" applies to it, you |
| have the option of following the terms and conditions either of |
| that specified version or of any later version that has been |
| published (not as a draft) by the Free Software Foundation. If |
| the Document does not specify a version number of this License, |
| you may choose any version ever published (not as a draft) by the |
| Free Software Foundation. If the Document specifies that a proxy |
| can decide which future versions of this License can be used, that |
| proxy's public statement of acceptance of a version permanently |
| authorizes you to choose that version for the Document. |
| |
| 11. RELICENSING |
| |
| "Massive Multiauthor Collaboration Site" (or "MMC Site") means any |
| World Wide Web server that publishes copyrightable works and also |
| provides prominent facilities for anybody to edit those works. A |
| public wiki that anybody can edit is an example of such a server. |
| A "Massive Multiauthor Collaboration" (or "MMC") contained in the |
| site means any set of copyrightable works thus published on the MMC |
| site. |
| |
| "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0 |
| license published by Creative Commons Corporation, a not-for-profit |
| corporation with a principal place of business in San Francisco, |
| California, as well as future copyleft versions of that license |
| published by that same organization. |
| |
| "Incorporate" means to publish or republish a Document, in whole or |
| in part, as part of another Document. |
| |
| An MMC is "eligible for relicensing" if it is licensed under this |
| License, and if all works that were first published under this |
| License somewhere other than this MMC, and subsequently |
| incorporated in whole or in part into the MMC, (1) had no cover |
| texts or invariant sections, and (2) were thus incorporated prior |
| to November 1, 2008. |
| |
| The operator of an MMC Site may republish an MMC contained in the |
| site under CC-BY-SA on the same site at any time before August 1, |
| 2009, provided the MMC is eligible for relicensing. |
| |
| |
| ADDENDUM: How to use this License for your documents |
| ==================================================== |
| |
| To use this License in a document you have written, include a copy of |
| the License in the document and put the following copyright and license |
| notices just after the title page: |
| |
| Copyright (C) YEAR YOUR NAME. |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.3 |
| or any later version published by the Free Software Foundation; |
| with no Invariant Sections, no Front-Cover Texts, and no Back-Cover |
| Texts. A copy of the license is included in the section entitled ``GNU |
| Free Documentation License''. |
| |
| If you have Invariant Sections, Front-Cover Texts and Back-Cover |
| Texts, replace the "with...Texts." line with this: |
| |
| with the Invariant Sections being LIST THEIR TITLES, with |
| the Front-Cover Texts being LIST, and with the Back-Cover Texts |
| being LIST. |
| |
| If you have Invariant Sections without Cover Texts, or some other |
| combination of the three, merge those two alternatives to suit the |
| situation. |
| |
| If your document contains nontrivial examples of program code, we |
| recommend releasing these examples in parallel under your choice of |
| free software license, such as the GNU General Public License, to |
| permit their use in free software. |
| |
| |
| File: gmp.info, Node: Concept Index, Next: Function Index, Prev: GNU Free Documentation License, Up: Top |
| |
| Concept Index |
| ************* |
| |
| [index] |
| * Menu: |
| |
| * #include: Headers and Libraries. |
| (line 6) |
| * --build: Build Options. (line 52) |
| * --disable-fft: Build Options. (line 313) |
| * --disable-shared: Build Options. (line 45) |
| * --disable-static: Build Options. (line 45) |
| * --enable-alloca: Build Options. (line 274) |
| * --enable-assert: Build Options. (line 319) |
| * --enable-cxx: Build Options. (line 226) |
| * --enable-fat: Build Options. (line 161) |
| * --enable-profiling <1>: Build Options. (line 323) |
| * --enable-profiling: Profiling. (line 6) |
| * --exec-prefix: Build Options. (line 32) |
| * --host: Build Options. (line 66) |
| * --prefix: Build Options. (line 32) |
| * -finstrument-functions: Profiling. (line 66) |
| * 2exp functions: Efficiency. (line 43) |
| * 68000: Notes for Particular Systems. |
| (line 94) |
| * 80x86: Notes for Particular Systems. |
| (line 150) |
| * ABI <1>: Build Options. (line 168) |
| * ABI: ABI and ISA. (line 6) |
| * About this manual: Introduction to GMP. (line 57) |
| * AC_CHECK_LIB: Autoconf. (line 11) |
| * AIX <1>: Notes for Particular Systems. |
| (line 7) |
| * AIX: ABI and ISA. (line 178) |
| * Algorithms: Algorithms. (line 6) |
| * alloca: Build Options. (line 274) |
| * Allocation of memory: Custom Allocation. (line 6) |
| * AMD64: ABI and ISA. (line 44) |
| * Anonymous FTP of latest version: Introduction to GMP. (line 37) |
| * Application Binary Interface: ABI and ISA. (line 6) |
| * Arithmetic functions <1>: Rational Arithmetic. (line 6) |
| * Arithmetic functions <2>: Float Arithmetic. (line 6) |
| * Arithmetic functions: Integer Arithmetic. (line 6) |
| * ARM: Notes for Particular Systems. |
| (line 20) |
| * Assembly cache handling: Assembly Cache Handling. |
| (line 6) |
| * Assembly carry propagation: Assembly Carry Propagation. |
| (line 6) |
| * Assembly code organisation: Assembly Code Organisation. |
| (line 6) |
| * Assembly coding: Assembly Coding. (line 6) |
| * Assembly floating Point: Assembly Floating Point. |
| (line 6) |
| * Assembly loop unrolling: Assembly Loop Unrolling. |
| (line 6) |
| * Assembly SIMD: Assembly SIMD Instructions. |
| (line 6) |
| * Assembly software pipelining: Assembly Software Pipelining. |
| (line 6) |
| * Assembly writing guide: Assembly Writing Guide. |
| (line 6) |
| * Assertion checking <1>: Debugging. (line 79) |
| * Assertion checking: Build Options. (line 319) |
| * Assignment functions <1>: Assigning Integers. (line 6) |
| * Assignment functions <2>: Initializing Rationals. |
| (line 6) |
| * Assignment functions <3>: Simultaneous Float Init & Assign. |
| (line 6) |
| * Assignment functions <4>: Simultaneous Integer Init & Assign. |
| (line 6) |
| * Assignment functions: Assigning Floats. (line 6) |
| * Autoconf: Autoconf. (line 6) |
| * Basics: GMP Basics. (line 6) |
| * Binomial coefficient algorithm: Binomial Coefficients Algorithm. |
| (line 6) |
| * Binomial coefficient functions: Number Theoretic Functions. |
| (line 124) |
| * Binutils strip: Known Build Problems. |
| (line 28) |
| * Bit manipulation functions: Integer Logic and Bit Fiddling. |
| (line 6) |
| * Bit scanning functions: Integer Logic and Bit Fiddling. |
| (line 40) |
| * Bit shift left: Integer Arithmetic. (line 38) |
| * Bit shift right: Integer Division. (line 62) |
| * Bits per limb: Useful Macros and Constants. |
| (line 7) |
| * Bug reporting: Reporting Bugs. (line 6) |
| * Build directory: Build Options. (line 19) |
| * Build notes for binary packaging: Notes for Package Builds. |
| (line 6) |
| * Build notes for particular systems: Notes for Particular Systems. |
| (line 6) |
| * Build options: Build Options. (line 6) |
| * Build problems known: Known Build Problems. |
| (line 6) |
| * Build system: Build Options. (line 52) |
| * Building GMP: Installing GMP. (line 6) |
| * Bus error: Debugging. (line 7) |
| * C compiler: Build Options. (line 179) |
| * C++ compiler: Build Options. (line 250) |
| * C++ interface: C++ Class Interface. (line 6) |
| * C++ interface internals: C++ Interface Internals. |
| (line 6) |
| * C++ istream input: C++ Formatted Input. (line 6) |
| * C++ ostream output: C++ Formatted Output. |
| (line 6) |
| * C++ support: Build Options. (line 226) |
| * CC: Build Options. (line 179) |
| * CC_FOR_BUILD: Build Options. (line 213) |
| * CFLAGS: Build Options. (line 179) |
| * Checker: Debugging. (line 115) |
| * checkergcc: Debugging. (line 122) |
| * Code organisation: Assembly Code Organisation. |
| (line 6) |
| * Compaq C++: Notes for Particular Systems. |
| (line 25) |
| * Comparison functions <1>: Integer Comparisons. (line 6) |
| * Comparison functions <2>: Float Comparison. (line 6) |
| * Comparison functions: Comparing Rationals. (line 6) |
| * Compatibility with older versions: Compatibility with older versions. |
| (line 6) |
| * Conditions for copying GNU MP: Copying. (line 6) |
| * Configuring GMP: Installing GMP. (line 6) |
| * Congruence algorithm: Exact Remainder. (line 30) |
| * Congruence functions: Integer Division. (line 137) |
| * Constants: Useful Macros and Constants. |
| (line 6) |
| * Contributors: Contributors. (line 6) |
| * Conventions for parameters: Parameter Conventions. |
| (line 6) |
| * Conventions for variables: Variable Conventions. |
| (line 6) |
| * Conversion functions <1>: Converting Integers. (line 6) |
| * Conversion functions <2>: Converting Floats. (line 6) |
| * Conversion functions: Rational Conversions. |
| (line 6) |
| * Copying conditions: Copying. (line 6) |
| * CPPFLAGS: Build Options. (line 205) |
| * CPU types <1>: Introduction to GMP. (line 24) |
| * CPU types: Build Options. (line 108) |
| * Cross compiling: Build Options. (line 66) |
| * Cryptography functions, low-level: Low-level Functions. (line 507) |
| * Custom allocation: Custom Allocation. (line 6) |
| * CXX: Build Options. (line 250) |
| * CXXFLAGS: Build Options. (line 250) |
| * Cygwin: Notes for Particular Systems. |
| (line 57) |
| * Darwin: Known Build Problems. |
| (line 51) |
| * Debugging: Debugging. (line 6) |
| * Demonstration programs: Demonstration Programs. |
| (line 6) |
| * Digits in an integer: Miscellaneous Integer Functions. |
| (line 23) |
| * Divisibility algorithm: Exact Remainder. (line 30) |
| * Divisibility functions: Integer Division. (line 137) |
| * Divisibility testing: Efficiency. (line 91) |
| * Division algorithms: Division Algorithms. (line 6) |
| * Division functions <1>: Rational Arithmetic. (line 24) |
| * Division functions <2>: Integer Division. (line 6) |
| * Division functions: Float Arithmetic. (line 33) |
| * DJGPP <1>: Notes for Particular Systems. |
| (line 57) |
| * DJGPP: Known Build Problems. |
| (line 18) |
| * DLLs: Notes for Particular Systems. |
| (line 70) |
| * DocBook: Build Options. (line 346) |
| * Documentation formats: Build Options. (line 339) |
| * Documentation license: GNU Free Documentation License. |
| (line 6) |
| * DVI: Build Options. (line 342) |
| * Efficiency: Efficiency. (line 6) |
| * Emacs: Emacs. (line 6) |
| * Exact division functions: Integer Division. (line 112) |
| * Exact remainder: Exact Remainder. (line 6) |
| * Example programs: Demonstration Programs. |
| (line 6) |
| * Exec prefix: Build Options. (line 32) |
| * Execution profiling <1>: Build Options. (line 323) |
| * Execution profiling: Profiling. (line 6) |
| * Exponentiation functions <1>: Float Arithmetic. (line 41) |
| * Exponentiation functions: Integer Exponentiation. |
| (line 6) |
| * Export: Integer Import and Export. |
| (line 45) |
| * Expression parsing demo: Demonstration Programs. |
| (line 15) |
| * Extended GCD: Number Theoretic Functions. |
| (line 43) |
| * Factor removal functions: Number Theoretic Functions. |
| (line 104) |
| * Factorial algorithm: Factorial Algorithm. (line 6) |
| * Factorial functions: Number Theoretic Functions. |
| (line 112) |
| * Factorization demo: Demonstration Programs. |
| (line 25) |
| * Fast Fourier Transform: FFT Multiplication. (line 6) |
| * Fat binary: Build Options. (line 161) |
| * FFT multiplication <1>: FFT Multiplication. (line 6) |
| * FFT multiplication: Build Options. (line 313) |
| * Fibonacci number algorithm: Fibonacci Numbers Algorithm. |
| (line 6) |
| * Fibonacci sequence functions: Number Theoretic Functions. |
| (line 132) |
| * Float arithmetic functions: Float Arithmetic. (line 6) |
| * Float assignment functions <1>: Simultaneous Float Init & Assign. |
| (line 6) |
| * Float assignment functions: Assigning Floats. (line 6) |
| * Float comparison functions: Float Comparison. (line 6) |
| * Float conversion functions: Converting Floats. (line 6) |
| * Float functions: Floating-point Functions. |
| (line 6) |
| * Float initialization functions <1>: Simultaneous Float Init & Assign. |
| (line 6) |
| * Float initialization functions: Initializing Floats. (line 6) |
| * Float input and output functions: I/O of Floats. (line 6) |
| * Float internals: Float Internals. (line 6) |
| * Float miscellaneous functions: Miscellaneous Float Functions. |
| (line 6) |
| * Float random number functions: Miscellaneous Float Functions. |
| (line 27) |
| * Float rounding functions: Miscellaneous Float Functions. |
| (line 9) |
| * Float sign tests: Float Comparison. (line 34) |
| * Floating point mode: Notes for Particular Systems. |
| (line 34) |
| * Floating-point functions: Floating-point Functions. |
| (line 6) |
| * Floating-point number: Nomenclature and Types. |
| (line 21) |
| * fnccheck: Profiling. (line 77) |
| * Formatted input: Formatted Input. (line 6) |
| * Formatted output: Formatted Output. (line 6) |
| * Free Documentation License: GNU Free Documentation License. |
| (line 6) |
| * FreeBSD: Notes for Particular Systems. |
| (line 43) |
| * frexp <1>: Converting Integers. (line 43) |
| * frexp: Converting Floats. (line 24) |
| * FTP of latest version: Introduction to GMP. (line 37) |
| * Function classes: Function Classes. (line 6) |
| * FunctionCheck: Profiling. (line 77) |
| * GCC Checker: Debugging. (line 115) |
| * GCD algorithms: Greatest Common Divisor Algorithms. |
| (line 6) |
| * GCD extended: Number Theoretic Functions. |
| (line 43) |
| * GCD functions: Number Theoretic Functions. |
| (line 26) |
| * GDB: Debugging. (line 58) |
| * Generic C: Build Options. (line 152) |
| * GMP Perl module: Demonstration Programs. |
| (line 35) |
| * GMP version number: Useful Macros and Constants. |
| (line 12) |
| * gmp.h: Headers and Libraries. |
| (line 6) |
| * gmpxx.h: C++ Interface General. |
| (line 8) |
| * GNU Debugger: Debugging. (line 58) |
| * GNU Free Documentation License: GNU Free Documentation License. |
| (line 6) |
| * GNU strip: Known Build Problems. |
| (line 28) |
| * gprof: Profiling. (line 41) |
| * Greatest common divisor algorithms: Greatest Common Divisor Algorithms. |
| (line 6) |
| * Greatest common divisor functions: Number Theoretic Functions. |
| (line 26) |
| * Hardware floating point mode: Notes for Particular Systems. |
| (line 34) |
| * Headers: Headers and Libraries. |
| (line 6) |
| * Heap problems: Debugging. (line 24) |
| * Home page: Introduction to GMP. (line 33) |
| * Host system: Build Options. (line 66) |
| * HP-UX: ABI and ISA. (line 77) |
| * HPPA: ABI and ISA. (line 77) |
| * I/O functions <1>: I/O of Integers. (line 6) |
| * I/O functions <2>: I/O of Floats. (line 6) |
| * I/O functions: I/O of Rationals. (line 6) |
| * i386: Notes for Particular Systems. |
| (line 150) |
| * IA-64: ABI and ISA. (line 116) |
| * Import: Integer Import and Export. |
| (line 11) |
| * In-place operations: Efficiency. (line 57) |
| * Include files: Headers and Libraries. |
| (line 6) |
| * info-lookup-symbol: Emacs. (line 6) |
| * Initialization functions <1>: Initializing Floats. (line 6) |
| * Initialization functions <2>: Random State Initialization. |
| (line 6) |
| * Initialization functions <3>: Simultaneous Float Init & Assign. |
| (line 6) |
| * Initialization functions <4>: Simultaneous Integer Init & Assign. |
| (line 6) |
| * Initialization functions <5>: Initializing Rationals. |
| (line 6) |
| * Initialization functions: Initializing Integers. |
| (line 6) |
| * Initializing and clearing: Efficiency. (line 21) |
| * Input functions <1>: Formatted Input Functions. |
| (line 6) |
| * Input functions <2>: I/O of Rationals. (line 6) |
| * Input functions <3>: I/O of Floats. (line 6) |
| * Input functions: I/O of Integers. (line 6) |
| * Install prefix: Build Options. (line 32) |
| * Installing GMP: Installing GMP. (line 6) |
| * Instruction Set Architecture: ABI and ISA. (line 6) |
| * instrument-functions: Profiling. (line 66) |
| * Integer: Nomenclature and Types. |
| (line 6) |
| * Integer arithmetic functions: Integer Arithmetic. (line 6) |
| * Integer assignment functions <1>: Assigning Integers. (line 6) |
| * Integer assignment functions: Simultaneous Integer Init & Assign. |
| (line 6) |
| * Integer bit manipulation functions: Integer Logic and Bit Fiddling. |
| (line 6) |
| * Integer comparison functions: Integer Comparisons. (line 6) |
| * Integer conversion functions: Converting Integers. (line 6) |
| * Integer division functions: Integer Division. (line 6) |
| * Integer exponentiation functions: Integer Exponentiation. |
| (line 6) |
| * Integer export: Integer Import and Export. |
| (line 45) |
| * Integer functions: Integer Functions. (line 6) |
| * Integer import: Integer Import and Export. |
| (line 11) |
| * Integer initialization functions <1>: Initializing Integers. |
| (line 6) |
| * Integer initialization functions: Simultaneous Integer Init & Assign. |
| (line 6) |
| * Integer input and output functions: I/O of Integers. (line 6) |
| * Integer internals: Integer Internals. (line 6) |
| * Integer logical functions: Integer Logic and Bit Fiddling. |
| (line 6) |
| * Integer miscellaneous functions: Miscellaneous Integer Functions. |
| (line 6) |
| * Integer random number functions: Integer Random Numbers. |
| (line 6) |
| * Integer root functions: Integer Roots. (line 6) |
| * Integer sign tests: Integer Comparisons. (line 28) |
| * Integer special functions: Integer Special Functions. |
| (line 6) |
| * Interix: Notes for Particular Systems. |
| (line 65) |
| * Internals: Internals. (line 6) |
| * Introduction: Introduction to GMP. (line 6) |
| * Inverse modulo functions: Number Theoretic Functions. |
| (line 70) |
| * IRIX <1>: ABI and ISA. (line 141) |
| * IRIX: Known Build Problems. |
| (line 38) |
| * ISA: ABI and ISA. (line 6) |
| * istream input: C++ Formatted Input. (line 6) |
| * Jacobi symbol algorithm: Jacobi Symbol. (line 6) |
| * Jacobi symbol functions: Number Theoretic Functions. |
| (line 79) |
| * Karatsuba multiplication: Karatsuba Multiplication. |
| (line 6) |
| * Karatsuba square root algorithm: Square Root Algorithm. |
| (line 6) |
| * Kronecker symbol functions: Number Theoretic Functions. |
| (line 91) |
| * Language bindings: Language Bindings. (line 6) |
| * Latest version of GMP: Introduction to GMP. (line 37) |
| * LCM functions: Number Theoretic Functions. |
| (line 64) |
| * Least common multiple functions: Number Theoretic Functions. |
| (line 64) |
| * Legendre symbol functions: Number Theoretic Functions. |
| (line 82) |
| * libgmp: Headers and Libraries. |
| (line 22) |
| * libgmpxx: Headers and Libraries. |
| (line 27) |
| * Libraries: Headers and Libraries. |
| (line 22) |
| * Libtool: Headers and Libraries. |
| (line 33) |
| * Libtool versioning: Notes for Package Builds. |
| (line 9) |
| * License conditions: Copying. (line 6) |
| * Limb: Nomenclature and Types. |
| (line 31) |
| * Limb size: Useful Macros and Constants. |
| (line 7) |
| * Linear congruential algorithm: Random Number Algorithms. |
| (line 25) |
| * Linear congruential random numbers: Random State Initialization. |
| (line 32) |
| * Linking: Headers and Libraries. |
| (line 22) |
| * Logical functions: Integer Logic and Bit Fiddling. |
| (line 6) |
| * Low-level functions: Low-level Functions. (line 6) |
| * Low-level functions for cryptography: Low-level Functions. (line 507) |
| * Lucas number algorithm: Lucas Numbers Algorithm. |
| (line 6) |
| * Lucas number functions: Number Theoretic Functions. |
| (line 143) |
| * MacOS X: Known Build Problems. |
| (line 51) |
| * Mailing lists: Introduction to GMP. (line 44) |
| * Malloc debugger: Debugging. (line 30) |
| * Malloc problems: Debugging. (line 24) |
| * Memory allocation: Custom Allocation. (line 6) |
| * Memory management: Memory Management. (line 6) |
| * Mersenne twister algorithm: Random Number Algorithms. |
| (line 17) |
| * Mersenne twister random numbers: Random State Initialization. |
| (line 13) |
| * MINGW: Notes for Particular Systems. |
| (line 57) |
| * MIPS: ABI and ISA. (line 141) |
| * Miscellaneous float functions: Miscellaneous Float Functions. |
| (line 6) |
| * Miscellaneous integer functions: Miscellaneous Integer Functions. |
| (line 6) |
| * MMX: Notes for Particular Systems. |
| (line 156) |
| * Modular inverse functions: Number Theoretic Functions. |
| (line 70) |
| * Most significant bit: Miscellaneous Integer Functions. |
| (line 34) |
| * MPN_PATH: Build Options. (line 327) |
| * MS Windows: Notes for Particular Systems. |
| (line 57) |
| * MS-DOS: Notes for Particular Systems. |
| (line 57) |
| * Multi-threading: Reentrancy. (line 6) |
| * Multiplication algorithms: Multiplication Algorithms. |
| (line 6) |
| * Nails: Low-level Functions. (line 683) |
| * Native compilation: Build Options. (line 52) |
| * NetBSD: Notes for Particular Systems. |
| (line 100) |
| * NeXT: Known Build Problems. |
| (line 57) |
| * Next prime function: Number Theoretic Functions. |
| (line 19) |
| * Nomenclature: Nomenclature and Types. |
| (line 6) |
| * Non-Unix systems: Build Options. (line 11) |
| * Nth root algorithm: Nth Root Algorithm. (line 6) |
| * Number sequences: Efficiency. (line 147) |
| * Number theoretic functions: Number Theoretic Functions. |
| (line 6) |
| * Numerator and denominator: Applying Integer Functions. |
| (line 6) |
| * obstack output: Formatted Output Functions. |
| (line 81) |
| * OpenBSD: Notes for Particular Systems. |
| (line 109) |
| * Optimizing performance: Performance optimization. |
| (line 6) |
| * ostream output: C++ Formatted Output. |
| (line 6) |
| * Other languages: Language Bindings. (line 6) |
| * Output functions <1>: Formatted Output Functions. |
| (line 6) |
| * Output functions <2>: I/O of Rationals. (line 6) |
| * Output functions <3>: I/O of Integers. (line 6) |
| * Output functions: I/O of Floats. (line 6) |
| * Packaged builds: Notes for Package Builds. |
| (line 6) |
| * Parameter conventions: Parameter Conventions. |
| (line 6) |
| * Parsing expressions demo: Demonstration Programs. |
| (line 21) |
| * Particular systems: Notes for Particular Systems. |
| (line 6) |
| * Past GMP versions: Compatibility with older versions. |
| (line 6) |
| * PDF: Build Options. (line 342) |
| * Perfect power algorithm: Perfect Power Algorithm. |
| (line 6) |
| * Perfect power functions: Integer Roots. (line 28) |
| * Perfect square algorithm: Perfect Square Algorithm. |
| (line 6) |
| * Perfect square functions: Integer Roots. (line 37) |
| * perl: Demonstration Programs. |
| (line 35) |
| * Perl module: Demonstration Programs. |
| (line 35) |
| * Postscript: Build Options. (line 342) |
| * Power/PowerPC <1>: Known Build Problems. |
| (line 63) |
| * Power/PowerPC: Notes for Particular Systems. |
| (line 115) |
| * Powering algorithms: Powering Algorithms. (line 6) |
| * Powering functions <1>: Float Arithmetic. (line 41) |
| * Powering functions: Integer Exponentiation. |
| (line 6) |
| * PowerPC: ABI and ISA. (line 176) |
| * Precision of floats: Floating-point Functions. |
| (line 6) |
| * Precision of hardware floating point: Notes for Particular Systems. |
| (line 34) |
| * Prefix: Build Options. (line 32) |
| * Prime testing algorithms: Prime Testing Algorithm. |
| (line 6) |
| * Prime testing functions: Number Theoretic Functions. |
| (line 7) |
| * Primorial functions: Number Theoretic Functions. |
| (line 117) |
| * printf formatted output: Formatted Output. (line 6) |
| * Probable prime testing functions: Number Theoretic Functions. |
| (line 7) |
| * prof: Profiling. (line 24) |
| * Profiling: Profiling. (line 6) |
| * Radix conversion algorithms: Radix Conversion Algorithms. |
| (line 6) |
| * Random number algorithms: Random Number Algorithms. |
| (line 6) |
| * Random number functions <1>: Integer Random Numbers. |
| (line 6) |
| * Random number functions <2>: Random Number Functions. |
| (line 6) |
| * Random number functions: Miscellaneous Float Functions. |
| (line 27) |
| * Random number seeding: Random State Seeding. |
| (line 6) |
| * Random number state: Random State Initialization. |
| (line 6) |
| * Random state: Nomenclature and Types. |
| (line 46) |
| * Rational arithmetic: Efficiency. (line 113) |
| * Rational arithmetic functions: Rational Arithmetic. (line 6) |
| * Rational assignment functions: Initializing Rationals. |
| (line 6) |
| * Rational comparison functions: Comparing Rationals. (line 6) |
| * Rational conversion functions: Rational Conversions. |
| (line 6) |
| * Rational initialization functions: Initializing Rationals. |
| (line 6) |
| * Rational input and output functions: I/O of Rationals. (line 6) |
| * Rational internals: Rational Internals. (line 6) |
| * Rational number: Nomenclature and Types. |
| (line 16) |
| * Rational number functions: Rational Number Functions. |
| (line 6) |
| * Rational numerator and denominator: Applying Integer Functions. |
| (line 6) |
| * Rational sign tests: Comparing Rationals. (line 28) |
| * Raw output internals: Raw Output Internals. |
| (line 6) |
| * Reallocations: Efficiency. (line 30) |
| * Reentrancy: Reentrancy. (line 6) |
| * References: References. (line 6) |
| * Remove factor functions: Number Theoretic Functions. |
| (line 104) |
| * Reporting bugs: Reporting Bugs. (line 6) |
| * Root extraction algorithm: Nth Root Algorithm. (line 6) |
| * Root extraction algorithms: Root Extraction Algorithms. |
| (line 6) |
| * Root extraction functions <1>: Float Arithmetic. (line 37) |
| * Root extraction functions: Integer Roots. (line 6) |
| * Root testing functions: Integer Roots. (line 28) |
| * Rounding functions: Miscellaneous Float Functions. |
| (line 9) |
| * Sample programs: Demonstration Programs. |
| (line 6) |
| * Scan bit functions: Integer Logic and Bit Fiddling. |
| (line 40) |
| * scanf formatted input: Formatted Input. (line 6) |
| * SCO: Known Build Problems. |
| (line 38) |
| * Seeding random numbers: Random State Seeding. |
| (line 6) |
| * Segmentation violation: Debugging. (line 7) |
| * Sequent Symmetry: Known Build Problems. |
| (line 68) |
| * Services for Unix: Notes for Particular Systems. |
| (line 65) |
| * Shared library versioning: Notes for Package Builds. |
| (line 9) |
| * Sign tests <1>: Comparing Rationals. (line 28) |
| * Sign tests <2>: Float Comparison. (line 34) |
| * Sign tests: Integer Comparisons. (line 28) |
| * Size in digits: Miscellaneous Integer Functions. |
| (line 23) |
| * Small operands: Efficiency. (line 7) |
| * Solaris <1>: ABI and ISA. (line 208) |
| * Solaris: Known Build Problems. |
| (line 72) |
| * Sparc: Notes for Particular Systems. |
| (line 127) |
| * Sparc V9: ABI and ISA. (line 208) |
| * Special integer functions: Integer Special Functions. |
| (line 6) |
| * Square root algorithm: Square Root Algorithm. |
| (line 6) |
| * SSE2: Notes for Particular Systems. |
| (line 156) |
| * Stack backtrace: Debugging. (line 50) |
| * Stack overflow <1>: Debugging. (line 7) |
| * Stack overflow: Build Options. (line 274) |
| * Static linking: Efficiency. (line 14) |
| * stdarg.h: Headers and Libraries. |
| (line 17) |
| * stdio.h: Headers and Libraries. |
| (line 11) |
| * Stripped libraries: Known Build Problems. |
| (line 28) |
| * Sun: ABI and ISA. (line 208) |
| * SunOS: Notes for Particular Systems. |
| (line 144) |
| * Systems: Notes for Particular Systems. |
| (line 6) |
| * Temporary memory: Build Options. (line 274) |
| * Texinfo: Build Options. (line 339) |
| * Text input/output: Efficiency. (line 153) |
| * Thread safety: Reentrancy. (line 6) |
| * Toom multiplication <1>: Higher degree Toom'n'half. |
| (line 6) |
| * Toom multiplication <2>: Other Multiplication. |
| (line 6) |
| * Toom multiplication <3>: Toom 4-Way Multiplication. |
| (line 6) |
| * Toom multiplication: Toom 3-Way Multiplication. |
| (line 6) |
| * Types: Nomenclature and Types. |
| (line 6) |
| * ui and si functions: Efficiency. (line 50) |
| * Unbalanced multiplication: Unbalanced Multiplication. |
| (line 6) |
| * Upward compatibility: Compatibility with older versions. |
| (line 6) |
| * Useful macros and constants: Useful Macros and Constants. |
| (line 6) |
| * User-defined precision: Floating-point Functions. |
| (line 6) |
| * Valgrind: Debugging. (line 130) |
| * Variable conventions: Variable Conventions. |
| (line 6) |
| * Version number: Useful Macros and Constants. |
| (line 12) |
| * Web page: Introduction to GMP. (line 33) |
| * Windows: Notes for Particular Systems. |
| (line 70) |
| * x86: Notes for Particular Systems. |
| (line 150) |
| * x87: Notes for Particular Systems. |
| (line 34) |
| * XML: Build Options. (line 346) |
| |
| |
| File: gmp.info, Node: Function Index, Prev: Concept Index, Up: Top |
| |
| Function and Type Index |
| *********************** |
| |
| [index] |
| * Menu: |
| |
| * __GMP_CC: Useful Macros and Constants. |
| (line 23) |
| * __GMP_CFLAGS: Useful Macros and Constants. |
| (line 24) |
| * __GNU_MP_VERSION: Useful Macros and Constants. |
| (line 10) |
| * __GNU_MP_VERSION_MINOR: Useful Macros and Constants. |
| (line 11) |
| * __GNU_MP_VERSION_PATCHLEVEL: Useful Macros and Constants. |
| (line 12) |
| * _mpz_realloc: Integer Special Functions. |
| (line 14) |
| * abs <1>: C++ Interface Rationals. |
| (line 49) |
| * abs <2>: C++ Interface Integers. |
| (line 47) |
| * abs: C++ Interface Floats. |
| (line 83) |
| * ceil: C++ Interface Floats. |
| (line 84) |
| * cmp <1>: C++ Interface Floats. |
| (line 85) |
| * cmp <2>: C++ Interface Integers. |
| (line 48) |
| * cmp <3>: C++ Interface Rationals. |
| (line 51) |
| * cmp: C++ Interface Floats. |
| (line 86) |
| * floor: C++ Interface Floats. |
| (line 93) |
| * gcd: C++ Interface Integers. |
| (line 64) |
| * gmp_asprintf: Formatted Output Functions. |
| (line 65) |
| * gmp_errno: Random State Initialization. |
| (line 55) |
| * GMP_ERROR_INVALID_ARGUMENT: Random State Initialization. |
| (line 55) |
| * GMP_ERROR_UNSUPPORTED_ARGUMENT: Random State Initialization. |
| (line 55) |
| * gmp_fprintf: Formatted Output Functions. |
| (line 29) |
| * gmp_fscanf: Formatted Input Functions. |
| (line 25) |
| * GMP_LIMB_BITS: Low-level Functions. (line 713) |
| * GMP_NAIL_BITS: Low-level Functions. (line 711) |
| * GMP_NAIL_MASK: Low-level Functions. (line 721) |
| * GMP_NUMB_BITS: Low-level Functions. (line 712) |
| * GMP_NUMB_MASK: Low-level Functions. (line 722) |
| * GMP_NUMB_MAX: Low-level Functions. (line 730) |
| * gmp_obstack_printf: Formatted Output Functions. |
| (line 79) |
| * gmp_obstack_vprintf: Formatted Output Functions. |
| (line 81) |
| * gmp_printf: Formatted Output Functions. |
| (line 24) |
| * GMP_RAND_ALG_DEFAULT: Random State Initialization. |
| (line 49) |
| * GMP_RAND_ALG_LC: Random State Initialization. |
| (line 49) |
| * gmp_randclass: C++ Interface Random Numbers. |
| (line 7) |
| * gmp_randclass::get_f: C++ Interface Random Numbers. |
| (line 46) |
| * gmp_randclass::get_z_bits: C++ Interface Random Numbers. |
| (line 39) |
| * gmp_randclass::get_z_range: C++ Interface Random Numbers. |
| (line 42) |
| * gmp_randclass::gmp_randclass: C++ Interface Random Numbers. |
| (line 27) |
| * gmp_randclass::seed: C++ Interface Random Numbers. |
| (line 34) |
| * gmp_randclear: Random State Initialization. |
| (line 62) |
| * gmp_randinit: Random State Initialization. |
| (line 47) |
| * gmp_randinit_default: Random State Initialization. |
| (line 7) |
| * gmp_randinit_lc_2exp: Random State Initialization. |
| (line 18) |
| * gmp_randinit_lc_2exp_size: Random State Initialization. |
| (line 32) |
| * gmp_randinit_mt: Random State Initialization. |
| (line 13) |
| * gmp_randinit_set: Random State Initialization. |
| (line 43) |
| * gmp_randseed: Random State Seeding. |
| (line 8) |
| * gmp_randseed_ui: Random State Seeding. |
| (line 10) |
| * gmp_randstate_t: Nomenclature and Types. |
| (line 46) |
| * gmp_scanf: Formatted Input Functions. |
| (line 21) |
| * gmp_snprintf: Formatted Output Functions. |
| (line 46) |
| * gmp_sprintf: Formatted Output Functions. |
| (line 34) |
| * gmp_sscanf: Formatted Input Functions. |
| (line 29) |
| * gmp_urandomb_ui: Random State Miscellaneous. |
| (line 8) |
| * gmp_urandomm_ui: Random State Miscellaneous. |
| (line 14) |
| * gmp_vasprintf: Formatted Output Functions. |
| (line 66) |
| * gmp_version: Useful Macros and Constants. |
| (line 18) |
| * gmp_vfprintf: Formatted Output Functions. |
| (line 30) |
| * gmp_vfscanf: Formatted Input Functions. |
| (line 26) |
| * gmp_vprintf: Formatted Output Functions. |
| (line 25) |
| * gmp_vscanf: Formatted Input Functions. |
| (line 22) |
| * gmp_vsnprintf: Formatted Output Functions. |
| (line 48) |
| * gmp_vsprintf: Formatted Output Functions. |
| (line 35) |
| * gmp_vsscanf: Formatted Input Functions. |
| (line 31) |
| * hypot: C++ Interface Floats. |
| (line 94) |
| * lcm: C++ Interface Integers. |
| (line 65) |
| * mp_bitcnt_t: Nomenclature and Types. |
| (line 42) |
| * mp_bits_per_limb: Useful Macros and Constants. |
| (line 7) |
| * mp_exp_t: Nomenclature and Types. |
| (line 27) |
| * mp_get_memory_functions: Custom Allocation. (line 90) |
| * mp_limb_t: Nomenclature and Types. |
| (line 31) |
| * mp_set_memory_functions: Custom Allocation. (line 18) |
| * mp_size_t: Nomenclature and Types. |
| (line 37) |
| * mpf_abs: Float Arithmetic. (line 47) |
| * mpf_add: Float Arithmetic. (line 7) |
| * mpf_add_ui: Float Arithmetic. (line 9) |
| * mpf_ceil: Miscellaneous Float Functions. |
| (line 7) |
| * mpf_class: C++ Interface General. |
| (line 20) |
| * mpf_class::fits_sint_p: C++ Interface Floats. |
| (line 87) |
| * mpf_class::fits_slong_p: C++ Interface Floats. |
| (line 88) |
| * mpf_class::fits_sshort_p: C++ Interface Floats. |
| (line 89) |
| * mpf_class::fits_uint_p: C++ Interface Floats. |
| (line 90) |
| * mpf_class::fits_ulong_p: C++ Interface Floats. |
| (line 91) |
| * mpf_class::fits_ushort_p: C++ Interface Floats. |
| (line 92) |
| * mpf_class::get_d: C++ Interface Floats. |
| (line 95) |
| * mpf_class::get_mpf_t: C++ Interface General. |
| (line 66) |
| * mpf_class::get_prec: C++ Interface Floats. |
| (line 115) |
| * mpf_class::get_si: C++ Interface Floats. |
| (line 96) |
| * mpf_class::get_str: C++ Interface Floats. |
| (line 98) |
| * mpf_class::get_ui: C++ Interface Floats. |
| (line 99) |
| * mpf_class::mpf_class: C++ Interface Floats. |
| (line 47) |
| * mpf_class::operator=: C++ Interface Floats. |
| (line 60) |
| * mpf_class::set_prec: C++ Interface Floats. |
| (line 116) |
| * mpf_class::set_prec_raw: C++ Interface Floats. |
| (line 117) |
| * mpf_class::set_str: C++ Interface Floats. |
| (line 101) |
| * mpf_class::swap: C++ Interface Floats. |
| (line 104) |
| * mpf_clear: Initializing Floats. (line 37) |
| * mpf_clears: Initializing Floats. (line 41) |
| * mpf_cmp: Float Comparison. (line 7) |
| * mpf_cmp_d: Float Comparison. (line 9) |
| * mpf_cmp_si: Float Comparison. (line 11) |
| * mpf_cmp_ui: Float Comparison. (line 10) |
| * mpf_cmp_z: Float Comparison. (line 8) |
| * mpf_div: Float Arithmetic. (line 29) |
| * mpf_div_2exp: Float Arithmetic. (line 55) |
| * mpf_div_ui: Float Arithmetic. (line 33) |
| * mpf_eq: Float Comparison. (line 19) |
| * mpf_fits_sint_p: Miscellaneous Float Functions. |
| (line 20) |
| * mpf_fits_slong_p: Miscellaneous Float Functions. |
| (line 18) |
| * mpf_fits_sshort_p: Miscellaneous Float Functions. |
| (line 22) |
| * mpf_fits_uint_p: Miscellaneous Float Functions. |
| (line 19) |
| * mpf_fits_ulong_p: Miscellaneous Float Functions. |
| (line 17) |
| * mpf_fits_ushort_p: Miscellaneous Float Functions. |
| (line 21) |
| * mpf_floor: Miscellaneous Float Functions. |
| (line 8) |
| * mpf_get_d: Converting Floats. (line 7) |
| * mpf_get_d_2exp: Converting Floats. (line 17) |
| * mpf_get_default_prec: Initializing Floats. (line 12) |
| * mpf_get_prec: Initializing Floats. (line 62) |
| * mpf_get_si: Converting Floats. (line 28) |
| * mpf_get_str: Converting Floats. (line 38) |
| * mpf_get_ui: Converting Floats. (line 29) |
| * mpf_init: Initializing Floats. (line 19) |
| * mpf_init2: Initializing Floats. (line 26) |
| * mpf_init_set: Simultaneous Float Init & Assign. |
| (line 16) |
| * mpf_init_set_d: Simultaneous Float Init & Assign. |
| (line 19) |
| * mpf_init_set_si: Simultaneous Float Init & Assign. |
| (line 18) |
| * mpf_init_set_str: Simultaneous Float Init & Assign. |
| (line 26) |
| * mpf_init_set_ui: Simultaneous Float Init & Assign. |
| (line 17) |
| * mpf_inits: Initializing Floats. (line 31) |
| * mpf_inp_str: I/O of Floats. (line 39) |
| * mpf_integer_p: Miscellaneous Float Functions. |
| (line 14) |
| * mpf_mul: Float Arithmetic. (line 19) |
| * mpf_mul_2exp: Float Arithmetic. (line 51) |
| * mpf_mul_ui: Float Arithmetic. (line 21) |
| * mpf_neg: Float Arithmetic. (line 44) |
| * mpf_out_str: I/O of Floats. (line 19) |
| * mpf_pow_ui: Float Arithmetic. (line 41) |
| * mpf_random2: Miscellaneous Float Functions. |
| (line 37) |
| * mpf_reldiff: Float Comparison. (line 30) |
| * mpf_set: Assigning Floats. (line 10) |
| * mpf_set_d: Assigning Floats. (line 13) |
| * mpf_set_default_prec: Initializing Floats. (line 7) |
| * mpf_set_prec: Initializing Floats. (line 65) |
| * mpf_set_prec_raw: Initializing Floats. (line 72) |
| * mpf_set_q: Assigning Floats. (line 15) |
| * mpf_set_si: Assigning Floats. (line 12) |
| * mpf_set_str: Assigning Floats. (line 18) |
| * mpf_set_ui: Assigning Floats. (line 11) |
| * mpf_set_z: Assigning Floats. (line 14) |
| * mpf_sgn: Float Comparison. (line 34) |
| * mpf_sqrt: Float Arithmetic. (line 36) |
| * mpf_sqrt_ui: Float Arithmetic. (line 37) |
| * mpf_sub: Float Arithmetic. (line 12) |
| * mpf_sub_ui: Float Arithmetic. (line 16) |
| * mpf_swap: Assigning Floats. (line 52) |
| * mpf_t: Nomenclature and Types. |
| (line 21) |
| * mpf_trunc: Miscellaneous Float Functions. |
| (line 9) |
| * mpf_ui_div: Float Arithmetic. (line 31) |
| * mpf_ui_sub: Float Arithmetic. (line 14) |
| * mpf_urandomb: Miscellaneous Float Functions. |
| (line 27) |
| * mpn_add: Low-level Functions. (line 69) |
| * mpn_add_1: Low-level Functions. (line 64) |
| * mpn_add_n: Low-level Functions. (line 54) |
| * mpn_addmul_1: Low-level Functions. (line 150) |
| * mpn_and_n: Low-level Functions. (line 449) |
| * mpn_andn_n: Low-level Functions. (line 464) |
| * mpn_cmp: Low-level Functions. (line 295) |
| * mpn_cnd_add_n: Low-level Functions. (line 542) |
| * mpn_cnd_sub_n: Low-level Functions. (line 544) |
| * mpn_cnd_swap: Low-level Functions. (line 569) |
| * mpn_com: Low-level Functions. (line 489) |
| * mpn_copyd: Low-level Functions. (line 498) |
| * mpn_copyi: Low-level Functions. (line 494) |
| * mpn_divexact_1: Low-level Functions. (line 233) |
| * mpn_divexact_by3: Low-level Functions. (line 240) |
| * mpn_divexact_by3c: Low-level Functions. (line 242) |
| * mpn_divmod: Low-level Functions. (line 228) |
| * mpn_divmod_1: Low-level Functions. (line 212) |
| * mpn_divrem: Low-level Functions. (line 186) |
| * mpn_divrem_1: Low-level Functions. (line 210) |
| * mpn_gcd: Low-level Functions. (line 303) |
| * mpn_gcd_1: Low-level Functions. (line 313) |
| * mpn_gcdext: Low-level Functions. (line 319) |
| * mpn_get_str: Low-level Functions. (line 373) |
| * mpn_hamdist: Low-level Functions. (line 438) |
| * mpn_ior_n: Low-level Functions. (line 454) |
| * mpn_iorn_n: Low-level Functions. (line 469) |
| * mpn_lshift: Low-level Functions. (line 271) |
| * mpn_mod_1: Low-level Functions. (line 266) |
| * mpn_mul: Low-level Functions. (line 116) |
| * mpn_mul_1: Low-level Functions. (line 135) |
| * mpn_mul_n: Low-level Functions. (line 105) |
| * mpn_nand_n: Low-level Functions. (line 474) |
| * mpn_neg: Low-level Functions. (line 98) |
| * mpn_nior_n: Low-level Functions. (line 479) |
| * mpn_perfect_square_p: Low-level Functions. (line 444) |
| * mpn_popcount: Low-level Functions. (line 434) |
| * mpn_random: Low-level Functions. (line 423) |
| * mpn_random2: Low-level Functions. (line 424) |
| * mpn_rshift: Low-level Functions. (line 283) |
| * mpn_scan0: Low-level Functions. (line 408) |
| * mpn_scan1: Low-level Functions. (line 416) |
| * mpn_sec_add_1: Low-level Functions. (line 555) |
| * mpn_sec_div_qr: Low-level Functions. (line 632) |
| * mpn_sec_div_qr_itch: Low-level Functions. (line 633) |
| * mpn_sec_div_r: Low-level Functions. (line 649) |
| * mpn_sec_div_r_itch: Low-level Functions. (line 650) |
| * mpn_sec_invert: Low-level Functions. (line 664) |
| * mpn_sec_invert_itch: Low-level Functions. (line 665) |
| * mpn_sec_mul: Low-level Functions. (line 577) |
| * mpn_sec_mul_itch: Low-level Functions. (line 578) |
| * mpn_sec_powm: Low-level Functions. (line 607) |
| * mpn_sec_powm_itch: Low-level Functions. (line 609) |
| * mpn_sec_sqr: Low-level Functions. (line 592) |
| * mpn_sec_sqr_itch: Low-level Functions. (line 593) |
| * mpn_sec_sub_1: Low-level Functions. (line 557) |
| * mpn_sec_tabselect: Low-level Functions. (line 623) |
| * mpn_set_str: Low-level Functions. (line 388) |
| * mpn_sizeinbase: Low-level Functions. (line 366) |
| * mpn_sqr: Low-level Functions. (line 127) |
| * mpn_sqrtrem: Low-level Functions. (line 348) |
| * mpn_sub: Low-level Functions. (line 90) |
| * mpn_sub_1: Low-level Functions. (line 85) |
| * mpn_sub_n: Low-level Functions. (line 76) |
| * mpn_submul_1: Low-level Functions. (line 162) |
| * mpn_tdiv_qr: Low-level Functions. (line 175) |
| * mpn_xnor_n: Low-level Functions. (line 484) |
| * mpn_xor_n: Low-level Functions. (line 459) |
| * mpn_zero: Low-level Functions. (line 501) |
| * mpn_zero_p: Low-level Functions. (line 299) |
| * mpq_abs: Rational Arithmetic. (line 34) |
| * mpq_add: Rational Arithmetic. (line 8) |
| * mpq_canonicalize: Rational Number Functions. |
| (line 22) |
| * mpq_class: C++ Interface General. |
| (line 19) |
| * mpq_class::canonicalize: C++ Interface Rationals. |
| (line 43) |
| * mpq_class::get_d: C++ Interface Rationals. |
| (line 52) |
| * mpq_class::get_den: C++ Interface Rationals. |
| (line 66) |
| * mpq_class::get_den_mpz_t: C++ Interface Rationals. |
| (line 76) |
| * mpq_class::get_mpq_t: C++ Interface General. |
| (line 65) |
| * mpq_class::get_num: C++ Interface Rationals. |
| (line 65) |
| * mpq_class::get_num_mpz_t: C++ Interface Rationals. |
| (line 75) |
| * mpq_class::get_str: C++ Interface Rationals. |
| (line 53) |
| * mpq_class::mpq_class: C++ Interface Rationals. |
| (line 12) |
| * mpq_class::set_str: C++ Interface Rationals. |
| (line 55) |
| * mpq_class::swap: C++ Interface Rationals. |
| (line 57) |
| * mpq_clear: Initializing Rationals. |
| (line 16) |
| * mpq_clears: Initializing Rationals. |
| (line 20) |
| * mpq_cmp: Comparing Rationals. (line 7) |
| * mpq_cmp_si: Comparing Rationals. (line 18) |
| * mpq_cmp_ui: Comparing Rationals. (line 16) |
| * mpq_cmp_z: Comparing Rationals. (line 8) |
| * mpq_denref: Applying Integer Functions. |
| (line 18) |
| * mpq_div: Rational Arithmetic. (line 24) |
| * mpq_div_2exp: Rational Arithmetic. (line 28) |
| * mpq_equal: Comparing Rationals. (line 34) |
| * mpq_get_d: Rational Conversions. |
| (line 7) |
| * mpq_get_den: Applying Integer Functions. |
| (line 24) |
| * mpq_get_num: Applying Integer Functions. |
| (line 23) |
| * mpq_get_str: Rational Conversions. |
| (line 22) |
| * mpq_init: Initializing Rationals. |
| (line 7) |
| * mpq_inits: Initializing Rationals. |
| (line 12) |
| * mpq_inp_str: I/O of Rationals. (line 27) |
| * mpq_inv: Rational Arithmetic. (line 37) |
| * mpq_mul: Rational Arithmetic. (line 16) |
| * mpq_mul_2exp: Rational Arithmetic. (line 20) |
| * mpq_neg: Rational Arithmetic. (line 31) |
| * mpq_numref: Applying Integer Functions. |
| (line 17) |
| * mpq_out_str: I/O of Rationals. (line 19) |
| * mpq_set: Initializing Rationals. |
| (line 24) |
| * mpq_set_d: Rational Conversions. |
| (line 17) |
| * mpq_set_den: Applying Integer Functions. |
| (line 26) |
| * mpq_set_f: Rational Conversions. |
| (line 18) |
| * mpq_set_num: Applying Integer Functions. |
| (line 25) |
| * mpq_set_si: Initializing Rationals. |
| (line 31) |
| * mpq_set_str: Initializing Rationals. |
| (line 36) |
| * mpq_set_ui: Initializing Rationals. |
| (line 29) |
| * mpq_set_z: Initializing Rationals. |
| (line 25) |
| * mpq_sgn: Comparing Rationals. (line 28) |
| * mpq_sub: Rational Arithmetic. (line 12) |
| * mpq_swap: Initializing Rationals. |
| (line 56) |
| * mpq_t: Nomenclature and Types. |
| (line 16) |
| * mpz_2fac_ui: Number Theoretic Functions. |
| (line 110) |
| * mpz_abs: Integer Arithmetic. (line 45) |
| * mpz_add: Integer Arithmetic. (line 7) |
| * mpz_add_ui: Integer Arithmetic. (line 9) |
| * mpz_addmul: Integer Arithmetic. (line 26) |
| * mpz_addmul_ui: Integer Arithmetic. (line 28) |
| * mpz_and: Integer Logic and Bit Fiddling. |
| (line 11) |
| * mpz_array_init: Integer Special Functions. |
| (line 11) |
| * mpz_bin_ui: Number Theoretic Functions. |
| (line 122) |
| * mpz_bin_uiui: Number Theoretic Functions. |
| (line 124) |
| * mpz_cdiv_q: Integer Division. (line 13) |
| * mpz_cdiv_q_2exp: Integer Division. (line 26) |
| * mpz_cdiv_q_ui: Integer Division. (line 18) |
| * mpz_cdiv_qr: Integer Division. (line 16) |
| * mpz_cdiv_qr_ui: Integer Division. (line 22) |
| * mpz_cdiv_r: Integer Division. (line 14) |
| * mpz_cdiv_r_2exp: Integer Division. (line 28) |
| * mpz_cdiv_r_ui: Integer Division. (line 20) |
| * mpz_cdiv_ui: Integer Division. (line 24) |
| * mpz_class: C++ Interface General. |
| (line 18) |
| * mpz_class::fits_sint_p: C++ Interface Integers. |
| (line 50) |
| * mpz_class::fits_slong_p: C++ Interface Integers. |
| (line 51) |
| * mpz_class::fits_sshort_p: C++ Interface Integers. |
| (line 52) |
| * mpz_class::fits_uint_p: C++ Interface Integers. |
| (line 53) |
| * mpz_class::fits_ulong_p: C++ Interface Integers. |
| (line 54) |
| * mpz_class::fits_ushort_p: C++ Interface Integers. |
| (line 55) |
| * mpz_class::get_d: C++ Interface Integers. |
| (line 56) |
| * mpz_class::get_mpz_t: C++ Interface General. |
| (line 64) |
| * mpz_class::get_si: C++ Interface Integers. |
| (line 57) |
| * mpz_class::get_str: C++ Interface Integers. |
| (line 58) |
| * mpz_class::get_ui: C++ Interface Integers. |
| (line 59) |
| * mpz_class::mpz_class: C++ Interface Integers. |
| (line 7) |
| * mpz_class::set_str: C++ Interface Integers. |
| (line 61) |
| * mpz_class::swap: C++ Interface Integers. |
| (line 66) |
| * mpz_clear: Initializing Integers. |
| (line 49) |
| * mpz_clears: Initializing Integers. |
| (line 53) |
| * mpz_clrbit: Integer Logic and Bit Fiddling. |
| (line 56) |
| * mpz_cmp: Integer Comparisons. (line 7) |
| * mpz_cmp_d: Integer Comparisons. (line 8) |
| * mpz_cmp_si: Integer Comparisons. (line 9) |
| * mpz_cmp_ui: Integer Comparisons. (line 10) |
| * mpz_cmpabs: Integer Comparisons. (line 18) |
| * mpz_cmpabs_d: Integer Comparisons. (line 19) |
| * mpz_cmpabs_ui: Integer Comparisons. (line 20) |
| * mpz_com: Integer Logic and Bit Fiddling. |
| (line 20) |
| * mpz_combit: Integer Logic and Bit Fiddling. |
| (line 59) |
| * mpz_congruent_2exp_p: Integer Division. (line 137) |
| * mpz_congruent_p: Integer Division. (line 133) |
| * mpz_congruent_ui_p: Integer Division. (line 135) |
| * mpz_divexact: Integer Division. (line 110) |
| * mpz_divexact_ui: Integer Division. (line 112) |
| * mpz_divisible_2exp_p: Integer Division. (line 123) |
| * mpz_divisible_p: Integer Division. (line 120) |
| * mpz_divisible_ui_p: Integer Division. (line 122) |
| * mpz_even_p: Miscellaneous Integer Functions. |
| (line 18) |
| * mpz_export: Integer Import and Export. |
| (line 45) |
| * mpz_fac_ui: Number Theoretic Functions. |
| (line 109) |
| * mpz_fdiv_q: Integer Division. (line 30) |
| * mpz_fdiv_q_2exp: Integer Division. (line 43) |
| * mpz_fdiv_q_ui: Integer Division. (line 35) |
| * mpz_fdiv_qr: Integer Division. (line 33) |
| * mpz_fdiv_qr_ui: Integer Division. (line 39) |
| * mpz_fdiv_r: Integer Division. (line 31) |
| * mpz_fdiv_r_2exp: Integer Division. (line 45) |
| * mpz_fdiv_r_ui: Integer Division. (line 37) |
| * mpz_fdiv_ui: Integer Division. (line 41) |
| * mpz_fib2_ui: Number Theoretic Functions. |
| (line 132) |
| * mpz_fib_ui: Number Theoretic Functions. |
| (line 130) |
| * mpz_fits_sint_p: Miscellaneous Integer Functions. |
| (line 10) |
| * mpz_fits_slong_p: Miscellaneous Integer Functions. |
| (line 8) |
| * mpz_fits_sshort_p: Miscellaneous Integer Functions. |
| (line 12) |
| * mpz_fits_uint_p: Miscellaneous Integer Functions. |
| (line 9) |
| * mpz_fits_ulong_p: Miscellaneous Integer Functions. |
| (line 7) |
| * mpz_fits_ushort_p: Miscellaneous Integer Functions. |
| (line 11) |
| * mpz_gcd: Number Theoretic Functions. |
| (line 26) |
| * mpz_gcd_ui: Number Theoretic Functions. |
| (line 33) |
| * mpz_gcdext: Number Theoretic Functions. |
| (line 43) |
| * mpz_get_d: Converting Integers. (line 27) |
| * mpz_get_d_2exp: Converting Integers. (line 36) |
| * mpz_get_si: Converting Integers. (line 18) |
| * mpz_get_str: Converting Integers. (line 47) |
| * mpz_get_ui: Converting Integers. (line 11) |
| * mpz_getlimbn: Integer Special Functions. |
| (line 23) |
| * mpz_hamdist: Integer Logic and Bit Fiddling. |
| (line 29) |
| * mpz_import: Integer Import and Export. |
| (line 11) |
| * mpz_init: Initializing Integers. |
| (line 26) |
| * mpz_init2: Initializing Integers. |
| (line 33) |
| * mpz_init_set: Simultaneous Integer Init & Assign. |
| (line 27) |
| * mpz_init_set_d: Simultaneous Integer Init & Assign. |
| (line 30) |
| * mpz_init_set_si: Simultaneous Integer Init & Assign. |
| (line 29) |
| * mpz_init_set_str: Simultaneous Integer Init & Assign. |
| (line 35) |
| * mpz_init_set_ui: Simultaneous Integer Init & Assign. |
| (line 28) |
| * mpz_inits: Initializing Integers. |
| (line 29) |
| * mpz_inp_raw: I/O of Integers. (line 62) |
| * mpz_inp_str: I/O of Integers. (line 31) |
| * mpz_invert: Number Theoretic Functions. |
| (line 70) |
| * mpz_ior: Integer Logic and Bit Fiddling. |
| (line 14) |
| * mpz_jacobi: Number Theoretic Functions. |
| (line 79) |
| * mpz_kronecker: Number Theoretic Functions. |
| (line 87) |
| * mpz_kronecker_si: Number Theoretic Functions. |
| (line 88) |
| * mpz_kronecker_ui: Number Theoretic Functions. |
| (line 89) |
| * mpz_lcm: Number Theoretic Functions. |
| (line 62) |
| * mpz_lcm_ui: Number Theoretic Functions. |
| (line 64) |
| * mpz_legendre: Number Theoretic Functions. |
| (line 82) |
| * mpz_limbs_finish: Integer Special Functions. |
| (line 48) |
| * mpz_limbs_modify: Integer Special Functions. |
| (line 41) |
| * mpz_limbs_read: Integer Special Functions. |
| (line 35) |
| * mpz_limbs_write: Integer Special Functions. |
| (line 40) |
| * mpz_lucnum2_ui: Number Theoretic Functions. |
| (line 143) |
| * mpz_lucnum_ui: Number Theoretic Functions. |
| (line 141) |
| * mpz_mfac_uiui: Number Theoretic Functions. |
| (line 112) |
| * mpz_mod: Integer Division. (line 100) |
| * mpz_mod_ui: Integer Division. (line 102) |
| * mpz_mul: Integer Arithmetic. (line 19) |
| * mpz_mul_2exp: Integer Arithmetic. (line 38) |
| * mpz_mul_si: Integer Arithmetic. (line 20) |
| * mpz_mul_ui: Integer Arithmetic. (line 22) |
| * mpz_neg: Integer Arithmetic. (line 42) |
| * mpz_nextprime: Number Theoretic Functions. |
| (line 19) |
| * mpz_odd_p: Miscellaneous Integer Functions. |
| (line 17) |
| * mpz_out_raw: I/O of Integers. (line 46) |
| * mpz_out_str: I/O of Integers. (line 19) |
| * mpz_perfect_power_p: Integer Roots. (line 28) |
| * mpz_perfect_square_p: Integer Roots. (line 37) |
| * mpz_popcount: Integer Logic and Bit Fiddling. |
| (line 23) |
| * mpz_pow_ui: Integer Exponentiation. |
| (line 31) |
| * mpz_powm: Integer Exponentiation. |
| (line 8) |
| * mpz_powm_sec: Integer Exponentiation. |
| (line 18) |
| * mpz_powm_ui: Integer Exponentiation. |
| (line 10) |
| * mpz_primorial_ui: Number Theoretic Functions. |
| (line 117) |
| * mpz_probab_prime_p: Number Theoretic Functions. |
| (line 7) |
| * mpz_random: Integer Random Numbers. |
| (line 42) |
| * mpz_random2: Integer Random Numbers. |
| (line 51) |
| * mpz_realloc2: Initializing Integers. |
| (line 57) |
| * mpz_remove: Number Theoretic Functions. |
| (line 104) |
| * mpz_roinit_n: Integer Special Functions. |
| (line 69) |
| * MPZ_ROINIT_N: Integer Special Functions. |
| (line 84) |
| * mpz_root: Integer Roots. (line 8) |
| * mpz_rootrem: Integer Roots. (line 14) |
| * mpz_rrandomb: Integer Random Numbers. |
| (line 31) |
| * mpz_scan0: Integer Logic and Bit Fiddling. |
| (line 38) |
| * mpz_scan1: Integer Logic and Bit Fiddling. |
| (line 40) |
| * mpz_set: Assigning Integers. (line 10) |
| * mpz_set_d: Assigning Integers. (line 13) |
| * mpz_set_f: Assigning Integers. (line 15) |
| * mpz_set_q: Assigning Integers. (line 14) |
| * mpz_set_si: Assigning Integers. (line 12) |
| * mpz_set_str: Assigning Integers. (line 21) |
| * mpz_set_ui: Assigning Integers. (line 11) |
| * mpz_setbit: Integer Logic and Bit Fiddling. |
| (line 53) |
| * mpz_sgn: Integer Comparisons. (line 28) |
| * mpz_si_kronecker: Number Theoretic Functions. |
| (line 90) |
| * mpz_size: Integer Special Functions. |
| (line 31) |
| * mpz_sizeinbase: Miscellaneous Integer Functions. |
| (line 23) |
| * mpz_sqrt: Integer Roots. (line 18) |
| * mpz_sqrtrem: Integer Roots. (line 21) |
| * mpz_sub: Integer Arithmetic. (line 12) |
| * mpz_sub_ui: Integer Arithmetic. (line 14) |
| * mpz_submul: Integer Arithmetic. (line 32) |
| * mpz_submul_ui: Integer Arithmetic. (line 34) |
| * mpz_swap: Assigning Integers. (line 37) |
| * mpz_t: Nomenclature and Types. |
| (line 6) |
| * mpz_tdiv_q: Integer Division. (line 47) |
| * mpz_tdiv_q_2exp: Integer Division. (line 60) |
| * mpz_tdiv_q_ui: Integer Division. (line 52) |
| * mpz_tdiv_qr: Integer Division. (line 50) |
| * mpz_tdiv_qr_ui: Integer Division. (line 56) |
| * mpz_tdiv_r: Integer Division. (line 48) |
| * mpz_tdiv_r_2exp: Integer Division. (line 62) |
| * mpz_tdiv_r_ui: Integer Division. (line 54) |
| * mpz_tdiv_ui: Integer Division. (line 58) |
| * mpz_tstbit: Integer Logic and Bit Fiddling. |
| (line 62) |
| * mpz_ui_kronecker: Number Theoretic Functions. |
| (line 91) |
| * mpz_ui_pow_ui: Integer Exponentiation. |
| (line 33) |
| * mpz_ui_sub: Integer Arithmetic. (line 16) |
| * mpz_urandomb: Integer Random Numbers. |
| (line 14) |
| * mpz_urandomm: Integer Random Numbers. |
| (line 23) |
| * mpz_xor: Integer Logic and Bit Fiddling. |
| (line 17) |
| * operator"" <1>: C++ Interface Floats. |
| (line 56) |
| * operator"" <2>: C++ Interface Rationals. |
| (line 38) |
| * operator"": C++ Interface Integers. |
| (line 30) |
| * operator%: C++ Interface Integers. |
| (line 35) |
| * operator/: C++ Interface Integers. |
| (line 34) |
| * operator<<: C++ Formatted Output. |
| (line 33) |
| * operator>> <1>: C++ Formatted Input. (line 14) |
| * operator>> <2>: C++ Interface Rationals. |
| (line 85) |
| * operator>>: C++ Formatted Input. (line 25) |
| * sgn <1>: C++ Interface Integers. |
| (line 62) |
| * sgn <2>: C++ Interface Rationals. |
| (line 56) |
| * sgn: C++ Interface Floats. |
| (line 102) |
| * sqrt <1>: C++ Interface Integers. |
| (line 63) |
| * sqrt: C++ Interface Floats. |
| (line 103) |
| * swap <1>: C++ Interface Floats. |
| (line 105) |
| * swap <2>: C++ Interface Rationals. |
| (line 58) |
| * swap: C++ Interface Integers. |
| (line 67) |
| * trunc: C++ Interface Floats. |
| (line 106) |
| |
| |