<Chapter Label="ch:Methods">
<Heading>The Routines for Specific Factorization Methods</Heading>
Descriptions of the factoring methods implemented in this package can
be found in <Cite Key="Bressoud89"/> and <Cite Key="Cohen93"/>.
Cohen's book contains also descriptions of the other methods mentioned
in the preface.
<ManSection>
<Func Name="FactorsTD" Arg="n [, Divisors ]" Label="trial division"/>
<Returns>
a list of two lists: The first list contains the prime factors found,
and the second list contains remaining unfactored parts of <A>n</A>,
if there are any.
</Returns>
<Description>
This function tries to factor <A>n</A> by trial division.
The optional argument <A>Divisors</A> is the list of trial divisors.
If not given, it defaults to the list of primes <M>p < 1000</M>.
<Example>
<![CDATA[
gap> FactorsTD(12^25+25^12);
[ [ 13, 19, 727 ], [ 5312510324723614735153 ] ]
]]>
</Example>
</Description>
</ManSection>
<ManSection>
<Func Name="FactorsPminus1" Arg="n [, [ a, ] Limit1 [, Limit2 ] ]"
Label="Pollard's p-1"/>
<Returns>
a list of two lists: The first list contains the prime factors found,
and the second list contains remaining unfactored parts of <A>n</A>,
if there are any.
</Returns>
<Description>
This function tries to factor <A>n</A> using Pollard's p-1.
It uses <A>a</A> as base for exponentiation, <A>Limit1</A> as first
stage limit and <A>Limit2</A> as second stage limit.
If the function is called with three arguments, these arguments are
interpreted as <A>n</A>, <A>Limit1</A> and <A>Limit2</A>. Defaults are
chosen for all arguments which are omitted. <P/>
Pollard's p-1 is based on the fact that exponentiation
(mod <M>n</M>) can be done efficiently enough to compute
<M>a^{k!}</M> mod <M>n</M> for sufficiently large <M>k</M>
in a reasonable amount of time. Assume that <M>p</M> is a prime factor
of <M>n</M> which does not divide <M>a</M>, and that <M>k!</M>
is a multiple of <M>p-1</M>. Then
<Index Key="Lagrange's Theorem">Lagrange's Theorem
Lagrange's Theorem states that a^{k!} \equiv 1
(mod <M>p</M>). If <M>k!</M> is not a multiple of <M>q-1</M> for
another prime factor <M>q</M> of <M>n</M>, it is likely that the
factor <M>p</M> can be determined by computing
<M>\gcd(a^{k!}-1,n)</M>. A prime factor <M>p</M> is usually found if
the largest prime factor of <M>p-1</M> is not larger than <A>Limit2</A>,
and the second-largest one is not larger than <A>Limit1</A>.
(Compare with <Ref Func="FactorsPplus1" Label="Williams' p+1"/>
and <Ref Func="FactorsECM" Label="Elliptic Curves Method, ECM"/>.)
<Example>
<![CDATA[
gap> FactorsPminus1( Factorial(158) + 1, 100000, 1000000 );
[ [ 2879, 5227, 1452486383317, 9561906969931, 18331561438319,
4837142997094837608115811103417329505064932181226548534006749213\
4508231090637045229565481657130504121732305287984292482612133314325471\
3674832962773107806789945715570386038565256719614524924705165110048148\
7161609649806290811760570095669 ], [ ] ]
gap> List( last[ 1 ]{[ 3, 4, 5 ]}, p -> Factors( p - 1 ) );
[ [ 2, 2, 3, 3, 81937, 492413 ], [ 2, 3, 3, 3, 5, 7, 7, 1481, 488011 ]
, [ 2, 3001, 7643, 399613 ] ]
]]>
</Example>
</Description>
</ManSection>
<ManSection>
<Func Name="FactorsPplus1" Arg="n [, [ Residues, ] Limit1 [, Limit2 ] ]"
Label="Williams' p+1"/>
<Returns>
a list of two lists: The first list contains the prime factors found,
and the second list contains remaining unfactored parts of <A>n</A>,
if there are any.
</Returns>
<Description>
This function tries to factor <A>n</A> using Williams' p+1.
It tries <A>Residues</A> different residues, and uses <A>Limit1</A>
as first stage limit and <A>Limit2</A> as second stage limit.
If the function is called with three arguments, these arguments are
interpreted as <A>n</A>, <A>Limit1</A> and <A>Limit2</A>. Defaults are
chosen for all arguments which are omitted. <P/>
Williams' p+1 is very similar to Pollard's <M>p-1</M>
(see <Ref Func="FactorsPminus1" Label="Pollard's p-1"/>).
The difference is that the underlying group here can either have order
<M>p+1</M> or <M>p-1</M>, and that the group operation takes more time.
A prime factor <M>p</M> is usually found if the largest prime factor
of the group order is at most <A>Limit2</A> and the second-largest one
is not larger than <A>Limit1</A>.
(Compare also with <Ref Func="FactorsECM"
Label="Elliptic Curves Method, ECM"/>.)
<Example>
<![CDATA[
gap> FactorsPplus1( Factorial(55) - 1, 10, 10000, 100000 );
[ [ 73, 39619, 277914269, 148257413069 ],
[ 106543529120049954955085076634537262459718863957 ] ]
gap> List( last[ 1 ], p -> [ Factors( p - 1 ), Factors( p + 1 ) ] );
[ [ [ 2, 2, 2, 3, 3 ], [ 2, 37 ] ],
[ [ 2, 3, 3, 31, 71 ], [ 2, 2, 5, 7, 283 ] ],
[ [ 2, 2, 2207, 31481 ], [ 2, 3, 5, 9263809 ] ],
[ [ 2, 2, 47, 788603261 ], [ 2, 3, 5, 13, 37, 67, 89, 1723 ] ] ]
]]>
</Example>
</Description>
</ManSection>
<ManSection>
<Func Name="FactorsECM"
Arg="n [, Curves [, Limit1 [, Limit2 [, Delta ] ] ] ]"
Label="Elliptic Curves Method, ECM"/>
<Func Name="ECM"
Arg="n [, Curves [, Limit1 [, Limit2 [, Delta ] ] ] ]"
Label="shorthand for FactorsECM"/>
<Returns>
a list of two lists: The first list contains the prime factors found,
and the second list contains remaining unfactored parts of <A>n</A>,
if there are any.
</Returns>
<Description>
This function tries to factor <A>n</A> using the Elliptic Curves
Method (ECM).
The argument <A>Curves</A> is the number of curves to be tried.
The argument <A>Limit1</A> is the initial
<Index Key="first stage limit">first stage limit</Index>
first stage limit, and <A>Limit2</A> is the initial
<Index Key="second stage limit">second stage limit</Index>
second stage limit.
The argument <A>Delta</A> is the increment per curve for the first stage
limit. The second stage limit is adjusted appropriately. Defaults are
chosen for all arguments which are omitted. <P/>
<C>FactorsECM</C> recognizes the option <A>ECMDeterministic</A>.
If set, the choice of the curves is deterministic.
This means that in repeated runs of <C>FactorsECM</C> the same
curves are used, and hence for the same <M>n</M> the same
factors are found after the same number of trials. <P/>
The Elliptic Curves Method is based on the fact that exponentiation
in the
<Index Key="elliptic curve groups">elliptic curve groups</Index>
elliptic curve groups <M>E(a,b)/n</M> can be performed fast enough
to compute for example <M>g^{k!}</M> for <M>k</M> large enough
(e.g. 100000 or so) in a reasonable amount of time and without
using much memory, and on Lagrange's Theorem.
Assume that <M>p</M> is a prime divisor of <M>n</M>.
Then Lagrange's Theorem states that if k! is a multiple of
<M>|E(a,b)/p|</M>, then for any
<Index Key="elliptic curve point">elliptic curve point</Index>
elliptic curve point <M>g</M>, the power <M>g^{k!}</M> is the
identity element of <M>E(a,b)/p</M>.
In this situation -- under reasonable circumstances --
the factor <M>p</M> can be determined by taking an appropriate gcd.
<P/>
In practice, the algorithm chooses in some sense <Q>better</Q>
products <M>P_k</M> of small primes rather than <M>k!</M> as exponents.
After reaching the first stage limit with <M>P_{Limit1}</M>, it
considers further products <M>P_{Limit1}q</M> for primes <M>q</M>
up to the second stage limit <A>Limit2</A>, which is usually set equal
to something like 100 times the first stage limit.
The prime <M>q</M> corresponds to the largest prime factor of the
order of the group <M>E(a,b)/p</M>. <P/>
A prime divisor <M>p</M> is usually found if the largest prime factor
of the order of one of the examined elliptic curve groups <M>E(a,b)/p</M>
is at most <A>Limit2</A> and the second-largest one is at most
<A>Limit1</A>. Thus trying a larger number of curves increases the chance
of factoring <A>n</A> as well as choosing a larger value
for <A>Limit1</A> and/or <A>Limit2</A>. It turns out to be not optimal
either to try a large number of curves with very small <A>Limit1</A>
and <A>Limit2</A> or to try only a few curves with very large limits.
(Compare with <Ref Func="FactorsPminus1"
Label="Pollard's p-1"/>.)
<P/>
The elements of the group <M>E(a,b)/n</M> are the points <M>(x,y)</M>
given by the solutions of <M>y^2 = x^3 + ax + by</M> in the residue
class ring (mod <M>n</M>), and an additional point <M>\infty</M>
at infinity, which serves as identity element.
To turn this set into a group, define the product (although elliptic
curve groups are usually written additively, I prefer using the
multiplicative notation here to retain the analogy
to <Ref Func="FactorsPminus1" Label="Pollard's p-1"/> and
<Ref Func="FactorsPplus1" Label="Williams' p+1"/>)
of two points <M>p_1</M> and <M>p_2</M> as follows:
If <M>p_1 \neq p_2</M>, let <M>l</M> be the line through <M>p_1</M>
and <M>p_2</M>, otherwise let <M>l</M> be the tangent to the
curve <M>C</M> given by the above equation in the point
<M>p_1 = p_2</M>. The line <M>l</M> intersects <M>C</M> in a third
point, say <M>p_3</M>. If <M>l</M> does not intersect the curve in
a third affine point, then set <M>p_3 := \infty</M>.
Define <M>p_1 \cdot p_2</M> by the image of <M>p_3</M> under
the reflection across the <M>x</M>-axis.
Define the product of any curve point <M>p</M> and <M>\infty</M> by
<M>p</M> itself. This -- more or less obviously, checking associativity
requires some calculation -- turns the set of points on the given curve
into an abelian group <M>E(a,b)/n</M>. <P/>
However, the calculations are done in
<Index Key="projective coordinates">projective coordinates</Index>
projective coordinates to have an explicit representation of the
identity element and to avoid calculating inverses (mod <M>n</M>)
for the group operation. Otherwise this would require using an
<M>O((log \ n)^3)</M>-algorithm, while multiplication (mod <M>n</M>)
is only <M>O((log \ n)^2)</M>. The corresponding equation is given by
<M>bY^2Z = X^3 + aX^2Z + XZ^2</M>. This form allows even more efficient
computations than the
<Index Key="Weierstrass model">Weierstrass model</Index>
Weierstrass model <M>Y^2Z = X^3 + aXZ^2 + bZ^3</M>, which is the
projective equivalent of the affine representation
<M>y^2 = x^3 + ax + by</M> mentioned above. The algorithm only keeps
track of two of the three coordinates, namely <M>X</M> and <M>Z</M>.
The curves are chosen in a way that ensures the order of the
corresponding group to be divisible by 12. This increases the chance
that it is smooth enough to find a factor of <M>n</M>.
The implementation follows the description of R. P. Brent given in
<Cite Key="Brent96"/>, pp. 5 -- 8. In terms of this paper,
for the second stage the <Q>improved standard continuation</Q> is used.
<Example>
<![CDATA[
gap> FactorsECM(2^256+1,100,10000,1000000,100);
[ [ 1238926361552897,
93461639715357977769163558199606896584051237541638188580280321 ]
, [ ] ]
]]>
</Example>
</Description>
</ManSection>
<Section Label="sec:CFRAC">
<Heading>The Continued Fraction Algorithm (CFRAC)</Heading>
<Index Key="Continued Fraction Algorithm (CFRAC)">
Continued Fraction Algorithm (CFRAC)
</Index>
<ManSection>
<Func Name="FactorsCFRAC" Arg="n"
Label="Continued Fraction Algorithm, CFRAC"/>
<Func Name="CFRAC" Arg="n"
Label="shorthand for FactorsCFRAC"/>
<Returns>
a list of the prime factors of <A>n</A>.
</Returns>
<Description>
This function tries to factor <A>n</A> using the Continued Fraction
Algorithm (CFRAC), also known as Brillhart-Morrison Algorithm.
In case of failure an error is signalled. <P/>
Caution: The run time of this function depends only on the size
of <A>n</A>, and not on the size of the factors.
Thus if a small factor is not found during the preprocessing
which is done before invoking the sieving process, the run time is
as long as if <A>n</A> would have two prime factors of roughly
equal size. <P/>
The Continued Fraction Algorithm tries to find integers <M>x</M>
and <M>y</M> such that <M>x^2 \equiv y^2</M> (mod <M>n</M>),
but not <M>\pm x \equiv \pm y</M> (mod <M>n</M>).
In this situation, taking <M>\gcd(x - y,n)</M> yields a nontrivial
divisor of <M>n</M>. For determining such a pair <M>(x,y)</M>,
the algorithm uses the continued fraction expansion of the square root
of <M>n</M>. If <M>x_i/y_i</M> is a
<Index Key="continued fraction approximation">
continued fraction approximation
</Index>
continued fraction approximation of the square root of <M>n</M>,
then <M>c_i := x_i^2 - ny_i^2</M> is bounded by a small constant times
the square root of <M>n</M>.
The algorithm tries to find as many <M>c_i</M> as possible which factor
completely over a chosen
<Index Key="factor base">factor base</Index>
factor base (a list of small primes) or with only one factor not in the
factor base. The latter ones can be used if and only if a second
<M>c_i</M> with the same <Q>large</Q> factor is found.
<Index Key="factor base" Subkey="large factors">factor base</Index>
Once enough values <M>c_i</M> have been factored, as a final stage
<Index Key="Gaussian Elimination">Gaussian Elimination</Index>
Gaussian Elimination over GF(2) is used to determine which of the
congruences <M>x_i^2 \equiv c_i</M> (mod <M>n</M>) have to be
multiplied together to obtain a congruence of the desired form
<M>x^2 \equiv y^2</M> (mod <M>n</M>).
Let <M>M</M> be the corresponding matrix. Then the entries
of <M>M</M> are given by <M>M_{ij} = 1</M> if an odd power of
the <M>j</M>-th element of the factor base divides the <M>i</M>-th
usable factored value, and <M>M_{ij} = 0</M> otherwise.
To obtain the desired congruence, it is necessary that the rows
of <M>M</M> are linearly dependent.
In other words, this means that the number of factored <M>c_i</M>
needs to be larger than the rank of <M>M</M>, which is
approximately given by the size of the factor base. (Compare
with <Ref Func="FactorsMPQS"
Label="Multiple Polynomial Quadratic Sieve, MPQS"/>.)
<Example>
<![CDATA[
gap> FactorsCFRAC( Factorial(34) - 1 );
[ 10398560889846739639, 28391697867333973241 ]
]]>
</Example>
</Description>
</ManSection>
<ManSection>
<Func Name="FactorsMPQS" Arg="n"
Label="Multiple Polynomial Quadratic Sieve, MPQS"/>
<Func Name="MPQS" Arg="n"
Label="shorthand for FactorsMPQS"/>
<Returns>
a list of the prime factors of <A>n</A>.
</Returns>
<Description>
This function tries to factor <A>n</A> using the Single Large Prime
Variation of the Multiple Polynomial Quadratic Sieve (MPQS).
In case of failure an error is signalled. <P/>
Caution: The run time of this function depends only on the size
of <A>n</A>, and not on the size of the factors.
Thus if a small factor is not found during the preprocessing
which is done before invoking the sieving process, the run time is
as long as if <A>n</A> would have two prime factors of roughly
equal size. <P/>
The intermediate results of a computation can be saved by
interrupting it with <C>[Ctrl][C]</C> and calling <C>Pause();</C>
from the break loop. This causes all data needed for resuming
the computation again to be pushed as a record <A>MPQSTmp</A>
on the options stack.
When called again with the same argument <A>n</A>, <C>FactorsMPQS</C>
takes the record from the options stack and continues with the previously
computed factorization data. For continuing the factorization process
in another session, one needs to write this record to a file.
This can be done by the function <C>SaveMPQSTmp(<A>filename</A>)</C>.
The file written by this function can be read by the standard
<C>Read</C>-function of &GAP;. <P/>
The Multiple Polynomial Quadratic Sieve tries to find integers <M>x</M>
and <M>y</M> such that <M>x^2 \equiv y^2</M> (mod <M>n</M>),
but not <M>\pm x \equiv \pm y</M> (mod <M>n</M>).
In this situation, taking <M>\gcd(x - y,n)</M> yields a nontrivial
divisor of <M>n</M>. For determining such a pair <M>(x,y)</M>,
the algorithm chooses polynomials <M>f_a</M> of the form
<M>f_a(r) = ar^2 + 2br + c</M> with suitably chosen coefficients
<M>a</M>, <M>b</M> and <M>c</M> which satisfy <M>b^2 \equiv n</M>
(mod <M>a</M>) and <M>c = (b^2 - n)/a</M>.
The identity <M>a \cdot f_a(r) = (ar + b)^2 - n</M> yields
a congruence (mod <M>n</M>) with a perfect square on one side
and <M>a \cdot f_a(r)</M> on the other. The algorithm uses a sieving
technique similar to the Sieve of Eratosthenes over an appropriately
chosen
<Index Key="sieving interval">sieving interval</Index>
sieving interval to search for factorizations of values <M>f_a(r)</M>
over a chosen factor base. Any two factorizations with the same single
<Q>large</Q> factor which does not belong to the factor base can also be
used. Taking more polynomials and hence shorter sieving intervals has
the advantage of having to factor smaller values <M>f_a(r)</M> over the
factor base. <P/>
Once enough values <M>f_a(r)</M> have been factored, as a final stage
Gaussian Elimination over GF(2) is used to determine which congruences
have to be multiplied together to obtain a congruence of the desired
form <M>x^2 \equiv y^2</M> (mod <M>n</M>). Let <M>M</M> be the
corresponding matrix. Then the entries of <M>M</M> are given by
<M>M_{ij} = 1</M> if an odd power of the <M>j</M>-th element of the
factor base divides the <M>i</M>-th usable factored value, and
<M>M_{ij} = 0</M> otherwise.
To obtain the desired congruence, it is necessary that the rows
of <M>M</M> are linearly dependent.
In other words, this means that the number of usable factorizations of
values <M>f_a(r)</M> needs to be larger than the rank of <M>M</M>.
The latter is approximately equal to the size of the factor base.
(Compare with <Ref Func="FactorsCFRAC"
Label="Continued Fraction Algorithm, CFRAC"/>.)
<Example>
<![CDATA[
gap> FactorsMPQS( Factorial(38) + 1 );
[ 14029308060317546154181, 37280713718589679646221 ]
]]>
</Example>
</Description>
</ManSection>
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung ist noch experimentell.