Sparse polynomial interpolation consists in recovering of a sparse
representation of a polynomial given by a
blackbox program which computes values of
at as
many points as necessary. In practice
is
typically represented by DAGs (Directed Acyclic Graphs) or SPLs
(Straight Line Programs). For polynomials with at most
terms over a field
, this task
typically involves
evaluations of
, operations on integers for discovering the
exponents of the nonzero terms of
,
and an additional number of operations in
that
can be bounded by
for recovering the
coefficients.
However, the latter complexity analysis does not take into account the
sizes of coefficients in and the expression
swell that might occur when evaluating at points with high height. For
practical implementations, it is important to perform most of the
computations over a prime finite field
,
where
fits into a single machine register. On
modern architectures, this typically means that
. There mainly are two approaches which can be used
over finite fields: the “prime number” approach and the
“Kronecker substitution” approach. For more references and
practical point of views on known algorithms we refer the reader to (see
also in a more restricted context). We recall these approaches in
Section2. In Section3, we present new
techniques which may be used to further enhance and combine these
classical approaches.
We report on our implementations in the library of the system. For the sake of conciseness some discussions on the probabilistic aspects will be informal. The interested reader might consult on this topic. Of course, once a candidate for the sparse representation is found, then the Schwartz-Zippel lemma can be used to verify it with a low level of failure.
Throughout this paper, we will use vector notation for -tuples: given variables
, we write
.
If
is a vector of
elements in
or the sequence of the variables
, and if
, then we write
and
. If
with all the exponents
pairwise distinct and
, then for all
we have
If the values are pairwise distinct, then it is
classical that
can be computed efficiently from
the first
terms of the power series
. One may further recover the
from the
with suitable values of
. If no bound for
is
known in advance, then we try to interpolate for successive upper bounds
in geometric progression.
Over many common fields such as ,
the numbers
can become quite large. For
efficiency reasons, it is better to work over a finite field
as much as possible. Moreover, we wish to take a
which fits into a single machine register, so that
computations in
can be done very efficiently.
When using modular reduction, the problems of computing the
and computing the
are actually
quite distinct, and are best treated separately. The
do not mostly depend on
(except for a finite
number of bad values) and only need to be computed for one
. The
are potentially
large numbers in
, so we may
need to compute their reductions modulo several
and then recover them using rational number reconstruction. From now on,
we focus on the computation of the support of
written
. Let
represent the partial degree of
in
, and let
.
For the determination of suitable such that the
can be reconstructed from the
, the “prime number” approach is to
take
, where
and
is the
-th
prime number. As long as
, we
may then recover
from
by repeated divisions. If
is the total degree of
, then it suffices that
. We recall that
asymptotically grows as
, so
that
can be taken of bit-size in
. This approach is therefore efficient when
is sufficiently small.
If , the “Kronecker
substitution” approach is to take
,
where
is a primitive root of the multiplicative
group
. Assuming that
, we can compute all the value
. Moreover, assuming that the
prime number
is “smooth”, meaning
that
has only small prime factors, the discrete
logarithm problem can be solved efficiently in
: given
there exists a
relatively efficient algorithm to compute
with
. In this way we deduce all
the
. This approach is
efficient when the partial degrees are small. More specifically, we
prefer it over the prime number approach if
.
We notice that bounds for ,
are sometimes known in advance
(e.g. from a syntactic expression for
) and sometimes have to be guessed themselves.
The total degree can be guessed efficiently by using sparse
interpolation for the univariate polynomial
, where
is a random point in
. Similarly, the partial
degrees
are obtained by interpolating
.
In this section we carry on with the same notation, and
is the finite field
. We aim
at decomposing the interpolation problem into several ones with less
variables in order to interpolate more polynomials than with the
previously known techniques. For all our algorithms, we assume that the
support
of
is known for
some
and random
.
We will write
and
for
the total degrees of
in
resp.
. Let
and assume that
.
By the SchwartzZippel lemma, taking random
in
ensures that
coincides
with the projection of
on
with probability at least
.
Assume that and let
be
such that
are pairwise distinct. Such
avector
, seen as a linear
form acting on the exponents, is said to separate the exponents of
. A random vector in
separates exponents with probability at least
. Since one can quickly verify that
is a separating form, it is worth spending time to search
for
with small entries.
Now consider a new variable and
. The partial degree of
in
is
,
hence its total degree is
.
If
is in the support of
then there exists a unique exponent
of
such that
. We
can compute
efficiently after sorting the
accordingly to the values
. We have thus shown how to reduce the sparse
interpolation problem for
to the
“smaller” sparse interpolation problem for
. If
,
then the method succeeds with high probability.
Let and let
be the
generating series associated to
and
as in(1). In a similar way, we may associate
a generating series
to
. Using sparse interpolation, we find
and
such that(1)
holds, as well as the analogous numbers
and
for
. If
, then these data allow us to
recover the vectors
from the ratios
. We may next compute the complete
from
, since we
assumed
to be known. This strategy also applies
to the “Kronecker substitution” approach. In that case, we
take
, and we require that
. If none of the conditions
or
holds, then we let
be maximal such that either
or
, and we recursively apply
the strategy to
and then to
itself.
Let be as in the “Kronecker
substitution” approach. For each
in the
support of
, let
be such that
,
and let
be the smallest distance between any two
distinct points of
. If
is large and the exponents of
are random, then
is expected to be of the order
(this follows from the birthday problem: taking
random points out of
, the probability of a collision is
as soon as
). Now assume that
. Performing sparse
interpolation on
, we obtain
numbers
, as well as the
corresponding
with
.
For each
, there is unique
and
with
. We may thus recover
from
and
.
In order to improve the randomness of the numbers
modulo
, it is recommended to
replace
by a random larger number.
The platform divides into three main components: low level structuring and efficient libraries for classical mathematical objects, a compiler and an interpreter of the eponym language, and interfaces to external libraries. is an open source academic project hosted at http://gforge.inria.fr/projects/mmx/. The Internet home page concerning installation and documentation is http://www.mathemagix.org.
Multivariate polynomials, power series, jets, and related algorithms are integrated in the library of . Algorithms for sparse and dense representations have been previously reported in our article, in which we focused on fast polynomial products assuming given an overset of the support. This short note reports on the implementation of DAGs, SLPs, and sparse interpolation. At the level, a DAG with coefficients in has type . Arithmetic operations, evaluation, substitution, and also construction from a polynomial in sparse representation are available. For efficiency purposes, the evaluation of one or several DAGs on several points can be speeded up by building a SLP from it, whose type is . These classes are available from and . An overview of the main features in together with examples and useful links to browsable source code is available from http://www.mathemagix.org/www/multimix/doc/html/index.en.html.
Sparse interpolation is implemented in . Complete examples of use can be
found in . If represents the polynomial function
that we want to interpolate as a polynomial in
, then our main routines take as input afunction
such that
is the
preimage of
computed modulo aprime number
. If computing
modulo
involves a division by zero, then an error
is raised. For efficiency reasons, such afunction
can be essentially a pointer to a compiled function. Of course DAGs over
or
can be converted to
such functions via SLPs.
In the next examples, which can be found in , we illustrate the behaviours of the aforementioned algorithms on an Intel(R) Core(TM) i7-3720QM CPU@2.60GHz platform with 8GB of 1600MHz DDR3.