Sparse polynomial interpolation consists in recovering of a sparse representation of a polynomial given by a blackbox program which computes values of at as many points as necessary. In practice is typically represented by DAGs (Directed Acyclic Graphs) or SPLs (Straight Line Programs). For polynomials with at most terms over a field, this task typically involves evaluations of, operations on integers for discovering the exponents of the nonzero terms of, and an additional number of operations in that can be bounded by for recovering the coefficients.

However, the latter complexity analysis does not take into account the sizes of coefficients in and the expression swell that might occur when evaluating at points with high height. For practical implementations, it is important to perform most of the computations over a prime finite field , where fits into a single machine register. On modern architectures, this typically means that . There mainly are two approaches which can be used over finite fields: the “prime number” approach and the “Kronecker substitution” approach. For more references and practical point of views on known algorithms we refer the reader to (see also in a more restricted context). We recall these approaches in Section2. In Section3, we present new techniques which may be used to further enhance and combine these classical approaches.

We report on our implementations in the library of the system. For the sake of conciseness some discussions on the probabilistic aspects will be informal. The interested reader might consult on this topic. Of course, once a candidate for the sparse representation is found, then the Schwartz-Zippel lemma can be used to verify it with a low level of failure.

Throughout this paper, we will use vector notation for -tuples: given variables , we write . If is a vector of elements in or the sequence of the variables , and if , then we write and . If with all the exponents pairwise distinct and , then for all we have

If the values are pairwise distinct, then it is classical that can be computed efficiently from the first terms of the power series . One may further recover the from the with suitable values of. If no bound for is known in advance, then we try to interpolate for successive upper bounds in geometric progression.

Over many common fields such as , the numbers
can become quite large. For efficiency reasons,
it is better to work over a finite field as much
as possible. Moreover, we wish to take a which
fits into a single machine register, so that computations in can be done very efficiently. When using modular
reduction, the problems of computing the and
computing the are actually quite distinct, and
are best treated separately. The do not mostly
depend on (except for a finite number of bad
values) and only need to be computed for one .
The are potentially large numbers in , so we may need to compute their reductions modulo several
and then recover them using rational number
reconstruction. From now on, we focus on the computation of the
*support* of written .
Let represent the partial degree of in , and let .

For the determination of suitable such that the can be reconstructed from the , the “prime number” approach is to take , where and is the -th prime number. As long as , we may then recover from by repeated divisions. If is the total degree of , then it suffices that . We recall that asymptotically grows as , so that can be taken of bit-size in . This approach is therefore efficient when is sufficiently small.

If , the “Kronecker substitution” approach is to take , where is a primitive root of the multiplicative group . Assuming that , we can compute all the value . Moreover, assuming that the prime number is “smooth”, meaning that has only small prime factors, the discrete logarithm problem can be solved efficiently in : given there exists a relatively efficient algorithm to compute with . In this way we deduce all the . This approach is efficient when the partial degrees are small. More specifically, we prefer it over the prime number approach if .

We notice that bounds for , are sometimes known in advance (e.g. from a syntactic expression for ) and sometimes have to be guessed themselves. The total degree can be guessed efficiently by using sparse interpolation for the univariate polynomial , where is a random point in . Similarly, the partial degrees are obtained by interpolating .

In this section we carry on with the same notation, and is the finite field . We aim at decomposing the interpolation problem into several ones with less variables in order to interpolate more polynomials than with the previously known techniques. For all our algorithms, we assume that the support of is known for some and random . We will write and for the total degrees of in resp. . Let and assume that . By the SchwartzZippel lemma, taking random in ensures that coincides with the projection of on with probability at least .

Assume that and let be such that are pairwise distinct. Such avector, seen as a linear form acting on the exponents, is said to separate the exponents of. A random vector in separates exponents with probability at least . Since one can quickly verify that is a separating form, it is worth spending time to search for with small entries.

Now consider a new variable and . The partial degree of in is , hence its total degree is . If is in the support of then there exists a unique exponent of such that . We can compute efficiently after sorting the accordingly to the values . We have thus shown how to reduce the sparse interpolation problem for to the “smaller” sparse interpolation problem for . If , then the method succeeds with high probability.

Let and let be the generating series associated to and as in(1). In a similar way, we may associate a generating series to . Using sparse interpolation, we find and such that(1) holds, as well as the analogous numbers and for . If , then these data allow us to recover the vectors from the ratios . We may next compute the complete from , since we assumed to be known. This strategy also applies to the “Kronecker substitution” approach. In that case, we take , and we require that . If none of the conditions or holds, then we let be maximal such that either or , and we recursively apply the strategy to and then to itself.

Let be as in the “Kronecker substitution” approach. For each in the support of , let be such that , and let be the smallest distance between any two distinct points of . If is large and the exponents of are random, then is expected to be of the order (this follows from the birthday problem: taking random points out of , the probability of a collision is as soon as ). Now assume that . Performing sparse interpolation on, we obtain numbers , as well as the corresponding with . For each , there is unique and with . We may thus recover from and . In order to improve the randomness of the numbers modulo , it is recommended to replace by a random larger number.

The platform divides into three main components: low level structuring
and efficient libraries for classical mathematical objects, a compiler
and an interpreter of the eponym language, and interfaces to external
libraries. is an open source academic project hosted at `http://gforge.inria.fr/projects/mmx/`.
The Internet home page concerning installation and documentation is
`http://www.mathemagix.org`.

Multivariate polynomials, power series, jets, and related algorithms are
integrated in the library of . Algorithms for sparse and dense
representations have been previously reported in our article, in which
we focused on fast polynomial products assuming given an overset of the
support. This short note reports on the implementation of DAGs, SLPs,
and sparse interpolation. At the level, a DAG with coefficients in has
type . Arithmetic operations, evaluation, substitution, and also
construction from a polynomial in sparse representation are available.
For efficiency purposes, the evaluation of one or several DAGs on
several points can be speeded up by building a SLP from it, whose type
is . These classes are available from and . An overview of the main
features in together with examples and useful links to browsable
source code is available from `http://www.mathemagix.org/www/multimix/doc/html/index.en.html`.

Sparse interpolation is implemented in . Complete examples of use can be
found in . If represents the polynomial function
that we want to interpolate as a polynomial in ,
then our main routines take as input afunction
such that is the preimage of
computed modulo aprime number. If computing modulo involves a division by
zero, then an error is raised. For efficiency reasons, such
afunction can be essentially a pointer to a
compiled function. Of course DAGs over or can be converted to such functions *via* SLPs.

In the next examples, which can be found in , we illustrate the
behaviours of the aforementioned algorithms on an *Intel(R) Core(TM)
i7-3720QM CPU@2.60GHz* platform with 8GB of *1600MHz DDR3*.