> <\body> <\hide-preamble> |w>|0>|w2|>|w>|0>||>||>|0.5tmpt>|0>>>>> |w>|0>|w2|>|w>|0>||>||>|0.5tmpt>|0.1ex>>>>> >> sparse polynomials>|<\doc-note> This paper is part of a project that has received funding from the French \PAgence de l'innovation de défense\Q. |||<\author-affiliation> CNRS (UMI 3069, PIMS) Department of Mathematics Simon Fraser University 8888 University Drive Burnaby, British Columbia V5A 1S6, Canada > \; >>||<\doc-note> > In this paper, we present a probabilistic algorithm to multiply two sparse polynomials almost as efficiently as two dense univariate polynomials with aresult of approximately the same size. The algorithm depends on unproven heuristics that will be made precise. > Let \,\,x|]>> be polynomials that are represented in the usual way as linear combinations of power products. The problem of is to compute the product in a way that is as efficient as possible in terms of the total bitsize of , , and (and where we use a similar for as for and). For pedagogical reasons, we mainly restrict our attention to polynomials with integer coefficients. Together with polynomials with rational coefficients, this is indeed the most important case for practical implementations inside computer algebra systems. Nevertheless, it is not hard to adapt our techniques to coefficients in more general rings (some indications to that effect are given in section). Still for pedagogical reasons, we will carry out our complexity analysis in the RAM model. We expect our algorithms to adapt to the Turing model, but more work will be needed to prove this and some of the constant factors might deteriorate. For polynomials of modest size, naive algorithms are often most efficient. We refer to for implementation techniques that are efficient in practice. Various types of faster algorithms have been proposed for polynomials with special supports. Asymptotically fast methods for polynomials of large sizes usually rely on sparse interpolation. The seminal paper by Ben Or and Tiwari triggered the development of many fast algorithms for the sparse interpolation of polynomial blackbox functions. In this framework, the unknown polynomial is given through a blackbox functions that can be evaluated at points in suitable extensions of the coefficient ring. We refer to for a nice survey on sparse interpolation and other algorithms to compute with sparse polynomials. The present paper grew out of our recent preprint with Grégoire Lecerf on this topic; the idea to \Pexploit colliding terms\Q in section 6.6 forms the starting point of our work. The most efficient algorithms for sparse interpolation are mostly probabilistic. Here we note that it is usually easy to check that the result is correct with high probability: just evaluate both the blackbox function and its supposed interpolation at a random point and verify that both evaluations coincide. In this paper, all algorithms will be probabilistic, which is suitable for the practical purposes that we are interested in. The running times of our algorithms also rely on suitable heuristics that we will make precise. Although the multiplication problem for sparse polynomials does not directly fit into the usual blackbox model, it does benefit from the techniques that have been developed for sparse interpolation. Practical algorithms along these lines have appeared in. Most algorithms operate in two phases: we first need to determine the exponents of the product and then its coefficients. The first phase is typically more expensive when the coefficients of are small, but it becomes cheap for large coefficients, due to the fact that we may first reduce modulo a suitable prime. It is also customary to distinguish between the supersparse case in which the total degree of is allowed to become huge and the normally sparse case in which the total degree remains small. In this paper, we mainly focus on the second problem, which is most important for practical applications. In order to describe the complexity results, let us introduce some notations. Given a polynomial \,\,x|]>>, we will write > for its total degree, > for its number of terms, > for the number of powers > that occur in its representation, and > for the maximal absolute of a coefficient. For instance, if *x-20*x*x*x+x>, then we have =4>, =3>, =6>, and =20>. For our multiplication problem , the degree d=d+d> of the result is easily determined, but we usually only assume a bound t> with |)>> for its number of terms. It is interesting to keep track of the dependency of our running times on logarithmic factors in certain parameters, but it is also convenient to ignore less important logarithmic and sublogarithmic factors. We do this by introducing the notation <\eqnarray*> >>|>|*s*s***|)>|)>>*|)>>|)>.>>>> We also wish to compare the cost of our algorithms with the cost of multiplying dense univariate polynomials of approximately the same size. Given integers 1>, we therefore also introduce the following two complexities: <\itemize> > stands for the cost of multiplying two non-zero polynomials in /-1|)>>> under the assumption that the product satisfies \N>. > stands for the cost of multiplying two polynomials in /|N*\|\>|)>/-1|)>>>. We make the customary assumption that /r> and /r> are non-decreasing as functions in . By, one may take =O|)>>. If >, then one also has =O|)>>, using Kronecker substitution. One traditional approach for sparse polynomial multiplication is to evaluate , , and at points in a geometric progression ,p,\,p|)>> modulo a sufficiently large prime number \T*p>, where > stands for the -th prime number. In combination with the tangent Graeffe method5 and7.2>, this approach allows the exponents of to be computed in time <\equation> O>>*log \|)>. The coefficients can be recovered using fast Vandermonde system solving, in time <\equation> O>*log t|)>, where *\2*-1>. In our case when is small, we usually have =O>>, in which case() simplifies into >|)>>. The dependence of the complexity on can be reduced using techniques from, among others. The main results of this paper are two faster probabilistic algorithms. The shorter running times rely on two heuristics and that will be detailed in section. For any \\>, we show (Theorem) that the exponents of can be computed in time <\equation> 6*\*>+O>+s+s|)>*log \++t|)>*log N+t*n|)>, where **> and \T*p> is prime. This algorithm is probabilistic of Monte Carlo type. Based on numerical evidence in section, we conjecture that \0.407265>. We also show (Theorem) that the coefficients may be computed in expected time <\equation> 3*\*|)>+O>+s+s|)>*log t++t+t|)>*log N|)>, using a probabilistic algorithm of Las Vegas type. In practice, when is small and not too large with respect to , the corrective terms in() are negligible and the cost reduces to +o|)>*>>. Similarly, the cost() usually simplifies to +o|)>*|)>>. If we also have =o>, then this means that the cost of the entire sparse multiplication becomes +o|)>*>. Here we note that > also corresponds to the time needed to multiply two dense polynomials in >, provided that the product satisfies t> and \N>. The proof of these bounds relies on the evaluation of at three points of the form >,\,u>|)>> in algebras of the form /-1|)>>, where |\*t|\>>. If > is sufficiently large (namely \\> for some critical value) and we already know the exponents of, then we show how to recover the coefficients with high probability. One interesting feature of our algorithm is that three evaluations are sufficient with high probability. Alogarithmic number of evaluations is necessary when using the more obvious iterative approach for which every additional evaluation allows us to compute a constant fraction of the unknown coefficients (with high probability). Our algorithm is first explained with an example in section and then in general in section. The probabilistic analysis is done in section. In section, we extend our approach to the computation of the exponents, using three additional evaluations. The last section is devoted to variants and extensions of our approach, and further remarks. The present paper works out an idea that was first mentioned in6.6>, in the context of general sparse interpolation. The application to polynomial multiplication is particularly suitable because of the low amortized cost of blackbox evaluations. The idea of using evaluations in (small) cyclic algebras has been used before in, but with aless optimal complexity. This approach also seems close to binning techniques that have recently been applied to compute sparse Fourier transforms; we plan to investigate this parallel in future work. Consider two sparse polynomials <\eqnarray*> ||+3*x*y*z-2*x*y+x*y*z>>|||*y*z.>>>> Their product is given by <\eqnarray*> ||*y*z+x*y*z+9*x*y*z+3*x*y*z-4*x*y*z+\>>|||*z+7*x*y*z-2*x*y*z+2*x*y-4*x*y.>>>> Assume that the monomials *y*z,x*y*z,\> of are known, but not the corresponding coefficients. Our aim is to determine through its evaluations at \Ppoints\Q of the form >=>,u>,u>|)>>> in /-1|)>> for suitable ,\,\\\> and lengths . These evaluations are obtained by evaluating and at the same points and multiplying the results in/-1|)>>>. In what follows, we will use three evaluation points. <\wide-tabular> | ||||||||||||||=>>>||||>|||||>|>>|>>|>>>|>>>|>>>>|||||>|>||||>|>|>|>|>|>|>|>|>|>|>>|>>|>>|>>|>>|>>>>>>|||||||||||||||||=>>>||||>|||||>|>>|>>|>>>|>>>|>>>>|>||||>|>||||>|>|||>|>>|>|>|>|>|>>|>>|>>|>>|>>|>>>>>>||||||||||||||||||||||=>>>||||>|||||>|>>|>>|>>>|>>>|>>>>|||||>||>|||>|>|>||>|>>|>|>|>|>|>>|>>|>>|>>|>>|>>>>>> >| ||||||||||||||||>|||||>|>|>|>||>|>|>|>|>|>>>|>>|>>|>>|>>|>>>>>>||||||||||||||||||||||>|>||||>|>|||>|>>|>|>>|>>|>|>>|>>|>>|>>|>>|>>>>>>||||||||||||||||||||||>|||||>|>|>||>|>|>|>|>>|>|>>|>>|>>|>>|>>|>>>>>> >| |||||||||||||||||>|>|>|||>|>|>|>|>>|>>>|>>|>>|>>|>>|>>>>>>|||||||||||||||||||||||>||||>|>>|>|>>|>>|>|>>|>>|>>|>>|>>|>>>>>>|||||||||||||||||||||||>||>||>|>|>|>|>>|>|>>>|>>|>>|>>|>>|>>>>>> >| ||||||||||||||||||>|>||||>|>|>|>>|>>|>>>|>>|>>|>>|>>|>>>>>>||||||||||||||||||||||||>||||>|>|>>|>>|>>|>|>>|>>|>>|>>|>>|>>>>>>||||||||||||||||||||||||>||>|||>|>>|>|>>|>|>>>|>>|>>|>>|>>|>>>>>> >| ||||||||||||||||||>|>|>>|>>|>>|>>>|>>|>>|>>|>>|>>>>>>||||||||||||||||||||||||>|>>|>>|>>|>|>>>|>>|>>|>>|>>|>>>>>>||||||||||||||||||||||||>|>>|>|>>|>>|>>>|>>|>>|>>|>>|>>>>>> >>> <|big-figure> Playing the game of mystery balls. At every round, we remove the balls that ended up in a private box for at least one of the three throws. >Let us show how to turn the problem of computing the coefficients of into a \Pgame of mystery balls\Q. At the start of the game, we have one numbered ball for each termof: <\eqnarray*> ||*y*z|\>>+*y*z|\>>+*y*z|\>>+*y*z|\>>+*x*y*z|\>>+\>>|||*z|\>>+*z|\>>+*x*y*z|\>>+|\>>+*x*y|\>>.>>>> For each ball, say >, the corresponding \Pmystery coefficient\Q needs to be determined (it might be hidden inside the ball), whereas the corresponding exponents are known (and stored in atable or painted on the ball). In fact, our game has three identical sets of balls, one for each of the three evaluation points. For each of these evaluation points, we also have a set of boxes, labeled by ,\,u>. Now consider the evaluation of at a point as above, say at => in the ring /-1|)>>. Then each term *x*y*z> evaluates to a term *u> with > and modulo . In our game, we throw the corresponding ball into the box that is labeled by >. For instance, our first ball > evaluates to and goes into the box labeled by . Our second ball > evaluates to > and goes into the box labeled by >. Continuing this way, we obtain the upper left distribution in Figure. Now the complete evaluation of at => in /-1|)>> gives <\eqnarray*> >||+3*u+u-1|)>.>>>> For each box, this means that we also know the sum of all coefficients hidden in the balls in that box. Indeed, in our example, the first box > contains three balls >, , and>, with coefficients , , and that sum up to . In Figure, we indicated these sums below the boxes. In round one of our game, we actually took our chances three times, by using the three evaluation points >, >, and > in /-1|)>>, and throwing our balls accordingly. This corresponds to the top row in Figure. Now we play our game as follows. If, in a certain round, a ball ends up alone in its box (we will also say that the ball has a private box), then the number below it coincides with the secret coefficient inside. At that point, we may remove the ball, as well as its copies from the two other throws, and update the numbers below accordingly. In round one of our running example, ball > ends up in a box of its own for our first throw. Similarly, the balls > and> both have private boxes for the second throw. Ball> also has a private box for the third throw. Removing the balls >, >, and > from the game, we obtain the second row in Figure. We also updated the numbers below the boxes: for every box, the number below it still coincides with the sum of the mystery coefficients inside the balls inside that box. Now that the balls >, >, and > have been removed, we observe that balls > and > have private boxes in their turn. We may thus determine their mystery coefficients and remove them from the game as well. This brings us to round three of our game and the third row in Figure. Going on like this, we win our game when all balls eventually get removed. We lose whenever there exists a round in which there are still some balls left, but all non-empty boxes contain at least two balls. In our example, we win after five rounds. <\remark> When implementing a computer program to play the game, we maintain atable that associates to each ball its exponents and the three boxes where it ended up for the three throws. Conversely, for each box, we maintain a linked list with all balls inside that box. We finally maintain a list with balls inside a private box, and which are about to be removed. In this way, the total amount of work that needs to be done in each round remains proportional to the number of balls that are removed instead of the total number of boxes. <\remark> Playing our game in several \Prounds\Q is convenient for the probabilistic analysis in section below. But the order in which balls in private boxes are removed actually does not matter, as long as we remove all balls in private boxes. Assume for instance that we win our game and that we replay it by removing balls in another order. Assume for contradiction that aball does not get removed in our modified game and choose in such a way that the round in which it gets removed in the original game is minimal. Then we observe that all balls that were removed before round in the original game also get removed in the modified version, eventually. When this happens, is in a private box for one of the throws: a contradiction. Have we been lucky in our example with throws of balls in boxes? For the probabilistic analysis in this section, we will assume that our throws are random and independent. We will do our analysis for three throws, because this is best, although a similar analysis could be carried out for other numbers of throws. From now on, we will assume that we have balls and *t> boxes. The experiment of throwing balls in boxes has widely been studied in the literature about hash tables9>. For a fixed ball, the probability that all other balls end up in another box is given by <\equation*> p=|)>=\*log*t>|)>>=\>+O|)>>=\>>+O|)>. More generally, for any fixed 1>, the probability that other balls end up in the same box and all the others in other boxes is given by <\eqnarray*> >||*>*|)>>>|||+O|)>|!>\*t>*>>+O|)>|)>>>|||>>|!*\>+O|)>.>>>> Stated otherwise, we may expect with high probability that approximately *t> balls end up in a private box, approximately *t> balls inside a box with one other ball, andsoon. This shows how we can expect our balls to be distributed in the first round of our game and at the limit when gets large. Assume more generally that we know the distribution |)>\>> in round and let us show how to determine the distribution |)>\>> for the next round. More precisely, assume that *|)>> is the expected number of balls in a box with balls in round, where we start with <\equation*> p=>>|!*\>. Setting <\equation*> \=p+p+\, we notice that \\\\\\>, where -\|)>*t> stands for the expected number of balls that are removed during round . Now let us focus on the first throw in round (the two other throws behave similarly, since they follow the same probability distribution). There are *t> balls that are in aprivate box for this throw. For each of the remaining balls, the probability > that it ended up in a private box for at least one of the two other throws is <\equation*> \=|\>|)>*|\>. The probability that a box with 2> balls becomes one with balls in the next round is therefore given by <\equation*> \=*\*|)>. For all 1>, this yields <\equation*> p=max>*\*p. If > tends to zero for large and gets large, then we win our game with high probability. If > tends to a limit \>, then we will probably lose and end up with approximately *t> balls that can't be removed (for each of the three throws). We have not yet been able to fully describe the asymptotic behavior of the distribution |)>\>> for \>, which follows a non-linear dynamics. Nevertheless, it is easy to compute reliable approximations for the coefficients > using interval or ball arithmetic; for this purpose, it suffices to replace each coefficient > in the tail of the distribution ( for large ) by the interval |]>>. Tables, , and show some numerical data that we computed in this way (the error in the numbers being at most 10>>). Our numerical experiments indicate that the \Pphase change\Q between winning and losing occurs at a critical value > with|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||>>||>||||||||>>>|||||||||||>|>||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>>>>> The probability distributions |)>\>> in rounds ,11> for =1/2>. > <\equation*> 0.407264\\\0.407265. Table shows what happens for =1/2>: until the seventh round, a bit less than half of the balls get removed at every round. After round eight, the remaining balls are removed at an accelerated rate. For =1/3>, the distributions |)>\>> numerically tend to a non-zero limit distribution ,k>|)>\>> with >\p,1>+p,2>+\\0.78350>. In Table, we show some of the distributions in round ten, for > near the critical point >. We also computed an approximation of the limit distribution at the critical point itself.||||||||||||||||||||||||||||||||||||||||||||||||>>||>||||||||>>>|||||||||||>|>||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>>>>> The probability distributions |)>\>> in rounds ,11> for =1/3>. >||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||>>||>||||||||>>>|||||||||||>|=0.333>>||||||||||>|=0.400>>||||||||||>|=0.405>>||||||||||>|=0.407>>||||||||||>|=0.410>>||||||||||>|=0.420>>||||||||||>|=0.450>>||||||||||>|=0.500>>||||||||||>|||||||||||>|>, >>||||||||||>>>>> The probability distributions |)>\>> in round for various > as well as an approximation of the limit distribution for =\>, by taking the distribution in round for \0.407264\\>>. > For \\>, we note that reliable numerical computations can be turned into an actual proof that we lose with high probability. Indeed, assume that \p=o> gets very small for some , whereas =O> remains bounded (for instance, in Table, we have \10> and =0.78351> for ). Then =|)>|)>*|\>=O|)>> also gets very small and <\eqnarray*> |>||2>\*|)>*p>>|||*\+O|)>>>||||\>*\+O|)>>>|>||+O|)>,k\2.>>>> If > happens to be > times smaller than > for some fixed \1> (for instance, in Table, we actually have \\> for ), then a standard contracting ball type argument can be used to prove that,1>> decreases to zero with geometric speed for \i>, while >> remains bounded away fromzero. Conversely, given \\>, it seems harder to prove in a similar way that we win. Nevertheless, for any \0>, reliable computations easily allow us to determine some round with \\>. For instance, for => and =10>, we may take. This is good enough for practical purposes: for \>, it means that we indeed win with high probability. Here we also note that a large number of rounds are generally necessary to win when > approaches the critical value >. For instance, for =0.42>, we need to wait until round to get \10>. Nevertheless, after a certain number of rounds, it seems that > always converges to zero with superlinear speed. In Table, we conclude with the result of a computer simulation in which we played our game with and =>. As one can see, the results are close to the theoretically expected ones from Table.||||||||||||||||||||||||||||||||||||||||||||||||||||>>||>||||||||>>>|||||||||||>|>||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>|||||||||||>>>>> Carrying out a random simulation of our game for and =>. The table shows the number > of balls that are in a box with balls in round (for the first throw). The last column also shows the sum =N+N+\>. > <\remark> Our game of mystery balls readily generalizes to different numbers of throws and different numbers of boxes for each throw. However, if our aim is to take the total number of boxes as small as possible, then computer experiments suggest that using three throws with the same number of boxes is optimal. For simplicity, we therefore restricted ourselves to this case, although it should be easy to extend our analysis to amore general setting. Let us now turn to the general problem of multiplying sparse polynomials. We will focus on the multiplication of integer polynomials \,\,x|]>> in at least two variables. We define to be the total degree of and assume that we have a bound for the number of terms of . As explained in the introduction, we proceed in two phases: we first determine the exponents of the unknown product . This part is probabilistic of Monte Carlo type, where we tolerate arate of failure \0> that is fixed in advance. In the second phase, we determine the unknown coefficients, using a probabilistic algorithm of Las Vegas type. In this section, we start with the second phase, which has already been explained on an example in section. Assume that our product has terms, so that *x>+\+c*x>> for certain ,\,c>\\> and ,\,e\\>, where >=x>*\*x>> for ,t>. It is obvious how to generalize the algorithm from section to this case: for some fixed \\>>, we distribute \Pour balls\Q over |\*t|\>> boxes, through the evaluation of at three points >,\,u>|)>>> in /-1|)>> for and ,\,\\\>. The vectors ,\,\> are essentially chosen at random, but it is convenient to take them pairwise non-collinear modulo , so as to avoid any \Puseless throws\Q. We assume the following heuristic: <\description-aligned> For random vectors ,\,\> as above each of the three throws randomly distributes the balls over the boxes, and the three distributions are independent. We then play our game of mystery balls as usual, which yields the coefficients ,\,c>> if we win and only a subset of these coefficients if we lose. In view of Remark, this leads to Algorithm.<\float|float|tbh> <\specified-algorithm> \,\,x|]>> and ,\,e\\> with \*x>+\+\*x>> the product , with high probability (Las Vegas) 2> and 6> <|specified-algorithm> <\enumerate> For a fixed \\>, let |\*t|\>>. Let ,\,\\,r-1|}>> be random vectors that are pairwise non-collinear modulo. Compute =P>,\,u>|)>> and =Q>,\,u>|)>> in /-1|)>>, for . Multiply \P*Q> in /-1|)>>, for . Let ,t|}>>. While there exist > and J> with \e\\\e>> for all \J\>, do <\enumerate> Let > be the coefficient of \e>> in >. For \>, replace \R-c*u>\e>>. Replace J\>. Return *x>+\+c*x>> if > and \Pfailed\Q otherwise. <\theorem> Assume heuristic >. Let \\> with \1> and **>. Then Algorithm is correct and runs in time <\equation*> 3*\*|)>+O>+s+s|)>*log t++t+t|)>*log N|)>. <\proof> The correctness of the algorithm has been explained in section. As to the running time, we first note that none of the integer coefficients encountered during our computation exceeds in absolute value. Now steps and have negligible cost and the running time for the remaining steps is as follows: <\itemize> In step, for every term > in or and every >, we first have to compute \e=\*e+\+\*e> modulo . Since we only have to accumulate *e> when \0>>, this can be done in time >+s|)>*log r|)>>. We next have to add the coefficients of all terms that end up in the same box, which amounts to +t|)>> additions of cost +t|)>*log N|)>>. In step, we do three multiplications of cost > each. Since /r> is non-decreasing, the cost of these multiplications is bounded by *|)>>. In steps and, we play our game of mystery balls, where stands for the set of ball that are still in play. Whenever ball ends up in a private box for throw number , we have to determine where this ball landed for the other throws, by computing >\e> modulo for \i>. We then have to update the numbers below the corresponding boxes, which corresponds to setting \R-c*u>\e>> in step. Since this eventually has to be done for each of the balls, step takes *log r+t*log N|)>> bit-operations, using a similar analysis as for step 3. Let us finally investigate bookkeeping costs that are implicit in our description of the algorithm. Above all, we have to maintain the linked lists with balls inside each box (see Remark). This can be done in time <\equation*> O*1>k*p*T|)>=O*1>!*\>*\>*T|)>=O. The other implicit costs to maintain various tables are also bounded by >. In, we surveyed several strategies for computing the exponents of the product . Most of the approaches from sections 4, 6, and 7 of that paper can be adapted to the present setting. We will focus on a probabilistic strategy that we expect to be one of the most efficient ones for practical purposes (a few variants will be discussed in section). For >, let > be the -th prime number and let >. We let > be a fixed prime number with \B*T/\> (for practical implementations, we may also take > to be a product of prime numbers that fit into machine words, and use multi-modular arithmetic to compute modulo >). For some fixed \\>, we again use |\*T|\>> boxes, and evaluate over the ring /\*\|)>/-1|)>>. This time, we use six evaluation points of the form >,\,u>|)>> and *u>,\,p*u>|)>> for >, where the ,\,\> are chosen at random and pairwise non-collinear modulo . Now consider a term >*\*x>> of . Its evaluation at >,\,u>|)>> is>, where *k+\+\*k> modulo . Meanwhile, its evaluation at *u>,\,p*u>|)>> is*u> with =p>*\*p>*c>. If there is no other term that evaluates to an expression of the form *u> at>,\,u>|)>>, then the same holds for the evaluation at *u>,\,p*u>|)>>. Consequently, the unique representative ,\-1|}>> of the quotient /c> in /|\*\|\>> coincides with >*\*p>> (the quotient is well defined with high probability). From this, we can determine the exponents ,\,k> by factoring . As additional safeguards, we also check that B>, that all prime factors of are in ,\,p|}>>, and that *k+\+\*k> modulo . Conversely, assume that there are at least two terms of that evaluate to an expression of the form *u> at >,\,u>|)>>. Let and > now be the coefficients of > in >,\,u>|)>>> and *u>,\,p*u>|)>>. Then the quotient /c> is essentially a random element in /|\*\|\>>> (see the heuristic below for more details), so its unique representative in ,\-1|}>>> is higher than with probability /T>. This allows our algorithm to detect that we are dealing with colliding terms; for the > quotients that we need to consider the probability of failure becomes /T|)>>=O|)>>. Using the above technique, the three additional evaluations at *u>,\,p*u>|)>>, allow us to determine which balls end up in a private box in our game of mystery balls. Moreover, since we can determine the corresponding exponents, we can also find where these balls landed for the two other throws. This allows us to play our game modulo minor modifications. Besides maintaining the numbers below the boxes for the first three throws, we also maintain the corresponding numbers for the evaluations at *u>,\,p*u>|)>>>. For every round of the game, the same technique then allows us to determine which balls have private boxes, and iterate. In our probabilistic analysis, it is important that the quotients /c> are essentially random elements in /|\*\|\>> in case of collisions. This assumption might fail when the coefficients of and are special ( either zero or one). Nevertheless, it becomes plausible after a change of variables \v*x>, ,n>, where ,\,v>> are random invertible elements in /|\*\|\>>. Let us formulate this as our second heuristic: <\description-aligned> For random vectors ,\,\> and after a random change of variables \v*x> (,n>), the quotients /c> as above are uniformly distributed over /|\*\|\>>, and the distributions for different quotients are independent. Algorithm summarizes our method, while incorporating the random change of variables.<\float|float|tbh> <\specified-algorithm> \,\,x|]>> and a bound t> the exponents of the product , with high probability (Monte Carlo) 2> and 6> <|specified-algorithm> <\enumerate> For a fixed \\>, let |\*t|\>>, p>, and \|B*T/\|\>> prime. Let ,\,\\,r-1|}>> be random vectors that are pairwise non-collinear modulo. Compute the reductions >,>\/|\*\|\>|)>,\,x|]>> of modulo >. Let ,\,v> be random invertible elements in /|\*\|\>>. Replace >\>*x,\,p*x|)>> and >\>*x,\,p*x|)>>. Compute \>*x,\,p*x|)>> and \>*x,\,p*x|)>> in /|\*\|\>|)>,\,x|]>>. Compute the evaluations >>, >>, >, > of >>, >>, >, > at >,\,u>|)>> for . Multiply >\>*>> and \*> in /-1|)>>, for . Let \\\,r|}>\>\0|}>> and \>. For all \\>, compute the preimage \,\-1|}>> of />>. For all \\>, try factoring =p>*\*p>> whenever \B>. While there exist \\> with =p>*\*p>>, +\+k\d>, and \=e>, do <\enumerate> Let and > be the coefficients of \k>> in >> and >. For \>, replace >\>-c*u>\k>> and \-*u>\k>>. Also update ,e>\>/>>> and its factorization if >>\0> for =\>\k>. Replace \\\,\>\k|)>\i=1,2,3|}>> and E\>. Return if =\> and \Pfailed\Q otherwise. <\lemma> Given a number ,\-1|}>> with distinct prime factors, it takes <\equation*> O>|)> bit operations to determine the existence of a prime factorization >*\*p>> with +\+k\d>>, and to compute it in case there is. <\proof> We first determine the indices with \0> using a divide and conquer technique. At the top level, we start with the remainder of the division of by *\*p>. We next compute =gcd*\*p|n/2|\>>|)>> and =gcd|n/2|\>+1>*\*p|)>>. If \1>, then we go on with the computation of =gcd,p*\*p|n/4|\>>|)>> and =gcd,p|n/4|\>+1>*\*p|n/2|\>>|)>>, and similarly for >. We repeat this dichotomic process until we have found all prime factors of. For each of the prime factors > of , we next compute >=|)>>> and =log p>/log p>. The total running time of this algorithm is bounded by n+s*log \*log log \|)>=O>|)>>. <\theorem> Assume the heuristics > and >. Let \\> with \1>, **>, and \p*T>. Then Algorithm is correct and runs in time <\equation*> 6*\*>+O>+s+s|)>*log \++t|)>*log N+t*n|)>. <\proof> We have already explained why our algorithm returns the correct answer with probability |)>>. The complexity analysis for steps, and is similar as the one for Algorithm. The running time for the other steps is as follows: <\itemize> The reductions of and modulo > can be computed in time >+t|)>*log N|)>>, in step. In step, we have to compute > modulo > for every power > occurring in the representation of or . Using binary powering, such a power can be computed in time >|)>=O>|)>>. The total cost of this step is therefore bounded by >+s|)>*log \|)>>>. In steps and, we have already shown that one factorization can be done in time >|)>>. Altogether, the cost of these steps is therefore bounded by >*n+s*log \|)>>. In section, we have described an efficient algorithm for the case when the total degree is small. This indeed holds for most practical applications, but it attractive to also study the case when our polynomials are \Ptruly sparse\Q with potentially large exponents. This is already interesting for univariate polynomials, a case that we also did not consider so far. Modulo the technique of Kronecker substitution7.1>, it actually suffices to consider the univariate case. For simplicity, let us restrict ourselves to this case. When > gets moderately large, it can also be interesting to consider the incremental approach from and7.3>. One first observation in the univariate case is that we can no longer take the same number of boxes for our three throws. Next best is to take =2*|\*t/2|\>+i> boxes for our -th throw, where . By construction this ensures that >, >, and > are pairwise coprime. If the exponents of are large, then it is also reasonable to assume that our heuristic still holds. For the efficient determination of the exponents, let \d*T/\>. As before, we may take> to be prime or a product of prime numbers that fit into machine words. Following Huang, we now evaluate both >=>*>> and >=x*>*>+>*>|)>> at in /|\*\|\>|)>/>-1|)>>> for . Any term > of >> then gives rise to a term *x> of>> with =e*c>, so we can directly read off from the quotient /c>. Modulo the above changes, this allows us to proceed as in Algorithm. One may prove in a similar way as before that the bit complexity of determining the exponents in this way is bounded by <\equation*> 9*\*>+O>+s+s|)>*log t++t+t|)>*log *N|)>|)>. The computation of the coefficients can be done in time <\equation*> 3*\*|)>+O>+s+s|)>*log t++t+t|)>*log N|)>. Using the combination of modular reduction and rational number reconstruction5>, the new algorithms generalize in a standard way to polynomials with rational coefficients. Given a fixed working precision , it is also possible to use coefficients in >, while using a fixed point number representation. Provided that all coefficients have the same order of magnitude, the floating point case can be reduced to this case. Indeed, the reductions modulo -1> are remarkably stable from a numerical point of view, and we may use FFT-multiplication in /-1|)>>. Algorithm can also be adapted, provided that \|B*T/\|\>>. Our algorithms adapt in a straightforward way to coefficients in a finite field of sufficiently large characteristic. If the characteristic is moderately large ( of the size of a machine word), then one may keep > small in Algorithm by using the incremental technique from and7.3>. Over finite fields > of small characteristic, one needs to use other techniques from. Instead of taking our three additional evaluation points of the form *u>,\,p*u>|)>>, we may for instance take them of the form *u>,\*u>,\,\>*u>|)>>, where > is a primitive root of unity of large smooth order (this may force us to work over an extension field; alternatively, one may use the aforementioned incremental technique and roots > of lower orders). Thanks to the smoothness assumption, discrete logarithms of powers of > can be computed efficiently using Pohlig\UHellman's algorithm. This allows us to recover exponents \d,\,k\d> from quotients of the form /c=\+k*+\+k*>>. Finally, it is possible to use coefficients in a general ring >, in which case complexities are measured in terms of the number of operations in >. For rings of sufficiently large characteristic, one may directly adapt the algorithms from section. For rings of small characteristic, it is possible to generalize the techniques from the finite field case. The techniques from this paper can also be used for general purpose sparse interpolation. In that case the polynomial \,\,x|]>> is given through an arbitrary blackbox function that can be evaluated at points in >-algebras over some ring >. This problem was studied in detail in. In section 6, we investigated in particular how to replace expensive evaluations in cyclic >-algebras of the form /-1|)>> by cheaper evaluations at suitable -th roots of unity in> or a small extension of>. The main new techniques from this paper were also anticipated in section6.6. The algorithms from this paper become competitive with the geometric sequence approach if the blackbox function is particularly cheap to evaluate. Typically, the cost of one evaluation should be |)>|)>> or |)>|)>>, depending on the specific variant of the geometric sequence approach that we use. Technically speaking, it is interesting to note that the problem from this paper actually does fit into this framework. Indeed, if and are sparse polynomial that are represented in their standard expanded form, then the evaluation of at a single point requires +s+1|)>>> operations in >, and we typically have >|)>>. This is due to the fact that the standard blackbox model does not take into account possible speed-ups if we evaluate our function at many points in geometric progression or at in a cyclic algebra /-1|)>>. Both in theory and for practical applications, it would be better to extend the blackbox model by allowing for polynomials of the form <\equation*> R=f,\,x,U,\,x|)>,\,U,\,x|)>|)>, where \,\,x,,\,y>|]>> is a blackbox function and ,\,U\\,\,x|]>> are sparse polynomials in their usual expanded representation. Within this model, the techniques from this paper should become efficient for many other useful operations on sparse polynomials, such as the computation of gcds or determinants of matrices whose entries are large sparse polynomials. The heuristic plays an important role in our complexity analysis. It is an interesting question whether it is satisfied for polynomials whose support is highly regular and not random at all. A particularly important case is when and are dense polynomials in variables of total degrees >, > and =d+d>. Such polynomials are often considered to be sparse, due to the fact that contains about times less terms than a fully dense polynomial of degree in each variable. If is bivariate or trivariate and of very high degree, then it has been shown in that can be computed in approximately the same time as the product of two dense univariate polynomials which has the same number of terms as . An algorithm for arbitrary dimensions has been presented in, whose cost is approximately > times larger than the cost of a univariate product of the same size. The problem has also been studied in and it is often used as a benchmark. What about the techniques from this paper? We first note that the exponents of are known, so we can directly focus on the computation of the coefficients. In case that we need to do several product computations for the same and , we can also spend some time on computing asmall > and vectors ,\,\\\> for which we know beforehand that we will win our game of mystery balls. For our simulations, we found it useful to chose each > among adozen random vectors in a way that minimizes the sum of the squares of the number of balls in the boxes (this sum equals for the first throw in Figure). For various , , and >, we played our game several times for different triples of vectors ,\,\|)>>>. The critical value for > for this specific type of support seems to be close to . In Table below, we report several cases for which we managed to win for this value of >. This that optimal polynomial multiplication algorithms for supports of this type are almost as efficient as dense univariate polynomial products of the samesize. <\big-table|>|||||||||||>|>|||||||||||>|>|||||||||||>|>>|>|||||>|||||>>>>> The particular case when our sparse polynomials are dense polynomials in variables of given total degrees >, > and =d+d>. The table shows values of > for which it is possible to win the game of mystery balls for suitable evaluation points (we do not claim these values to be optimal). <\bibliography|bib|tm-plain|> <\bib-list|37> A.Arnold, M.GiesbrechtD.S.Roche. Sparse interpolation over finite fields via low-order roots of unity. , 27\U34. ACM Press, 2014. A.Arnold, M.GiesbrechtD.S.Roche. Faster sparse multivariate polynomial interpolation of straight-line programs. , 75:4\U24, 2016. M.Asadi, A.Brandt, R.MoirM.M.Maza. Sparse polynomial arithmetic with the BPAS library. , 32\U50. Springer, 2018. M.Ben-OrP.Tiwari. A deterministic algorithm for sparse multivariate polynomial interpolation. , 301\U309. ACM Press, 1988. J.Canny, E.KaltofenY.Lakshman. Solving systems of non-linear polynomial equations faster. , 121\U128. ACM Press, 1989. C.Fieker, W.Hart, T.HofmannF.Johansson. Nemo/Hecke: computer algebra and number theory packages for the Julia programming language. , 157\U164. 2017. Ph.FlajoletR.Sedgewick. . Addison Wesley, Reading, Massachusetts, 2nd, 1996. T.S.Freeman, G.M.Imirzian, E.KaltofenY.Lakshman. DAGWOOD: a system for manipulating polynomials given by straight-line programs. , 14:218\U240, 1988. S.GargÉ.Schost. Interpolation of polynomials given by straight-line programs. , 410(27-29):2659\U2662, 2009. Mickaël GastineauJacques Laskar. Development of trip: fast sparse multivariate polynomial multiplication using burst tries. , 446\U453. Springer, 2006. J.vonzur GathenJ.Gerhard. . Cambridge University Press, 2-nd, 2002. A.W.GrovesD.S.Roche. Sparse polynomials in FLINT. , 50(3):105\U108, 2016. C.H.Papadimitriou. . Addison-Wesley, 1994. D.HarveyJ.vander Hoeven. Integer multiplication in time >. , HAL, 2019. . H.Hassanieh, P.Indyk, D.KatabiE.Price. Nearly optimal sparse Fourier transform. , 563\U578. 2012. J.vander Hoeven. The truncated Fourier transform and applications. J.Gutierrez, , 290\U296. Univ. of Cantabria, Santander, Spain, July 4\U7 2004. J.vander Hoeven. Ball arithmetic. , HAL, 2009. . J.vander HoevenG.Lecerf. On the complexity of blockwise polynomial multiplication. , 211\U218. Grenoble, France, July 2012. J.vander HoevenG.Lecerf. On the bit-complexity of sparse polynomial multiplication. , 50:227\U254, 2013. J.vander HoevenG.Lecerf. Sparse polynomial interpolation in practice. , 48(3/4):187\U191, 2015. J.vander HoevenG.Lecerf. Sparse polynomial interpolation. Exploring fast heuristic algorithms over finite fields. , HAL, 2019. . J.vander Hoeven etal. GNU TeXmacs. , 1998. J.vander HoevenÉ.Schost. Multi-point evaluation in higher dimensions. , 24(1):37\U52, 2013. M.A.HuangA.J.Rao. Interpolation of sparse multivariate polynomials over large finite fields with applications. , 508\U517. Philadelphia, PA, USA, 1996. Society for Industrial and Applied Mathematics. Q.L.HuangX.S.Gao. Sparse Polynomial Interpolation with Finitely Many Values for the Coefficients. V.Gerdt, V.Koepf, W.SeilerE.Vorozhtsov, , 10490. Springer, Cham, 2017. Q.-L.Huang. Sparse polynomial interpolation over fields with large or zero characteristic. , 219\U226. ACM, 2019. M.JavadiM.Monagan. Parallel sparse polynomial interpolation over finite fields. M.Moreno MazaJ.-L.Roch, , 160\U168. ACM Press, 2010. S.C.Johnson. Sparse polynomial arithmetic. , 8(3):63\U71, 1974. E.KaltofenL.Yagati. Improved sparse multivariate polynomial interpolation algorithms. , 467\U474. Springer Verlag, 1988. M.Kapralov. Sparse Fourier transform in any constant dimension with nearly-optimal sample complexity in sublinear time. , 264\U277. 2016. Mohamed Khochtali, DanielS RocheXisen Tian. Parallel sparse interpolation using small primes. , 70\U77. 2015. M.MonaganR.Pearce. Parallel sparse polynomial multiplication using heaps. , 263\U270. ACM Press, 2009. M.MonaganR.Pearce. Sparse polynomial multiplication and division in Maple 14. Available from , 2009. H.MuraoT.Fujise. Modular algorithm for sparse multivariate polynomial interpolation and its parallel implementation. , 21:377\U396, 1996. D.S.Roche. Chunky and equal-spaced polynomial multiplication. , 46(7):791\U806, 2011. D.S.Roche. What can (and can't) we do with sparse polynomials? C.Arreche, , 25\U30. ACM Press, 2018. T.Yan. The geobucket data structure for polynomials. , 25(3):285\U293, 1998. <\initial> <\collection> <\attachments> <\collection> <\associate|bib-biblio> <\db-entry|+if2lDqI1qmMqAaI|inproceedings|CKL89> <|db-entry> E. Y. > <\db-entry|+2X2sGxMCqQBxuWk|inproceedings|BoLeSc:2003:tellegen> <|db-entry> G. É. > <\db-entry|+1LCfUIVm228oQhYU|inproceedings|GR11> <|db-entry> D. S. > <\associate|bib-bibliography> <\db-entry|+if2lDqI1qmMqAaI|inproceedings|CKL89> <|db-entry> E. Y. > <\db-entry|+2FvHpwlf2H6fD5fe|misc|TeXmacs:website> <|db-entry> > > <\db-entry|+8lDHqURSvmeWyd|book|GaGe02> <|db-entry> J. > <\db-entry|+63PGfIkmrfTqAK|book|Pap94> <|db-entry> Papadimitriou>> <\db-entry|+1yKN9ZG31yfvgyKL|inproceedings|ABMM18> <|db-entry> A. R. M. M. > <\db-entry|+1yKN9ZG31yfvgyKM|inproceedings|FHHJ17> <|db-entry> W. T. F. > <\db-entry|+1yKN9ZG31yfvgyKG|inproceedings|GaLa06> <|db-entry> Jacques > <\db-entry|+2UJhhqqcr6IfM9|article|vdH:sparsemult> <|db-entry> G. > <\db-entry|+8lDHqURSvmeX0R|article|Johnson1974> <|db-entry> > <\db-entry|+2MmazzPzwkzLNWj|inproceedings|MonaganPearce2009> <|db-entry> R. > <\db-entry|+1yKN9ZG31yfvgyKK|unpublished|MonaganPearce2009b> <|db-entry> R. > > <\db-entry|+8lDHqURSvmeX7Y|article|Yan1998> <|db-entry> > <\db-entry|+TM239x3GmAFtnJ|inproceedings|vdH:tft> <|db-entry> > > <\db-entry|+8lDHqURSvmeX9e|inproceedings|vdH:blockprod> <|db-entry> G. > <\db-entry|+8lDHqURSvmeX9Q|article|vdH:mmeval> <|db-entry> É. > <\db-entry|+8lDHqURSvmeX4u|article|Roche2011> <|db-entry> > <\db-entry|+if2lDqI1qmMqAaW|inproceedings|BenOrTiwari1988> <|db-entry> P. > <\db-entry|+2Ge6dVXegcIOERX|inproceedings|ArnoldGiesbrechtRoche2014> <|db-entry> M. D. S. > <\db-entry|+2MmazzPzwkzLNWZ|article|GargSchost2009> <|db-entry> É. > <\db-entry|+8lDHqURSvmeXA0|article|vdH:spinpol> <|db-entry> G. > <\db-entry|+8lDHqURSvmeX09|inproceedings|HuangRao1996> <|db-entry> A. J. > <\db-entry|+2Ge6dVXegcIOERa|inproceedings|HuangGao2017> <|db-entry> X. S. > V. W. E. > <\db-entry|+1hEca2LzyXQ|inproceedings|Huang19> <|db-entry> > <\db-entry|+2DPRky2cs1xL3PQ|inproceedings|JaMo10> <|db-entry> M. > J.-L. > <\db-entry|+2MmazzPzwkzLNWi|inproceedings|KaltofenYagati1988> <|db-entry> L. > <\db-entry|+2MmazzPzwkzLNWe|article|MF96> <|db-entry> T. > <\db-entry|+2DPRky2cs1xL3Q7|inproceedings|Roche2018> <|db-entry> > > <\db-entry|+2bPrDBsrH7rutLc|techreport|vdH:ffsparse> <|db-entry> G. > > <\db-entry|+aBpkOnY2VUnXTuN|article|FrImKaLa88> <|db-entry> G. M. E. Y. > <\db-entry|+1yKN9ZG31yfvgyKA|article|GR16> <|db-entry> D. S. > <\db-entry|+1yKN9ZG31yfvgyKB|inproceedings|KRT15> <|db-entry> Daniel S Xisen > <\db-entry|+2KXhjR2JqOY31OL|techreport|vdH:nlogn> <|db-entry> J. van der > >> > <\db-entry|+un4EG755Kq76eO|article|AGR16> <|db-entry> M. D. S. > <\db-entry|+1yKN9ZG31yfvgyKN|inproceedings|HIKP12> <|db-entry> P. D. E. > <\db-entry|+1yKN9ZG31yfvgyKO|inproceedings|Kap16> <|db-entry> > <\db-entry|+1yKN9ZG31yfvgyKF|book|FS13> <|db-entry> R. > <\db-entry|+h40nlTFgrbyPHF|techreport|vdH:ball> <|db-entry> > > <\references> <\collection> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > <\auxiliary> <\collection> <\associate|bib> TeXmacs:website GaGe02 Pap94 ABMM18 FHHJ17 GaLa06 vdH:sparsemult Johnson1974 MonaganPearce2009 MonaganPearce2009b Yan1998 vdH:tft vdH:blockprod vdH:mmeval Roche2011 BenOrTiwari1988 ArnoldGiesbrechtRoche2014 CKL89 GargSchost2009 vdH:spinpol HuangRao1996 HuangGao2017 Huang19 JaMo10 KaltofenYagati1988 MF96 Roche2018 vdH:ffsparse FrImKaLa88 GR16 vdH:sparsemult KRT15 vdH:nlogn GaGe02 vdH:ffsparse Huang19 vdH:ffsparse AGR16 GargSchost2009 KRT15 HIKP12 Kap16 FS13 vdH:ball vdH:ffsparse vdH:ffsparse vdH:spinpol vdH:ffsparse Huang19 GaGe02 vdH:spinpol vdH:ffsparse vdH:ffsparse vdH:ffsparse vdH:tft vdH:mmeval vdH:blockprod vdH:sparsemult GR16 MonaganPearce2009 <\associate|figure> |1>|> Playing the game of mystery balls. At every round, we remove the balls that ended up in a private box for at least one of the three throws. |> <\associate|table> |1>|> The probability distributions ||)>\>> in rounds |i=1,\,11> for |\=1/2>. |> |2>|> The probability distributions ||)>\>> in rounds |i=1,\,11> for |\=1/3>. |> |3>|> The probability distributions ||)>\>> in round |10> for various |\> as well as an approximation of the limit distribution for |\=\>, by taking the distribution in round |10000> for |\\0.407264\\>>. |> |4>|> Carrying out a random simulation of our game for |t=100000> and |\=|||>||0.05em>>. The table shows the number |N> of balls that are in a box with |k> balls in round |i> (for the first throw). The last column also shows the sum |\=N+N+\>. |> |5>|> The particular case when our sparse polynomials |P,Q,R> are dense polynomials in |n> variables of given total degrees |d>, |d> and |d=d=d+d>. The table shows values of |\> for which it is possible to win the game of mystery balls for suitable evaluation points (we do not claim these values to be optimal). |> <\associate|toc> |math-font-series||font-shape||1.Introduction> |.>>>>|> |math-font-series||font-shape||2.A game of mystery balls> |.>>>>|> |math-font-series||font-shape||3.On our probability of winning the game> |.>>>>|> |math-font-series||font-shape||4.Multiplying sparse polynomials> |.>>>>|> |4.1.Determination of the coefficients |.>>>>|> > |4.2.Determination of the exponents |.>>>>|> > |math-font-series||font-shape||5.Variants and further experiments> |.>>>>|> |5.1.Supersparse polynomials |.>>>>|> > |5.2.Other rings of coefficients |.>>>>|> > |Rational coefficients |.>>>>|> > |Complex numbers |.>>>>|> > |Finite fields |.>>>>|> > |General rings |.>>>>|> > |5.3.Other operations |.>>>>|> > |5.4.Special types of support |.>>>>|> > |math-font-series||font-shape||Bibliography> |.>>>>|> <\links> <\collection> > >