> <\body> with sparse exponents>|<\doc-note> This work has been partly supported by the French NODE project. ||>||<\author-affiliation> Laboratoire d'informatique de l'École polytechnique (LIX, UMR 7161) CNRS, École polytechnique, Institut Polytechnique de Paris Bâtiment Alan Turing, CS35003 1, rue Honoré d'Estienne d'Orves 91120 Palaiseau, France |>>|||<\author-affiliation> Laboratoire d'informatique de l'École polytechnique (LIX, UMR 7161) CNRS, École polytechnique, Institut Polytechnique de Paris Bâtiment Alan Turing, CS35003 1, rue Honoré d'Estienne d'Orves 91120 Palaiseau, France >>|<\doc-note> > Consider a sparse multivariate polynomial with integer coefficients. Assume that is represented as a \Pmodular black box polynomial\Q, via an algorithm to at arbitrary integer points, modulo arbitrary positive integers. The problem of sparse interpolation is to recover in its usual sparse representation, as a sum of times monomials. For the first time we present a quasi-optimal algorithm for this task in term of the product of the number of terms of by the maximum of the bit-size of the terms of . > Consider a multivariate integer polynomial \,\,x|]>>. Then can uniquely be writtenas a sum <\equation> f=c*x>+\+c*x>, where |c,\,c|\>\\>\\\i\0|}>> and ,\,e\\> are pairwise distinct. Here we understand that >\x>*\*x>> for any =,\,\|)>\\>. We call() the of . In this paper, we assume that is explicitly given through its sparse representation and that we only have a program for evaluating . The goal of is to recover the sparse representation of . Theoretically speaking, we could simply evaluate at a single point ,\,a|)>> with =2,a=2>>,\,a=2>>> for some sufficiently large positive integer . Then the sparse representation of can directly be read off from the binary digits of >. However, the bit-complexity of this method is terrible, since the bit-size of > typically becomes huge. In order to get a better grip on the bit-complexity to evaluate and then interpolate, we will assume that we actually have a program to evaluate modulo for any positive integer modulus . We denote by > the cost to evaluate for a modulus with 2> and we assume that the \/s> is a non-decreasing function that grows not too fast as a function of . Since the blackbox function should at least read its input values, we also assume that |)>>. As in, for complexity estimates we use a (RAM) machine over a finite alphabet along with the notation: >(g(z))> means that |)>>>>>. The machine is expected to have an instruction for generating a random bit in constant time (see section for the precise computational model and hypotheses that we use). Assuming that we are given a bound for the number of terms of and > for a bound of the bit-size of the terms of , the main result of this paper is the following: <\theorem> There is a Monte Carlo algorithm that takes a modular blackbox representation of asparse polynomial \,\,x|]>> of T> terms of bit-size \> as input and that interpolates in time *T|)>+n|)>**T+n|)>> with a probability of success at least>. If the variables all appear in the sparse representation of , then *T|)>|)>> and *T|)>>, so the bound further simplifies into *T|)>**T|)>>. In addition, detecting useless variables can be done fast, using the heuristic method from7.4>. The problem of sparse interpolation has a long history and goes back to work by Prony in the > century. The first modern fast algorithm is due to Ben-Or and Tiwari. Their work spawned a new area of research in computer algebra together with early implementations. We refer to and3> for nice surveys. Modern research on sparse interpolation has developed in two directions. The first theoretical line of research has focused on rigorous and general complexity bounds. The second direction concerns implementations, practical efficiency, and applications of sparse interpolation. The present paper mainly falls in the first line of research, although we will briefly discuss practical aspects in section. The proof of our main theorem relies on some number theoretical facts about prime numbers that will be recalled in section. There is actually a big discrepancy between empirical observations about prime numbers and hard theorems that one is able to prove. Because of this, our algorithms involve constant factors that are far more pessimistic than the ones that can be used in practice. Our algorithms also involve a few technical complications in order to cover all possible cases, including very large exponents that are unlikely to occur in practice. Our paper borrows many techniques from that deal with the particular case when is a univariate polynomial. In principle, the multivariate case can be reduced to this case: setting =f,\,z>>|)>> for a sufficiently large \>>, the interpolation of reduces to the interpolation of . However, this reduction is optimal only if the entries of the vector exponents \\> \ are all approximately of the same bit-size. One interesting case that is not well covered by this reduction is when the number of variables is large and when the exponent vectors > are themselves sparse in the sense that only a few entries are non-zero. The main contribution of the present paper is the introduction of aquasi-optimal technique to address this problem. Another case that is not well covered by is when the bit-sizes of the coefficients or exponents vary wildly. Recent progress on this issue has been made in and it is plausible that these improvements can be extended to our multivariate setting (see also Remark). Theorem also provides a quasi-optimal solution for an approximation of the sparse interpolation problem: for a fixed constant \0>, we only require the determination of at least|)>*t> correct terms of(). In order to cover sparse exponent vectors in a more efficient way, we will introduce a new technique in section. The idea is to compress such exponent vectors using random projections. With high probability, it will be possible to reconstruct the actual exponent vectors from their projections. We regard this section as the central technical contribution of our paper. Let us further mention that random projections of the exponents were previously implemented in in order to reduce the multivariate case to the univariate one: monomial collisions were avoided in this way but the reconstruction of the exponents needed linear algebra and could not really catch sparse or unbalanced exponents. The proof of Theorem will be completed at the end of section. Section will address the practical aspects of our new method. An important inspiration behind the techniques from section and its practical variants is the mystery ball game from; this connection will also be discussed in section. Our paper focuses on the case when has integer coefficients, but our algorithm can be easily adapted to rational coefficients as well, essentially by appealing to rational reconstruction5, section5.10> during the proof of Lemma. However when has rational coefficients, its blackbox might include divisions and therefore raise \Pdivision by zero\Q errors occasionally. This makes the probability analysis and the worst case complexity bounds more difficult to analyze, so we preferred to postpone this study to another paper. Our algorithm should also be applicable to various other coefficient rings of characteristic zero. However it remains an open challenge to develop similar algorithms for coefficient rings of small positive characteristic. Throughout this paper, we will use the following notation: <\eqnarray*> >|>||}>>>|>>|>||}>>>>> For any \>, we also define <\eqnarray*> >|>|,k-1|}>>>|>|>|-||\>.>>>> We may use both > and > as sets of canonical representatives modulo . Given \>> and depending on the context, we write for the unique \> or \> with k*\>>. This section presents sparse data structures, computational models, and quantitative results about prime distributions. At the end, an elementary fast algorithm is presented for testing the divisibility of several integers by several prime numbers. We order > lexicographically by >. Given formal indeterminates ,\,x> and an ,\,e|)>\\>, we define \x>*\*x>>. We define the of an integer \> as \min \\i\2|}>>. In particular, =0>, =1>, =2>, etc. We define the size> of an exponent tuple \> by \e+\+e>. We extend these definitions to the cases when \> and \> by setting \\>> and \\>>, where =|\|>,\,|\|>|)>>>. Now consider a multivariate polynomial \,\,x|]>>. Then can uniquely be writtenas a sum <\equation*> f=c*x>+\+c*x>, where ,\,c\\>> and ,\,e\\> are such that \\\e>. We call this the of . We call ,\,e> the of and ,\,c> the corresponding . We also say that *x>,\,c*x>> are the of and we call ,\,e|}>> the of . Any non-zero > with \>> and \> is called an of . We define \\>+\>+\+\>+\>> to be the size> of . <\remark> For the complexity model we could have chosen a multi-tape Turing machine, but this would have led to more tedious cost analyses. In fact on a Turing tape, we would actually need to indicate the ends of numbers using adequate markers. Using a liberal notion of \Pbit\Q, which allows for the storage of such markers in a single bit, the of an integer \> then becomes >\\+1>>. For \>>, we also define >\\>+1>. Exponents ,\,e|)>> can be stored by appending the representations of ,\,e>, but this is suboptimal in the case when only afew entries of are non-zero. For such \Psparse exponents\Q, one prefers to store the pairs |)>> for which \0>, again using suitable markers. For this reason, the of becomes >\min>+\+\>>,\0>>+\>>|)>|)>+1>. Throughout this paper, we will analyze bit complexities in the RAM model as in. In this model, it is known that two -bit integers can be multiplied in time >. As a consequence, given an -bit modulus \>>, the ring operations in /|m*\|\>> \ can be done with the same complexity. Inverses can be computed in time n|)>>, whenever they exist. For randomized algorithms, we assume that we have an instruction for generating a random bit in constant time. Consider a polynomial \,\,x|]>>. A for is aprogram that takes a modulus \>> and integers ,\,a\,m-1|}>> as input, and that returns ,\,a|)> rem m\\>. A is apolynomial that is represented in this way. The (or, better, a cost function) of such apolynomial is a function > such that > yields an upper bound for the running time if has bitsize s>. It will be convenient to always assume that the \/s>> per bit of the modulus is non-decreasing and that \k>*> for any1>>. Since should at least read its input values, we also assume that |)>>>. <\remark> A popular type of modular blackboxes are straight-line programs (SLPs). For an SLP of length that only uses ring operations, the above average cost function usually becomes \C*L*log s>, for some fixed constant that does not depend on. If the SLP also allows for divisions, then we rather obtain \C*L*log s>, but this is out of the scope of this paper, due to the \Pdivision by zero\Q issue. In fact, computation trees are more suitable than SLPs in this context. For instance, the computation of determinants using Gaussian elimination naturally fits in this setting, since the chosen computation path may then depend on the modulus . However, although these bounds \Pusually\Q hold ( for all common algebraic algorithms that we are aware of, including the RAM model), they may fail in pathological cases when the SLP randomly accesses data that are stored at very distant locations on the Turing tape. For this reason, the blackbox cost model may be preferred in order to study bit complexities. In this model, a suitable replacement for the length of an SLP is the average cost function>, which typically involves only a logarithmic overhead in the bit-length . All along this paper (resp. ) will bound the bit-size (resp. number of terms) of the polynomial to be interpolated, and and will denote random prime numbers that satisfy: <\itemize> r=O>, \*r+1>, |)>>. The construction of and will rely on the following number theoretic theorems, where stands for the natural logarithm, that is =1>. We will also use x\log x/log 2>. <\theorem> (3.8)>> For \21>, there exist at least *\/log \> distinct prime numbers in the open interval ,2*\|)>>. <\theorem> Let \> if \>> and \1.538*\> otherwise. For all 1>, the number of prime divisors of is bounded by >. <\proof> The function > is increasing for \>> and >|)>=1.538*\>. So it is always nondecreasing and continuous. The number of prime divisor of any 15> it at most 1.538*\>>. Let > and > respectively be the number of divisors and prime divisors of. Then clearly >\d>. Now for all 3> we know from that <\equation*> \\log d\\. We will need the following slightly modified version of2.1>, which is itself based on a result from. <\theorem> There exists a Monte Carlo algorithm which, given \0> and |\>>, produces atriple |)>> that has the following properties with probability at least >, and returns fail otherwise: <\enumerate-alpha> is uniformly distributed amongst the primes of >; there are at least /> primes in |)>\+1|)>> and is uniformly distributed amongst them; > is a primitive -th root of unity in >. Its worst-case bit complexity is >>. <\proof> In2.1> the statement() is replaced by the simpler condition that R>. But by looking at step2 of the algorithm on page4 of, we observe that is actually uniformly distributed amongst the primes of |)>\+1|)>> and that there are at least /> such primes with high probability |1-|4>|\>>. <\lemma> Let > be a real number in > and let 22> be such that 4*n>. There exists a Monte Carlo algorithm that computes distinct random prime numbers ,\,p> in>> in time <\equation*> O|)>|)>|)>*>, with a probability of success of at least 1-\>. <\proof> Theorem asserts that there are at least *P/log P> primes in the interval >>. The probability to fetch a prime number in > while avoiding at most fixed numbers is at least <\equation*> *-n|P>\-|)>*|P>=. The probability of failure after trials is at most |)>>. By using the AKS algorithm each primality test takes time >>. The probability of success for picking distinct prime numbers in this way is at least <\equation*> |)>|)>. In order to guarantee this probability of success to be at least >, it suffices to take <\equation*> k\|)>|)>|-log |)>>. The concavity of the function yields -log\> for >, whence <\equation*> |)>|n>\|n>\|2*n>|1-|2*n>>\-log |2*n>|)>, and consequently, <\equation*> -log |)>|)>\-log |2*n>|)>. On the other hand we have |)>\>. It therefore suffices to take <\equation*> k\|*+log |)>|)>*log P|\>. Let \\\p> be prime numbers and let \\\a> be strictly positive integers. The aim of this subsection is to show that the set of pairs \i\\,k\\,p\a|}>> can be computed in quasi-linear time using the following algorithm named >. <\vgroup> <\named-specified-algorithm|> non empty subsets \\> and \\>. the set \\\\\p\a|}>>. <|named-specified-algorithm> <\enumerate> If |\|>=1>, then return \\\\\p\a|}>>. Let ||\|>/2|\>>, let > be a subset of > of cardinality , and let \\\\>. Compute \\>a> and \\>a>. Compute \\\p\A|}>> and \\\p\A|}>>. Return ,\|)>\,\|)>>. <\lemma> The algorithm > is correct and runs in time s|)>>, where log*\*p*a*\*a|)>>>. <\proof> Let \log\>a|)>> and \log\>p|)>>. Step1 costs +\|)>*log+\|)>|)>> by using fast multi-remaindering10>. Using fast sub-product trees10>, step3 takes *log \|)>>. By means of fast multi-remaindering again, step4 takes +\|)>*log+\|)>|)>>. Since the > are distinct prime numbers, when entering step5 we have <\equation*> \>p\\>afor>m=1,2. Let |)>> denote the cost of the recursive calls occurring during the execution of the algorithm. So far we have shown that <\equation*> |)>=|)>+|)>+O*log \|)>. Unrolling this inequality and taking into account that the depth of the recursion is =O*\*a|)>|)>>, we deduce that |)>=O*log \|)>>. Finally the total cost of the algorithm is obtained by adding the cost of the top level call, that is <\equation*> |)>+O+\|)>*log+\|)>|)>=O s|)>. Consider a sparse polynomial \>c*x> that we wish to interpolate. In the next section, we will describe a method that allows us to efficiently compute most of the exponents in an encoded form >. The simplest codes > are of the form <\equation> \=e*\+\+e*\. When \>, the most common such encoding is the , with =E>> for all \>. However, this encoding may incur large bit-size n*\> with respect to the bit-size of . The aim of this section is to introduce more compact codes >. These codes will be \Pprobabilistic\Q in the sense that we will only be able to recover from > with high probability, under suitable assumptions on . Moreover, the recovery algorithm is only efficient if we wish to simultaneously \Pbulk recover\Q exponents from their codes. Throughout this section, the number of variables \>> and the target number of exponents\>> are assumed to be given. We allow exponents to be vectors of arbitrary integers in >. Actual computations on exponents will be done modulo >> for a fixed odd base and a flexible -adic precision >. We also fix a constant 22> such that <\equation> 2*n*log B\P\\B and we note that this assumption implies <\equation> \4*n and <\equation> B\1937and\11>.> We finally assume \1> to be a parameter that will be specified in section. The exponent encoding will depend on one more parameter <\equation*> 1\m\n that will be fixed in section and flexible in section. We define <\equation*> \\|\*|\>. Our encoding also depends on the following random parameters: <\itemize> For each \>>, let ,\,i> be random elements in > and \,\,i|}>>. Let ,\,p> be pairwise distinct random prime numbers in the interval >; such primes do exist thanks to Lemma and. Now consider an exponent ,\,e|)>\\>. We encode as <\eqnarray*> |>|>||I>p*e|> rem B>\\>>k\\>>>|>|>|,\,\-1>|)>\\>>>.>>>> We will sometimes write ,p,I|]>>> and ,p,I|]>>> instead of > and > in order to make the dependence on >, and explicit. Given an exponent ,\,e|)>\\> of , our first aim is to determine the individual exponents > of small size. More precisely, assuming that <\equation*> #e\\\e\0|}>|\|>\, we wish to determine all > with |\|>\B>>. We say that > is if for each \> with \0>, there exists a \>> such that I\e\0|}>=>. This property does only depend on the random choices of the >. <\lemma> Assume that n/m>. Then, for random choices of ,\,I>>>, the probability that> is transparent is at least *\/\>>. <\proof> Let \\\e\0|}>> and |\|>\n/m>. Given \>> and \>, the probability that \\=> is <\equation*> |n>=*|)>\*|)>\\*. The probability that \\=>> for some is therefore at least <\equation*> 1-*|)>*>\1-\/\>. We conclude that the probability that this holds for all \> is at least *\/\>>. We say that > is if for every \>> and \> such that |\|>\B>> and \\>, we have =p*e>. <\lemma> For random choices of ,\,p>, the code > is faithful with probability at least <\equation*> 1-*\*n*log B|P>. <\proof> Let > be the set of all primes strictly between and . Let > be the set of all ,\,p|)>\\> such that ,\,p> are pairwise distinct. Let > be the subset of > of all choices of ,\,p> for which > is not faithful. Consider \>>, \>, and ,\,p|)>\\> be such that \\>, |\|>\B>> and \p*e>. Let \\-p*e\0>. For each \>, let <\equation*> pq>\,\,p,q,p,\,p|)> and >\\,pq>,I|]>>>, so that |]>>=\>. For each \>, using |\|>\B>>, we observe that <\equation*> \>=\+q*e+\*B> necessarily holds with \>. Now consider the set ,\,p,p,\,p>> of \> such that >> is divisible by . Any such is a divisor of either -B>>, >, or +B>>. Since is odd we have <\equation*> |\|>\>-1|2>, and therefore <\equation*> |\|>\|\|>+*e|\|>\>-1|2>+>|2>=B>-. It follows that +\*B>\0>. Consequently, <\eqnarray*> ,\,p,p,\,p>|\|>>|>||>|)>|log P>|\>>>||>|>|)>|log P>+1|)>>>|||*log B|log P>+|log P>|)>>>||>|*log B|log P>+|)>)>>>||>|*log B|2*log P>.>>>> Now let <\equation*> \\,\,p|)>\\\p\\,\,p,p,\,p>|}>, so that \\>. By what precedes, we have <\equation*> |\|>\*|\|>|n-1>**log B|log P>, whence <\equation*> |\|>\*\*\*n*|\|>|n-1>*. From |\|>=|\|>|n>> we deduce that <\equation*> |\|>||\|>>\*\*n*log B|2*|\|>-n+1|)>*log P>. From Theorem we know that |\|>\*P/log P>. This yields |\|>\*P/log P>, as well as |\|>\2*n>, thanks to. We conclude that\ <\equation*> |\|>||\|>>\*\*n*log B||\|>*log P>\*\*n*log B|P>. Given \\>>>> and \>, let ,i>> be the smallest index such that \\,i>>\0> and ,i>>|\|>/p\B>>. If no such,i>> exists, then we let ,i>\\>. We define >\,0>,\,k,n-1>|)>>. Assume that =\> is transparent and faithful for some \>. Given \>, let \\,i>>/p> if ,i>\\> and \0> otherwise. Then the condition ,i>>|\|>/p\B>> implies that *|~>|\|>\B>> always holds. Moreover, if |\|>\B>>>, then the definitions of \Ptransparent\Q and \Pfaithful\Q imply that =|~>>. In other words, these > can efficiently be recovered from> and >>. <\lemma> Let ,\,\|)>\>>>|)>>, where \m>. Then we can compute >,\,k>|)>> in time *\*log B|)>>. <\proof> Note that the hypotheses \m> and \1> imply that *\|)>>. Using Lemma, we can compute all triples > with \\> in time <\equation*> *\*log B+n*log P|)>=*\*log B|)>, thanks to. Using *\*log B|)>> further operations we can filter out the triples >> for which |\|>/p\B>>>. We next sort the resulting triples for the lexicographical ordering, which can again be done in time *\*log B|)>>. For each pair >, we finally only retain the first triple of the form >. This can once more be done in time *\*log B|)>>. Then the remaining triples are precisely those ,i>|)>> with ,i>\\>. The following combination of the preceding lemmas will be used in the sequel: <\lemma> Let =,\,e|)>\|)>> and let =,\,\|)>\>>>|)>> be the corresponding values as above. Let be the set of indices \> such that =\|)>> and \n/m>. Then there exists an algorithm that takes > as input and that computes |~>=,\,|)>>\|)>> such that *|\|>\B>> for all >\\\\> and =e> forall J> with |\|>\B>>. The algorithm runs in time *\*log B|)>> and succeeds with probability at least <\equation*> 1-*\/\>>-*\*T*n*log B|P>. <\proof> Lemmas and bound the probability of failure for the transparency and the faithfulness of the random choices for each of the terms. The complexity bound follows from Lemma. Our next aim is to determine all exponents > with >\\>, for some fixed threshold>. For this, we will apply Lemma several times, for different choices of ,p,I|]>>>. Let <\equation*> U\|log min,n|)>|\>+2. For ,U>, let <\eqnarray*> >>|>|||2>|\>>>|>>|>||>|\>>>|>>|>||\*>>|\>.>>>> We also choose >> and >> independently at random as explained in section, with >>, >>, and >> in the roles of >, , and >. We finally define <\equation*> \>\\>,p>,I>|]>>, for ,U>. Note that the above definitions imply <\equation> 4*min,n|)>\2\8*min,n|)> and m>\n>. The inequality >\n/2+1> implies <\equation> 2\>>,whenever >m>\2. If >=1>, then \2\4*n=4*n/m>>, so, in general, <\equation> 2\>>. By we have \2>, whence <\equation> \>\\>\\\\>. <\lemma> For ,U>, we have >*\>\18*\*\>. <\proof> From \1>, we get <\equation> \>\|\*2|\>\\*2. Now <\eqnarray*> >*\>>|>||2>+1|)>*\*2)>>>||>|*+2|)>>>||>|*\,)>>>>> which concludes the proof. <\theorem> Let =,\,e|)>\|)>> and let >=>,\,\>|)>\>>>>|)>> for ,U>> be as above. Assume that \n>. Let be the set of indices \> such that >\\> and >=\>|)>>> for all ,U>. Then there exists an algorithm that takes >,\,\>>> as input and that computes |^>=,\,|)>\|)>>> such that >\\>> for all \> and =e> for all J>. The algorithm runs in time *T*\*log B|)>> and succeeds with probability at least <\equation*> 1-/\>>-*U*\*T*n*log B|P>. <\proof> We compute successive approximations >,\>,\,\>> of > as follows. We start with >\0>. Assuming that >> is known, we compute <\equation*> |\>\>-\>>|)>|)>|)> rem B>>\\>>>> for all \>, and then apply the algorithm underlying Lemma to |\>\\-\>> and |\>\|\>,\,|\>|)>>. Note that for all J> the equality >=\>|)>> implies <\eqnarray*> |\>>||>|)>-\>>|)>|)>|)> rem B>>>>|||>->|)>|)>>>|||>>|)>.>>>> Our choice of >> and >>, the inequality, and the assumption \n> successively ensure that <\equation*> >|\>>\>+1||2>>\|5*\>\>\T, so we can actually apply Lemma. Let>> be the set of indices \> such that |\>=\>>|)>> and >\n/m>>. Lemma provides us with |~>=,\,|)>>\|)>> such that *|\|>\B>>> for all >\\\\>. In addition =>> holds whenever J>>> and >|\|>\B>>> with probability at least <\equation*> 1->>*\/\>>-*\>*T*n*log B|P>. Now we set <\equation> \>\\>+|~>. At the end, we return |^>\\>> as our guess for >. For the analysis of this algorithm, we first assume that all applications of Lemma succeed. Let us show by induction (and with the above definitions of |\>> and |\>>) that, for all \>, J>, and ,U>, we have: <\enumerate-roman> >\n/m>>; >|)>|\|>\B>>>>; >|)>=e> whenever |\|>\B>>*|)>>. If and >=1> then() clearly holds. If and >\2>, then and imply >\2\min,n|)>>. Since J> we have >\\>. Now we clearly have >\\>> and >\n>, so() holds. If 2>, then let \> be such that >\0>, so we have >|)>\e>. The induction hypothesis() and yield <\equation*> 4*P*|\|>\B>>*|)>\*B>>, whence |\|>\B>>>. Consequently, <\eqnarray*> |\|>>|>|>*log B-log>>||>|>*log >\1>)>>>||>|>*log |3>)>>>||>|>*-3|)>|2>2-1>>)>>>||>|*-3|)>|2>>>||>||2>)>>>||>||n/m>>.)>>>>> Hence the total bit-size of all |\|>> such that >\0> is at least >|)>*\/>|)>>> and at most> by definition of . This concludes the proof of(). Now if J>, then >=\>|)>=|\>>>, so Lemma and() imply that J>>. In other words, we obtain an approximation |~>> of |\>> such that >*|\|>\B>>> for all \\\J>, and =>> holds whenever >|\|>\B>>>. Let us prove(). If , then >|)>|\|>=2*P*|\|>\B>>>. If 2>, then yields <\eqnarray*> >|)>|\|>>|>|>|)>|\|>+2*P*|\|>>>||>|>>+2*P*|\|>())>>>||>|>>+>>*B>>>>||>|>>+*B>>>\P+1>)>>>||>|>>|B>+*B>>))>>>||>|>>|P+1>+*B>>)>>>|||>>.>>>> As to(), assume that |\|>\B>>*|)>>>. If , then >|)>=>=e> is immediate. If 2>, then the induction hypothesis() yields >|)>|\|>\B>>>>, whence <\eqnarray*> >|\|>>|>||\|>+4*P*>|)>|\|>>>||>|>>-2*B>-1>+2*B>>>>||>|>>.))>>>>> We deduce that =>> holds, hence implies(). This completes our induction. At the end of the induction, we have >=5*\> and, for all \\\J>, <\eqnarray*> |\|>>|>|>>\\>)>>>||>|*B*2>)>>>|||*B>/ B|)>+1/2>>>||>|*B>/50+1/2>)>>>||>|>/50+3/4>)>>>||>|>>.>\1>)>>>>> By() and, this implies the correctness of the overall algorithm. As to the complexity bound, Lemma shows that >*\>=O*\|)>>, when applying Lemma. Consequently, the cost of all these applications for ,U> is bounded by <\equation*> \>*T*\>*log B|)>=*T*\*U*log B|)>=*T*\*log B|)>, since |)>>. The cost of the evaluations of >>|)>|)>> and all further additions, subtractions, and modular reductions is not more expensive. The bound for the probability of success follows by induction from Lemmas and, while using the fact that all >> and >> are chosen independently. Throughout this section, we assume that is a modular blackbox polynomial in ,\,x|]>>> with at most terms and of bit-size at most T>. In order to simplify the cost analyses, it will be convenient to assume that <\equation*> S\max|)>. Our main goal is to interpolate . From now on will stand for the actual number of non-zero terms in the sparse representation of . Our strategy is based on the computation of increasingly good approximations of the interpolation of , as in, for instance. The idea is to determine an approximation > of, such that >> contains roughly half the number of terms as , and then to recursively re-run the same algorithm for > instead of . Our first approximation will only contain terms of small bit-size. During later iterations, we will include terms of larger and larger bit-sizes. Throughout this section, we set <\equation> \\2*\and>\\|\*S/T|\>, so that at most > of the terms of have size\>. Our main technical aim will be to determine at least >> terms of of size \>, with high probability. Our interpolation algorithm is based on an extension of the univariate approach from. One first key ingredient is to homomorphically project the polynomial to an element of /-1,M|)>> for a suitable cycle length \>> and a suitable modulus (of the form >>, where is as in the previous section and \\>>). More precisely, we fix <\equation> R\max|)>*\ and compute |)>> as in Theorem. Now let ,\,\> be random elements of ,r-1|}>>, and consider the map <\eqnarray*> ,r>:\,\,x|]>>|>|/-1|)>>>||>|>,\,t>|)> mod -1|)>.>>>> We call ,r>> a . <\lemma> The bit-size of the product of the non-zero coefficients of ,r>> is at most >. <\proof> Given +\+a*t\\/-1|)>>, let \\>+\+\>> and \\0>a>. Note that \\+\> and \\+\> for any \>. The first inequality yields ,r>>\\>, whereas the second one implies ,r>|)>>\\,r>>>. Given a modulus \>>, we also define <\eqnarray*> ,r,M>:\,\,x|]>>|>|/-1,M|)>>>||>|>,\,t>|)> mod -1,M|)>>>>> and call ,r,M>> a . If =\,r>>, then we say that a term > of and the corresponding exponent are > if there is no other term *x>> of such that |)>=\>|)>>. If =\,r,M>>, then we define >-faithfulness in the same way, while requiring in addition that be invertible modulo . For any \2>, we note that > is ,r,M>>faithful if and only if > is ,r,M>>>faithful. We say that is >-faithful if all its terms are >-faithful. In asimilar way, we say that >\\,r>> is faithful if is invertible for any non-zero term>>>of>>. The first step of our interpolation algorithm is similar to the one from and consists of determining the exponents of >\\,r>>. Let be a prime number. If >> is faithful, then the exponents of >> are precisely those of > rem q=\,r,q>>. <\lemma> We can compute ,r,q>> in time <\equation*> |)>*r*\+n*. <\proof> We first precompute ,\,\> in time >. We compute \\,r,q>> by evaluating |)>=f>,\,\>|)>>> for ,r-1>>. This takes |)>*r*\+n*r*> bit-operations. We next retrieve > from these values using an inverse discrete Fourier transform (DFT) of order . This takes > further operations in >, using Bluestein's method. Assuming that >> is -faithful and that we know ,r,q>>, consider the computation of ,r,q>>> for higher precisions >. Now >> is also >>-faithful, so the exponents of ,r,q>>> and ,r,q>> coincide. One crucial idea from is to compute ,r,q>>> using only \#\,r,q>> instead of evaluations of modulo >>. This is the purpose of the nextlemma. <\lemma> Assume that >> is -faithful and that ,r,q>> is known. Let \#\,r,q>\T> and let \f*x|)>>, where =,\,\|)>\\>>> is such that rem q\0> for all \>. Then we may compute ,r,q>>> in time <\equation*> *\|)>*T*\*\+n*+log r|)>*\*log q|)>. <\proof> We first Hensel lift the primitive -th root of unity > in > to a principal -th root of unity |~>> in /|q>*\|\>> in time *log q|)>>, as detailed in2.2>. We next compute |~>>,\,|~>>> in time *log q|)>>, using binary powering. We pursue with the evaluations \f*|~>>,\,\*|~>>|)>> for ,T-1>. This can be done in time <\equation*> *\|)>*T*\*\+n*T**log q|)>. Now the exponents ,\,e>\\> of ,r,q>>> are known, since they coincide with those of ,r,q>>, and we have the relation <\equation*> |||||>|>||~>>>||~>>>|>||~>-1>>>>|>|>||>>||~>-1|)>*e>>||~>-1|)>*e>>|>||~>-1|)>*e-1>>>>>>>*||>>|>>|>>|-1>>>>>>=||>>|>>|>>|-1>>>>>>, where > denotes the coefficient of >> in ,r,q>>>, for ,T-1>. It is well known that this linear system can be solved in quasi-linear time *\*log q|)>>: in fact this problem reduces to the usual interpolation problem thanks to the transposition principle; see for instance5.1>. The next lemma is devoted to bounding the probability of picking up random prime moduli that yield -faithful projections. <\lemma> Let ( >>) be the number of terms ( ,r,q>>-faithful terms) of of size\>. Let the cycle length and the modulus be given as described in Theorem. If the ,\,\> are uniformly and independently taken at random ,r-1|}>>, then, with probability 1->>>, the projection >> is -faithful and <\equation*> N>\>>|)>*N->. \; <\proof> We let \1/\> and 2/\> as in, which allows us to apply Theorem. Let c*x> and \\c\0\\>+\\\|}>>. For any \>, let <\equation*> \\\0>|\|>. Given E>, consider \\*\E\>\-e>> and note that \2*\>>. We say that is if 0> and -e|)> rem r\0> for all \E\>. This is the case if and only if > is not divisible by . Now > is divisible by at most *\>|)>> distinct prime numbers, by Theorem. Since there are at least <\equation*> *R/log R>\*\*S/log *S|)> prime numbers in >, by Theorem, the probability that > is divisible by is at most <\equation*> *\>|)>|*\*S/log *S|)>>. From we obtain \\\64>, whence *\>\\>>. It follows that <\eqnarray*> *\>|)>|*\*S/log *S|)>>>|>|*\>|)>/log *\>|)>|)>|0.6*\*S/log *S|)>>>>||>|*\/log |*\|)>|\>|0.6*\*S/log *S|)>>>>||>|*\/log |*\|)>|\>|0.6*\*S/log *S|)>>>>|||*\/log |*\|)>|\>|0.3*\*S/log *S|)>>>>||>|>**T/log *T|)>|\*S/log *S|)>>\T>)>>>||>|>,>>>> since *S\\*T> from. Now consider two admissible exponents e> in and let \> with -e|)> rem r\0>>. For fixed values of > with i>, there is a single choice of \,r-1|}>> with \|)> rem r=0>>. Hence the probability that this happens with random ,\,\> is >. Consequently, for fixed E>, the probability that \|)> rem r=0>> for some \E\> is at most <\equation> |r-1>\|R>\|\*S>\|\*\*T>\*\>\>, thanks to. Assuming now that is fixed, let us examine the probability that >> is -faithful. Let> be the product of all non-zero coefficients of >>. Then >>is faithful if and only if does not divide >. Now the bit-size of the product > is bounded by \S>, by Lemma. Hence> is divisible by at most > prime numbers, by Theorem and our assumption that 2>. With probability at least <\equation> 1->, there are at least /> prime numbers amongst which is chosen at random, by Theorem(). Assuming this, >>is not -faithful with probability at most <\equation> />\>\>, since S\2>, by. Let > be the set of E> such that rem q\0>> and \|)> rem r\0> for all \E\>>. Altogether, the bounds, (), , and() imply that the probability that agiven E> belongs to > is at least >. It follows that the expectation of |\|>> is at least|)>*>>. For <\equation*> \\>=>>, this further implies that the probability that |\|>\/\|)>*> cannot exceed>: otherwise the expectation of |\|>> would be |\*/\|)>*+|)>*=|)>*|\>>. We finally observe that E> is ,r,q>>-faithful whenever \|)> rem r\0> for all \supp f> such that >>+\>\\>. Now for every > with >>+\>\\>, there is at most one E> with \|)> rem r=0>: if \-e|)> rem r=0> for \E>, then \-e|)> rem r=0>, whence =e>. By(), there are at most > exponents > with >>+\>\\>. We conclude that >\/\|)>*N-T/\>, whenever |\|>\/\|)>*>. Lemma allows us to compute the coefficients of ,r,q>>*x|)>|)>> with good complexity. In the univariate case, it turns out that the exponents of ,r,q>>-faithful terms of can be recovered as quotients of matching terms of ,r,q>>>|)>*f|)>> and ,r,q>>> by taking> sufficiently large. In the multivariate case, this idea still works, for a suitable Kronecker-style choice of>. However, we only reach a suboptimal complexity when the exponents of are themselves sparse and/or of unequal magnitudes. The remedy is to generalize the \Pquotient trick\Q from the univariate case, by combining it with the algorithms from section: the quotients will now yield the exponents in encoded form. Let us now specify the remaining parameters from section. First of all, we take q>>, where <\equation*> \\||\>. Consequently, <\equation> 2*n*S\B\2*n*S*q. We also take |/2|\>> and \|6*\*log S|\>>. Since is odd, the inequalities hold. We compute and >,p>,I>> for ,U> as in section. For ,U>, \>>>, and \>, let <\eqnarray*> >>|>|,r,B>>>>>|>>|>|>*B>>>|i\I>>>||.>>>>>>>>> For any term > with \> and \>, note that <\eqnarray*> >|)>>||\e>>>|>>*x|)>|)>>||>*B>>|)>*c*t\e>.>>>> Whenever 0>, it follows that >> can be read off modulo >>> from the quotient of >>*x|)>|)>> and >|)>>. <\lemma> Let , , >, >, be as in )>, )> and let |)>> be as in Theorem. Then we can compute the random parameters >>, >> (,U>, \>>>) and ,\,\> in time> and with a probability of success 1-1/S>. <\proof> The cost to generate random elements ,\,\> in ,r-1|}>> is <\equation*> O|)>=O, since we assumed n>. The generation of >,\,I>-1>>> can be done in time <\equation*> O>*m>*\|)>=O*n*\|)>=. For ,U>, we compute >> using Lemma with \1/>. The computation of a single >> takes time <\equation*> O|)>|)>|)>*>= and succeeds with probability at least >. The probability of success for all ,U>> is at least |)>\1-1/S>, because <\equation*> log|)>\|1-\>=\|U>. We conclude with the observation that >. <\lemma> Assume that >> is -faithful and that ,r,q>> is known. Let \#\,r,q>\T>. Then we can compute >> and >>*x|)>|)>> for all ,U|}>> and \>>> in time <\equation*> **\*S|)>. <\proof> For a fixed ,U|}>>, we can compute >> and >>*x|)>|)>> for all \>>> in time <\equation*> 2*\>*>*\*\|)>*T*\>*\*\+\>*n*+log r|)>*\>*\*log q|)> by Lemma. From Lemma and *\=O>, we get <\equation*> \>*\>*\*\\18*\*\*\*\=O*\*log S|)>. By definition of and >, we have =O>. By we also have =O S|)>>. Hence the cost for computing >> and >>*x|)>|)>> simplifies to <\eqnarray*> ||>*\*\|)>*O*\*\*log S|)>+n*+log r|)>*\*\*log S|)>>>|||+log n*log S|)>**\*S|)>+n**\*S|)>)>>>|||+n|)>**\*S|)>>>|||**\*S|)>.|)>>)>>>>> Since ,n|)>|)>=O>, the total computation time for ,U> is also bounded by **\*S|)>>. Consider a family of numbers \\>>>>, where ,U>>, \>>>, and \>>. We say that |)>> is a for if we have =\>> whenever is a ,r,B>>>>-faithful exponent of with >>=\,r,B>>>|)>>. We require nothing for the remaining numbers >, which should be considered as garbage. <\corollary> Assume that >> is -faithful and that ,r,q>> is known. Then we may compute a faithful exponent encoding for in time **\*S|)>>. <\proof> We compute all >> and >>*x|)>|)>> using Lemma. Let > and > be the coefficients of >>> in >> and >>*x|)>|)>>. Then we take \c/c> if > is divisible by > and \0> if not. For a fixed ,U|}>> all these divisions take *\*S|)>> bit-operations, using a similar reasoning as in the proof of Lemma. Since >, the result follows. Let c*x>. We say that =*x\\,\,x|]>> is a > of if \#f> and |)>\>. <\lemma> Let |)>> be as in Theorem, with as in. There is a Monte Carlo algorithm that computes a -approximation of in time **\*S|)>> and which succeeds with probability at least >>->. <\proof> We first compute the required random parameters as in Lemma. This takes time> and succeeds with probability at least . We next compute ,r,q>> using Lemma, which can be done in time <\equation*> |)>*r*log q+n*=*O+n*=+n|)>*. We next apply Corollary and compute a faithful exponent encoding for , in time **\*S|)>>. By Lemma, this computation fails with probability at most >>>. Moreover, in case of success, we have >\>>|)>*N->>>, still with the notation of Lemma. From() and T>, we get \\*S-T\-1|)>*S\S\n>. This allows us to apply Theorem to the faithful exponent encoding for . Let >,\,>-1>> be the exponents of ,r,q>>. Consider \>> such that there exists a,r,q>>-faithful term> of with ,r,q>|)>=t>>> and \\>. Theorem produces aguess for for every such . With probability at least <\equation*> 1-T*n*U*\/\>-*U*\*T*n*log B|P> these guesses are all correct. The running time of this step is bounded by <\equation*> *T*\*log B|)>=*\*S|)>, since >, n>, and imply >. Below we will show that <\equation> T*n*U*\/\>+*U*\*T*n*log B|P>\. Let > be the smallest integer such that >\2+1>>. We finally compute ,r,q>>> using Lemma, which can be done in time <\equation*> *\|)>*T*\*\+n*+log r|)>*\*log q|)>=+n|)>**S|)>=**S|)>. Let ,\,c-1>\\>>> be such that <\equation*> \,r,q>>=*t>>+\+c-1>*t>-1>>|)> mod q>. For every ,r,q>>-faithful term > of with ,r,q>|)>=t>>> and \\>, we have <\equation*> \,r,q>>|)>= mod q>|)>*t>>, so we can recover \\>>> from ,r,q>>>. With a probability at least >>->, all the above steps succeed. In that case, we clearly have =T\T=#f>. Let +f>>, where > ( >>) is the sum of the terms of bit-size\>(\>), so that >>. Then \>> and <\equation*> #>-|)>\N-N>+>\>>*N+>>. Consequently, <\equation> #|)>\>>*N+>\, which means that \c*x>+\+c-1>*x-1>>> is a -approximation of . In order to conclude, it remains to prove the claimed inequality. Using the definition of > and the inequalities S>, S>, log S+2\S>, we have <\equation> T*n*U*\/\>\>\>. From we have 2> and therefore /\/+2|)>\1+2>. So the inequality *T\\*S> yields \; <\equation> *U*\*T*n*log B|P>\*\*U*n*S*log B|>. Let us analyze the right-hand side of. Without further mention, we will frequently use that 2>. First of all, we have <\eqnarray*> >|>| S+1\|)>|)>*log S\1.54*log S>>|>||\152*log S,>>|>|||6*\*log S|\>\|)>|)>*log S\17*log S,>>||>| \+3\log \+log S+3>>||| \+log S+9\2*log S+1|)>+log S+9\3*log S.>>>> It follows that <\equation> 577*\*\*U\577\152\17\3*log S\2*log S. Now the function x|)>/x> is decreasing for \>. Consequently, <\equation*> |S>\|)>|2*S>\|2>\2. Similarly, /S\17/2\2>. Hence, <\eqnarray*> |||)>*\=max*S*\,2*\|)>\2*S>>||>|\2*S>>||>|*n*S*q\2*S*q\2*S.))>>>>> This yields <\equation> log B\22*log S+382*log 2\46*log S. Combining, , and, we deduce that <\equation> *\*U*n*S*log B|>\2*n*S*log S|>\2*n*S*log S|2*n*S>\ S|S>. The inequalities, , and finally yield the claimed bound: <\equation*> T*n*U*\/\>+*U*\*T*n*log B|P>\>+ S|S>=>+ S|S>|S>\. Instead of just half of the coefficients, it is possible to compute any constant portion of the terms with the same complexity. More precisely, given \0>, we say that > is a |)>>-approximation of if \#f> and |)>\\*T>. <\theorem> Let |)>> be as in Theorem, with as in. Given \0>, there is a Monte Carlo algorithm that computes a |)>>-approximation of in time +n|)>*> and which succeeds with probability at least >. <\proof> When =2*\\>> \ we have >>+>\\>, so() becomes |)>\\*>. In addition, we have >>+\\>, whenever >>. In other words, if is sufficiently large, then the proof of Lemma actually yields a |)>>-approximation of . Once an approximation > of the sparse interpolation of is known, we wish to obtain better approximations by applying the same algorithm with > in the role of . This requires an efficient algorithm to evaluate >. In fact, we only need to compute projections of > as in Lemma and to evaluate > at geometric sequences as in Lemma. Building a blackbox for > turns out to be suboptimal for these tasks. Instead, it is better to use the following lemma. <\lemma> Let \\,\,x|]>> be written <\equation*> =c*x>+\+c*x> where |c,\,c|\>\\>> and ,\,e\\>, let \max,T-1>>+\>|)>>, let |)>> be as in Theorem, and let ,\,\> be in ,r-1|}>>. <\enumerate-roman> We can compute ,r,q>|)>> in time <\equation*> T*n*+log q|)>. Given \0>, \0>, an element |~>\\>*\>> of order , and =,\,\|)>\\>>>, we can compute *|~>>,\,\*|~>>|)>> for ,T-1> in time <\equation*> T*|)>+*log r**log q|)>+*\*log q|)>. <\proof> Given \>, the projections ,r,q>>|)>=t*\+\+e*\|)> rem r>> can be computed in time +log r|)>>. The projections ,r,q>*x>|)>= rem q|)>*\,r,q>>|)>> can be obtained using +log q|)>> further operations. Adding up these projections for ,T-1> takes > operations. Altogether, this yields(). As for part() we follow the proof of Lemma. We first compute \|~>>,\,|~>>|)>>> with *log q|)>> by binary powering. The reductions rem r,\,e rem r> of exponents are obtained in time |)>>. Then, the powers >,\,\>> and >,\,\>> can be deduced in time *log q|)>>. Let \*x|)>>. Then we have <\equation*> ||>>||)>>>|>>|-1>|)>>>>>>=|||||>|>|>>|>>|>|>>>|>|>||>>|-1|)>*e>>|-1|)>*e>>|>|-1|)>*e>>>>>>*||>*c>>|>*c>>|>>|>*c>>>>>. By the transposition principle, this matrix-vector product can be computed with the same cost as multi-point evaluation of a univariate polynomial. We conclude that \ ,g|)>,\,g|)>> can be computed in time |)>*\*log q|)>>. We are now ready to complete the proof of our main result. <\render-proof|Proof of Theorem> We set \*T> and take >, as in. Thanks to Theorem, we may compute a suitable triple |)>> in time >|)>=O>|)>>, with probability of success at least >, where \1/\>. Let |log T|\>+1>. Let |j|\>>\|T/2|\>> and |j|\>>\|\*S/T|j|\>>|\>> for ,J>. Starting with |0|\>>\0>, we compute asequence |0|\>>,f|1|\>>,\,f|J|\>>> of successive approximations of as follows. Assuming that |j|\>>> is known for some J>, we apply Lemma with |j|\>>> and |j|\>>> in the roles of and . With high probability, this yields a |j|\>>>-approximation |j|\>>> of |j|\>>> and we set |j+1|\>>\f|j|\>>+\|j|\>>>. Taking |j|\>>> into account in Lemma requires the following changes in the complexity analysis: <\itemize> Since |j|\>>> has T> terms of size \>, part() of Lemma implies that the complexity bound stated in Lemma becomes <\equation*> |)>*+T*\*n**log q|)>=|)>*+n*. Part() of Lemma implies that the complexity bound stated in Lemma becomes <\equation*> *\|)>*T*\*\++n*+log r|)>*\*log q|)>+T*log r**log q|)>. It follows that the complexity bounds of Lemmas, , and Corollary remain unchanged. The total running time is therefore bounded by <\equation*> J***\*S|)>=*. Using the inequalities \+1> and 2>, the probability of failure is bounded by <\equation*> J*>+>>+|)>\+1|)>**\>+>+|)>\>. If none of the steps fail, then |j+1|\>>|)>\|j|\>>|2>\T|j+1|\>>> for ,J-1>>, by induction. In particular, |J|\>>|)>\|T/2|log T|\>>|\>|2>=>, so |J|\>>>. <\remark> As interesting open problem is whether the complexity bound in Theorem can be replaced by *>, where is a given bound on the bit-size of. This was erroneously asserted in a prior version of this paper and such a bound would be more efficient in cases where the sizes of the coefficients and exponents of vary widely. Here, the first difficulty appears when a first approximation > of has > terms with small coefficients and > has just a few terms but huge coefficients of bit-size >: the evaluation cost of > in part() of Lemma would need to remain in >. Recent advances in this problem can be found in for the univariate case. Note that the bound *> is still achieved in the approximate version (Theorem) of our main theorem. For practical purposes, the choice of 2*\> is not realistic. The high constant > is due to the fact that we rely on for unconditional proofs for the existence of prime numbers with the desirable properties from Theorem. Such unconditional number theoretic proofs are typically very hard and lead to pessimistic constants. Numerical evidence shows that a much smaller constant would do in practice: see4> and2.2.2>. For the univariate case the complexity of the sparse interpolation algorithm in is made precise in term of the logarithmic factors. The exposition so far has also been optimized for simplicity of presentation rather than practical efficiency: some of the other constant factors might be sharpened further and some of the logarithmic factors in the complexity bounds might be removed. In practical implementations, one may simply squeeze all thresholds until the error rate becomes unacceptably high. Here one may exploit the \Pauto-correcting\Q properties of the algorithm. For instance, although the |j|\>>>-approximation |j|\>>> from the proof of Theorem may contain incorrect terms, most of these terms will be removed at the nextiteration. Our exposition so far has also been optimized for full generality. For applications, ahigh number of, say , variables may be useful, but the bit-size of individual exponents rarely exceeds the machine precision. In fact, most multivariate polynomials of practical interest are of low or moderately large total degree. A lot of the technical difficulties from the previous sections disappear in that case. In this section we will describe some practical variants of our sparse interpolation algorithm, with a main focus on this special case. In practice, the evaluation of our modular blackbox polynomial is typically an order of magnitude faster if the modulus fits into a double precision number ( bits if we rely on floating point arithmetic and bits when using integer arithmetic). In this subsection, we describe some techniques that can be used to minimize the use of multiple precision arithmetic. If the individual exponents of are small, but its coefficients are allowed to be large, then it is classical to proceed in two phases. We first determine the exponents using double precision arithmetic only. Knowing these exponents, we next determine the coefficients using modular arithmetic and the Chinese remainder theorem: modulo any small prime, we may efficiently compute using only evaluations of modulo on a geometric progression, followed by5.1>. Doing this for enough small primes, we may then reconstruct the coefficients of using Chinese remaindering. Only the Chinese remaindering step involves a limited but unavoidable amount of multi-precision arithmetic. By determining only the coefficients of size \>, where > is repeatedly doubled until we reach , the whole second phase can be accomplished in time |)>*O+O S|)>>. One important trick that was used in section was to encode >> inside an integer >*B>>> modulo >>> of doubled precision >*\> instead of >*\>. In practice, this often leads us to exceed the machine precision. An alternative approach (which is reminiscent of the ones from) is to work with tangent numbers: let us now take <\eqnarray*> >>|>|,r,B>>>>>|>>|>|>*\>|i\I>>>||>>>>>>>>> where >> is extended to ,\,x|]>|]>/|)>> and where >\\>>>|]>/|)>>. Then, for any term >, we have <\eqnarray*> >|)>>||\e>>>|>>*x|)>|)>>||>*\|)>*c*t\e>,>>>> so we may again obtain >> from >>*x|)>|)>> and >|)>> using one division. Of course, this requires our ability to evaluate at elements of /|B|}>>>*\|\>|)>|]>/|)>|)>>, which is indeed the case if is given by an SLP. Note that arithmetic in /|B|}>>>*\|\>|)>|]>/|)>>> is about three times as expensive as arithmetic in /|B|}>>>*\|\>>. Although the algorithm > from section is asymptotically efficient, it relies heavily on multiple precision arithmetic. If all > and > fit within machine numbers and > is not too large, then we expect it to be more efficient to simply compute all remainders rem p>. After the computation of pre-inverses for>, such remainders can be computed using only a few hardware instructions, and these computations can easily be vectorized. As a consequence, we only expect the asymptotically faster algorithm > to become interesting for very large sizes like \1000>>. Of course, we may easily modify > to fall back on the naive algorithm below a certain threshold (recursive calls included); vectorization can still be beneficial even for moderate sizes. As explained above, if has only small exponents, then multiple precision arithmetic is only needed during the Chinese remaindering step that recovers the coefficients from modular projections. If actually does contain some exponents that exceed machine precision, is it still possible to avoid most of the multiple precision arithmetic? Let |)>> be a triple as in Theorem. In order to avoid evaluations of modulo large integers of the form >>, we wish to use Chinese remaindering. Let ,\|)>,\,>,\>|)>>> be > triples as in Theorem with *\*q>\q>>, the > pairwise distinct, and where is shared by all triples. Since there are many primes for which |)>\+1|)>>> contains at least />, such triples can still be found with high probability. In practice, |)>|)>\+1|)>> already contains enough prime numbers. Evaluations of modulo >> are now replaced by separate evaluations modulo ,\,q>> after which we can reconstruct evaluations modulo *\*q>> using Chinese remaindering. However, one crucial additional idea that we used in Lemma is that >>isautomatically >>-faithful as soon as it is -faithful. In the multi-modular setting, if >> is >faithful, then it is still true that the exponents of > rem q> are contained in the exponents of > rem q> for ,\>. This is sufficient for Lemma to generalize. Let ,\,d> be bounds for the degree of in ,\,x>, let be a bound for the maximal number of variables that can occur in a term of , and let ,\,p> the prime numbers from section. Assume that our polynomial has only nonnegative \Pmoderately large exponents\Q in the sense that *p,\,d*p|)>> fits into a machine number. Then we may simplify the setup from section by taking <\eqnarray*> |>||n/V|\>>>|>|>||\*V|\>>>||>||\*n*log n|\>>>|>|>|I>p*e\\>>|>|>|,\,\-1>|)>\\>,>>>> where \2> and \2> are small constants and where we forget about and >. For any =,\,\-1>|)>\\>>, let \\>\\\0|}>|\|>>. In this simplified setup, one may use the following algorithm to retrieve from >: <\specified-algorithm> \\>>. with =\> or >. <|specified-algorithm> Let 0>, \\>, and \\>. While there exist \>> and \> with \\\0> and \\> do: <\indent> Let \/p>. Let \\|\>,0,q,0,\,0|)>>. If \\> for some \>> or -\|)>\#\>, then let \\\|}>>. Otherwise, update \e+q>, \\-\>, and \\>. If =0>, then return . Otherwise, return >. The probability of success is non-trivial to analyze due to the interplay of the choices of ,\,p> and the updates of >. For this reason, we consider the algorithm to be heuristic. Nevertheless, returned values are always correct: <\proposition> Algorithm is correct. <\proof> By construction, we have the loop invariant that +\=\>, so the answer is clearly correct in case of success. The set of \Pproblematic pairs\Q > was introduced in order to guarantee progress and avoid infinite loops. Indeed, > strictly decreases at every update of >. For definiteness, we also ensured that > remains in >> throughout the algorithm. (One may actually release this restriction, since incorrect decisions may be corrected later during the execution.) <\remark> Algorithm has one interesting advantage with respect to the method from section: the correct determination of some of the > leads to a simplification of>, which diminishes the number of collisions ( entries =I>p*> such that the sum contains at least two non-zero terms). This allows the algorithm to discover more coefficients > that might have been missed without the updates of >. As a result, the algorithm may succeed for lower values of > and >, for a more compressed encoding of . <\remark> From a complexity perspective, some adaptations are needed to make Algorithm run in quasi-linear time. Firstly, one carefully has to represent the sets > so as to make the updates \\-\>> efficient. Secondly, from a theoretical point of view, it is better to collect pairs>> with \\\0> in one big pass (thereby benefiting from Lemma) and perform the updates during a second pass. However, this second \Poptimization\Q is only useful in practice when becomes very large ( 10000>), as explained in the . Algorithm is very similar to the mystery ball game algorithm from. This analogy suggests to choose the random sets > in a slightly different way: let |\,\,\|\>:\\\>> be random maps, where \|\*n/3|\>> and \3*\=3*|\*n/3|\>>, and take <\equation*> I+k>\\|)>,j\\,k\\>. Assuming for simplicity that =\/p> whenever \\> in our algorithm, the probability of success was analyzed in. It turns out that there is aphase change around \1.22179> (and /3\0.407265>). For any \\> and \0>, numeric evidence suggests that the probability of success exceeds > for sufficiently large . This should be compared with our original choice of the >, for which the mere probability that a given index \> belongs to none of the > is roughly |)>*V>\\>>. We clearly will not be able to determine > whenever this happens. Moreover, the probability that this does not happen for any \> is roughly >>. In order to ensure that >\1->, say, this requires us to take \log n+7>. Ben-Or and Tiwari's seminal algorithm for sparse interpolation contained another way to encode multivariate exponents, based on the prime factorization of integers: givenpairwise distinct prime numbers ,\,p>, we encode an exponent ,\,e|)>\\>> as \p>*\*p>>. Ben-Or and Tiwari simply chose > to be the >-th prime number. For our purposes, it is better to pick pairwise distinct small random primes p,\,p\P> with >. Using (a further extension of) Lemma, we may efficiently bulk retrieve from > for large sets of exponents . The Ben-Or\UTiwari encoding can also be used in combination with the ideas from section. The idea is to compute both ,r,q>> and ,r,q>> with \f=*x,\,p*x|)>>>. For monomials >, we have <\eqnarray*> ,r,q>|)>>||\e>>>|,r,q>|)>>||*t\e>,>>>> so we can again compute > as the quotient of ,r,q>|)>> and ,r,q>|)>>. If the total degree of is bounded by a small number , then the Ben-Or\UTiwari encoding is very compact. In that case, all exponents of indeed satisfy \P>, whence >\D*\>. However, if is a bit larger (say and ), then > might not fit into a machine integer and there is no obvious way to break the encoding > up in smaller parts that would fit into machine integers. By contrast, the encoding > from the previous subsection uses a vector of numbers that do fit into machine integers, under mild conditions. Another difference is that >\|\*V|\>*+\+\|)>> only grows linearly with instead of , and only logarithmically with. As soon as is large and not very small, this encoding should therefore be more efficient for practical purposes. We wish to thank the authors of who pointed out a mistake in our original version of Theorem: see Remark. <\bibliography|bib|tm-plain|> <\bib-list|44> M.Agrawal, N.KayalN.Saxena. PRIMES is in P. , 160(2):781\U793, 2004. A.Arnold, M.GiesbrechtD.S.Roche. Sparse interpolation over finite fields via low-order roots of unity. , 27\U34. New York, NY, USA, 2014. ACM Press. A.Arnold, M.GiesbrechtD.S.Roche. Faster sparse multivariate polynomial interpolation of straight-line programs. Symb. Comput.>, 75:4\U24, 2016. A.ArnoldD.S.Roche. Multivariate sparse interpolation using randomized Kronecker substitutions. , 35\U42. New York, NY, USA, 2014. ACM Press. M.Asadi, A.Brandt, R.MoirM.Moreno Maza. Sparse polynomial arithmetic with the BPAS library. V.Gerdt, W.Koepf, W.SeilerE.Vorozhtsov, , 11077, 32\U50. Springer, Cham, 2018. M.Ben-OrP.Tiwari. A deterministic algorithm for sparse multivariate polynomial interpolation. , 301\U309. ACM Press, 1988. L.I.Bluestein. A linear filtering approach to the computation of discrete Fourier transform. , 18(4):451\U455, 1970. A.Bostan, G.LecerfÉ.Schost. Tellegen's principle into practice. J.R.Sendra, , ISSAC '03, 37\U44. New York, NY, USA, 2003. ACM Press. R.P.BrentP.Zimmermann. . Cambridge University Press, 2010. P.Bürgisser, M.ClausenM.A.Shokrollahi. . Springer-Verlag, Berlin, 1997. J.Canny, E.KaltofenY.Lakshman. Solving systems of non-linear polynomial equations faster. , 121\U128. New York, NY, USA, 1989. ACM Press. J.Doliskani, P.Giorgi, R.LebretonÉ.Schost. Simultaneous conversions with the residue number system using linear algebra. , 44(3), 2018. T.S.Freeman, G.M.Imirzian, E.KaltofenY.Lakshman. DAGWOOD: a system for manipulating polynomials given by straight-line programs. , 14:218\U240, 1988. S.GargÉ.Schost. Interpolation of polynomials given by straight-line programs. , 410(27-29):2659\U2662, 2009. J.vonzur GathenJ.Gerhard. . Cambridge University Press, New York, 3rd, 2013. P.Giorgi, B.Grenet, A.PerretDu CrayD.S.Roche. Random primes in arithmetic progressions. , Arxiv, 2022. P.Giorgi, B.Grenet, A.Perret du CrayD.S.Roche. Sparse polynomial interpolation and division in soft-linear time. , ISSAC'22, 459\U468. New York, NY, USA, 2022. ACM Press. P.Giorgi, B.Grenet, A.Perret du CrayD.S.Roche. Fast interpolation and multiplication of unbalanced polynomials. , arXiv, 2024. . B.Grenet, J.vander HoevenG.Lecerf. Randomized root finding over finite fields using tangent Graeffe transforms. , 197\U204. New York, NY, USA, 2015. ACM Press. D.Y.Grigoriev, M.KarpinskiM.F.Singer. Fast parallel algorithms for sparse multivariate polynomial interpolation over finite fields. , 19(6):1059\U1063, 1990. A.W.GrovesD.S.Roche. Sparse polynomials in FLINT. , 50(3):105\U108, 2016. D.HarveyJ.vander Hoeven. Integer multiplication in time >. , 193(2):563\U617, 2021. J.vander Hoeven. Probably faster multiplication of sparse polynomials. , HAL, 2020. . J.vander Hoeven. . Scypress, 2020. J.vander HoevenG.Lecerf. On the bit-complexity of sparse polynomial multiplication. , 50:227\U254, 2013. J.vander HoevenG.Lecerf. Sparse polynomial interpolation in practice. , 48(3/4):187\U191, 2015. J.vander HoevenG.Lecerf. Sparse polynomial interpolation. Exploring fast heuristic algorithms over finite fields. , HAL, 2019. . J.vander HoevenG.Lecerf. Fast interpolation of sparse multivariate polynomials. , HAL, 2023. . J.vander Hoeven, G.LecerfG.Quintin. Modular SIMD arithmetic in Mathemagix. , 43(1):5\U1, 2016. J.HuM.B.Monagan. A fast parallel sparse polynomial GCD algorithm. S.A.Abramov, E.V.ZimaX.-S.Gao, , 271\U278. New York, NY, USA, 2016. ACM Press. M.A.HuangA.J.Rao. Interpolation of sparse multivariate polynomials over large finite fields with applications. , 508\U517. Philadelphia, PA, USA, 1996. Society for Industrial and Applied Mathematics. Q.-L.Huang. Sparse polynomial interpolation over fields with large or zero characteristic. , 219\U226. New York, NY, USA, 2019. ACM Press. M.JavadiM.Monagan. Parallel sparse polynomial interpolation over finite fields. M.Moreno MazaJ.-L.Roch, , 160\U168. New York, NY, USA, 2010. ACM Press. E.Kaltofen, Y.N.LakshmanJ.-M.Wiley. Modular rational sparse multivariate polynomial interpolation. , 135\U139. New York, NY, USA, 1990. ACM Press. E.KaltofenL.Yagati. Improved sparse multivariate polynomial interpolation algorithms. P.M.Gianni, , 358, 467\U474. Springer-Verlag, Berlin, Heidelberg, 1988. M.MonaganB.Tuncer. Using sparse interpolation to solve multivariate diophantine equations. , 49(3):94\U97, 2015. M.MonaganB.Tuncer. Sparse multivariate Hensel lifting: a high-performance design and implementation. J.Davenport, M.Kauers, G.LabahnJ.Urban, , 10931, 359\U368. Springer, Cham, 2018. H.MuraoT.Fujise. Modular algorithm for sparse multivariate polynomial interpolation and its parallel implementation. Symb. Comput.>, 21:377\U396, 1996. J.-L.NicolasG.Robin. Majorations explicites pour le nombre de diviseurs de . , 26:485\U492, 1983. A.Perret du Cray. : interpolation, arithmétique, test d'identité>. , Université de Montpellier (France), 2023. R.Prony. Essai expérimental et analytique sur les lois de la dilatabilité des fluides élastiques et sur celles de la force expansive de la vapeur de l'eau et de la vapeur de l'alkool, à différentes températures. de l'École Polytechnique Floréal et Plairial, an III>, 1(cahier 22):24\U76, 1795. D.S.Roche. What can (and can't) we do with sparse polynomials? C.Arreche, , ISSAC '18, 25\U30. New York, NY, USA, 2018. ACM Press. J.B.RosserL.Schoenfeld. Approximate formulas for some functions of prime numbers. , 6(1):64\U94, 1962. A.Sedunova. A partial Bombieri\UVinogradov theorem with explicit constants. , 101\U110, 2018. K.Werther. The complexity of sparse polynomial interpolation over finite fields. , 5(2):91\U103, 1994. <\initial> <\collection> <\attachments> <\collection> <\associate|bib-bibliography> <\db-entry|+58kEKjZ1NKEM4dq|book|TeXmacs:vdH:book> <|db-entry> > <\db-entry|+2WheVMn2twB3YIJ|book|GathenGerhard2013> <|db-entry> J. > <\db-entry|+2DPRky2cs1xL3Pp|article|Pro1795> <|db-entry> > de l'École Polytechnique Floréal et Plairial, an III> <\db-entry|+if2lDqI1qmMqAaW|inproceedings|BenOrTiwari1988> <|db-entry> P. > <\db-entry|+21ptCDVw16ELG4O6|inproceedings|CKL89> <|db-entry> E. Y. > <\db-entry|+aBpkOnY2VUnXTuN|article|FrImKaLa88> <|db-entry> G. M. E. Y. > <\db-entry|+qYhUkmR1lNMNv0r|article|GrigorievKarpinskiSinger1990> <|db-entry> M. M. F. > <\db-entry|+8lDHqURSvmeX09|inproceedings|HuangRao1996> <|db-entry> A. J. > <\db-entry|+2MmazzPzwkzLNWg|inproceedings|KaltofenLakshmanWiley1990> <|db-entry> Y. N. J.-M. > <\db-entry|+21ptCDVw16ELG4OA|inproceedings|KaltofenYagati1988> <|db-entry> L. > > <\db-entry|+1QNR1I4u1FLRwI5m|article|MF96> <|db-entry> T. > Symb. Comput.> <\db-entry|+qYhUkmR1lNMNvBP|article|Werther1994> <|db-entry> > <\db-entry|+tgCudvXqpzK6N3|inproceedings|Roche2018> <|db-entry> > > <\db-entry|+2NThGgQo9xz5mXG|phdthesis|PerretduCray2023> <|db-entry> > : interpolation, arithmétique, test d'identité> <\db-entry|+21ptCDVw16ELG4O3|inproceedings|ArnoldGiesbrechtRoche2014> <|db-entry> M. D. S. > <\db-entry|+tgCudvXqpzK6Mb|article|AGR16> <|db-entry> M. D. S. > Symb. Comput.> <\db-entry|+21ptCDVw16ELG4O5|inproceedings|ArnoldRoche2014> <|db-entry> D. S. > <\db-entry|+2MmazzPzwkzLNWZ|article|GargSchost2009> <|db-entry> É. > <\db-entry|+21ptCDVw16ELG4O7|inproceedings|GiorgiGrenetPerretduCray2022> <|db-entry> B. A. D. S. > '22> <\db-entry|+1QNR1I4u1FLRwI5e|inproceedings|Huang19> <|db-entry> > <\db-entry|+21ptCDVw16ELG4O1|inproceedings|ABMM18> <|db-entry> A. R. M. > W. W. E. > <\db-entry|+1QNR1I4u1FLRwI5Z|inproceedings|vdH:rmodroots> <|db-entry> J. van der G. > <\db-entry|+21ptCDVw16ELG4O8|article|GR16> <|db-entry> D. S. > <\db-entry|+1Hcl3U922Lc9q617|techreport|vdH:smul> <|db-entry> > > <\db-entry|+8lDHqURSvmeXA0|article|vdH:spinpol> <|db-entry> G. > <\db-entry|+tgCudvXqpzK6MJ|techreport|vdH:ffsparse> <|db-entry> G. > > <\db-entry|+1QNR1I4u1FLRwI5d|inproceedings|HM16> <|db-entry> M. B. > E. V. X.-S. > <\db-entry|+1QNR1I4u1FLRwI5f|inproceedings|JaMo10> <|db-entry> M. > J.-L. > <\db-entry|+qYhUkmR1lNMNv67|article|MT15> <|db-entry> B. > <\db-entry|+21ptCDVw16ELG4OB|inproceedings|MT18> <|db-entry> B. > M. G. J. > <\db-entry|+1VNBJ3wg2X8MAzEw|techreport|GiorgiGrenetPerretduCrayRoche2024> <|db-entry> B. A. D. S. > > <\db-entry|+1PsNR9OI1vmeGVuk|article|vdH:nlogn> <|db-entry> J. van der > >> <\db-entry|+qYhUkmR1lNMNuwQ|book|BZ10> <|db-entry> P. > <\db-entry|+qYhUkmR1lNMNuua|book|BCS97> <|db-entry> M. M. A. > <\db-entry|+28fKGNvA1Suxj1n7|article|RS62> <|db-entry> L. > <\db-entry|+FiSRDNK22l63TuZ|article|NR83> <|db-entry> G. > > <\db-entry|+QmuwJvh23GrcaN0|techreport|GGPR22b> <|db-entry> B. A. Perret Du D. S. > > <\db-entry|+tgCudvXqpzK6MP|article|Sedunova2018> <|db-entry> > <\db-entry|+tgCudvXqpzK6MZ|article|AKS04> <|db-entry> N. N. > <\db-entry|+21ptCDVw16ELG4O2|article|Blue70> <|db-entry> > <\db-entry|+1WH9NdJ117cAxby3|inproceedings|BoLeSc2003> <|db-entry> G. É. > R. > <\db-entry|+2DPRky2cs1xL3QZ|article|vdH:sparsemult> <|db-entry> G. > <\db-entry|+1VNBJ3wg2X8MAzEh|techreport|vdH:mvsparse-v1> <|db-entry> G. > > <\db-entry|+1c9ZcFhm1wjIEk2e|article|vdH:simd> <|db-entry> G. G. > <\db-entry|+21ptCDVw16ELG4Nw|article|DoliskaniGiorgiLebretonSchost2018> <|db-entry> P. R. É. > <\references> <\collection> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > <\auxiliary> <\collection> <\associate|bib> TeXmacs:vdH:book GathenGerhard2013 vdH:ffsparse Pro1795 BenOrTiwari1988 CKL89 FrImKaLa88 GrigorievKarpinskiSinger1990 HuangRao1996 KaltofenLakshmanWiley1990 KaltofenYagati1988 MF96 Werther1994 Roche2018 PerretduCray2023 ArnoldGiesbrechtRoche2014 AGR16 AGR16 ArnoldRoche2014 GargSchost2009 GiorgiGrenetPerretduCray2022 Huang19 ABMM18 vdH:rmodroots GR16 vdH:smul vdH:spinpol vdH:ffsparse HM16 JaMo10 MT15 MT18 GiorgiGrenetPerretduCray2022 PerretduCray2023 GiorgiGrenetPerretduCray2022 PerretduCray2023 GiorgiGrenetPerretduCrayRoche2024 ArnoldRoche2014 vdH:smul GathenGerhard2013 GathenGerhard2013 vdH:nlogn BZ10 GathenGerhard2013 BZ10 GathenGerhard2013 BCS97 BCS97 RS62 NR83 GGPR22b Sedunova2018 GGPR22b Sedunova2018 AKS04 GathenGerhard2013 GathenGerhard2013 AGR16 GiorgiGrenetPerretduCray2022 GiorgiGrenetPerretduCray2022 Blue70 GiorgiGrenetPerretduCray2022 GiorgiGrenetPerretduCray2022 BoLeSc2003 KaltofenYagati1988 vdH:sparsemult vdH:mvsparse-v1 GiorgiGrenetPerretduCrayRoche2024 Sedunova2018 GGPR22b PerretduCray2023 PerretduCray2023 vdH:sparsemult vdH:rmodroots Huang19 vdH:simd DoliskaniGiorgiLebretonSchost2018 vdH:smul vdH:smul BenOrTiwari1988 GiorgiGrenetPerretduCrayRoche2024 <\associate|toc> |math-font-series||font-shape||1.Introduction> |.>>>>|> |Notation. |.>>>>|> > |math-font-series||font-shape||2.Preliminaries> |.>>>>|> |2.1.Sparse polynomials |.>>>>|> > |2.2.Modular blackbox polynomials |.>>>>|> > |2.3.Number theoretic reminders |.>>>>|> > |2.4.Amortized determination of prime divisors in a fixed set |.>>>>|> > |math-font-series||font-shape||3.Probabilistic codes for exponents> |.>>>>|> |3.1.The exponent encoding |.>>>>|> > |3.2.Guessing individual exponents of prescribed size |.>>>>|> > |3.3.Guessing exponents of prescribed size |.>>>>|> > |math-font-series||font-shape||4.Sparse interpolation> |.>>>>|> |4.1.Cyclic modular projections |.>>>>|> > |4.2.Probability of faithfulness |.>>>>|> > |4.3.Computing probabilistic codes for the exponents |.>>>>|> > |4.4.Approximate sparse interpolation |.>>>>|> > |4.5.Sparse interpolation |.>>>>|> > |math-font-series||font-shape||5.Practical variants> |.>>>>|> |5.1.Conducting most computations in double precision |.>>>>|> > |Chinese remaindering. |.>>>>|> > |Tangent numbers. |.>>>>|> > |Small prime divisors. |.>>>>|> > |Chinese remaindering, bis. |.>>>>|> > |5.2.The mystery exponent game |.>>>>|> > |5.3.The Ben-Or\UTiwari encoding |.>>>>|> > |Acknowledgments |.>>>>|> > |math-font-series||font-shape||Bibliography> |.>>>>|>