> <\body> |<\doc-note> This work has partially been supported by the ANR Gecko project. ||<\author-address> CNRS, de Mathématiques ( 425) Université Paris-Sud 91405 Orsay Cedex France Email: >|>||> <\abstract> Consider two polynomials and with multiple precision floating point coefficients. Although the product can in principle be computed efficiently using FFT multiplication, this algorithm is numerically stable only if the coefficients of are of the same order of magnitude and similarly for . In this paper, we present anew asymptotically fast multiplication algorithm which has a ``better'' numerically stability. We also provide some theoretical support for what we feel to be ``better'', by introducing the concept of ``relative Newton error''. >>>>>>>>>>>> Let > be the set of floating point numbers with an -bit mantissa and an -bit exponent. Each non-zero number \> of this kind can be written as m*2>>, with <\eqnarray*> >|>||2>>,\,-1|2>>>>|>|>|2,\,\1,0,1,\,2}.>>>> In what follows, is assumed to be fixed once and for all. For the sake of simplicity, we will not study overflows and underflows in detail and instead make the assumption that has been chosen large enough with respect to the input data. Consider two polynomials <\eqnarray*> ||+\+P*z>>|||+\+Q*z>>>> with coefficients in >. For definiteness, we will assume \0>, \0>, \0> and \0>. In this paper, we study the problem of multiplying and . Using FFT-multiplication, the polynomials and can in principle be multiplied in time (n))>. Here (n)=O(n*log n*log log n)> is the complexity of -bit integer multiplication and it can be assumed that (n)/n> increases with. Alternatively, one may rewrite and as ``floating point polynomials'', with an -bit integer exponent and polynomials with -bit integer coefficients as mantissas. Using Kronecker's method, we may then multiply and in time (n*d))>, where . If q>, then this complexity can be improved a bit further into (n*q)*p/q)> by using multiplications of degrees . In the case when the coefficients of are all of the same order of magnitude, and similarly for , the above algorithms are very efficient and numerically stable. In section, we will give a more detailed uniform error analysis. Unfortunately, the algorithms are very unstable if the coefficients of or are not of the same order of magnitude. For instance, taking > with \2n>>, the transformation of and into floating point polynomials =z\P> and =z\Q> amounts to neglecting the coefficients > and>. In particular, when approximating *>, only the leading coefficient > of the product will be computed with a satisfactory accuracy. A situation which arises often in practice is that the above fast multiplication algorithms can be made numerically stable using a suitable scaling <\eqnarray*> |>|\>(z)=P(\*z)>>||>|\>(z)=Q(\*z).>>>> For instance, in the above example when >, we may multiply and using <\equation*> P*Q=(P\>*Q\>)\1>>. This scaling method is particularly useful in the case when and are truncations of formal power series. In section, we will briefly explain why and we refer to6.2> for a more detailed analysis. However, fast multiplication with scaling only works in the case when the coefficients of \>> have similar orders of magnitude for a suitable >. Geometrically speaking, this means that the roots of all lie in a ``not so large'' annulus around the circle of radius >. In practice, this condition is not always satisfied, which leaves us with the question of designing a more general multiplication algorithm which is both asymptotically fast and numerically stable. In its full generality, this problem is ill-posed. For instance, in cases when cancellation occurs, it is impossible to design an algorithm which computes all coefficients > with a relative error bounded by >. A simple example of this situation is the computation of =0> for , . Nevertheless, such cancellations can only occur ``below'' the ``numerical Newton polygon'' of : when using the naive *(n))> multiplication algorithm, based on the convolution sums <\equation*> (P*Q)=P*Q, the coefficients > which correspond to vertices of this Newton polygon are computed accurately, with a relative error bounded by >. Numerical Newton polygons will be defined and discussed in section. They are the natural numeric counterpart of Newton polygons for polynomials with power series coefficients, based on the analogy between floating point numbers and (Laurent) series, which goes back to Newton himself. A reasonable problem is therefore to compute with a similar kind of accuracy as the naive *(n))> multiplication algorithm does, but faster. This also leads to the question of formalizing what can be meant by ``similar kind of accuracy''. In section, we define the notion of ``relative Newton error'' > for the computation of P*Q>. We regard this notion as the natural generalization of the notion of ``relative error'', when numbers are replaced by numerical polynomials. In section, we will state a few interesting properties of this concept, which provide further evidence for its usefulness. The central part of the paper is devoted to the description of an asymptotically fast multiplication algorithm which has a good numerical stability in terms of the relative Newton error. The main ideas behind the algorithm are the following: <\enumerate> The product is modeled by a rectangle p\0\q={0,\,p-1}\{0,\q-1}> in which each pair corresponds to the computation of *Q>. Only a subregion \0\p\0\q>, which can be determined precisely (section), contributes substantially to the product . The complement can be neglected. The region > can be covered by a suitable disjoint union of rectangles i\j\j> (see sections and). The contribution of each such rectangle i\j\j> to can be computed efficiently using a fast multiplication algorithm with scaling (section). The main result of the paper is the following: <\theorem> There exists an algorithm to compute a floating point approximation P*Q> with relative Newton error \2n>> in time (n*d))>, provided that . Let us describe in more detail the second classical multiplication algorithm mentioned in the introduction, when the coefficients of are of the same order of magnitude, and similarly for . We start with the determination of the maximum exponents <\eqnarray*> >||>,\,e>}>>|>||>,\,e>}>>>> and the approximation of >> and >> by integer polynomials > and > whose coefficients have bit-sizes bounded by . At the second step, we use Kronecker's method for the multiplication of > and >. Taking 2*(n+2)+log d>, the product *> can be reconstructed from the product of the integers (2)> and (2)>. Assuming that d=O(n)>, these integers have sizes and , whence can be multiplied in time (n*d))>. We finally obtain an approximation of the product by fitting the coefficients of **2+e-2*n>> into the nearest available floating point numbers in >. Let us now analyze the errors induced by the above scheme, where we recall our assumption that no overflows or underflows occur during our computations. When computing > and > by rounding to the nearest, we obtain the uniform absolute error bounds <\eqnarray*> -P*2>\<\|\|\>>|>|>>>|-Q*2>\<\|\|\>>|>|>,>>>> where <\equation*> \<\|\|\>P\<\|\|\>=max {\|P\|,\,\|P\|} denotes the sup-norm of if we consider as a vector of coefficients. Consequently, <\eqnarray*> *-P*Q*2-e>\<\|\|\>>|>|*-P**2>\<\|\|\>+\<\|\|\>P**2>-P*Q*2-e>\<\|\|\>>>||>|*\<\|\|\>\<\|\|\>+*\<\|\|\>P\<\|\|\>*2>>>||>|.>>>> On the other hand, we also have the bounds <\eqnarray*> P*Q\<\|\|\>>|>|P\<\|\|\>*\<\|\|\>Q\<\|\|\>>>|P*Q\<\|\|\>>|>|>*\<\|\|\>P\<\|\|\>*\<\|\|\>Q\<\|\|\>.>>>> The second bound follows by considering the FFT transforms of , and with respect to a -th root of unity >. Indeed, the sup-norms of these transforms satisfy <\eqnarray*> FFT>(P)\<\|\|\>>|>|P\<\|\|\>>>|P\<\|\|\>=*\<\|\|\>FFT1>>(FFT>(P))\<\|\|\>>|>|FFT>(P)\<\|\|\>,>>>> whence <\equation*> \<\|\|\>P\<\|\|\>*\<\|\|\>Q\<\|\|\>\\<\|\|\>FFT>(P)\<\|\|\>*\<\|\|\>FFT>(Q)\<\|\|\>=\<\|\|\>FFT>(P*Q)\<\|\|\>\2*d*\<\|\|\>P*Q\<\|\|\>. When \e+e>, the combination of() and() yields the (rough) bound <\equation*> \<\|\|\>R-P*Q\<\|\|\>\4*d*2*\<\|\|\>P*Q\<\|\|\>. In the case when \e+e>, the combination of() and() leads to the same bound. In all cases, we thus obtain a <\equation> \=R-P*Q\<\|\|\>|\<\|\|\>P*Q\<\|\|\>>\2 d+2-n>. <\remark> We have chosen Kronecker's method for the multiplication of integer polynomials, because it is relatively straightforward to implement on top of a fast library for integer arithmetic such as . Internally, such integer libraries rewrite numbers as polynomials, say in , and rely on FFT multiplication of these polynomials. At the price of improving , better performance can be achieved by using a bi-variate FFT on both and . In general, this yields only a constant speed-up, but in the somewhat exotic case when d\n>, this method still admits a complexity (n*d))> instead of (d*log d))>. In the case when the polynomials and are truncations of formal power series and, the asymptotic behaviour of the coefficients is usually of the form <\equation> log \|P\|=\*k+O(\(k)), for some suitable slowly increasing function (k)=o(k)>. It is natural to rescale by considering the polynomial \>> <\equation*> P\>(z)=P(\*z), with =2\>>, since its coefficients \,k>=P*2\*k>> are comparable in magnitude: <\equation*> log \|P\,k>\|=O(\(k)). If has a similar asymptotic behaviour <\equation> log \|Q\|=\*k+O(\(k)), we may then compute P*Q> using the formula <\equation*> R\(P\>*Q\>)\1>> and the algorithm from the previous section for the multiplication of \>> and \>>. Now assume that we also have an asymptotic formula <\equation*> log \|R\|=\*k+O(\(k)). Then it can be shown that the relative error of each coefficient > is bounded by <\equation> \>=-(P*Q)\||(P*Q)>=\(k))+log d-n> for all sufficiently large . In other words, the multiplication method with scaling amounts to a loss of (k))+log d> bits of precision. Under the assumption that C*\(d)+2*log d> for asufficiently large constant, we are thus left with at least correct leading bits in the mantissa of each coefficient. The asymptotic behaviour of (k)> is closely related to the dominant singularities ofand and for many interesting applications in combinatorics or physics, one has (k)=O(log k)>. For more details, and the application of this technique to the relaxed computation of formal power series, we refer to 6.2>. It sometimes happens that only ``limsup versions'' <\eqnarray*> k>log \|P\|>||*k+O(\(k))>>|k>log \|Q\|>||*k+O(\(k))>>>> of() and() are verified. In that case, the individual relative error estimates() cease to hold, but we still have the scaled uniform bound <\equation> \\>>=R\>-(P*Q)\>\<\|\|\>|\<\|\|\>(P*Q)\>\<\|\|\>>\2 d+2-n>, which is a direct consequence of the corresponding bound() in the previous section. It also frequently occurs that does not have the same radius of convergence as, inwhich case we rather have an asymptotic form <\equation*> log \|Q\|=\*k+O(\(k)). Without loss of generality, we may assume that \\>. When considering the convolution product <\equation> (P*Q)=P*Q, we observe that the summands *Q> approximately form a geometric progression. In first approximation, only the last -\)> terms therefore contribute to the leading bits of the result (see figure ). The computation of can thus be reduced to the computation of the products <\equation*> P*Qw>,Pp>*Q2*w>,Pp>*Q3*w>,\,Pp>*Qq/w\*w\q>, using the notation <\eqnarray*> j>||,j-1}>>|j>>||*z+\+P*z.>>>> When computing each of these products using the scaling>\> and a doubled precision , the individual coefficients > of P*Q> are again obtained with a relative precision of the form)>. The total cost of this algorithm is (w)*d/w)>, which reduces to (n)*d)> for large and fixed . A generalization of this idea will be analyzed in more detail in section. <\big-figure||gr-frame|>|gr-geometry||gr-grid||gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||gr-edit-grid-old||1>|gr-grid-aspect|||>|gr-grid-aspect-props|||>|gr-fill-color|default|gr-line-width|default|gr-line-arrows|||>>>|gr-text-at-halign|center|gr-text-at-valign|default|gr-color|default|magnification|>|>||>>||||>>||>>||>>||||>>||>>||>>||||>>||||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>||>>|>|>>|>|>>|||>>>|||>>|*Q>|>>|>|>>|>|>>|>|>>|>|>>|||>>|||>>>||>>||>>|||>>|||>>>||>>||>>||||>>|>||||>>||>>||>>||>>||>>||>>||>>||>>||>>|||>>>||>>|||>>>||>>>>> Illustration of the computation of for two polynomials and whose coefficients satisfy \|P\|=\*k+o(k)> and \|Q\|\\*k+o(k)> with \\>. Each square corresponds to the contribution of the product *Q> to >. Setting -\)>, only the terms *Q> in the shaded area contribute to the leading bits of >, in first approximation. Roughly speaking, the algorithms from sections and are numerically stable when the roots of and lie in an annulus around the unit circle the circle of radius >. When this is no longer the case, one may still try to apply the algorithm from section for partial products i>*Qj>>, using scales > which vary as a function of i> and j>. Acentral concept for doing this is the > of , which is the convex hull of the half-lines \|P\|-\>)> for all i\p>, with the convention that \|0\|=\\> (see figure for two examples). The numeric Newton polygon > is characterized by its <\equation*> E=E>=E+\+E*z with coefficients <\equation*> E=max{y:(i,y)\N}. Equivalently, we may characterize > by the first exponent > and the slopes =E-E> with i\p>. Notice that \\\\>. It is easy to compute the vertices ,y),\,(i,y)>, \\\i> of the Newton polynomial of k>> by induction over : at each step, we add \|P\|)> at the right of the sequence and repeatedly remove the before last pair ,y)> as long as it lies below the edge from ,y)> to ,y)>. Using this method, the vertices of >, as well as the numbers > and >, can be computed in linear time. The scaling operation P\>> has a simple impact on exponent polynomials: <\equation*> E\>>(z)=Elog \>(z)=i\p>(E+i*log \)*z. The norms of and \>> and their exponent polynomials are related by <\eqnarray*> \<\|\|\>P\<\|\|\>>||E\<\|\|\>>>| \<\|\|\>P\>\<\|\|\>>||Elog \>\<\|\|\>,>>>> since the maximum -value of a convex hull of a set of pairs coincides with the maximum -values of the points themselves. <\big-figure||gr-frame|>|gr-geometry||gr-grid||1>|gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||1>|gr-edit-grid-old||1>|gr-grid-aspect-props|||>|gr-grid-aspect||>|magnification|>|gr-fill-color|default|gr-text-at-halign|center|gr-line-arrows|none|gr-line-width|2ln|>|fill-color|pink||||||>>|>||>>|||||>>|>>|>||>>>>|gr-frame|>|gr-geometry||gr-grid||1>|gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||1>|gr-edit-grid-old||1>|gr-grid-aspect-props|||>|gr-grid-aspect||>|magnification|>|gr-fill-color|default|gr-text-at-halign|center|gr-line-width|2ln|gr-color|default|>|fill-color|pink|||||>>|||||>>|>>|>||>>|>||>>>>|gr-frame|>|gr-geometry||gr-grid||1>|gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||1>|gr-edit-grid-old||1>|gr-grid-aspect-props|||>|gr-grid-aspect||>|magnification|>|gr-fill-color|pink|gr-text-at-halign|center|gr-line-width|2ln|>|fill-color|pink|||||||>>|+N>>|>>|>|fill-color|pink||>>|>|fill-color|pink||>>>>> Illustrations of the numeric Newton polygons of +1> and +> and the Minkowski sum of these polygons. Recall that the of the convex sets > and > is defined by <\equation*> N+N={(i+i,j+j):(i,j)\N,(i,j)\N}. An example of a Minkowski sum is shown in figure. It is convenient to regard > and> as ``max-plus polynomials'' over the semi-algebra \{\\}> in which multiplication and addition are replaced by addition and the operator >. These polynomials are convex in the sense that its sequences of coefficients are convex. The exponent polynomial of +N> may then be reinterpreted as a ``product'' of polynomials: <\eqnarray*> +N>>||\E>>|\E)>||E+E.>>>> Clearly, \E)=E+E>, and it is easily shown that the slopes of +N> are obtained by merging the ordered lists \\\\> and \\\\>. This provides us with a linear-time algorithm for the computation of \E> as a function of > and >. The crucial property of +N> is that it approximates > well. <\proposition> We have E-E\E\<\|\|\>\log d+1.> <\proof> Let \\>>. Applying() and(), we have <\equation*> \|log \<\|\|\>(P*Q)\>\<\|\|\>-log \<\|\|\>P\>\<\|\|\>-log \<\|\|\>Q\>\<\|\|\>\|\log d+1. Using() and the fact that EQ>\<\|\|\>=\<\|\|\>E\<\|\|\>+\<\|\|\>E\<\|\|\>>, we obtain <\equation*> \|\<\|\|\>Elog \>\<\|\|\>-\<\|\|\>(E\E)log \>\<\|\|\>\|\log d+1. Given arbitrary convex max-plus polynomials and with , it therefore suffices to show that <\equation*> \<\|\|\>E-F\<\|\|\>\m=max\\>\|\<\|\|\>E\>\<\|\|\>-\<\|\|\>F\>\<\|\|\>\|. Indeed, for i\deg E>, let > be such that E\>\<\|\|\>=E+i*\>. Then <\equation*> F\\<\|\|\>F\>\<\|\|\>-i*\\\<\|\|\>E\>\<\|\|\>+m-i*\=E+m. The relation \F+m> is proved similarly. It is classical that the numeric properties of a polynomial can be greatly effected by slightly modifying one of its coefficients. For instance, if n>*z-1>, then its root can move from > to\>, when modifying its coefficient > by an amount as small as n>*\<\|\|\>P\<\|\|\>>. Nevertheless, as we will detail in section, one may expect that many of the numeric properties of do not sensibly change if we modify a coefficient > by less than -n>>. If the coefficients> of are known with absolute errors >>, this motivates us to introduce the of by <\equation*> \=max >|2>>,\,>|2>>. For instance, if is an approximation of the product , then its relative Newton error would typically be given by <\equation*> \=max-(P*Q)\||2>>>,\,-(P*Q)\||2>>>. Alternatively, we may regard > as the minimal uniform error of the approximations \>\P\>*Q\>> for all possible scales \\>>: <\equation*> \=min\\>>\\>>. In this section and the subsequent sections, we propose an efficient algorithm for computing an approximation P*Q>, which is numerically stable in the sense that > is small. The first main idea behind Newton multiplication is that only part of the products*Q> substantially contribute to : the pair and the corresponding product *Q> are said to be if <\equation*> E+E\(E\E)-n. Indeed, when neglecting the sum of all such products in the computation of , the corresponding increase of the relative Newton error is bounded by d+1-n>>. This follows from proposition and the fact that there are at most pairs which contribute to afixed coefficient >. The left-hand picture in figure illustrates a typical zone of non-negligible contributions *Q>. The lower and upper limits (i)> and (i)> of this zone are increasing functions in and can therefore be determined in linear time as a function of > and >. A set > of pairs is said to be negligible if each of its elements is negligible. Apartial product i>*Qj>> is said to be if i\j\j> is negligible and if there exists no proper subrectangle |~>\|~>\|~>\|~>> of i\j\j> such that i\j\j\|~>\|~>\|~>\|~>> is negligible. Clearly, any non-negligible partial product i>*Qj>> gives rise to a unique admissible partial product |~>\|~>>*Q|~>\|~>>> by taking |~>\|~>\|~>\|~>> to be the smallest subrectangle whose complement in i\j\j> is negligible. Any subdivision <\equation> 0\p\0\q= i\i\j\j therefore gives rise to a new disjoint union <\equation> |~>\|~>\|~>\|~> of admissible subrectangles, whose complement in p\0\q> is negligible. The second disjoint union will be called the of the first one. The right-hand side of figure illustrates the admissible part of a so-called product subdivision (see section below). <\big-figure||gr-frame|>|gr-geometry||gr-grid||gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||gr-edit-grid-old||1>|gr-grid-aspect-props|||>|gr-grid-aspect||>|magnification|>|gr-fill-color|#ffd0d0|gr-color|none|gr-line-width|default|>|fill-color|#ffc0c0||||>>|>|fill-color|pink||||>>|>|fill-color|#ffc0c0||||>>|>|fill-color|#ffc0c0||||>>|>|fill-color|pink||||>>|>|fill-color|#ffc0c0||||>>|>|fill-color|#ffc0c0||||>>|>|fill-color|pink||||>>|>|fill-color|pink||||>>|>|fill-color|#ffc0c0||||>>|>|fill-color|pink||||>>|>|fill-color|#ffc0c0||||>>|>|fill-color|pink||||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||||>>|>||>>|>||>>|>||>>>>|gr-frame|>|gr-geometry||gr-grid||gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||gr-edit-grid-old||1>|gr-grid-aspect-props|||>|gr-grid-aspect||>|magnification|>|gr-fill-color|default|gr-color|#e0e0e0|gr-line-width|default|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>||>>|>|fill-color|pink||||>>|>|fill-color|pink||||>>|>|fill-color|pink||||>>|>|fill-color|pink||||>>|>||||>>|>||>>|>||>>|>||>>|>||>>|>|fill-color|pink||||>>|>||>>|>||>>|>||>>|>|fill-color|pink||||>>|>|fill-color|pink||||>>|>|fill-color|pink||||>>|>|fill-color|pink||||>>|>|fill-color|pink||>>|>|fill-color|pink||||>>|>|fill-color|pink||||>>>>> On the left hand side we displayed the non-negligible contributions *Q> to a typical product . The dark squares correspond to products *Q> for which >+E>=(E\E)> and the light squares to products where we only have >+E>\(E\E)-n>. At the right hand side, we have shown the admissible part of a product subdivision of p\0\q> formed by the thin black rectangles. Our next task is to find asuitable subdivision() with admissible part(), such that P|~>\|~>>*Q|~>\|~>>> can be computed both efficiently and in a numerically stable way. The second main idea behind Newton multiplication is to search for a subdivision such that the products |~>\|~>>*Q|~>\|~>>> can be performed using suitable scalings such that the rectangles |~>\|~>\|~>\|~>> are yet reasonably large. As a first step in this construction, we will approximate the Newton polygons of and by simpler ones, merging those edges whose slopes differ only by a small amount. Given a slope \\> and a convex max-plus polynomial , we define the >-discrepancy >> of by <\equation*> h>=\<\|\|\>E\>\<\|\|\>+\<\|\|\>\E\>\<\|\|\>. Then the individual exponents \,i>> of \>> lie in a strip of height >>: <\equation> \<\|\|\>E\>\<\|\|\>-h>\E\,i>\\<\|\|\>E\>\<\|\|\>. In the particular case when > equals the main slope of , <\equation*> \=-E|deg E>, we call >> the of and also denote it by >. For any \\> and any second convex max-plus polynomial , the convexity of and implies <\eqnarray*> >>|>|>>|>|>|,h)>>>> More generally, one defines the >-discrepancy ,i\i>> of on a range i\0\(deg E+1)> by ,i\i>=hi>*Ei>,\>> and similarly i>=hi>*Ei>>>. We have <\eqnarray*> i>>|>|i>+h\i>>>>> for any subdivision of the range i> into i> and \i>. Let us now return to the approximation of the Newton polygon > by a new one with fewer edges. Let \0> be an arbitrary but fixed constant, such as =1>, and denote =\\*n\>. We now construct the sequence \\\i=p> by taking each > maximal such that ,i\i>\\> (see figure below). Using the convexity of >, it is not hard to check that this construction can be done in linear time using a single traversal of p>. The edges of > between > and -1> are well-approximated by a single edge from ,E>)> to -1,E-1>)>. Asimilar sequence \\\j=q> is constructed for . <\big-figure||gr-frame|>|gr-geometry||gr-grid||1>|gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||1>|gr-edit-grid-old||1>|gr-grid-aspect-props|||>|gr-grid-aspect||>|magnification|>|gr-line-width|1ln|gr-fill-color|default|gr-color|default|gr-dash-style||gr-text-at-halign|center|gr-text-at-valign|center|gr-as-visual-grid|off|>|fill-color|#ffc0c0||||>>|>|fill-color|#ffc0c0||||>>|>||||||>>|>||>>|>||>>|>||>>|>||>>|>|dash-style|||>>|>|dash-style|||>>|>|dash-style|||>>|>|dash-style|||>>|>>|>>|>>|>>|>>|>>|>>|>>>>> Illustration of the sequence \\\i=p> for a given Newton polygon and amaximal discrepancy =1>. The border of the Newton polygon corresponds to the thick black line. The parts corresponding to the segments \i> and \i> fit into the indicated strips of heights and >. The last part, which corresponds to the segment \i> is a straight segment which fits into a strip of height . A first candidate for the subdivision() is the : <\equation> 0\p\0\q= i\i\j\j, where the > and > are as in the previous section. Figure shows an example of a product subdivision. <\lemma> Given , there exist at most +2)> rectangles in the product subdivision, forwhich the admissible part contains a pair with . <\proof> Assume the contrary and let \i\j\j> and -1>\i>\j-1>\j>> be the top left and bottom right rectangles in the product subdivision, for which the admissible parts contribute to >. We have -r\1/\+2> or -s\1/\+2>. By symmetry, we may assume -r\1/\+2> . Now consider the convex max-plus polynomials <\equation*> |||||i>>|+>|i-1>-1>*z-1>-i-1>>>|||>>|+>|-1>+1>*z-1>-i-1>>>>>> Using(), we have =h,i\i-1>>\(1/\)*\\n>. From() and(), we thus get <\equation*> h\h\n. Now E+F\<\|\|\>> coincides with \E)>, whence there exists an index i\i-1>> with <\equation*> E+E=(E+F)\\<\|\|\>E+F\<\|\|\>-h\(E\E)-n, which means that the pair is negligible. But this is only possible if either ,k-i)> or -1>-1,k-i-1>+1)> is also negligible. This, in its turn, implies that either the rectangle \i\j\j> or the rectangle -1>\i>\j-1>\j>> is negligible, in contradiction with our hypothesis. <\corollary> There are at most non-negligible rectangles in the product subdivision, and they can be computed in linear time. <\proof> The first assertion follows from the facts that each diagonal ={(i,j):i+j=k}> with 0\p+0\q> intersects at most +2)> rectangles and that there are less than such diagonals. The rectangles can be computed in time , by following the lower and upper limits > and > of the zone of non-negligible pairs. The subdivision we are really after refines(). More precisely, consider a non-negligible rectangle i\j\j> of the form \i\j\j>. Let <\equation*> \=\i>=-1>-E|i-1-i> be the slope of on i>. Swapping and if necessary, we may assume without loss of generality that \\j>>. We now construct the sequence |^>\\\|^>>=j>, by choosing each |^>>> minimal such that ,\>\\>, and subdivide <\equation> i\i\j\j=>i\i\|^>>\|^>+1>. We may re-interpret figure as an illustration of the non-negligible pairs inside the rectangle i\j\j>, together with the subdivision() and its corresponding admissible part. Performing the replacement() for all non-negligible rectangles \i\j\j> in(), while leaving the negligible rectangles untouched, we obtain a finer subdivision of p\0\q>, which we call the . <\lemma> Given , there exist at most +3)> rectangles in the Newton subdivision, for which the admissible part contains a pair with . <\proof> Consider a non-negligible rectangle i\j\j=i\i\j\j> in the product subdivision. In view of lemma(), it suffices to show that there exists at most +3> rectangles in the admissible part of its subdivision () which contribute to >. Now assume that |^>\|^>>\|^>+1>> and |^>\|^>>\|^>+1>> satisfy -\1/\+3> and assume that |^>\|^>\i\i> satisfy |^>+|^>=|^>+|^>=k>. Setting =2>>, we then have <\eqnarray*> \>,|^>>>|>|\>,|^>>+\>>|\>,|^>>>|>|\>,|^>>-(1/\+1)*\\E\>,|^>>-n-\,>>>> whence <\equation*> E|^>>+E|^>>\E|^>>+E|^>>-n\(E\E)-n. This shows that the pair |^>,|^>)> is negligible. We conclude that the diagonal intersects at most +3> admissible rectangles. <\corollary> There are at most non-negligible rectangles in the Newton subdivision, and they can be computed in linear time. We can now describe our algorithm for computing the product P*Q>. We start by the computation of the admissible part of the Newton subdivision. For each rectangle i\j\j> in this admissible part, we compute its contribution to as follows. By construction, we have ,\>\\> or ,\>\\> for =\i>> or =\j>>. Setting =2>>, we compute the contribution i\j\j>> of i>*Qj>> to using the multiplication method with scaling from section, <\equation> Ri\j\j>\(Pi,\\>*Qj,\\>)\1>>, using a precision of > instead of bits. The final result P*Q> is obtained by summing all partial products i\j\j>> using straightforward floating point arithmetic. This algorithm leads to the following refinement of theorem. <\theorem> There exists an algorithm of time complexity (n*d))>, which computes P*Q> with relative Newton error <\equation> \\2. <\proof> We claim that the computation of the product() using a precision of > bits contributes to the relative Newton error of the global product by an amount which is bounded by >. This clearly implies(), since for a fixed coefficient > in the product, at most partial products of the form i\j\j>> contribute to>. Setting =Pi>>, =Qj>> and =Ri\j\j>>, the bound() yields <\equation*> \>-(*)\>\<\|\|\>|\<\|\|\>(*)\>\<\|\|\>>\2 d+O(1)-n-2*\>. For any +j\i\i+j\j>, the bounds(), () and proposition also imply <\eqnarray*> \<\|\|\>(*)\>\<\|\|\>-log d>|>| \<\|\|\>\>\<\|\|\>+log \<\|\|\>\>\<\|\|\>>>||>|E\>,i\i>\<\|\|\>+\<\|\|\>E\>,j\j>\<\|\|\>>>||>|\>,i>+h,log \,i\i>+E\>,j>+h,log \,j\j>>>||>|\>\Q\>,k>+2*\>>||>|\>,k>+2*\+log d+1.>>>> Combining both estimates, we obtain <\eqnarray*> -(*)\||2>>>>||\,k>-(*)\,k>\||2\,k>>>>>>||>|\>-(*)\>\<\|\|\>|2\,k>>>>>>||>| d+O(1)-n-2*\>*(*)\>\<\|\|\>|2\,k>>>>>>||>| d+O(1)-n>.>>>> This completes the proof of our claim. The proof of the complexity bound relies on lemma. Let <\equation*> |~>\|~>\|~>\|~> be the admissible part of the Newton subdivision and let +3)>. Consider the total length <\equation*> L=|~>-|~>+|~>-|~> of the left borders and the lower borders of all rectangles |~>\|~>\|~>\|~>>. Since every diagonal hits the union of these borders in at most points, we have 4*d*K>. The total cost of all multiplications |~>\|~>>*Q|~>\|~>>> is therefore bounded by <\equation*> O((|~>-|~>+|~>-|~>)*n)\O((L*n))\O((d*n)). This completes the proof of our theorem. <\remark> Theorem follows from theorem, when applying it for the increased precision . Now that we have an efficient multiplication which is numerically stable in terms of the relative Newton error, it is interesting to investigate this concept a bit closer. First of all, its name is justified by the property <\equation*> \\2*d*(\+\+O(\*\)), which follows from proposition. Some natural numeric quantities associated to a numeric polynomial are its coefficients, its roots and its evaluations at points. For our notion of relative Newton error to be satisfactory, a small relative Newton error should therefore imply small relative errors for the coefficients, the roots and evaluations at points. Now the example from the introduction shows that it is impossible to obtain good relative errors for all coefficients. Nevertheless, given a polynomial with asmall relative Newton error >, we can at least guarantee small relative errors >\\> for those coefficients > which correspond to vertices )> of the Newton polygon. For most practical purposes, we expect this to be sufficient. However, theoretically speaking, the naive multiplication algorithm of polynomials sometimes provides better relative errors for other coefficients. This is for instance the case when squaring a polynomial of the form )+B(z)*z>, where is very small compared to . For what follows, it will be convenient to redefine deg P>. Let ,\,u}\\> denote the set of roots of , ordered by increasing magnitudes \|\\\\|u\|>. By analogy with the power series setting, the norms \|> roughly correspond to the slopes > of the Newton polygon. We expect the existence of a bound <\equation*> \|log \|u\|+\\|\O(log d), although we have not yet proved this. The worst case arises when admits asingle root of multiplicity . We define the >> between \\> (not both zero) by <\equation*> \=\||\|z\|+\|z\|>\1. and the relative distance > between and , counted with multiplicities, by <\equation*> \=\>*\*\>. Clearly, the relative error of the evaluation of at increases when approaches . The following proposition shows that the relative error of can be kept small as long as> is sufficiently large. <\proposition> The relative error of the evaluation of at \> is bounded by <\equation*> \\|\>. <\proof> Modulo a suitable scaling and multiplication of by a constant, we may assume without loss of generality that and P\<\|\|\>=1>. We claim that \>. Indeed, <\equation*> P(z)=P*(z-u)*\*(z-u), so, considering FFT transforms with respect to a -th root of unity >, we get <\equation*> 1=\<\|\|\>P\<\|\|\>\\<\|\|\>FFT>(P)\<\|\|\>\\|P\|*(1+\|u\|)*\*(1+\|u\|)=>. Since P\<\|\|\>=1>, its coefficients are known with absolute errors >\ \>. We can therefore evaluate with absolute error \d* \> and relative error \\/\|P(z)\|>. Let us finally consider the computation of the roots ,\,u> in the case when is known with relative Newton error >. The worst case again arises when *(z-u)> has a single root of multiplicity . Indeed, a relative error > inside > gives rise to a relative error |d>> for , so that we cannot expect anything better than >\|d>> for each . Nevertheless, given an isolated root > and )>, we have (u)=Q(u)> and <\equation*> \)>\\|Q(u)\|*\>. Using a similar argument as in the proof of proposition, it follows that <\equation*> \>\|\,U\{u}>>. Indeed, after reduction to the case when \|=1> and P\<\|\|\>=1>, we combine the facts that )>\d*\>, )\|\\,U\{u}>> and >=\>>. In this paper, we have presented a fast multiplication algorithm for polynomials over >, which is numerically stable in a wide variety of cases. In order to capture the increased stability in a precise theorem, we have shown the usefulness of numeric Newton polygons and the relative Newton error. Even though our algorithm was presented in the case of real coefficients, it is straightforward to generalize it to the complex case. Multiplication being the most fundamental operation on polynomials, it can be hoped that fast and numerically stable algorithms for other operations (division, , multi-point evaluation, ) can be developed along similar lines. This might for instance have applications to polynomial root finding. Indeed, an operation such as the Graeffe transform can be done both efficiently and accurately using our algorithm. The multiplication algorithm with scaling from6.2> and section has been implemented inside the system. We have experienced it to be very efficient for the expansion of power series. For instance, the accurate computation of terms of > with a-bit precision typically takes about one minute. One reason behind this efficiency stems from the reduction to Kronecker multiplication. Indeed, multiple precision libraries for floating point numbers, such as , involve a lot of overhead for every single operation. We rather reduce the whole problem to one big integer multiplication. Only a preliminary version of Newton multiplication has been implemented so far inside . Although this implementation uses a simpler, but not always efficient subdivision, it already produces encouraging results. Work is in progress to improve the implementation and reduce its combinatorial overhead. It remains worth investigating whether there exist other, simpler and/or more efficient subdivision schemes. <\bibliography|bib|alpha|all> <\bib-list|HLRZ00> J.W. Cooley and J.W. Tukey. An algorithm for the machine calculation of complex Fourier series. , 19:297--301, 1965. T.Granlund et al. GMP, the GNU multiple precision arithmetic library. , 1991. G.Hanrot, V.Lefèvre, K.Ryde, and P.Zimmermann. MPFR, a C library for multiple-precision floating-point computations with exact rounding. , 2000. L.Kronecker. Grundzüge einer arithmetischen Theorie der algebraischen Grössen. , 92:1--122, 1882. VictorY. Pan. On approximating complex polynomial zeros: Modified quadtree (Weyl's) construction and improved Newton's iteration. Technical Report RR-2894, INRIA Sophia, 1996. A.Schönhage. Asymptotically fast algorithms for the numerical multiplication and division of polynomials with complex coefficients. In J.Calmet, editor, , volume 144 of , pages 3--15, Marseille, France, April 1982. Springer. A.Schönhage and V.Strassen. Schnelle Multiplikation grosser Zahlen. , 7:281--292, 1971. J.vander Hoeven. Relax, but don't be too lazy. , 34:479--542, 2002. J.vander Hoeven et al. Mathemagix, 2002. . <\initial> <\collection> <\references> <\collection> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > <\auxiliary> <\collection> <\associate|bib> CT65 SS71 vdH:relax Kron1882 Sch82 GMP vdH:relax Pan96 vdH:relax vdH:mmx MPFR <\associate|figure> <\tuple|normal> Illustration of the computation of |P*Q> for two polynomials |P> and |Q> whose coefficients satisfy |log \|P\|=\*k+o(k)> and |log \|Q\|\\*k+o(k)> with |\\\>. Each square corresponds to the contribution of the product |P*Q> to |(P*Q)>. Setting |w=n/(\-\)>, only the terms |P*Q> in the shaded area contribute to the |n> leading bits of |(P*Q)>, in first approximation. > <\tuple|normal> Illustrations of the numeric Newton polygons of |P=2+8*z-8*z+1> and |Q=2-z-2*z+> and the Minkowski sum of these polygons. > <\tuple|normal> On the left hand side we displayed the non-negligible contributions |P*Q> to a typical product |P*Q>. The dark squares correspond to products |P*Q> for which |E>+E>=(E\E)> and the light squares to products where we only have |E>+E>\(E\E)-n>. At the right hand side, we have shown the admissible part of a product subdivision of |0\p\0\q> formed by the thin black rectangles. > <\tuple|normal> Illustration of the sequence |0=i\\\i=p> for a given Newton polygon and a |->|-0.3em|>|0em||0em||>>maximal discrepancy |\=1>. The border of the Newton polygon corresponds to the thick black line. The parts corresponding to the segments |i\i> and |i\i> fit into the indicated strips of heights |1> and |>. The last part, which corresponds to the segment |i\i> is a straight segment which fits into a strip of height |0>. > <\associate|toc> |math-font-series||font-shape||1.Introduction> |.>>>>|> |math-font-series||font-shape||2.Multiplication of pre-conditioned polynomials> |.>>>>|> |math-font-series||font-shape||3.Multiplication with scaling> |.>>>>|> |math-font-series||font-shape||4.Numeric Newton polygons> |.>>>>|> |math-font-series||font-shape||5.Relative Newton error and truncation> |.>>>>|> |math-font-series||font-shape||6.Fitting the Newton polygon using fewer edges> |.>>>>|> |math-font-series||font-shape||7.The product and Newton subdivisions> |.>>>>|> |math-font-series||font-shape||8.Newton multiplication> |.>>>>|> |math-font-series||font-shape||9.Properties of the relative Newton error> |.>>>>|> |math-font-series||font-shape||10.Conclusion> |.>>>>|> |math-font-series||font-shape||Bibliography> |.>>>>|>