> <\body> ||<\author-address> de Mathématiques ( 425) Université Paris-Sud 91405 Orsay Cedex France Email: >|>> <\abstract> In previous work, we have introduced the technique of relaxed power series computations. With this technique, it is possible to solve implicit equations almost as quickly as doing the operations which occur in the implicit equation. In this paper, we present a new relaxed multiplication algorithm for the resolution of linear equations. The algorithm has the same asymptotic time complexity as our previous algorithms, but we improve the space overhead in the divide and conquer model and the constant factor in the model. >>>>>>>>>>>> Let > be an effective ring and consider two power series +f*z+\> and +g*z+\> in [[z]]>. In this paper we will be concerned with the efficient computation of the first coefficients of the product +h*z+\>. If the first coefficients of and are known beforehand, then we may use any fast multiplication for polynomials in order to achieve this goal, such as divide and conquer multiplication , which has a time complexity )>, or multiplication , which has a time complexity n*log log n)>. For simplicity, ``time complexity'' stands for the required number of operations in >. Similarly, ``space complexity'' will stand for the number of elements of > which need to be stored. The required number of multiplications >(n)> in the divide and conquer algorithm satisfies the following recurrence relations: <\eqnarray*> >(1)>||>|>(n)>||>(\n/2\)+K>(\n/2\)>>>> When performing computing only the product truncated at order , then the number of multiplications >>> needed by the divide and conquer algorithm becomes <\eqnarray*> >>(1)>||>|>>(n)>||>(\n/2\)+2*K>>(\n/2\)>>>> For certain computations, and most importantly the resolution of implicit equations, it is interesting to have so called ``relaxed algorithms'' which output the first coefficients of as soon as the first coefficients of and are known for each n>. This allows for instance the computation of the exponential of a series with =0> using the formula <\equation> g=f*g. In , we proved the following two theorems: <\theorem> There exists a relaxed multiplication algorithm of time complexity and space complexity , and which uses >(n)> multiplications. <\theorem> There exists a relaxed multiplication algorithm of time complexity and space complexity . Although these theorems are satisfactory from a theoretical point of view, they can be improved in two directions: by removing the logarithmic space overhead in the divide and conquer model and by improving the constant factor in the model. In this paper, we will present such an improved algorithm in the case of relaxed multiplication with a fixed series. More precisely, let and be power series, such that is known up to order . Then our algorithm will compute the product up to order and output > as soon as ,\,f> are known, for all n>. We will prove the following: <\theorem> There exists a relaxed multiplication algorithm with fixed series of time complexity , of space complexity , and which uses >>(n)> multiplications. We also obtain a better constant factor in the asymptotic complexity in the model, but this result is harder to state in a precise theorem. The algorithm is useful for the relaxed resolution of linear differential or difference equations. For instance, the exponential of a series can be computed using >K>>(n)> multiplications in >. Moreover, the new algorithm is very simple to implement, so it is likely to require less overhead than the algorithm from theorem . Our algorithm is based on the recent middle product algorithm , which is recalled in section . In section we present our new algorithm and in section we give some applications. In our algorithms we will use the following notations: the data type (n)> stands for truncated power series of order , like +\+f*z>. Given (n)> and i\j\n>, we will denote j>=f+\+f*z\(j-i)>. Given (m)> and (n)>, we also denote g=f+g*z\(m+n)>. We will denote by ((n))> the type, whose elements are references to elements of type (n)>. If (n)> and i\j\n>, then we assume that j>\((n))>. Let +\+f*z> and +\+g*z> be two truncated power series at orders . The and is defined to be the truncated power series g=h+\+h*z> of order , such that =f*g> for all {0,\,n-1}>. In figure , corresponds to the colored region. \; |Illustration of the middle product.> The middle product of +f*z> and +g*z+g*z> can be computed using only three multiplications, using the following trick: <\eqnarray*> >||*(g+g)>>|>||-f)*g>>|>||*(g+g)>>|>||-\>>|>||+\>>>> This trick may be applied recursively in order to yield an algorithm which needs exactly the same number of multiplications >(n)> as the divide and conquer algorithm for the computation of the product of two polynomials of degree . More precisely, the following recursive algorithm comes from . <\algorithm|> g> (n)> and (2*n-1)> their middle product g\(n)> <\with|par-par-sep|0cm> <\body> *g> <\with|mode|math> k\\n/2\,l\\n/2\ \fn>\(g2*l-1>+g3*l-1>)> is even <\indent> \(fn>-fk>)\g3*l-1>> \[f\(fn>-fk>)]\g3*l-1>> \fk>\(gl+2*k-1>+g2*n-1>)> -\)\(\+\k>)> In it is also shown that, in the model, the middle product can still be computed in essentially the same time as the product of two polynomials. Let and be power series, such that is known up to order . In this section, we present an algorithm which computes the product up to order . For each n>, the algorithm outputs > as soon as ,\,f> are known. The idea of our algorithm is similar to the idea behind fast relaxed multiplication in and based on a subdivision of the triangular area which corresponds to the computation of the truncated power series product. This subdivision is shown in figures and , where each parallelogram corresponds to the computation of a middle product. |The subdivision used for the new relaxed multiplication algorithm.> More precisely, let n/2\> and assume that ,\,f> are known. Then the contribution of l>\gn>> to may be computed using the middle product algorithm from the previous section. The relaxed truncated products k>*gk>> and n>*gk>> may be computed recursively. In order to implement this idea, we will use an in-place algorithm, which adds the result of to a reference > to an element of (n)>. Denote by > the initial value of >. Then the in-place algorithm should be called successively for ,n>. After the last call, we have =\+h>. Taking =0>, the algorithm computes . <\algorithm|> (f,g,\,i)> (n)>, \((n))>, n>. we have i>=\+fi>*gi>> on exit. <\with|par-par-sep|0cm> <\body> \f*g> and <\with|mode|math> k\\n/2\,l\\n/2\ k> (fk>,gk>,\k>,i)> n>\fl>\gn>> l> (fn>,gk>,\n>,i-l)> The number of multiplications >(n)> used by > is determined by the relations <\eqnarray*> >(1)>||;>>|>(n)>||>(\n/2\)+2*R>(\n/2\).>>>> By induction, it follows that >(n)=K>>(n)>. The overall time complexity satisfies <\equation*> R(n)\K(\n/2\)+2*R(\n/2\)+O(n), so . The algorithm being in-place, its space complexity is clearly . This proves theorem . In it is also interesting to use the above algorithm in the model. We then have the estimation <\equation*> R(n)\M(\n/2\)+2*R(\n/2\)+O(n) for the asymptotic complexity . If c*n*log n*log log n>, this yields <\equation*> R(n)\*M(n)*log n. This should be compared with the complexity M(n)*log n> of the previously best algorithm and with the complexity 2*M(n)*log n> of the standard fast relaxed multiplication algorithm. Notice that we rarely obtain the complexity c*n*log n*log log n> in practice. In the range where c*n>>, we obtain <\equation*> R(n)\>-2>*M(n). Let us consider the computation of > up till order using our algorithm and the formula <\equation*> f=f*g, with =1> and +\>. We start with =0> in > and perform the following computations at successive calls for ,6>: <\enumerate> We set \f*g=1>, so that <\equation*> \\1 and =1>. We recursively apply > to 3>>, 3>>, 3>> and . This requires the computation of 2>\g3>=(1+z)\(1+2*z+3*z)=3+5*z>. We thus increase 3>\3+5*z>, so that <\equation*> \\1+3*z+5*z and =>. The two nested recursive calls to > now lead to the increase of > by *g=>, so that <\equation*> \\1+3*z+>*z and =>. We now both have and l=3>. So we first compute 3>\g6>=10+*z+17*z> and set 6>\10+*z+17*z>. We next recursively apply > to 6>>, 3>>, 6>> and , which leads to an increase of > by *g=>. Alltogether, we obtain <\equation*> \\1+3*z+>*z+>*z+>*z+17*z and =>. We recursively apply > to 6>>, 3>>, 6>> and . This leads to the increase 6>\f5>\g3>=+*z>, so that <\equation*> \\1+3*z+>*z+>*z+>*z+>*z and =>. The two nested recursive calls lead to the increase \f*g=>, so that <\equation*> \\1+3*z+>*z+>*z+>*z+>*z and =>. The entire computation is represented schematically in figure . |Illustration of an order relaxed multiplication.> First of all, let us consider the problem of relaxed division by a fixed power series. In other words, we are given two power series and , where is known up to order and =1>. We want an algorithm for the computation of up to order , such that > is computed as soon as ,\,f> are known for each n>. Now we have <\equation*> h=f-z*(\*h), where =(g-1)/z\\[[z]]>. We may thus compute in a relaxed way using the algorithm from the previous section. Computing up till terms will then necessitate K>(n)>> multiplications in >. Let us next consider a linear differential equation <\equation> L*f+\+L*f=0, with ,\,L\\[[z]]> and (0)=1>. Given initial conditions for ,\,f>, there exists a unique solution to this equation. We may compute this solution using the relaxed algorithm from the previous section, the above algorithm for relaxed division, and the formula <\equation*> f=L*|r\>(L*f+\+L*f). In order to compute coefficients, we need to perform >(n)> multiplications in > and multiplications and divisions by integers. If =1>, then we only need >(n)> multiplications. For instance, the exponential of a series with =0> satisfies the equation <\equation*> g-f*g=0, so can be computed using >(n)> multiplications, using the formula (). More generally, consider the solution to () with the prescribed initial conditions, and let be another series with =0>. Then the composition g> again satisfies a linear differential equation. Indeed, we have the relations <\eqnarray*> g>||>|\g>||>>>|\g>|||g>-*g|g>>>||>|>>> Postcomposing () with and using these relations, we obtain a linear differential equation for . In fact, our algorithm may be used to solve far more general linear equations, such as linear partial differential equations, or linear differential-difference equations. In the case of difference equations, we notice that the relaxed multiplications in the algorithms from for relaxed right composition with a fixed series all have one fixed argument. So we may indeed apply the algorithm from section . We finally notice that our algorithm can even be used in a non-linear context. Indeed, after computing n/2\> coefficients of a truncated relaxed product, the computation of the remaining products reduces to the computation of two truncated relaxed products with one fixed argument. Actually, this corresponds to an implicit application of Newton's method. We have presented a new algorithm for relaxed multiplication. Although the new algorithm does not yield a significant improvement from the asymptotic complexity point of view, we do expect it to be very useful for practical applications, such as the exponentiation of power series. First of all, the algorithm is easy to implement. Secondly, it only needs a linear amount of memory in the range where divide and conquer multiplication is appropriate. In combination with multiplication, the algorithm yields a better constant factor in the asymptotic complexity. When implementing a library for power series computations, it is interesting to incorporate a mechanism to automatically detect relaxed and fixed multiplicands in a complex computation. This is possible by examining the dependency graph. With such a mechanism, one may use the new algorithm whenever possible. Some interesting questions remain open in the divide and conquer model: can we apply Mulders' trick for the computation of ``short products'' in our setting while maintaining the linear space complexity (see figure )? In that case, we might improve the number of multiplications in theorem to >0.808\*K(n)>. |Using Mulders' trick in combination with the middle product.> In a similar vein, does there exist a relaxed multiplication algorithm of time complexity >K(n)> and linear space complexity? This would be so, if the middle product algorithm could be made relaxed in an in-place way (the algorithm is already ``essentially relaxed'' in the sense of in the divide and conquer model). As it stands now, with the above questions still unanswered, the original relaxed multiplication algorithm from theorem remains best from the time complexity point of view in the divide and conquer model. Moreover, Mulders' trick can be applied in this setting, so as to yield a short relaxed multiplication algorithm of complexity >0.808\*K(n)>, or even better . This has surprising consequences for the complexities of several operations like short division and square roots: we obtain algorithms of time complexities >0.808\*K(n)> and >*K(n)> when using n)> space, while the best known algorithms which use linear space have time complexities >K(n)> and >*K(n)>. In order to obtain the complexity of >*K(n)> in the case of square roots, one should use a relaxed version of the fast squaring algorithm from , which is based on middle products. We finally remark that this relaxed version of squaring using middle products is also interesting in the model. In this case, the relaxed middle product corresponds to a full relaxed product with one fixed argument. Such products can be computed in time >2*R(n)>, so that we obtain a relaxed squaring algorithm of time complexity >2*R(n)>. This is twice as good as general relaxed multiplication. In the non-relaxed setting, squares can be computed in a time between *M(n)> and *M(n)>, depending on whether most time is spent on inner multiplications or fast Fourier transforms respectively. <\bibliography|bib|alpha|relaxed> <\bib-list|[99]> D.G. Cantor and E.Kaltofen. On fast multiplication of polynomials over arbitrary algebras. , 28:693--701, 1991. J.W. Cooley and J.W. Tukey. An algorithm for the machine calculation of complex Fourier series. , 19:297--301, 1965. Guillaume Hanrot, Michel Quercia, and Paul Zimmermann. Speeding up the division and square root of power series. Research Report 3973, INRIA, July 2000. Available from . Guillaume Hanrot, Michel Quercia, and Paul Zimmermann. The middle product algorithm I. speeding up the division and square root of power series. Accepted for publication in AAECC, 2002. Guillaume Hanrot and Paul Zimmermann. A long note on Mulders' short product. Research Report 4654, INRIA, December 2002. Available from hanrot/Papers/mulders.ps>. D.E. Knuth. , volume 2: Seminumerical Algorithms. Addison-Wesley, 3-rd edition, 1997. A.Karatsuba and J.Ofman. Multiplication of multidigit numbers on automata. , 7:595--596, 1963. T.Mulders. On short multiplication and division. , 11(1):69--88, 2000. A.Schönhage and V.Strassen. Schnelle Multiplikation grosser Zahlen. , 7:281--292, 1971. J.vander Hoeven. Lazy multiplication of formal power series. In W.W. Küchlin, editor, , pages 17--20, Maui, Hawaii, July 1997. J.vander Hoeven. Relax, but don't be too lazy. , 34:479--542, 2002. <\initial> <\collection> <\references> <\collection> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > <\auxiliary> <\collection> <\associate|bib> Kar63 Kn97 CT65 SS71 CK91 vdH:relax vdH:issac97 vdH:relax HaQuZi00 HaQuZi02 HaQuZi00 HaQuZi02 HaQuZi02 vdH:issac97 vdH:relax vdH:relax Mul00 HaZi02 vdH:issac97 vdH:relax HaZi02 HaQuZi02 <\associate|figure> Illustration of the middle product.|> The subdivision used for the new relaxed multiplication algorithm.|> Illustration of an order |6> relaxed multiplication.|> Using Mulders' trick in combination with the middle product.|> <\associate|toc> |math-font-series||1Introduction> |.>>>>|> |math-font-series||2The middle product> |.>>>>|> |math-font-series||3Relaxed multiplication with a fixed series> |.>>>>|> |math-font-series||4A worked example> |.>>>>|> |math-font-series||5Applications> |.>>>>|> |math-font-series||6Conclusion and open questions> |.>>>>|>