Relaxed resolution of implicit equations

Joris van der Hoeven

LIX, CNRS

École polytechnique

91128 Palaiseau Cedex

France

Email: vdhoeven@lix.polytechnique.fr

Web: http://lix.polytechnique.fr/~vdhoeven

December 17, 2009

. This work has been supported by the ANR-09-JCJC-0098-01 MaGiX project, as well as a Digiteo 2009-36HD grant and Région Ile-de-France.

The technique of relaxed power series expansion provides an efficient way to solve equations of the form , where the unknown is a vector of power series, and where the solution can be obtained as the limit of the sequence . With respect to other techniques, such as Newton's method, two major advantages are its generality and the fact that it takes advantage of possible sparseness of . In this paper, we extend the relaxed expansion mechanism to more general implicit equations of the form .

Keywords: Implicit equation, relaxed power series, algorithm

A.M.S. subject classification: 68W25, 42-04, 68W30, 30B10, 33F05, 11Y55

1.Introduction

Let be an effective field of constants of characteristic zero. Let be a column vector of indeterminate series in . We may also consider as a power series . Let be a column vector of expressions built up from , and constants in using ring operations, differentiation and integration (with constant term zero). Finally, let be a finite number of initial conditions. Assume that the system

(1)

admits a unique solution . In this paper, we are interested in the efficient computation of this solution up to a given order .

In the most favourable case, the equation is of the form

(2)

where the coefficient of in only depends on earlier coefficients of , for each . In that case,

actually provides us with a recurrence relation for the computation of the solution. Using the technique of relaxed power series expansions [vdH02a, vdH07], which will briefly be recalled in section 2, it is then possible to compute the expansion up till order in time

(3)

where is the number of multiplications occurring in , where is the total size of as an expression, and denotes the complexity of relaxed multiplication of two power series up till order . Here we assume that is represented by a directed acyclic graph, with possible common subexpressions. For large , we have , where denotes the complexity [CT65, SS71, CK91] of multiplying two polynomials of degrees . If admits sufficiently many -th roots of unity, then we even have and . For moderate , when polynomial multiplication is done naively or using Karatsuba's method, relaxed multiplication is as efficient as the truncated multiplication of polynomials at order .

One particularly important example of an equation of the above type is the integration of a dynamical system

(4)

where is algebraic (i.e. does not involve differentiation or integration). In that case, given the solution up till order , we may consider the linearized system

up till order , where stands for the Jacobian matrix associated to at . If we also have a fundamental system of solutions of up till order , then one step of Newton's method allows us to find the solution of (4) and a new fundamental system of solutions of the linearized equation up till order [BK78, BCO+06]. A careful analysis shows that this leads to an algorithm of time complexity

(5)

In [vdH06], this bound has been further improved to

(6)

under the assumptions that admits sufficiently many -th roots of unity and that .

Although the complexity (5) is asymptotically better than (3) for very large , the relaxed approach often turns out to be more efficient in practice. Indeed, Newton's method both suffers from a larger constant factor and the fact that we profit less from the potential sparsity of the system. Moreover, as long as multiplications are done in the naive or Karatsuba model, the relaxed approach is optimal in the sense that the computation of the solution takes roughly the same time as its verification. Another advantage of the relaxed approach is that it generalizes to more general functional equations and partial differential equations.

Let us now return to our original implicit system (1). A first approach for its resolution is to keep differentiating the system with respect to until it becomes equivalent to a system of the form (2). For instance, if is algebraic, then differentiation of (1) yields

Consequently, if is invertible, then

provides us with an equivalent system which can be solved by one of the previous methods. Unfortunately, this method requires the computation of the Jacobian, so we do not longer exploit the potential sparsity of the original system.

If is a system of differentially algebraic equations, then we may also seek to apply Newton's method. For non degenerate systems and assuming that we have computed the solution and a fundamental system of solutions for the linearized equation up till order , one step of Newton's method yields an extension of the solutions up till order , for a fixed constant . From an asymptotic point of view, this means that the complexities (5) and (6) remain valid.

It is natural to ask for a relaxed algorithm for the resolution of (1), with a similar complexity as (3). We will restrict our attention to so-called “quasi-linear equations”, for which the linearized system is “non degenerate”. This concept will be introduced formally in section 3 and studied in more detail in section 6. In section 4, we present the main algorithm of this paper for the relaxed resolution of (1).

The idea behind the algorithm is simple: considering not yet computed coefficients of as formal unknowns, we solve the system of equations for increasing values of . In particular, the coefficients of the power series involved in the resolution process are no longer in , but rather polynomials in . For each subexpression of and modulo adequate substitution of known coefficients by their values , it turns out that there exist constants and , such that is a constant plus a linear combination of , for large . Moreover, each relaxed multiplication with symbolic coefficients can be reduced to a relaxed multiplication with constant coefficients and a finite number of scalar multiplications with symbolic coefficients. The main result is stated in theorem 5 and generalizes the previous complexity bound (3).

In section 6, we provide a more detailed study of the linearized system associated to (1). This will allow us to make the dependency of on more explicit. On the one hand, given a quasi-linear system on input, this will enable us to provide a certificate that the system is indeed quasi-linear. On the other hand, the asymptotic complexity bounds can be further sharpened in lucky situations (see theorem 11). Finally, in the last section 7, we outline how to generalize our approach to more general functional equations and partial differential equations.

2.Relaxed power series

Throughout this article, will denote an effective field of characteristic zero. This means that elements in can be encoded by data structures on a computer and that we have algorithms for performing the field operations in .

Let us briefly recall the technique of relaxed power series computations, which is explained in more detail in [vdH02a]. In this computational model, a power series is regarded as a stream of coefficients . When performing an operation on power series it is required that the coefficient of the result is output as soon as sufficiently many coefficients of the inputs are known, so that the computation of does not depend on the further coefficients. For instance, in the case of a multiplication , we require that is output as soon as and are known. In particular, we may use the naive formula for the computation of .

The additional constraint on the time when coefficients should be output admits the important advantage that the inputs may depend on the output, provided that we add a small delay. For instance, the exponential of a power series may be computed in a relaxed way using the formula

Indeed, when using the naive formula for products, the coefficient is given by

and the right-hand side only depends on the previously computed coefficients .

The main drawback of the relaxed approach is that we cannot directly use fast algorithms on polynomials for computations with power series. For instance, assuming that has sufficiently many -th roots of unity and that field operations in can be done in time , two polynomials of degrees can be multiplied in time , using FFT multiplication [CT65]. Given the truncations and at order of power series , we may thus compute the truncated product in time as well. This is much faster than the naive relaxed multiplication algorithm for the computation of . However, the formula for when using FFT multiplication depends on all input coefficients and , so the fast algorithm is not relaxed. Fortunately, efficient relaxed multiplication algorithms do exist:

Theorem 1. [vdH97, vdH02a] Let be the time complexity for the multiplication of polynomials of degrees in . Then there exists a relaxed multiplication algorithm for series in of time complexity .

Theorem 2. [vdH07] If admits a primitive -th root of unity for all , then there exists a relaxed multiplication algorithm of time complexity . In practice, the existence of a -th root of unity with suffices for multiplication up to order .

An efficient C++ implementation of relaxed power series is available in the Mathemagix system [vdH+02b]. Leaving low-level pointer and memory management details apart, we will outline how the implementation works, using an informal pseudo-language. Relaxed power series in are implemented as an abstract base class which contains the already computed coefficients and a protected virtual method for computing the next coefficient. For instance, the naive product of can be implemented using the following concrete derived class :

Class

Fields

Constructor

Method

Return

Let us briefly explain this code. In addition to the vector with the already computed coefficients (which is derived from ), the class contains two data fields for the multiplicands and . The constructor returns the product of two series and the method computes using the naive relaxed method. The method does not take care of remembering previously computed coefficients and does not make sure that coefficients are computed in order. Therefore, a different public function is used for the computation of the coefficients and :

Function

Let denote the already computed coefficients of

If then for do

Return

In the case of implicitly defined power series , the method involves the series itself. For instance, exponentiation can be implemented as follows:

Class

Fields

Constructor

Method

If then return

Else return

Example 3. Let us expand the exponential of using the above algorithms. Besides and , three auxiliary series , and are created. Now the computation of gives rise to the following sequence of assignments:

3.Implicit power series equations

Let be a column vector of indeterminate series in . Alternatively, may be considered as a series with formal coefficients

We will denote by the set of expressions built up from , and constants in using ring operations, differentiation and integration (with for all ). Setting

any expression in may then be regarded as an element of .

For each , we define using the following rules:

By induction, we have

for all and .

Let be a column vector of expressions in . We will assume that depends on each of the indeterminates . Given , consider the implicit system with initial conditions

(7)

For any , this system implies the following system of equations in :

(8)

The system (7) is equivalent to the systems together with the initial conditions .

In what follows, we will assume that (7) admits a unique solution . Given and , we will denote by the series in such that is the result of the substitution of by in , for all and . If, for all , there exists an such that is linear in for all and , then we say that (7) is ultimately linear. In that case, becomes a linear system of equations in . More generally, the combined system

(9)

is a linear system of equations in for all sufficiently large . If is linear and can be eliminated from for all sufficiently large, then we say that (7) is quasi-linear. The minimal for which can be eliminated from for all sufficiently large will then be called the index of (7). The minimal such that can be eliminated from for all will be called the offset.

Example 4. Let , and assume that

is invertible. Then for all , we have

with . Hence, can be computed from the previous coefficients using

The system consists of the equation

from which can be eliminated. We conclude that (7) is quasi-linear, of index .

4.Relaxed resolution of implicit equations

Consider a quasi-linear system (7) of index with unique solution . We want to solve the system in a relaxed way, by computing the systems for increasing values and eliminating from using linear algebra. For each subexpression of , we need to evaluate in a relaxed way. The main challenge is to take advantage of the fact that is really a constant plus a linear combination of and to integrate the necessary substitutions of newly computed coefficients by their values into the relaxed resolution process.

Denote the inner product of vectors by . For each and , let be the subset of such that

(10)

for all . Then is a -vector space and

Given with , we define the one-step substitution by

In particular, and the the iterate coincides with the full substitution of by in .

It remains to be shown how to multiply two series and in a suitable way, without introducing quadratic terms in . Given , it will be convenient to introduce shift operators

We recursively define the substitution product of and by

(11)

using the fact that . Unrolling (11), we have

(12)

The substitution product satisfies the important property

Moreover, it respects the constraint that can be computed as soon as and are known for . Recall that the computation of requires the previous computation of .

From the implementation point of view, we proceed as follows. We introduce a new data type , whose instances are of the form

where stands for the relaxed power series solution of (7). Such an instance represents

Denoting by the subtype of instances in with , we may thus view series as elements of . We have a natural inclusion , where we notice that does not matter if , and a constructor for the unknown . The -vector space operations on are implemented in a straightforward way. The one-step substitution operator is implemented by

if and otherwise. On a fixed , this allows us to implement the substitution product using (11). Moreover, by casting and to relaxed series in , we may compute the product using a fast relaxed product in . We are now in a position to state our relaxed algorithm for solving (7).

Class

Fields , ,

Constructor

and

Method

While true

If then raise an error

and

Triangularize by eliminating with large first

If then raise an error

If where and only involves then

Let be the unique solution to as a system in

Let and substitute for in

Return

The following subalgorithm is used for the symbolic construction of the unknown series :

Class

Fields ,

Constructor

,

Method

If then return

Else return

These algorithms require a few comments. First of all, we denoted by the replacement of by in the expression , modulo suitable implicit conversions . Throughout the algorithm, the set stands for the current system of “not yet solved equations”. Each equation is represented by an instance in which only involves with . New equations are progressively added while keeping in a triangular form. As soon as can be eliminated from , then the next coefficient can be computed.

Theorem 5. Let (7) be an equation of index and offset . Then the above algorithm for the resolution of (7) is correct. If involves and is of size as an expression, then are computed in time

Proof. By construction, just after setting , the system is equivalent to from (9). If and , we may thus eliminate from the system and compute . This proves the correctness. As to the complexity bound, we observe that the substitution product in amounts to scalar products and one relaxed product in . Similarly, each linear operation (addition, subtraction, derivation and integration) in amounts to similar operations in . If is a triangular system of size and we add new rows, then the triangularization of the new system can be done in time .

Remark 6. In practice, the user does not necessarily have any a priori bound for the index of (7). Instead of assuming a fixed index for the substitution products, it is actually possible to automatically increase whenever contains a non constant coefficient for some subexpression of .

Remark 7. A prototype of the algorithm has been implemented in the Mathemagix system. For various systems of low index with rational coefficients, we have compared the time to compute the solution up to order with the time to verify its correctness. For small orders , we observed ratios . For large orders , we achieved , in correspondence with the asymptotic complexity bound.

5.A worked example

Consider the system

It is not hard to find the unique explicit solution of this system. Indeed,

whence . Since , it follows that . Plugging this into the first equation , we get , whence and .

During the computations below, we will see that the system is quasi-linear of index . Denoting the relaxed solution by , we will have to compute , and the series , , .

Initialization
At the very start, we have .

Step 1
The evaluations of and yield

These relations do not yet enable us to determine and .

Step 2
The evaluations of and yield

After triangularization, we get

The two last equations imply and .

Step 3
Evaluations of and and triangularization of yield

From the equations and , we get .

Further steps
For , the evaluations of and yield

After triangularization, we thus get

Consequently, .

6.Symbolic linearization

Assume that the system (7) is quasi-linear. Given a subexpression of an integer and , we claim that the coefficient of in (and which corresponds to in (10)) is a rational function in , for sufficiently large . There are two ways to see this.

Let denote the set of expressions , such that for all there exist vectors of rational functions and a sequence with

(13)

for all sufficiently large . In other words,

for and sufficiently large . We define if . We clearly have and . Assume that . Then and we may explicitly compute the corresponding rational functions using

If is a polynomial, then we notice that for all and . If is a differential polynomial of order , then is a polynomial in of degree . In general, the degrees of the numerator and denominator of are bounded by the maximal number of nested differentiations resp. integrations occurring in .

An alternative way to determine the is to consider as a perturbation of the solution and perform a Taylor series expansion

The coefficients can then be read off from the linear term using

(14)

For instance, consider the expression

Then we have

with if and otherwise.

A first theoretical consequence of our ability to compute symbolic expressions for is the following:

Theorem 8. There exists an algorithm which takes a quasi-linear system (7) on input and computes its index and its offset.

Proof. The system (9) can be rewritten as a matrix-vector equation

(15)

Here is a column vector with entries and . The entries of the matrix are coefficients of the form . In particular, we can compute a matrix such that the matrix is given by the specialization of at for sufficiently large .

Let be the symbolic triangularization of . For sufficiently large , the triangularization of coincides with for . Now may be eliminated from the equation (15) if and only if the last non zero rows and the last columns of are an invertible triangular matrix. This is the case for all sufficiently large if and only if the last non zero rows and the last columns of are an invertible triangular matrix in . We compute the index of (7) as being the smallest for which this is the case.

As to the offset, we first observe that we may explicitly compute an such that and for all , since the values for which these equations do not hold are roots of a polynomial with coefficients in . Using the algorithm from section 4, we may compute the solution up to any given order . We thus compute the offset as being the smallest such that can be eliminated from (15) for all .

Example 9. For the example from section 5, the equation (15) becomes

Remark 10. An interesting theoretical question is how to test whether a given system is quasi-linear. The proof technique of the theorem at least gives us a partial answer: if a system is quasi-linear, then we will be able to provide a certificate for this fact. Most systems encountered in practice are indeed quasi-linear, provided that we have sufficiently many initial conditions. In the contrary case, it is usually possible to find a simpler equivalent quasi-linear system. It would be nice to prove a general theorem in this direction.

A second consequence of our ability to compute symbolic expressions for is that we can avoid the systematic computation of the coefficients using arithmetic in : we rather compute on demand, by evaluating at . The coefficients are essentially needed at two places: for the computation of substitution products and for the computation of the system . Let be the cost of an evaluation of : if is a polynomial, then ; if is a differential polynomial, then is its order plus one; etc..

When computing by evaluating at , the computation of one coefficient of a one-step substitution amounts to evaluations of rational functions of the form . Consequently, every substitution product amounts to a cost in the final complexity.

As to the computation of a coefficients , we may compute as in (15) using evaluations of cost and then solving a linear system of size . This gives rise to a cost in the final complexity. Alternatively, as a side effect of the triangularization of with the notations from (15), we may compute a symbolic matrix such that for all sufficiently large . If the system (7) is algebraic, then actually has constant coefficients, so the complexity further reduces to . In general, the evaluation of will be more expensive, so it is not clear whether this strategy pays off. Altogether, we have proved:

Theorem 11. Let (7) be an equation of index of total size and containing at most multiplications. Assume that the equations involve strictly less than nested derivations or integrations. Then can be computed in time

If (7) is algebraic, then the complexity further drops down to

Remark 12. The algorithm from this section has not been implemented yet. The new complexity bound constitutes an improvement mainly in the algebraic case. Since the manipulation of non constant coefficients in gives rise to a small but non negligible amount of overhead for small and moderate (see remark 7), we indeed expect further gains in this case.

7.Generalizations

For simplicity, the presentation of this paper has been focused on ordinary differential equations. Nevertheless, the techniques admit generalizations in several directions. We will outline two such generalizations.

Functional equations
Let be a set of relaxed power series with and replace by the set of expressions built up from , and constants in using ring operations, differentiation, integration and right composition with series in .

Assume first that for all . In a similar way as in section 6, there exists a symbolic expression of the form (13) for each , except that we now have

(16)

where are the functions which occur as postcomposers in . In particular, if , then the are contained in a Hardy field, and theorem 8 generalizes.

The above observation further generalizes to the case when for certain . In non degenerate cases, expressions with only occur as perturbations, and (16) still holds. In general, we also have to consider degenerate situations, such as the case when

for a certain and all .

One may even consider functional equations in which we also allow postcompositions with general expressions with . Although the theory from section 6 becomes more and more intricate, the algorithm from section 4 generalizes in a straightforward way.

Partial differential equations
We may also consider power series in several variables . Given a multivariate power series and , we denote by the coefficient of in , and let . The expressions in are built up from , and constants in using ring operations and partial differentiation or integration with respect to the . The number now becomes a vector in .

The simplest, blockwise generalization proceeds as follows. Indices are compared using the product ordering . Given and , we let be the series in such that is the result of the substitution of by in , for all and . Given and , we let be the subset of such that

(17)

For each , let be such that and for . We define the one-step partial substitution by

The partial shifts and are defined similarly and we denote by the substitution of for in . The substitution product is defined recursively. If , then we set . Otherwise, we let be smallest such that is maximal and take

Using this substitution product, the algorithm from section 4 generalizes. The theory from section 6 can also be adapted. However, theorem 8 admits no simple analogue, due to the fact that there is no algorithm for determining the integer roots of a system of multivariate polynomials.

Several variants are possible depending on the application. For instance, it is sometimes possible to consider only the up till a certain total degree in (17), instead of a block of coefficients. For some applications, it may also be interesting to store the in a sparse vector.

Bibliography

[BCO+06]

A. Bostan, F. Chyzak, F. Ollivier, B. Salvy, É. Schost, and A. Sedoglavic. Fast computation of power series solutions of systems of differential equation. preprint, april 2006. submitted, 13 pages.

[BK78]

R.P. Brent and H.T. Kung. Fast algorithms for manipulating formal power series. Journal of the ACM, 25:581–595, 1978.

[CK91]

D.G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras. Acta Informatica, 28:693–701, 1991.

[CT65]

J.W. Cooley and J.W. Tukey. An algorithm for the machine calculation of complex Fourier series. Math. Computat., 19:297–301, 1965.

[SS71]

A. Schönhage and V. Strassen. Schnelle Multiplikation grosser Zahlen. Computing, 7:281–292, 1971.

[vdH97]

J. van der Hoeven. Lazy multiplication of formal power series. In W. W. Küchlin, editor, Proc. ISSAC '97, pages 17–20, Maui, Hawaii, July 1997.

[vdH02a]

J. van der Hoeven. Relax, but don't be too lazy. JSC, 34:479–542, 2002.

[vdH+02b]

J. van der Hoeven et al. Mathemagix, 2002. http://www.mathemagix.org.

[vdH06]

J. van der Hoeven. Newton's method and FFT trading. Technical Report 2006-17, Univ. Paris-Sud, 2006. Submitted to JSC.

[vdH07]

J. van der Hoeven. New algorithms for relaxed multiplication. JSC, 42(8):792–802, 2007.