Relaxed resolution of implicit equations

The technique of relaxed power series expansion provides an efficient way to solve equations of the form , where the unknown is a vector of power series, and where the solution can be obtained as the limit of the sequence . With respect to other techniques, such as Newton's method, two major advantages are its generality and the fact that it takes advantage of possible sparseness of . In this paper, we extend the relaxed expansion mechanism to more general implicit equations of the form .

Keywords: Implicit equation, relaxed power series, algorithm

A.M.S. subject classification: 68W25, 42-04, 68W30, 30B10, 33F05, 11Y55

1.Introduction

Let

be an effective field of constants of characteristic zero. Let

be a column vector of

indeterminate series in

. We may also consider

as a power series

. Let

be a column vector of expressions built up from

and constants in

using ring operations, differentiation and integration (with constant term zero). Finally, let

be a finite number of initial conditions. Assume that the system

admits a unique solution

. In this paper, we are interested in the efficient computation of this solution up to a given order

where the coefficient

only depends on earlier coefficients

, for each

. In that case,

actually provides us with a recurrence relation for the computation of the solution. Using the technique of relaxed power series expansions [vdH02a, vdH07], which will briefly be recalled in section 2, it is then possible to compute the expansion

up till order

in time

where

is the number of multiplications occurring in

, where is

the total size of

as an expression, and

denotes the complexity of relaxed multiplication of two power series up till order

. Here we assume that

is represented by a directed acyclic graph, with possible common subexpressions. For large

, we have

, where

denotes the complexity [CT65, SS71, CK91] of multiplying two polynomials of degrees

. If

admits sufficiently many

-th roots of unity, then we even have

and

. For moderate

, when polynomial multiplication is done naively or using Karatsuba's method, relaxed multiplication is as efficient as the truncated multiplication of polynomials at order

One particularly important example of an equation of the above type is the integration of a dynamical system

where

is algebraic (i.e. does not involve differentiation or integration). In that case, given the solution

up till order

, we may consider the linearized system

up till order

, where

stands for the Jacobian matrix associated to

. If we also have a fundamental system of solutions of

up till order

, then one step of Newton's method allows us to find the solution of (4) and a new fundamental system of solutions of the linearized equation up till order

[BK78, BCO+06]. A careful analysis shows that this leads to an algorithm of time complexity

under the assumptions that

admits sufficiently many

-th roots of unity and that

Although the complexity (5) is asymptotically better than (3) for very large

, the relaxed approach often turns out to be more efficient in practice. Indeed, Newton's method both suffers from a larger constant factor and the fact that we profit less from the potential sparsity of the system. Moreover, as long as multiplications are done in the naive or Karatsuba model, the relaxed approach is optimal in the sense that the computation of the solution takes roughly the same time as its verification. Another advantage of the relaxed approach is that it generalizes to more general functional equations and partial differential equations.

Let us now return to our original implicit system (1). A first approach for its resolution is to keep differentiating the system with respect to

until it becomes equivalent to a system of the form (2). For instance, if

is algebraic, then differentiation of (1) yields

provides us with an equivalent system which can be solved by one of the previous methods. Unfortunately, this method requires the computation of the Jacobian, so we do not longer exploit the potential sparsity of the original system.

is a system of differentially algebraic equations, then we may also seek to apply Newton's method. For non degenerate systems and assuming that we have computed the solution

and a fundamental system of solutions for the linearized equation up till order

, one step of Newton's method yields an extension of the solutions up till order

, for a fixed constant

. From an asymptotic point of view, this means that the complexities (5) and (6) remain valid.

It is natural to ask for a relaxed algorithm for the resolution of (1), with a similar complexity as (3). We will restrict our attention to so-called “quasi-linear equations”, for which the linearized system is “non degenerate”. This concept will be introduced formally in section 3 and studied in more detail in section 6. In section 4, we present the main algorithm of this paper for the relaxed resolution of (1).

The idea behind the algorithm is simple: considering not yet computed coefficients of

as formal unknowns, we solve the system of equations

for increasing values of

. In particular, the coefficients of the power series involved in the resolution process are no longer in

, but rather polynomials in

. For each subexpression

and modulo adequate substitution of known coefficients

by their values

, it turns out that there exist constants

and

, such that

is a constant plus a linear combination of

, for large

. Moreover, each relaxed multiplication with symbolic coefficients can be reduced to a relaxed multiplication with constant coefficients and a finite number of scalar multiplications with symbolic coefficients. The main result is stated in theorem 5 and generalizes the previous complexity bound (3).

In section 6, we provide a more detailed study of the linearized system associated to (1). This will allow us to make the dependency of

more explicit. On the one hand, given a quasi-linear system on input, this will enable us to provide a certificate that the system is indeed quasi-linear. On the other hand, the asymptotic complexity bounds can be further sharpened in lucky situations (see theorem 11). Finally, in the last section 7, we outline how to generalize our approach to more general functional equations and partial differential equations.

2.Relaxed power series

Throughout this article,

will denote an effective field of characteristic zero. This means that elements in

can be encoded by data structures on a computer and that we have algorithms for performing the field operations in

Let us briefly recall the technique of relaxed power series computations, which is explained in more detail in [vdH02a]. In this computational model, a power series

is regarded as a stream of coefficients

. When performing an operation

on power series it is required that the coefficient

of the result is output as soon as sufficiently many coefficients of the inputs are known, so that the computation of

does not depend on the further coefficients. For instance, in the case of a multiplication

, we require that

is output as soon as

and

are known. In particular, we may use the naive formula

for the computation of

The additional constraint on the time when coefficients should be output admits the important advantage that the inputs may depend on the output, provided that we add a small delay. For instance, the exponential

of a power series

may be computed in a relaxed way using the formula

The main drawback of the relaxed approach is that we cannot directly use fast algorithms on polynomials for computations with power series. For instance, assuming that

has sufficiently many

-th roots of unity and that field operations in

can be done in time

, two polynomials of degrees

can be multiplied in time

, using FFT multiplication [CT65]. Given the truncations

and

at order

of power series

, we may thus compute the truncated product

in time

as well. This is much faster than the naive

relaxed multiplication algorithm for the computation of

. However, the formula for

when using FFT multiplication depends on all input coefficients

and

, so the fast algorithm is not relaxed. Fortunately, efficient relaxed multiplication algorithms do exist:

An efficient C++ implementation of relaxed power series is available in the Mathemagix system [vdH+02b]. Leaving low-level pointer and memory management details apart, we will outline how the implementation works, using an informal pseudo-language. Relaxed power series in

are implemented as an abstract base class

which contains the already computed coefficients and a protected virtual method

for computing the next coefficient. For instance, the naive product of

can be implemented using the following concrete derived class

Let us briefly explain this code. In addition to the vector with the already computed coefficients (which is derived from

), the class

contains two data fields for the multiplicands

and

. The constructor

returns the product of two series

and the method

computes

using the naive relaxed method. The method

does not take care of remembering previously computed coefficients and does not make sure that coefficients are computed in order. Therefore, a different public function

is used for the computation of the coefficients

and

In the case of implicitly defined power series

, the method

involves the series

itself. For instance, exponentiation

can be implemented as follows:

Example 3. Let us expand the exponential of

using the above algorithms. Besides

and

, three auxiliary series

and

are created. Now the computation of

gives rise to the following sequence of assignments:

3.Implicit power series equations

Let

be a column vector of

indeterminate series in

. Alternatively,

may be considered as a series with formal coefficients

We will denote by

the set of expressions built up from

and constants in

using ring operations, differentiation and integration (with

for all

). Setting

Let

be a column vector of

expressions in

. We will assume that

depends on each of the indeterminates

. Given

, consider the implicit system with initial conditions

The system (7) is equivalent to the systems

together with the initial conditions

In what follows, we will assume that (7) admits a unique solution

. Given

and

, we will denote by

the series in

such that

is the result of the substitution of

, for all

and

. If, for all

, there exists an

such that

is linear in

for all

and

, then we say that (7) is ultimately linear. In that case,

becomes a linear system of equations in

. More generally, the combined system

is a linear system of equations in

for all sufficiently large

. If

is linear and

can be eliminated from

for all sufficiently large, then we say that (7) is quasi-linear. The minimal

for which

can be eliminated from

for all sufficiently large

will then be called the index of (7). The minimal

such that

can be eliminated from

for all

will be called the offset.

4.Relaxed resolution of implicit equations

Consider a quasi-linear system (7) of index

with unique solution

. We want to solve the system in a relaxed way, by computing the systems

for increasing values

and eliminating

from

using linear algebra. For each subexpression

, we need to evaluate

in a relaxed way. The main challenge is to take advantage of the fact that

is really a constant plus a linear combination of

and to integrate the necessary substitutions of newly computed coefficients by their values into the relaxed resolution process.

Denote the inner product of vectors by

. For each

and

, let

be the subset of

such that

In particular,

and the the iterate

coincides with the full substitution

It remains to be shown how to multiply two series

and

in a suitable way, without introducing quadratic terms in

. Given

, it will be convenient to introduce shift operators

Moreover, it respects the constraint that

can be computed as soon as

and

are known for

. Recall that the computation of

requires the previous computation of

From the implementation point of view, we proceed as follows. We introduce a new data type

, whose instances are of the form

where

stands for the relaxed power series solution of (7). Such an instance

represents

Denoting by

the subtype of instances

with

, we may thus view series

as elements of

. We have a natural inclusion

, where we notice that

does not matter if

, and a constructor

for the unknown

. The

-vector space operations on

are implemented in a straightforward way. The one-step substitution operator

is implemented by

and

otherwise. On a fixed

, this allows us to implement the substitution product

using (11). Moreover, by casting

and

to relaxed series in

, we may compute the product

using a fast relaxed product in

. We are now in a position to state our relaxed algorithm for solving (7).

Class

Fields , ,

Constructor

and

Method

While true

If then raise an error

and

Triangularize by eliminating with large first

If then raise an error

If where and only involves then

Let be the unique solution to as a system in

Let and substitute for in

Return

The following subalgorithm is used for the symbolic construction of the unknown series

These algorithms require a few comments. First of all, we denoted by

the replacement of

in the expression

, modulo suitable implicit conversions

. Throughout the algorithm, the set

stands for the current system of “not yet solved equations”. Each equation is represented by an instance in

which only involves

with

. New equations are progressively added while keeping

in a triangular form. As soon as

can be eliminated from

, then the next coefficient

can be computed.

Proof. By construction, just after setting

, the system

is equivalent to

from (9). If

and

, we may thus eliminate

from the system

and compute

. This proves the correctness. As to the complexity bound, we observe that the substitution product in

amounts to

scalar products and one relaxed product in

. Similarly, each linear operation (addition, subtraction, derivation and integration) in

amounts to

similar operations in

. If

is a triangular system of size

and we add

new rows, then the triangularization of the new system can be done in time

Remark 6. In practice, the user does not necessarily have any a priori bound for the index of (7). Instead of assuming a fixed index

for the substitution products, it is actually possible to automatically increase

whenever

contains a non constant coefficient for some subexpression

Remark 7. A prototype of the algorithm has been implemented in the Mathemagix system. For various systems of low index with rational coefficients, we have compared the time

to compute the solution up to order

with the time

to verify its correctness. For small orders

, we observed ratios

. For large orders

, we achieved

, in correspondence with the asymptotic complexity bound.

5.A worked example

whence

. Since

, it follows that

. Plugging this into the first equation

, we get

, whence

and

During the computations below, we will see that the system is quasi-linear of index

. Denoting the relaxed solution by

, we will have to compute

and the series

Initialization

Step 1

Step 2

Step 3

Further steps

6.Symbolic linearization

Assume that the system (7) is quasi-linear. Given a subexpression

an integer

and

, we claim that the coefficient

(and which corresponds to

in (10)) is a rational function in

, for sufficiently large

. There are two ways to see this.

Let

denote the set of expressions

, such that for all

there exist vectors of rational functions

and a sequence

with

for

and sufficiently large

. We define

. We clearly have

and

. Assume that

. Then

and we may explicitly compute the corresponding rational functions using

is a polynomial, then we notice that

for all

and

. If

is a differential polynomial of order

, then

is a polynomial in

of degree

. In general, the degrees of the numerator and denominator of

are bounded by the maximal number of nested differentiations resp. integrations occurring in

An alternative way to determine the

is to consider

as a perturbation of the solution and perform a Taylor series expansion

A first theoretical consequence of our ability to compute symbolic expressions for

is the following:

Here

is a column vector with entries

and

. The entries of the matrix

are coefficients of the form

. In particular, we can compute a matrix

such that the matrix

is given by the specialization

for sufficiently large

Let

be the symbolic triangularization of

. For sufficiently large

, the triangularization

coincides with

for

. Now

may be eliminated from the equation (15) if and only if the last

non zero rows and the last

columns of

are an invertible triangular matrix. This is the case for all sufficiently large

if and only if the last

non zero rows and the last

columns of

are an invertible triangular matrix in

. We compute the index of (7) as being the smallest

for which this is the case.

As to the offset, we first observe that we may explicitly compute an

such that

and

for all

, since the values

for which these equations do not hold are roots of a polynomial with coefficients in

. Using the algorithm from section 4, we may compute the solution

up to any given order

. We thus compute the offset as being the smallest

such that

can be eliminated from (15) for all

Remark 10. An interesting theoretical question is how to test whether a given system is quasi-linear. The proof technique of the theorem at least gives us a partial answer: if a system is quasi-linear, then we will be able to provide a certificate for this fact. Most systems encountered in practice are indeed quasi-linear, provided that we have sufficiently many initial conditions. In the contrary case, it is usually possible to find a simpler equivalent quasi-linear system. It would be nice to prove a general theorem in this direction.

A second consequence of our ability to compute symbolic expressions for

is that we can avoid the systematic computation of the coefficients

using arithmetic in

: we rather compute

on demand, by evaluating

. The coefficients

are essentially needed at two places: for the computation of substitution products and for the computation of the system

. Let

be the cost of an evaluation of

: if

is a polynomial, then

; if

is a differential polynomial, then

is its order plus one; etc..

When computing

by evaluating

, the computation of one coefficient of a one-step substitution

amounts to

evaluations of rational functions of the form

. Consequently, every substitution product amounts to a cost

in the final complexity.

As to the computation of a coefficients

, we may compute

as in (15) using

evaluations of cost

and then solving a linear system of size

. This gives rise to a cost

in the final complexity. Alternatively, as a side effect of the triangularization of

with the notations from (15), we may compute a symbolic matrix

such that

for all sufficiently large

. If the system (7) is algebraic, then

actually has constant coefficients, so the complexity further reduces to

. In general, the evaluation of

will be more expensive, so it is not clear whether this strategy pays off. Altogether, we have proved:

Theorem 11. Let (7) be an equation of index of total size and containing at most multiplications. Assume that the equations involve strictly less than nested derivations or integrations. Then can be computed in time

If (7) is algebraic, then the complexity further drops down to

Remark 12. The algorithm from this section has not been implemented yet. The new complexity bound constitutes an improvement mainly in the algebraic case. Since the manipulation of non constant coefficients in

gives rise to a small but non negligible amount of overhead for small and moderate

(see remark 7), we indeed expect further gains in this case.

7.Generalizations

For simplicity, the presentation of this paper has been focused on ordinary differential equations. Nevertheless, the techniques admit generalizations in several directions. We will outline two such generalizations.

Functional equations

Assume first that

for all

. In a similar way as in section 6, there exists a symbolic expression of the form (13) for each

, except that we now have

where

are the functions which occur as postcomposers in

. In particular, if

, then the

are contained in a Hardy field, and theorem 8 generalizes.

The above observation further generalizes to the case when

for certain

. In non degenerate cases, expressions

with

only occur as perturbations, and (16) still holds. In general, we also have to consider degenerate situations, such as the case when

One may even consider functional equations in which we also allow postcompositions with general expressions

with

. Although the theory from section 6 becomes more and more intricate, the algorithm from section 4 generalizes in a straightforward way.

Partial differential equations

The simplest, blockwise generalization proceeds as follows. Indices

are compared using the product ordering

. Given

and

, we let

be the series in

such that

is the result of the substitution of

, for all

and

. Given

and

, we let

be the subset of

such that

For each

, let

be such that

and

for

. We define the one-step partial substitution

The partial shifts

and

are defined similarly and we denote by

the substitution of

for

. The substitution product is defined recursively. If

, then we set

. Otherwise, we let

be smallest such that

is maximal and take

Using this substitution product, the algorithm from section 4 generalizes. The theory from section 6 can also be adapted. However, theorem 8 admits no simple analogue, due to the fact that there is no algorithm for determining the integer roots of a system of multivariate polynomials.

Several variants are possible depending on the application. For instance, it is sometimes possible to consider only the

up till a certain total degree

in (17), instead of a block of coefficients. For some applications, it may also be interesting to store the

in a sparse vector.

Bibliography

[BCO+06]: A. Bostan, F. Chyzak, F. Ollivier, B. Salvy, É. Schost, and A. Sedoglavic. Fast computation of power series solutions of systems of differential equation. preprint, april 2006. submitted, 13 pages.
[BK78]: R.P. Brent and H.T. Kung. Fast algorithms for manipulating formal power series. Journal of the ACM, 25:581–595, 1978.
[CK91]: D.G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras. Acta Informatica, 28:693–701, 1991.
[CT65]: J.W. Cooley and J.W. Tukey. An algorithm for the machine calculation of complex Fourier series. Math. Computat., 19:297–301, 1965.
[SS71]: A. Schönhage and V. Strassen. Schnelle Multiplikation grosser Zahlen. Computing, 7:281–292, 1971.
[vdH97]: J. van der Hoeven. Lazy multiplication of formal power series. In W. W. Küchlin, editor, Proc. ISSAC '97, pages 17–20, Maui, Hawaii, July 1997.
[vdH02a]: J. van der Hoeven. Relax, but don't be too lazy. JSC, 34:479–542, 2002.
[vdH+02b]: J. van der Hoeven et al. Mathemagix, 2002. http://www.mathemagix.org.
[vdH06]: J. van der Hoeven. Newton's method and FFT trading. Technical Report 2006-17, Univ. Paris-Sud, 2006. Submitted to JSC.
[vdH07]: J. van der Hoeven. New algorithms for relaxed multiplication. JSC, 42(8):792–802, 2007.

			(10)



			(11)


			(12)


			(14)