.

This work has been supported by the ANR-09-JCJC-0098-01 MaGiX project, the Digiteo 2009-36HD grant and Région Ile-de-France.

Reliable homotopy continuation
Third preliminary version

Joris van der Hoeven

LIX, CNRS

École polytechnique

91128 Palaiseau Cedex

France

Email: vdhoeven@lix.polytechnique.fr

Web: http://lix.polytechnique.fr/~vdhoeven

December 27, 2015

In this paper, we present several algorithms for certified homotopy continuation. One typical application is to compute the roots of a zero dimensional system of polynomial equations. We both present algorithms for the certification of single and multiple roots. We also present several ideas for improving the underlying numerical path tracking algorithms, especially in the case of multiple roots.

Keywords: Homotopy continuation, polynomial system solving, ball arithmetic, interval arithmetic, reliable computing

A.M.S. subject classification: 65H20, 65H04, 65G20, 30C15, 13P15

Disclaimer. This paper is a preliminary version, which is mainly intended as a working program for an ongoing implementation in the Mathemagix system. It is expected that several adjustments and corrections will have to be made during the implementation of the algorithms in this paper and their proof reading by colleagues.

1.Introduction

Besides Gröbner basis computations, homotopy methods are a popular technique for solving systems of polynomial equations. In this paper, we will only consider zero dimensional systems. Given such a system

with and , the idea is to find a suitable starting system

of which all solutions are known, to introduce the homotopy

and to compute the solutions to by following the solutions of from to . Two main approaches exist:

Algebraic homotopies

In this setting, the polynomial equations have exact rational or algebraic coefficients. The homotopy continuation is done exactly using suitable resultants. At the end of the homotopy, the solutions of the system are again given exactly, as the solutions of simpler systems. The theory was developed in [GHMP95, GHH+97, Lec01, Dur08] and a concrete implementation is available in the Kronecker system [Lec01].

Numeric homotopies

An alternative approach is to follow the solution paths using a numeric path tracking algorithm; see [Mor87, Ver96, SW05] and references therein. This approach is usually faster, partly because most of the operations can be done at a significantly lower precision. However, the end result is only approximate. In particular, it cannot be used for the reliable resolution of overdetermined systems. Several implementations exist for numeric path tracking [Ver99, BHSW06, Ley09].

It is surprising that little effort has been undertaken so far in order to bring both approaches closer together. Particularly interesting challenges are how to make numeric homotopy as reliable as possible and how to reconstruct exact end results from the numeric output. Part of this situation might be due to the fact that interval analysis [Moo66, AH83, Neu90, JKDW01, Kul08, MKC09, Rum10] is not so well-known in the communities where homotopy methods were developed, with the exception of one early paper [Kea94]. The main objective paper is to systematically exploit interval analysis techniques in the context of homotopy continuation. We will show how to certify homotopy continuations as well as single and multiple solutions of the polynomial system .

Section 3 is devoted to preliminaries from the area of reliable computation. In section 3.1, we start by recalling the basic principles of ball arithmetic [Hoe09], which is a more suitable variant of interval arithmetic for our purposes. In section 3.2, we pursue by recalling the concept of a Taylor model [MB96, MB04], which is useful in order to compute with reliable enclosures of multivariate analytic functions on polydisks. We also introduce a variant of Taylors models in section 3.3, which simultaneously encloses an analytic function and a finite number of its derivatives. In sections 3.4 and 3.5, we discuss the well known problem of overestimation which is inherent to ball arithmetic. We will provide some techniques to analyze, quantify and reduce overestimation.

Before attacking the topic of certified path tracking, it is useful to review the theory of numeric path tracking first. In section 4, we start with the case of non singular paths, in which case we use a classical predictor corrector approach based on Euler-Newton's method. The goal of a numeric path tracker is to advance as fast as possible on the solution path while minimizing the risk of errors. Clearly, the working precision has to be sufficiently large in order to ensure that function evaluations are reasonably accurate. In section 4.4, we show how to find a suitable working precision using ball arithmetic. We consider this approach to be simpler, more robust and more general than the one proposed in [BSHW08]. In order to reduce the risk of jumping from one path to another path, we also need a criterion for checking whether our numeric approximations stay reasonably close to the true solution path. A numerically robust way to do this is to ensure that the Jacobian of does not change to rapidly during each step; see section 4.5 and [BSHW08] for a related approach. Another technique is to detect near collisions of paths and undertake special action in this case; see section 4.6.

In section 5, we turn our attention to homotopies (3) such that the end system (1) admits multiple solutions. We will see that Euler-Newton iterations only admit a linear convergence near multiple solutions. Therefore, it is useful to search for alternative iterations which admit a better convergence. Now the solution path near a multiple solution is given by a convergent Puiseux series in . When letting turn around the origin, we thus fall on another solution path. The collection of paths which are obtained through repeated rotations of this kind is called a herd. In sections 5.2 and 5.3, we will describe a new path tracking method with quadratic convergence, which operates simultaneously on all paths in a herd. The remaining issue of how to detect clusters and herds will be described in sections 5.4, 5.5 and 5.6.

In section 6, we turn our attention to the certification of single roots of (1) and single steps of a path tracker. An efficient and robust method for the certification of solutions to systems of non linear equations is Krawczyk's method [Kra69], with several improvements by Rump [Rum80]. In section 6.1, we adapt this classical method to the setting of ball arithmetic. In section 6.2, we will see that an easy generalization of this method provides an algorithm for certified path tracking. An alternative such algorithm was given in [Kea94], but the present algorithm presents similar advantages as Krawczyk's method with respect to other methods for the certification of solutions to systems of non linear equations. However, both methods still suffer from overestimation due to the fact that error bounds are computed on a polydisk which contains the solution path. Using the technique of Taylor models, we will show in section 6.3 that it possible to compute the error bounds in small tubes around the actual solution path, thereby reducing the problem of overestimation.

In section 7, we first consider the more difficult problem of certifying multiple roots in the univariate case. We will describe two methods based on Rouché's theorem and a third method which rather aims at certifying a local factorization. The last method also serves as an important ingredient for making the Weierstrass preparation theorem effective in an analytic context.

Before we turn our attention to the certification of multiple roots in the multivariate case, it will be convenient to have a general toolbox for effective complex analysis in several variables. In sections 8.1 and 8.2, we first propose two ways how to formalize “computable analytic functions” We next propose several basic algorithms for computations with such functions.

In section 9, we consider the more difficult problem of certifying multiple roots in the multivariate setting. Several algorithms have been proposed for this problem [OWM83, Lec02, GLSY05, GLSY07, LVZ06, LVZ08, RG10, MM11]. However, most existing strategies require the computation of a large number of derivatives of the system, which becomes prohibitive for large clusters of solutions. In the simplest case when the Jacobian matrix of the polynomial system has corank at most one at the singularity, our solution is based on the simultaneous consideration of all solution paths in a herd. The general case will essentially be reduced to this case using analytic elimination techniques which are similar to the geometric resolution method of polynomial systems [GHMP95, GHH+97, Dur08]. In section 9.8, we will also outline a bridge between numeric and algebraic solutions which provides an alternative way to certify solutions.

In section 10, we study generalizations to the resolution of systems of analytic equations in a given polydisk. In section 10.2, we first consider a system of analytic equations as a perturbation of a system of polynomial equations. If, for each of the solutions to the system of polynomial equations, we can control the speed with which the solution moves under perturbations, then we can solve the perturbed system. Unfortunately, this forces us to keep track of solutions which are far away from the region of interest. In section 10.3, we present an alternative strategy based on incremental resolution, as in [GHMP95, GHH+97, Dur08]. Although this strategy also may require to work outside the region of interest, it does stay closer.

2.Notations

Positive elements
Given a subset , we denote

Vector notation
Unless stated otherwise, we will use the -norm for vectors :

This norm should not be confused with taking componentwise absolute values

For we also define

If are formal variables, then we write

Matrix notation
We write for the set of matrices over a set . The matrix norm of a matrix corresponding to the -norm (4) for vectors

Directed acyclic graphs
We recall that labeled directed acyclic graphs are often used for the representation of symbolic expressions with potential common subexpressions. For instance,

is a typical dag for the expression . We will denote by the size of a dag . For instance, the size of the above dag is .

3.Reliable arithmetic

3.1.Ball arithmetic

Let us briefly recall the principles behind ball arithmetic. Given a normed vector space , we will denote by or the set of closed balls with centers in and radii in . Given such a ball , we will denote its center by and its radius by . Conversely, given and , we will denote by the closed ball with center and radius .

A continuous operation is said to lift into an operation on balls, which is usually also denoted by , if the inclusion property

is satisfied for any and . We also say that is an enclosure for the set , whenever (5) holds. For instance, if is a Banach algebra, then we may take

Similar formulas can be given for division and elementary functions. Certified upper and lower bounds for will be denoted by and .

It is convenient to extend the notion of a ball to more general radius types, which only carry a partial ordering. This allows us for instance to regard a vector of balls as a “vectorial ball” with center and radius . If , then we write if and only if for all . A similar remark holds for matrices and power series with ball coefficients.

In concrete machine computations, numbers are usually approximated by floating point numbers with a finite precision. Let be the set of floating point numbers at a given working precision, which we will assume fixed. It is customary to include the infinities in as well. The IEEE754 standard [ANS08] specifies how to perform basic arithmetic with floating point numbers in a predictable way, by specifying a rounding mode among “down”, “up” and “nearest”. A multiple precision implementation of this standard is available in the Mpfr library [HLRZ00]. Given an operation , we will denote by its approximation using floating pointing arithmetic with rounding mode . This notation extends to the case when and are replaced by their complexifications and .

Let and or and . We will denote by or the set of closed balls in with centers in and radii in . In this case, we will also allow for balls with an infinite radius. A continuous operation is again said to lift to an operation on balls if (5) holds for any and . The formulas for the ring operations may now be adapted to

where , and are reliable bounds for the rounding errors induced by the corresponding floating point operations on the centers; see [Hoe09] for more details.

In order to ease the remainder of our exposition, we will avoid technicalities related to rounding problems, and compute with “idealized” balls with centers in and radii in . For those who are familiar with rounding errors, it should not be difficult though to adapt our results to more realistic machine computations.

Remark 1. In classical interval analysis so called interval lifts of operations are sometimes required to satisfy the inclusion monotonicity property

for all , which clearly implies the usual inclusion property (5). For floating intervals, it is easy to ensure this stronger property using correct rounding. In the ball setting, the exact ring operations in and are clearly inclusion monotonic, but it seems cumbersome to preserve this stronger property for floating balls. For this reason, we systematically develop our theory without assuming inclusion monotonicity.

3.2.Taylor models

If we are computing with analytic functions on a disk, or multivariate analytic functions on a polydisk, then Taylor models [MB96, MB04] provide a suitable functional analogue for ball arithmetic. We will use a multivariate setup with as our coordinates and a polydisk for a fixed . Taylor models come in different blends, depending on whether we use a global error bound on or individual bounds for the coefficients of the polynomial approximation. Individual bounds are sharper (especially if we truncate up to an small order such that the remainder is not that small), but more expensive to compute. Our general setup covers all possible blends of Taylor models.

We first need some more definitions and notations. Assume that is given the natural partial ordering. Let denote the -th canonical basis vector of , so that and for . For every , recall that . A subset is called an initial segment, if for any and with , we have . In that case, we write and . In what follows, we assume that and are fixed initial segments of with . For instance, we may take and or or .

Let or . Given a series , we will write for its support. Given a subset and a subset , we write and . If is analytic on , then we denote its sup-norm by

A Taylor model is a tuple , where , and are as above, and . We will write for the set of such Taylor models. Given and , we will also denote and . Given an analytic function on , we write , if there exists a decomposition

with and for all . In particular, if , then

for any . Given two Taylor models , we will say that is included in , and we write if for any . This holds in particular if for all , in which case we say that is strongly included in and write . We finally define by

so that for all and .

Addition, subtraction and scalar multiplication are defined in a natural way on Taylor models. For multiplication, we need a projection with for all and if . One way to construct such a mapping is as follows. For , we must take . For , let be largest such that . Then we recursively define . Given , we now define their product by

Using the observation that , this product satisfies the inclusion property that for any analytic functions and on .

3.3.-stable Taylor models

For some applications, it is convenient to use Taylor models for enclosing both an analytic function and a certain number of its derivatives. Let us show how to incorporate this in our formalism. Throughout this section, we assume that and that is an initial segment with .

Given a Taylor model and , we notice that can be regarded as a Taylor model in with . Let be an analytic function and . We define the relations and by

Clearly, for all and .

Let be an operation. Then is said to -lift to , if for all and all , we have . Addition, subtraction and scalar multiplication -lift in the usual way. As to multiplication, we take

with

In order to see that satisfies the -inclusion property, it suffices to check that

for all . This is clear if . Otherwise,

For any with , there exists a with . Hence,

In the particularly useful case when , we notice that for all and for all .

3.4.Overestimation

The major problem in the area of ball arithmetic is overestimation. For example, even though the expression evaluates to zero for any , its evaluation at any ball in with a non zero radius is not identically equal to zero. For instance,

Algorithms which rely on ball arithmetic have to be designed with care in order to avoid this kind of overly pessimistic error bounds. In particular, if we evaluate a dag using ball arithmetic, then a symbolically equivalent dag might lead to better error bounds.

Consider a continuous function with as in section 3.1. We recall that is said to lift into an operation if the inclusion property

is satisfied for all and . Clearly, such a lift is not unique: for any with for all , the function is also a lift of . If we require that , then the best possible lift is given by

In general, this lift may be expensive to compute. Nevertheless, its existence suggest the following definition of the quality of a lift. The overestimation of at is defined by

This quantity is easier to study if we let tend to zero. Accordingly, we also define the pointwise overestimation function by

Here means that and .

If is computed by evaluating a dag , then it would be nice to have explicit formulas for the pointwise overestimation. For and assuming that the lift is evaluated using the default ball implementations of and from section 3.1 , we claim that there exists a dag with

for . Indeed, we may compute using the rules

where stands for the -th coordinate function. Now we also have

for . Consequently,

If , then this formula simplifies to

Example 2. With , let us compare the dags and . We have and , whence

The example shows that we have an infinite amount of overestimation near double zeros, except if the dag is explicitly given as a square near the double zero. More generally, for the dag with an -fold zero, we obtain

At a distance of the zero, ball arithmetic thus produces bounds which are times too pessimistic.

Remark 3. An interesting problem is whether a good understanding of the pointwise overestimation also helps us to bound the overestimation on more general balls. One concrete question is whether we have

for all polynomial dags and balls . This inequality seems to hold in all easy cases that we have looked at, but we do not have a proof that it holds in general.

3.5.Reducing the overestimation

The example 2 shows that standard ball arithmetic generally produces an infinite amount of overestimation near double or multiple zeros. This raises the problem how to compute better ball lifts which do not present this drawback.

One possible remedy is to systematically compute the ball lifts using Taylor models. Indeed, assume that we want to evaluate at the ball . Let , and be as in section 3.2 and let be the corresponding domain of Taylor models in . Let and consider the Taylor model evaluation of at

Then

yields an enclosure of . Although the evaluation of is usually far more expensive than the evaluation of , let us now study how much the overestimation has been reduced.

Let and let us introduce the operator , which generalizes the mapping . The operator is defined by induction over the size of :

For as above, we then have

Now assume that and let be the valuation of at . If , then we have

If , then we still have (8), but (9) and (10) become

If , then we generally have

although may occur in lucky cases.

4.Numeric path tracking

4.1.General framework

Let be an open subset of and an analytic function. We consider as a function in and the time , where and , and also call a homotopy. Assuming that for some and that we are not in a “degenerate” case, there exists a unique analytic function with for all . We are interested in the value of when . More generally, given a vector of vectors, there exists a unique function with for all .

The goal of a numeric path tracker is to approximate the function as well and as quickly possible and, above all, to compute its value at the “end point” . In what follows, we will denote by the set of floating point numbers with bit mantissas. We also define , and assume that we have a program for computing a numeric approximation of . Given with , we thus want to compute with , by following the homotopy.

In many cases, we will be interested in homotopies for solving a system

(11)

of polynomial equations. The number of solutions to a generic system of this kind is given by the Bezout number , where is the total degree of for each . For suitable scaling parameters , we now define by

Let

For any , the point

clearly satisfies , whereas any with satisfies .

If the system (11) is zero dimensional and the values are complex and sufficiently random (we also say that the homotopy is in general position), then the system is also zero dimensional for every . In what follows we will always assume that the homotopy has been chosen in such a way.

4.2.Solutions at infinity

One classical difficulty with homotopy methods for solving a polynomial system (11) is that many of the solution paths may tend to infinity in the sense that for some and . Computations which infinities can be avoided by rewriting the equations in projective coordinates. More precisely, setting , the projectivation of a polynomial is defined by

Applying this to the system (11), we obtain a new system

(12)

of homogeneous equations in . For a random hyperplane

the composite system (1213) is again zero dimensional, but without solutions at infinity. It is easy to reconstruct solutions to (11) from solutions to (1213) and vice versa.

4.3.Predictor corrector methods

Assume that we have a way to approximate the Jacobian of by . For instance, if is given by a dag, then a dag for can be computed using forward differentiation, and just corresponds to the approximated evaluation of this dag.

Assume that we are given and at a certain point where . We may write as the horizontal join of two matrices and . Given close to , we may find a for which using Euler-Newton's method

The replacement is called a prediction step. We may still apply the formula when , in which case is usually a better approximation than to a genuine zero of at than . In this situation, the replacement is called a correction step.

From the computational point of view, the evaluation of the Jacobian is usually about times more expensive than the evaluation of the function itself (except for large and sparse ). Instead of reevaluating the Jacobian after the prediction step at , it may therefore be worth it to perform a few correction steps using the Jacobian at instead:

Since the convergence of is only linear, the number is typically chosen quite small (). One full prediction-correction cyclus now just consists of the replacement .

From the complexity point of view, the evaluation of and is usually far more expensive than the cost of linear algebra at size , at least for the examples we will be interested in here. Therefore, it will not be necessary to device the linear algebra algorithms with special care (for instance, we may simply compute the inverse once and for all, instead of using LU decompositions). On the other hand, we typically want to increase the step size as much as possible, while trying to stay reasonably close to the true solution path.

4.4.Precision control

One obvious source of numeric errors is when the numeric precision being used is insufficient for producing sensible results. In [BSHW08], a strategy has been proposed for selecting a sufficient precision for homotopy methods to be numerically reliable. We will now propose an alternative method for finding such a precision, whose justification is based on a simpler argument.

Let be the current working precision. Our method is based on the following idea: when evaluating , the actual precision of the result is usually smaller than and of the form for some fixed constant. We will call the effective precision and we may expect the numeric evaluations to be reliable as long as is picked sufficiently large such that remains above a certain threshold (e.g. ).

We still need a more precise definition of the effective precision or a simple way to compute it. Assuming that admits a ball lift, we may evaluate at the ball . Then

provides an estimate for the relative precision of . If , then this precision is potentially quite low. In that case, we may also consider at the next time . Instead of performing one extra ball evaluation, we may also use the following approximation of :

We now take

for the current effective precision at and assuming a current step size .

4.5.Step size control

Since purely numeric homotopy methods are usually being designed for speed, the main focus is not on being 100% fool proof. Nevertheless, it remains worth it to search for cheap ways in order to detect errors and adapt the stepsize so as to avoid potential errors.

Now assume that we perform one full prediction correction cyclus . We first need a criterion for when to accept such a step. The main problem with the design of numeric criteria is there is no way to decide whether a numeric quantity is small or large; such checks can only be performed with respect to other quantities. Instead of checking whether we remain close to the genuine solution path, it is therefore more robust to check that the Jacobian does not change not change to quickly on the interval .

More precisely, let , , and . Then it is natural to only accept steps for which

for a fixed threshold (e.g. ). Here we may use any matrix norm , so it is most convenient to chose one which is easy to compute:

The condition (14) is not fully satisfactory yet, since it relies on the expensive computation of a Jacobian . This is acceptable if the step has a good chance of being accepted (since we will need the Jacobian anyway for the next step), but annoying if the step is to be rejected. Before checking (14), it is therefore wise to perform a few cheaper checks in order to increase the probability that (14) will hold indeed. In particular, if , then we may verify that

for the max-norm on vectors, where (e.g. ) and . This simplified check is linked to (14) by remarking that . The new check (15) should not be applied when and are too close for and to be computed with sufficient precision. More precisely, it should really be replaced by the check

where is slightly smaller than one (e.g. ) and stands for the “effective working precision” from section 4.4.

In addition to the above checks, one might wish to ensure that is reasonably small after each step. Unfortunately, there is no satisfactory reference with respect which smallness can be checked, except for . The best we can do therefore consists of checking whether tend to at some indicated rate:

for all , where (e.g. ). Again, we need to insert a safety exemption for the case when the convergence is exceptionally good.

Once that we have a criterion on whether a step should be accepted, an algorithm for automatic stepsize control is easily implemented: assuming that we are walking from to , we start by setting . Given and , we try a step until . If the step fails, then we set with (e.g. ), and retry for the smaller stepsize. Otherwise, we accept the step and set for the next step, where (e.g. ).

4.6.Near collisions

Another way to look at the numerical error problem is to investigate what can actually go wrong. Theoretically speaking, around each true solution path , there exists a small tube of variable polyradius , where Newton's method converges to the true solution . As long as our current approximation at time remains in this tube , no errors will occur. Now the Newton iterations have a strong tendency of projecting back into the tubes, especially if we use the additional safeguard (17). Nevertheless, it might happen that we jump from one tube into another tube, whenever two solution paths come close together.

If we are considering a homotopy for solving a polynomial system , then various solution paths will actually meet at if the system admits multiple roots. Such multiple roots are an intrinsic difficulty and we will need dedicated “end game” strategies to ensure good numeric convergence in this case (see section 5 below).

For , and for suitably prepared functions , the Lebesgue probability that two solutions paths meet at a point is zero. Nevertheless, we may have near collisions, which usually occur in pairs: the probability that more than two paths simultaneously pass close to a same point is extremely low.

So assume that we have a near collision of two solution paths. Then we have a true collision at for some complex time near the real axis. Locally around this collision point, the two paths are then given by

for some vector . If we only know at a few points, then we may try to compute , and , and also check whether the second path indeed exists.

Now assume that we have approximated and derivative at two times . Denote these approximations by , , and . Then

for , whence we may use the following approximations for and :

We next perform several safety checks. First of all, we obtained as the division of two vectors; we may use the mean value of the componentwise divisions and check that the variance remain small. We next verify that and are reasonably close. We also verify that the Newton iteration starting at converges to a solution close to . We finally verify that the same thing holds for instead of , where .

We will not go into technical details on the precise numerical checks here, since section 5.3 below contains a similar discussion for the case of multiple roots at . We may also adapt the herd iteration from section 5.2 below to near collisions, which allows for the simultaneous continuation of and . Contrary to the case when , we also need to recompute better estimations of at every step, which can be done via the simultaneous computation of and the two “conjugate” paths with . Indeed, using the higher order expansion

we get

from which we may deduce high quality approximations of and . As soon as is small with respect to , then the junction between paths and their conjugates occurs and we know how to traverse the near collision.

5.Multiple roots

5.1.Straightforward Euler-Newton type methods

Consider a homotopy induced by a polynomial system (11) with a zero dimensional set of solutions. It frequently occurs that some of the solutions are multiple roots, in which case the predictor corrector algorithm slows down significantly when approaches . This is due to the fact that Newton's method only has a linear convergence if we are approaching a multiple root, whereas the convergence is quadratic for single roots.

In order to get a better understanding of this phenomenon, it is instructive to quantify the slow down in the case of an -fold root of a univariate polynomial , which is more or less representative for the general case. In the neighbourhood of the root , we have

with . Hence, the Newton iteration becomes

In particular, we see that we need roughly iterations in order to divide by . We also notice that is roughly divided by at every iteration. For complexity measures, it is more reasonable to study the speed of convergence of rather than itself. Indeed, the relative precision of an -fold root is intrinsically times smaller than the working precision.

If we are rather considering a homotopy , then we usually have . Locally, we may thus write

Assume that we have for small and , so that

Then the Euler-Newton iteration for step size yields

Following our criterion (14), we should have

Roughly speaking, this means that . Hence, is multiplied by at every step and is multiplied by every steps.

5.2.The herd iteration

For high precision computations, it would be nice to have an algorithm with quadratic convergence in . Before we give such an algorithm, let us first introduce some terminology and study the behaviour of the solutions paths when .

By assumption, we are given a system (11) with an -fold root . Consider a solution path for the homotopy with . Since is algebraic in , we may expand

as a Puiseux series in for a certain ramification index (which we assume to be taken minimal). Now letting turn around once, we have

where . When turning repeatedly, we thus obtain pairwise distinct solutions paths with . We will call such a family of solution paths a herd.

Contrary to the homotopy methods from section 4, which operate on individual paths, the iteration that we will present now simultaneously operates on all paths in a herd. Consider a solution path with as above and the corresponding herd with . We assume that both and are known for a given and all . Let and denote the FFT-transforms of the vectors and with respect to . Then we have

for all . We now compute using the formulas

For of the order of , we now have

for all . We call (18) the herd prediction. This prediction may be corrected using conventional Newton iterations at time , for a fixed constant . A complete cyclus of this type will be called a herd iteration.

5.3.Step size control for herd iterations

Several technical details need to be settled in order to obtain a robust implementation of herd iterations. First of all, we need a numeric criterion for deciding when the approximations and are of a sufficient quality for starting our herd iteration. Clearly, the error of the approximation should be in .

We may first ensure ourselves that the approximation can not substantially be improved using Newton iterations: let be the result of applying one Newton iteration to at time . Then we check whether

for some threshold , such as (although this check becomes unstable if , we notice that this situation cannot arise systematically for ).

The check (19) for does not yet guarantee that the correspond to approximate evaluations of the Puiseux expansions. In order to check that this is indeed the case, we first compute the as described in the previous section. Defining

we next evaluate for all and apply one Newton iteration at time to the results, yielding . We now check whether

for some threshold , such as , and all . Of course, this second check is more expensive than the first check (19). The thresholds should therefore be adjusted in such a way that the second check is likely to succeed whenever the first one does.

The above criteria can also be used for deciding whether a proposed herd iteration from to should be accepted or not. We still have to decide how to chose . For a fixed constant and a positive integer which may change at every step, we will take

If a step is accepted, then we increase by one or a larger integer smaller than . If a step is not accepted, then we decrease by one and repeat the same procedure until acceptance or . If , then we have either reached the best possible accuracy for the current working precision, or our paths did not really converge to the same point . The first case occurs whenever the effective precision from section 4.4 drops below a given threshold. In the latter case, we revert to individual homotopies for further continuation.

5.4.Detection of clusters

Let us now go back to the initial polynomial system (11) and assume that we have computed numerical approximations of all individual homotopies up till a certain time . We need a way to partition the individual paths into herds. One obvious way is to follow all solution paths from to and deduce the corresponding permutation of . However, this computation is quite expensive, so it would be nice to have something faster.

A first step towards the detection of herds is to find all clusters, i.e. all groups of paths which tend to the same limit . Here we notice that one cluster may contain several herds, as in the example

where all four solution paths with tend to the quadruple root of . This cluster contains two herds and .

Now let and for all . For each , we consider the ball

The radii of these balls has been chosen with care, such that, with high probability, any two paths which belong to the same herd are also in the same connected component of . This is best verified on the case of path . Then the next path in the cluster is and

An efficient way to separate different connected components of is via projection. Let be a random vector of real numbers of length . Then any point may be projected to the vector product . Applying this projection to our balls , we obtain intervals . We may sort the (and the corresponding ) on their centers in time and compute the various connected components of using a linear pass. Whenever and are in different connected components, then so are and . Assuming that is sufficiently small, application of this procedure for random vectors results with probability one in the separation of all connected components corresponding to different clusters.

5.5.Detection of herds

Let be a set of indices such that the with form a cluster with limit . We still need a way to find the various herds inside the cluster. In a similar way as in section 5.3, we may improve the quality of our approximations and via Newton iteration until and . From now on, we assume that we have done this.

For each and , we may write

for some and . We obtain a good approximation using

If is not too large (so that has a small numerator and denominator), then we also obtain reasonably accurate approximations and by

and check whether

is indeed close to some with . Doing this for all , we thus obtain a candidate permutation with for all . Each cycle in this permutation induces a candidate herd. Using the criteria from 5.3, we may next check whether the quality of the candidate herd is sufficient. If not, then we may always resort to the more expensive computation of the solution path from to .

5.6.Synchronization

Our algorithms for the previous sections for cluster and herd detection rely on the availability of approximations on all paths at the same time . Usually the individual homotopies are launched in parallel and advance at different speeds. Consequently, the synchronization of all paths at the same time is a non trivial matter.

Strictly speaking, we notice that it is not necessary to synchronize all paths, but rather those paths which belong to the same cluster or herd. In particular, we will concentrate on those paths which tend to multiple roots.

So consider a path which tends to a multiple root . As long as is approximated using an individual continuation, we have seen that the convergence to is linear. For a fixed (such as ), the computation of at all “checkpoints” thus only requires a constant overhead. At every checkpoint, we may now launch the algorithm for the detection of clusters. For every candidate cluster , we next determine the checkpoint with highest at which is available for all . We launch our algorithm for the detection of herds at this checkpoint .

In addition, it is a good practice to check that we still have points on all paths at every checkpoint. For paths which tend to a single root, we may approximate for large using a single step continuation from to . For the approximation of using (21), we notice that it important that no paths of the cluster are missing or counted twice. Indeed, in the contrary case, we only have with for all , which is insufficient for the computation of accurate approximations of and .

6.Certified homotopies

6.1.Certification of Newton's method

Consider an analytic function on some open subset of and assume that admits a ball lift. Given an isolated root of , it is well known that Newton's method converges to in a small neighbourhood of . It is a natural question to explicitly compute a ball neighbourhood for which this is the case (where we notice that a “ball” in is really a compact polydisk). One method which is both efficient and quite tight was proposed by Krawczyk [Kra69]. Recall that denotes the Jacobian of .

Theorem 4. Let , and let be an analytic function. Let be a ball enclosure of the set . If

then admits a fixed point .

Proof. For any , we have

Since is convex, we also have

Hence

It follows that is an analytic function from the compact ball into itself. By Brouwer's fixed point theorem, we conclude that there exists a with .

Corollary 5. Let , and let be an invertible matrix with . If and

then the equation admits a root .

Proof. We apply the theorem for .

The above method is still a bit unsatisfactory in the sense that it does not guarantee the uniqueness of the solution. Denoting by the interior of a subset of , the following sharpening of the method is due to Rump [Rum80].

Theorem 6. With the notations from theorem 4, if

then admits a unique fixed point in .

Proof. Let us first show that the spectral norm (i.e. the norm of the largest eigenvalue) of any is . Indeed, our assumption implies

Now consider the norm on . Then, for any and with , we have

whence . This is only possible if the spectral norm of is .

Now consider . By what precedes, any matrix in is invertible. For any two distinct points , we have

Since is convex, there exists a matrix with

By what precedes, it follows that . We conclude that or . The existence of a fixed point follows from theorem 4.

Corollary 7. With the notations of corollary 5, if

then the equation admits a unique root .

Proof. Application of theorem 6 for .

Assuming that we have computed a numeric approximation to a root of , a second question is how to find a suitable ball for which the corollaries apply. Starting with , a simple solution is consider the sequence defined by

where

Whenever , then we are done. In order to ensure the convergence of this method, we need to tweak the recurrence (22) and replace it by

for suitable small positive constants and . We refer to [Rum80] for more details on this technique, which is called -inflation.

6.2.Certification of a numeric homotopy continuation

Assume that the polynomial system (11) admits only simple roots and that we have obtained numeric approximations for all these roots using a numeric path tracker. Then theorem 5 suffices for the joint certification of the numeric approximations . Indeed, using the above technique, we first compute balls for which theorem 5 applies. To conclude, it then suffices to check that these balls are pairwise disjoint. This can be done using the same algorithm as for the detection of clusters, which was described in section 5.4.

In the case when two balls and do intersect, then we recompute approximations for the paths and using a smaller step size, that is, by lowering the constant in (14). We keep doing so until none of the balls intersect; even if some of the paths may have been permuted due to numerical errors, the final set of all is correct if none of the balls intersect. Indeed, each of the balls contains a solution and there can be no more solutions than the number predicted by the Bezout bound.

If and intersect then, instead of recomputing the paths and using smaller and smaller step sizes, we may also search for a way to certify the entire homotopy computations. This will be the topic of the remainder of this section. Let us first show how to adapt the theory from the previous section to certified path tracking. From now on, we assume that is an analytic function which admits a ball lift.

Theorem 8. Let be such that . Let and let be an invertible matrix with . If

then the equation admits a unique root for each .

Proof. Let and consider the function . Then encloses and encloses , and we conclude by theorem 6.

Clearly, for any , theorem 8 ensures the existence of a unique solution path from to in the tube . As at the end of the previous section, the question again arises how to compute balls and for which the conditions of the theorem are likely to be satisfied. Since the computation of is expensive, it is important to keep down the number of iterations of the type (22) or (23) as much as possible (say at most one iteration).

Now assume that we performed a numeric homotopy computation from to . Then a reasonable first guess is to take

for some , say . Unfortunately, if one of the components of tends to zero, then this guess turns out to be inadequate. Therefore, it is recommended to use an additional inflation proportional to the norm of :

for some small , say . Another idea is to use the radius of the previous step as a reference (except for the very first step, of course). For instance, if our previous step went from to , then we may take

for some small , say .

6.3.Certification via tubular models

One important disadvantage of the method from the previous section for the certification of one path tracking step is that we use global error bounds on the tube . Consequently, the inaccuracy of is proportional to the step size , whence any overestimation in the evaluation of or due to the inaccuracy in requires a reduction of the step size.

For this reason, it is much better to follow the solution path as closely as possible instead of enclosing it in a “square tube”. This can be achieved via the use of Taylor models. Using -stable Taylor models, it is possible to simultaneously compute of accurate enclosures for and on the tube.

More precisely, let , and . For a fixed in , let be an initial segment of of the form

and let . A -stable Taylor model in will also be called a tubular model. We will write for the set of tubular models. Given , we let and be such that

for all .

Figure 1. Illustration of a solution path in a tube.

Theorem 9. Let , , and let

be an approximation of the solution to . For instance, if and , then we may take , with and . Consider and with

Let , , . If

then the equation admits a unique solution , for every .

Proof. For an illustration of the proof, see figure 1. Let and . By construction, and using the facts that and , we have

for any and . For a fixed , it follows that encloses on the disk . Our hypothesis (24) also implies that

From theorem 6, we conclude that admits a unique fixed point .

In order to apply the theorem, it remains to be shown how to find a good tube, i.e. how to choose , , , and . For a fixed order of the approximation, the idea is to adjust and such that can be chosen minimal.

Let us first consider the first order case . Assume that we performed a numeric path continuation from to and that both and are approximatively known. Then there exists a unique curve of degree three with , , and . Let be a linear curve which minimizes the maximum of on for every . Then we take , , and . We may also take for some fixed such as . However, for better performance it is recommended to apply an additional inflation to , similar to what we did in the previous section.

For higher orders , we proceed in an essentially similar way. We first compute a high order numeric polynomial approximation of . For orders , this may require the accurate approximation of additional points with on the solution path. We next find a -th order polynomial which approximates as good as possible and choose our tube in a similar way as above. It should be noticed that the evaluation in theorem 9 is at least thrice as expensive as the numeric evaluation of . This makes it worth it to improve the quality of the numeric approximations of points on the curve using one or more additional Newton iterations. The use of higher order approximations makes it possible to choose very small, thereby avoiding a great deal of the overestimation due to the use of ball arithmetic.

6.4.Numeric spearhead computations and certification

We insist once more on the importance of performing all certifications as late as possible rather than along with the numeric computations themselves. One should regard the numeric computation (the spearhead) as an educated guess of what is happening and the certification as an independent problem at a second stage. In particular, only the numeric results which interests us (i.e. the solutions of the polynomial system) need to be certified and not the way we obtained them (i.e. the homotopies).

Even in the case when we are interested in certifying the homotopies themselves, it is best to do so once the numeric computations have already been completed. This allows for instance for more parallelism in the computations. Indeed, we may cut the entire homotopy path in several pieces and certify these pieces in parallel. More numeric data may also be available once all numeric computations have been completed, which might be useful for guiding and accelerating the certification stage.

Similar remarks apply more generally for certified integration of dynamical systems. In that setting, one is often interested in the flow in the neighbourhood of some initial condition. The sequential part of such a computation resides in the numeric integration for a particular initial condition. Once the corresponding numeric trajectory is known numerically, we may again cut it in pieces which and compute the corresponding flows and certifications in parallel. Whenever the dependence on the initial conditions is very strong, the actual numeric integration should be done using accurate high order Taylor methods using a multiple working precision.

7.Certification of multiple univariate roots

In section 5.1, we have studied in detail the numeric determination of a multiple root of a univariate polynomial. It is instructive to take up this study and examine how we certify such multiple roots. Since the property of being an -fold root is lost under small perturbations, this is actually impossible using ball arithmetic. The best we can hope for is to certify the existence of roots in a small ball, or the existence of an -fold root of a small perturbation of the polynomial (see also [Rum10]).

7.1.Applying Rouché's theorem

So consider a polynomial with an approximate -fold root at and assume that we wish to certify that admits exactly roots in the ball , for some . One first strategy is to make use of the Taylor series expansion of at . More precisely, let be the set of univariate Taylor models in with and for some . Evaluating at , we obtain a Taylor model with the property that for any . It remains to be shown that any admits roots in . We claim that this is the case if

Indeed, assume that we have (25) and let . Then

for all with . By Rouché's theorem [Lan76, page 158], it follows that and admit the same number of roots on . If becomes large, or if admits other roots close to , then the bound (25) often does not hold. In that case, one may use more sophisticated techniques from [Sch82, Hoe11] in order to certify that admits roots in . From the complexity point of view, the series expansion method requires evaluations of , where denotes the cost of multiplying two polynomials of degrees .

7.2.Computing the winding number

Another approach is to apply Rouché's theorem in a more direct way by computing on a path starting at and which circles around the origin once. If the reliable image of this path avoids the origin, then the number of roots of coincides with the number of times that turns around the origin. More precisely, let for a suitable (see also below) and let for . Then we evaluate and check whether for all . If this is the case, then

yields the exact number of roots of inside . This method requires evaluations of , but needs to be sufficiently large if we want to ensure a reasonable chance of success for the method.

Let us investigate the choice of an appropriate in more detail on the simplest example when and

Consider the evaluation of at . We have

For small , the condition thus implies

Roughly speaking, for , this means that

We recall from example 2 that also corresponds to the punctual overestimation of the ball evaluation of at . If we want to reduce to a quantity which does not depend on , then it follows from the considerations in section 3.5 that we need to evaluate using Taylor models of order at least . However, in that case, we might just as well use the first method based on a direct series expansion of at .

7.3.Certifying a local factorization

Whenever a polynomial admits roots in a disk, then the polynomial is a monic factor of . Instead of trying to determine and certify the roots individually, another idea is to determine and certify this monic factor of .

Assuming that admits no other roots near , we will show in this section that forms an isolated zero of , when regarding this equation as a system of equations in variables . On one hand, this approach has the advantage that we may apply Corollary 7 in order to certify the factorization. On the other hand, we may use fast polynomial arithmetic for the actual evaluation of .

Theorem 10. Let be roots of a polynomial of multiplicities , and . Then is an isolated zero of the system , when considered as a system of equations in variables .

Proof. Given the euclidean division of by , we have for small perturbations of , with . Consequently,

Computing in the quotient algebra , this equation can be reread as

Since with , the multiplication mapping with in is invertible. By (26), the Jacobian matrix of

is precisely the matrix of this multiplication mapping with respect to the canonical basis .

8.Effective complex analysis

For the sequel, it will sometimes be convenient to consider the more general context of analytic functions instead of mere polynomials. We provide two formalizations of computable multivariate analytic functions: an abstract one relying on analytic continuation in section 8.1, as well as a more concrete evaluation based one in section 8.2. In the remainder of the section, we will discuss evaluation of analytic functions in zero dimensional algebras, effective Weierstrass preparation, and computable meromorphic functions.

8.1.Computable multivariate analytic functions

We recall [Wei00] that a real number is said to be left (resp. right) computable if there exists an increasing (resp. decreasing) computable sequence with . We say that is computable if is both left and right computable. We denote the sets of computable, left computable and right computable real numbers by , and . We define to be the set of computable complex numbers. The definitions also adapt in a straightforward way to extended real numbers . The theory of computable real numbers provides a suitable abstract framework for studying which analytic problems can be solved.

In [Hoe05, Hoe07], we proposed a similar concept of computable analytic functions. Given an analytic function at the origin, we say that is computable if there exists methods for computing the power series expansion of , a lower bound for its convergence radius, an upper bound for on any closed disk on which converges, and a method for the analytic continuation of . Formally speaking, denoting by the set of such functions, this means that we may compute

Given , we call its computable radius of convergence. Usually, is smaller than the genuine radius of convergence of .

This definition admits several variants. In practice, it is usually most convenient to provide a method for the computation of bounds using ball arithmetic and allow for infinite bounds. In that case, we automatically obtain an algorithm for the computation of , by taking . This definition is also convenient to extend to the case of multivariate analytic functions in . In this case, we require algorithms for the computation of:

Recall that the function is necessarily upper continuous (e.g. [Hoe07, Theorem 2.3]). In particular, for every with there exists an with .

8.2.Multivariate analytic functions as evaluable functions

In practice, multivariate analytic functions such as are often built up as dags from univariate analytic functions such as and . In that case, it would be very expensive to systematically use power series expansions in several variables in order to compute with such functions. Instead, it would be better to represent such multivariate analytic functions as objects which can be evaluated at analytic functions in an arbitrary number of variables, or even at points in more general algebras.

More precisely, let be an effective Banach algebra over . This means that is the set of computable points in a Banach algebra over , and that the operations and the norm can be computed by algorithm. Recall that

for all . We do not necessarily assume to be commutative. Given a multivariate analytic function in with and pairwise commuting with for , the evaluation

is well defined. What is more: if and , then . Indeed, we start by computing and such that . For fixed and

we then have

By choosing sufficiently large, we may thus make as small as desired.

Conversely, assume now that, in the definition of computable multivariate analytic functions, we replace the method by an evaluation method for any effective Banach algebra over . In particular, given , we may take to be the algebra of all formal power series for which

is finite. Given with , it follows that the evaluation is well defined, and this evaluation yields the power series expansion of at the origin. This shows that providing an evaluation method is essentially equivalent to providing a method for series expansion.

Remark 11. In fact, analytic continuation and bound computation can also be regarded as evaluations in suitable “Banach algebras”. Indeed, the analytic continuation at corresponds to the evaluation at the analytic function . The computation of the bound can be done by evaluating at the ball and taking , where . Nevertheless, explicit methods for analytic continuation and bound computations are usually of a better quality. They may also be needed for the implementation of more general evaluation methods.

Following the same line of ideas, it may also be useful to consider the evaluation of analytic functions at broken line paths (or more general piecewise analytic paths) instead of ordinary points, thereby combining analytic continuation and ordinary evaluation in a single method. One might even consider evaluations at “paths” of successive Banach algebras of a similar type, such as with the and the sufficiently close, and .

8.3.Evaluation in commutative zero dimensional algebras

In order to simplify notations, we will now stop our digression on the abstract notions of computability and drop the superscripts “com”. In view of what we have seen in section 9.2, it is particularly interesting to evaluate multivariate analytic functions in commutative zero dimensional algebras over . Therefore, we will now study this special case in more detail.

Let be a finite dimensional -algebra. Given any basis for , the elements of can be represented by matrices, so may be identified with a commutative subalgebra of the algebra of matrices for some . In particular, any matrix norm on induces a norm on .

Given a multivariate analytic function at the origin which is convergent on a polydisk of polyradius , we may use (27) in order to evaluate at any points with for all . Since the commute, it is actually possible to do a bit better. Indeed, it is classical that there exists an invertible matrix (corresponding to a base change), such that

where , each is a nilpotent triagonal matrix, and . We may thus compute using

For any with , we notice that the evaluation reduces to an evaluation at an ordinary point. If , then we also notice that

where is minimal with .

8.4.Weierstrass preparation

8.4.1.Implicit functions

Let us start with a special case of Weierstrass preparation: the implicit function theorem. Consider a computable analytic function in in the neighbourhood of a point . Assume that but . Then the implicit function theorem states that there exists a unique analytic function in such that in a neighbourhood of and . It is not hard to check that is computable, for instance in the sense of section 8.1:

More generally, given analytic functions such that the Jacobian matrix has full rank at , then it can be shown in a similar way that the system of equations in unknowns

admits a unique analytic solution at which is again computable. Notice that this more general case also follows through a -fold application of the case of a single function.

8.4.2.Weierstrass preparation

Let us still consider a computable analytic function in in the neighbourhood of a point . Assume now that but . Then the Weierstrass preparation theorem states that there exist unique analytic functions in and an invertible analytic function in such that

We may regard (28) as a local factorization of the analytic function in , which depends on as parameters. Now Theorem 10 for the computation of local factorizations readily generalizes to computable analytic functions. More precisely, is the solution of the system , which we consider as a system of analytic equations in unknowns . It can be checked that this system of equations admits full rank, so that we may compute by applying the effective implicit function.

8.4.3.Certifying the multiplicity in a small neighbourhood

Effective Weierstrass preparation as described in the previous section can only be applied on rare occasions. Indeed, the assumption that we have an exact multiple cancellation is too ambitious, and needs to be replaced by the weaker condition that has a multiplicity in in a small neighbourhood of . This condition is still enough for the existence of a unique solution to the system in a neighbourhood of .

This naturally leads us to the question how to certify that has multiplicity in , in a small neighbourhood of , and uniformly in . In fact, paying some care, all methods from section 7 can be generalized to this setting. First of all, when using ball arithmetic, the parameters are simply replaced by small balls around , which reduces the problem to a univariate one. Secondly, when working with Taylor models at any chosen truncation order , the tail of an analytic function near is replaced by a ball of the form or . This reduction allows us to work with polynomials instead of series.

We also notice that the above certification methods can actually prove something slightly stronger than a uniform multiplicity: they can usually be used to verify that the zero set does not intersect . If such is the case, then we say that the equation is equisolvable in on .

8.4.4.Analytic elimination on small balls

Given a ball (and where we recall that such a “ball” is really a polydisk) such that we can certify a constant multiplicity of in on , the unique (and computable) analytic function such that is an analytic unit on will be called the Weierstrass normalization of on .

Given two analytic functions and on , assume that we can certify a constant multiplicity for at least one of them, say for , and let be as above. Then Weierstrass division of by yields unique analytic functions and such that and is polynomial in with . We may obtain (the residue class of) as a computable analytic function through evaluation of in the quotient algebra of computable analytic functions in by the relation .

Having computed the remainder of the Weierstrass division of by , we may finally form the resultant of the polynomials and in . This resultant is a computable analytic function in whose zeros are precisely the projections of the common zeros of and on . We will also call the analytic resultant of and on . Whenever is also defined, it can be checked that for some analytic unit on . Indeed, writing and for analytic units and , and is a unit. Similarly, .

8.5.Simplifying computable meromorphic functions

A computable meromorphic function is simply the quotient of two computable analytic functions such that the denominator is non zero. The computable meromorphic functions on a given domain form an effective field (but without a zero test). Using an external argument, we sometimes know that a computable meromorphic function is actually analytic on some domain. In this case, we would like to conclude that is computably analytic on this domain.

8.5.1.A reliable version of l'Hôpital's rule

Let us first consider the case when and are both univariate computable analytic functions in , in the neighbourhood of a single isolated zero . We start with the subcase when on a ball with . Let be the ball evaluation of at . For all , we then we have

Consequently, . For general , we obtain in a similar way that . We conclude that

8.5.2.Higher multiplicities

It would be interesting to know whether there exist generalizations of (29) to higher multiplicities or involving the successive derivatives of . In this section we will present a more ad hoc strategy for computing a reliable Taylor series expansion of in the case when , and are all analytic.

Let and be all small and a larger ball with the same center and radii and . Assume that admits exactly zeros both on and on . Let and . Writing , and for each , we have

Let be an upper bound for . Since achieves its maximum on at its border, we also have

From Cauchy's formula, it follows that

for all . Plugging this into (30), we obtain

For the small balls, we may thus use (32) for the computation of an enclosure for , and switch to (31) whenever this enclosure becomes better.

8.5.3.Recovering analytic functions from their values on a circle

More generally, it is in principle possible to recover any analytic function on a closed disk from its values on its boundary using Cauchy's formula. We will now describe an efficient method to do this effectively. When applied to computable meromorphic functions as above, this method has the advantage that we do not need to know the numerator and denominator explicitly; it is sufficient that we can compute its ball values on the boundary circle. Without loss of generally, we may assume that is the unit disk.

For a given number of evaluation points (preferably a power of two) and with , we evaluate at the disks for , yielding . We next compute ball enclosures for the coefficients of the unique polynomial with and for all using one reliable discrete Fourier transform. The remainder is bounded by on the unit circle, whence on the unit disk. This yields a Taylor model for . Taking sufficiently large, the so computed Taylor model can be made as accurate as needed.

8.5.4.Multivariate meromorphic functions

Let us now consider a meromorphic function in several variables which is actually analytic on some small ball. Modulo a linear change of coordinates and shrinking the domain to a smaller ball , we may first ensure that has constant multiplicity in on . We may then use any of the methods from the previous section for computing as an analytic function in , which depends analytically on the parameters .

9.Certification of multiple multivariate roots

9.1.Jacobians of corank one

Consider a system (11) and assume that we have numerically isolated a solution of multiplicity . Based on numerical computations, assume also that the Jacobian is expected to have corank at most one at .

In a small ball around (and where we recall that is really a polydisk) we may certify the corank assumption by computing using ball arithmetic and then checking that an submatrix admits a non zero determinant (still using ball arithmetic).

At the next stage, we would like to certify that there exist indeed exactly solutions in a suitable neighbourhood of , when counted with multiplicities. Now using our effective version of the implicit function theorem from section 8.4.1, we may analytically eliminate at least variables from the equations (11), say from the equations . This yields computable analytic functions such that

for all . We next compute the computable analytic function

Using the techniques from section 8.4.3, we finally check that has multiplicity in a suitable neighbourhood of .

Remark 12. One of the simplest examples when the corank one hypothesis fails is the following system:

9.2.Certification of herd homotopies

Let us still consider the system (11) and assume that we are given a herd of numeric solution paths which all tend to the same limit . In this section, we will investigate the problem of simultaneously certifying each of the paths in the herd.

We would like the follow the same strategy as in section 7.3, where we certified a single factor of order instead of separate roots. In the current setting, instead of viewing the as distinct individual paths, this amounts to regarding the whole herd as a single multivalued path. From the algebraic point of view, this means that we need to consider the ideal which annihilates . There are several ways to represent this ideal by a system of polynomial equations. We will use so called univariate representations.

Now recall that each can be considered as a vector of Puiseux series in of valuations . Setting

we notice that is invariant if we turn once around the origin. This means that is really an analytic function in at the origin. Assuming general position, is actually the minimal annihilator of . Furthermore, for sufficiently small, the roots are pairwise distinct, so there are unique interpolating polynomials of degree with

for all . In what follows, we will represent by the system of polynomials

where is monomic of degree and for each .

Solving the original system for of this form now amounts to a new system of equations in the unknowns and . This system can also be considered as a system of equations in unknowns over the algebra , augmented with one special unknown and one special equation . In analogy with Theorem 10, we may hope that this new system is non singular near , so that we can certify the herd using Corollary 7. It can be checked that this is always so in the corank one case from the previous section. Unfortunately, Theorem 10 fails to generalize in general. In particular, the new system remains singular whenever several herds converge to the same singularity.

9.3.Equisolvability

Let us now return to the general case when the corank of the Jacobian matrix at the solution may be larger than one. First of all, it will be useful to generalize the notion of equisolvability from section 8.4.3 to the case of several equations. Consider the system

(34)

on , where might actually be analytic functions instead of mere polynomials. We say that (34) is equisolvable in on if the zero set

(35)

does not intersect , where and . In particular, this means that the number of solutions of (34) as a function of does not depend on . Furthermore, the following lemma shows that there are no exceptional fibers with an infinite number of solutions.

Lemma 13. Assume that (34) is equisolvable in on and that is the zero set of a non zero analytic function on . Then none of the components of can be contained entirely in the set .

Proof. Assume for contradiction that has dimension . Taking , the intersection has dimension , whence it contains an analytic solution curve whose image has dimension one. Since (34) is equisolvable in on , the curve is entirely contained in . Consequently, the analytic functions are all bounded, whence constant. This contradicts our assumption that has dimension one.

Example 14. The equation

is equisolvable in on , since there are always two solutions, but

is not equisolvable, since the equation has one solution for , but no solutions for . Of course, is equisolvable on for on a larger disk (or on a smaller disk). Graphically speaking, an equation is equisolvable if and only if all solution curves are “nicely horizontal” in .

9.4.Kronecker representations

With the notations from the previous section, consider polynomials

where the coefficients and are computable analytic functions in on , such that (with ) and the system (34) is equivalent to

(36)

for all , where . Then we call (36) the Kronecker representation for (34) on .

Lemma 15. Assume that the system admits a finite number of solutions in and that the -coordinates of these solutions are pairwise distinct. Then there exists a Kronecker representation for on .

Proof. Let be such that are pairwise distinct. Then we define the Kronecker representation for to be the unique -tuple of univariate polynomials with , , , such that is annihilated by the system

We may compute such a Kronecker representation as follows. If , then we simply take and for all . For , we decompose with and compute in terms of and using

This yields an efficient dichotomic algorithm for the computation of . The result follows by applying the method to the solutions of in .

Theorem 16. Assume that (34) is equisolvable in on . For the zeroset of some non zero analytic function on , assume that for any , the solutions of (34) with admit pairwise distinct -coordinates. Then there exists a Kronecker representation for (34).

Proof. For any , the above lemma yields a Kronecker representation

for the system (34) with . Moreover, the equisolvability assumption implies that the degree of does not depend on . Furthermore, since for , the way we compute the functions depends analytically on . We thus obtain a “Kronecker representation” for (34) on the subset instead of .

Let be such that . Expanding and with , we have for all . Consequently, remains bounded on . Since is a removable isolated singularity, it follows that admits a unique analytic continuation on . It also follows that we may also choose to be the zero set of .

It remains to be shown that the with can also extended analytically from to . For this, we consider a new coordinate where are parameters. For sufficiently close to , we may apply the above result to the system rewritten as equations in . Notice that this new system is defined on a domain which is slightly smaller than . We thus obtain a Kronecker representation

which is equivalent to for with , where . The function is defined on and analytic both in and . Differentiating the equation with respect to with , we obtain

Letting tend to , we obtain that .

Proposition 17. Assume that is equisolvable in on . Assume that we are given a Kronecker representation (36) for (34) such that is equisolvable in on and such that vanishes on . Then (34) is equisolvable in on .

Proof. Assume for contradiction that . Since is equisolvable in on , we cannot have . Consequently, . Hence, we both have and . This contradicts the assumption that is equisolvable in on .

9.5.Univariate representations

The univariate representation is a variant of the Kronecker representation. It consists of polynomials

such that the coefficients are computable analytic functions in on , the coefficients are computable meromorphic functions in which are analytic on , where and . Finally, the system (34) is assumed to be equivalent to

(37)

for all . Given a Kronecker representation, we may compute a univariate representation as follows. We first compute the inverse of modulo , whose coefficients are meromorphic and analytic on . Taking , we may then retrieve the univariate representation from the Kronecker representation.

9.6.Intersection with a new equation

Given a Kronecker representation satisfying the hypothesis of Proposition 17, let us now study the intersection problem with a new equation . We first compute a univariate representation for (34), as detailed in section 9.5. Now consider the resultant

In practice, we may compute by evaluating at in the algebra where is the set of computable analytic functions on , and then take the norm of this evaluation. Let denote the solutions of (34) in as a function of , considered as analytic functions in on some Riemann surface above . By a classical property of resultants, we have

on . Whereas the functions generally admit a non trivial monodromy, the value of is uniquely determined as a function on on . Moreover, since for , the analytic function is bounded on . Since the complement of in is isolated, this means that extends uniquely into an analytic function on .

Lemma 18. Given a Kronecker representation (36) for and assuming that is equisolvable in on , the resultant vanishes on .

Proof. If , then (38) directly yields . For general , let be a vector such such that for all sufficiently small . Then is a bounded function, given by a Puiseux series of valuation in . Since is equisolvable in on , one of the curves tends to for . Since , it follows that and therefore tend to for . We conclude that , using the analyticity of .

We have shown above how to find an analytic relation for which is implied by on . We still need to show how to express as a function of . For this we use a similar technique as in the proof of Theorem 16. Consider formal variables with for all and new coordinates such that

With respect to these new coordinates, the univariate representation for (34) may be rewritten as

In a similar way as above above, we next evaluate the resultant

and prove that are analytic functions in on . The common zeros of (34) and correspond to the zeros of this resultant with respect to the new coordinates. Setting and leads to

where . Assuming that is not the zero function on , this almost yields a Kronecker representation for .

In order to obtain an actual Kronecker representation, we finally compute the Weierstrass normal form of with respect to , together with an invertible analytic function in such that . Setting , we then obtain the Kronecker representation

for on , where . If is equisolvable in on , then Lemma 18 also implies that vanishes on .

9.7.Incremental multiplicity certification

We are now in a position to certify the multiplicity of the system (11) on a sufficiently small ball. We first construct a homotopy for the system (11) which is in sufficiently general position. For notational reasons, it will be convenient to use instead of for the time parameter. We take

where .

For our main objective is to certify that the system is equisolvable in on a sufficiently small ball with around a numerical solution. For , we proceed as in section 8.4.3. Assume now that we have proved equisolvability in for the system on and that we also have a Kronecker representation

for the same system, where . Using the algorithms from section 8.4.3, we may again ensure that is equisolvable in on . Using the algorithm from section 9.6, we next compute a Kronecker representation

for the intersection , where . We will show below that on . Furthermore, vanishes on , by Lemma 18. We conclude that the system is equisolvable in on , by Proposition 17. By induction, this completes our algorithm to certify (almost) multiple roots on a sufficiently small polydisk around .

We still have to check that is not identical to zero on . Assume the contrary, so that vanishes everywhere, by analytic continuation. In other words, for any fixed , the equation admits a multiple solution in . In particular, taking , the generator of the ideal admits a multiple solution. Denoting by the solutions of the system , we have . Now was taken sufficiently small so as to ensure that for all , contradicting the assumption that admits a multiple root.

9.8.Global algebraic certification

Even if the coefficients of the system (11) are all rational or algebraic, then the computed solutions are only numeric approximations. For some applications, it is useful to have exact representations of the solutions. This allows for instance to check whether a given other polynomial with rational coefficients vanishes on the solution set or on some points of the solutions set.

The Kronecker representation provides one useful exact representation for the set of solutions. Modulo a generic linear change of coordinates, we may assume without loss of generality that for any distinct solutions of (11), their first coordinates and are also distinct. Using the algorithm in the proof of Lemma 15, we may then compute a Kronecker representation for the system (11).

Assume now that (11) has rational coefficients and that we have computed numeric approximations of with bit precision . Using the above method, we may thus compute a numeric approximation of the Kronecker representation with an accuracy of approximately bits. We apply rational number reconstruction [GG02, Chapter 5] in order to provide a guess for the Kronecker representation with rational coefficients. We may check whether this guess is correct by evaluating at over . If the check fails, then we double the bit precision, use Newton's method to improve the approximations , and keep iterating.

In the case that we suspect our system to admit multiple roots, which means that we have at least one herd of numerical solutions, the centers of herds of solutions are still known numerically with a precision of approximately bits. Consequently, we may use the above method in order to compute the radical of the zero dimensional ideal generated by the . By evaluating the over suitable algebras with nilpotent elements, we may check in a similar way as above whether the multiplicity of each root matches with the numeric multiplicity and obtain an exact algebraic representation for . In the projective setting with no solutions at infinity, this strategy can be used to provide a full algebraically certification of all solutions.

10.Systems of analytic equations

It is possible to generalize the techniques of this paper to the local resolution of a system of analytic equations on a polydisk. In this section, we outline some ideas for generalizations of this kind. As a general rule, such generalizations usually work better when the local solutions in the polydisk are well separated from the other solutions outside the polydisk.

10.1.Univariate equations

Consider the equation

where is an effective analytic function on the closed unit disk , without any roots on the unit circle. Assume that we wish to find all roots of in . Using the methods from section 8.4.3, the first step is to compute and certify the number of roots in .

Having determined the exact number of roots inside , most standard numerical methods for finding the roots of a polynomial of degree can be mimicked. For instance, we may use the homotopy

for and . In the unlucky case that we only find a subset of the roots with , we set and keep repeating the algorithm using a homotopy of the form

and for a new random choice of with . This method will ultimately pick random sufficiently close to any of the roots, thereby ensuring the termination of the method. We may also use Aberth's method, while resetting points that fall outside the unit disk to random points inside the disk.

10.2.Direct reduction to the polynomial case

Now consider a system

(39)

of equations, where are computable analytic functions on the closed unit polydisk . From now on, our aim is to determine the solutions of this system in .

For any choice of degrees , we may consider the truncated polynomials

and consider the truncated system

(40)

Using the homotopy methods from this paper, we may compute the solutions of this system and keep only those ones which are in . At a second stage, we form the homotopy

and follow the solutions of the truncated systems from to . This yields an uncertified candidate for the set of solutions to (39).

Remark 19. The above method admits several variants. For instance, if we want to avoid the explicit computation of truncated polynomials, then we may directly use the homotopy

for suitable and with . While following the homotopies, we may also decide to drop any paths which lead “too far” outside .

Given , let us now investigate how to pick . For any , we define

Typically, we now take to be the degree for which is maximal (and in any case larger than this value). We may determine this degree by first computing a bound for on a polydisk with , so that for all , whence for all . Starting with and , we now repeat the following: if , then . Else, if , then we stop and take . Otherwise, set and continue.

Let us now investigate how to certify that we found all solutions in . Since (39) is really a perturbation of (40), one idea would be ensure that for each solution of (40), the solutions remain either inside or outside for small perturbations. More precisely, for each , we start by computing a bound for on (for instance, we may take with the above notations, but is sometimes better to take more explicit coefficients for a sharper bound). We next consider the system

For each solution to (40), we now compute a ball enclosure of this solution such that any solution to with near lies in . This can be done using the ball version of Newton's method. We next check whether each of the obtained enclosures is either entirely contained in or in its complement. If this is the case, then we have obtained the desired certification. Indeed, consider a solution to (39). Then setting , we have and , for all . Hence for one of the above enclosures.

Remark 20. One disadvantage of the above certification method is that we have to keep track of both the solutions inside and outside for small perturbations of (40). An alternative idea would be to consider the set of all solutions to (39) inside as a generalized point which is the solution of a suitably deflated system, as in we did in section 9.2 for the certification of herd homotopies. This generalized solution is usually unique in a large polydisk, corresponding to large perturbations of the system of equations for . With some luck, this polydisk actually contains all sets of points in , thereby certifying that is the set of all solutions to (39) inside .

Unfortunately, this idea only works in dimension . For instance, in dimension two, the ball of systems with and does not contain a system which admits the set as its solutions. Nevertheless, it can be checked that any set of two elements in is the solution of a system in either or , where with for any .

10.3.Incremental resolution

As noticed in remark 20, one major disadvantage of the certification method from the previous section is that we need to consider the behaviour of all solutions to (40) under small perturbations, and even of those which lie far outside .

The incremental geometric resolution technique from the section 9.7 for the certification of multiple roots can also be used for finding and certifying an arbitrary set of roots in a polydisk . For this to work, it is necessary that the system (40) is equisolvable on , which means in practice that we chose a sufficiently general position and that the other roots of (40) outside are sufficiently far away from .

Remark 21. In the analytic setting, it can be interesting to compute the numeric solutions to (40) in an incremental way as well. A heuristic way to do the intersection step numerically goes as follows. We assume general position and consider the last intersection (the other intersections are done similarly, in fibers with ). Suppose that we are given a Kronecker representation

for the system , in the fiber where . Let be an upper bound for the number of roots of . For each of the roots of the equation , we continue our Kronecker representation to a new one in the fiber with . We finally use the homotopy

and perform the analytic continuation of the known solutions at until .

In the case when our method fails to prove equisolvability, we need to decompose the system (39) into simpler systems which are equisolvable. This can be done using the following techniques:

In principle, it is always possible to reduce to the equisolvable case using these techniques, but the number of required subdivisions may quickly grow out of hands. Indeed, for each solution (which we assumed to be single), there exists a sufficiently small neighbourhood where the are essentially linear functionals. The design of a good algorithm for keeping the number of subdivisions as small as possible is beyond the scope of this paper.

Bibliography

[AH83] G. Alefeld and J. Herzberger. Introduction to interval analysis. Academic Press, New York, 1983.

[ANS08] ANSI/IEEE. IEEE standard for binary floating-point arithmetic. Technical Report, ANSI/IEEE, New York, 2008. ANSI-IEEE Standard 754-2008. Revision of IEEE 754-1985, approved on June 12, 2008 by IEEE Standards Board.

[BHSW06] D. Bates, J. Hauenstein, A. Sommese and C. Wampler. Bertini: software for numerical algebraic geometry. http://www.nd.edu/~sommese/bertini/, 2006.

[BSHW08] D.J. Bates, A.J. Sommese, J.D. Hauenstein and C.W. Wampler. Adaptive multiprecision path tracking. SIAM Journal on Numerical Mathematics, 46(2):722–746, 2008.

[Dur08] C. Durvye. Algorithmes pour la décomposition primaire des idéaux polynomiaux de dimension nulle donnés en évaluation. PhD thesis, Univ. de Versailles (France), 2008.

[GG02] J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, 2-nd edition, 2002.

[GHH+97] M. Giusti, K. Hägele, J. Heintz, J.E. Morais, J.L Montaña and L.M. Pardo. Lower bounds for diophantine approximation. Journal of Pure and Applied Algebra, 117–118:277–317, 1997.

[GHMP95] M. Giusti, J. Heintz, J.E. Morais and L.M. Pardo. When polynomial equation systems can be solved fast? In G. Cohen, M. Giusti and T. Mora, editors, Proc. AAECC'11, volume 948 of Lecture Notes in Computer Science. Springer Verlag, 1995.

[GLSY05] M. Giusti, G. Lecerf, B. Salvy and J.-C. Yakoubsohn. On location and approximation of clusters of zeros of analytic functions. Foundations of Computational Mathematics, 5(3):257–311, 2005.

[GLSY07] M. Giusti, G. Lecerf, B. Salvy and J.-C. Yakoubsohn. On location and approximation of clusters of zeros: case of embedding dimension one. Foundations of Computational Mathematics, 7(1):1–49, 2007.

[HLRZ00] G. Hanrot, V. Lefèvre, K. Ryde and P. Zimmermann. MPFR, a C library for multiple-precision floating-point computations with exact rounding. http://www.mpfr.org, 2000.

[Hoe02] J. van der Hoeven. Relax, but don't be too lazy. JSC, 34:479–542, 2002.

[Hoe05] J. van der Hoeven. Effective complex analysis. JSC, 39:433–449, 2005.

[Hoe07] J. van der Hoeven. On effective analytic continuation. MCS, 1(1):111–175, 2007.

[Hoe09] J. van der Hoeven. Ball arithmetic. Technical Report, HAL, 2009. Http://hal.archives-ouvertes.fr/hal-00432152.

[Hoe11] J. van der Hoeven. Efficient root counting for analytic functions on a disk. Technical Report, HAL, 2011. Http://hal.archives-ouvertes.fr/hal-00583139.

[JKDW01] L. Jaulin, M. Kieffer, O. Didrit and E. Walter. Applied interval analysis. Springer, London, 2001.

[Kea94] R.B. Kearfott. An interval step control for continuation methods. SIAM J. Numer. Anal., 31(3):892–914, 1994.

[Kra69] R. Krawczyk. Newton-algorithmen zur bestimmung von nullstellen mit fehler-schranken. Computing, 4:187–201, 1969.

[Kul08] U.W. Kulisch. Computer Arithmetic and Validity. Theory, Implementation, and Applications. Number 33 in Studies in Mathematics. De Gruyter, 2008.

[Lan76] S. Lang. Complex analysis. Addison-Wesley, 1976.

[Lec01] G. Lecerf. Une alternative aux méthodes de réécriture pour la résolution des systes algébriques. PhD thesis, École polytechnique, 2001.

[Lec02] G. Lecerf. Quadratic Newton iteration for systems with multiplicity. Foundations of Computational Mathematics, 2(3):247–293, 2002.

[Ley09] A. Leykin. NAG. http://www.math.uic.edu/~leykin/NAG4M2, 2009. Macaulay 2 package for numerical algebraic geometry.

[LVZ06] A. Leykin, J. Verschelde and A. Zhao. Newton's method with deflation for isolated singularities of polynomial systems. TCS, 359(1–3):111–122, 2006.

[LVZ08] A. Leykin, J. Verschelde and A. Zhao. Algorithms in algebraic geometry, chapter Higher-order deflation for polynomial systems with isolated singular solutions, pages 79–97. Springer, New York, 2008.

[MB96] K. Makino and M. Berz. Remainder differential algebras and their applications. In M. Berz, C. Bischof, G. Corliss and A. Griewank, editors, Computational differentiation: techniques, applications and tools, pages 63–74. SIAM, Philadelphia, 1996.

[MB04] K. Makino and M. Berz. Suppression of the wrapping effect by Taylor model-based validated integrators. Technical Report MSU Report MSUHEP 40910, Michigan State University, 2004.

[MKC09] R.E. Moore, R.B. Kearfott and M.J. Cloud. Introduction to Interval Analysis. SIAM Press, 2009.

[MM11] A. Mantzaflaris and B. Mourrain. Deflation and certified isolation of singular zeros of polynomial systems. In A. Leykin, editor, Proc. ISSAC'11, pages 249–256. San Jose, CA, US, Jun 2011. ACM New York.

[Moo66] R.E. Moore. Interval Analysis. Prentice Hall, Englewood Cliffs, N.J., 1966.

[Mor87] A.P. Morgan. Solving polynomial systems using continuation for] engineering and scientific problems. Prentice-Hall, Englewood Cliffs, N.J., 1987.

[Neu90] A. Neumaier. Interval methods for systems of equations. Cambridge university press, Cambridge, 1990.

[OWM83] T. Ojika, S. Watanabe and T. Mitsui. Deflation algorithm for multiple roots of a system of nonlinear equations. J. of Math. An. and Appls., 96(2):463–479, 1983.

[RG10] S. Rump and S. Graillat. Verified error bounds for multiple roots of systems of nonlinear equations. Num. Algs., 54:359–377, 2010.

[Rum80] S.M. Rump. Kleine Fehlerschranken bei Matrixproblemen. PhD thesis, Universität Karlsruhe, 1980.

[Rum10] S.M. Rump. Verification methods: rigorous results using floating-point arithmetic. Acta Numerica, 19:287–449, 2010.

[Sch82] A. Schönhage. The fundamental theorem of algebra in terms of computational complexity. Technical Report, Math. Inst. Univ. of Tübingen, 1982.

[SW05] A.J. Sommese and C.W. Wampler. The Numerical Solution of Systems of Polynomials Arising in Engineering and Science. World Scientific, 2005.

[Ver96] J. Verschelde. Homotopy continuation methods for solving polynomial systems. PhD thesis, Katholieke Universiteit Leuven, 1996.

[Ver99] J. Verschelde. PHCpack: a general-purpose solver for polynomial systems by homotopy continuation. ACM Transactions on Mathematical Software, 25(2):251–276, 1999. Algorithm 795.

[Wei00] K. Weihrauch. Computable analysis. Springer-Verlag, Berlin/Heidelberg, 2000.