Effective real numbers in Mmxlib

by Joris van der Hoeven

Mathématiques, CNRS

Bâtiment 425

Université Paris-Sud

91405 Orsay Cedex

France

Email: joris@texmacs.org

Web: http://www.math.u-psud.fr/~vdhoeven

Abstract

Until now, the area of symbolic computation has mainly focused on the manipulation of algebraic expressions. Based on earlier, theoretical work, the author has started to develop a systematic C++ library Mmxlib for mathematically correct computations with more analytic objects, like complex numbers and analytic functions. While implementing the library, we found that several of our theoretical ideas had to be further improved or adapted. In this paper, we report on the current implementation, we present several new results and suggest directions for future improvements.

1Introduction

Although the field of symbolic computation has given rise to several softwares for mathematically correct computations with algebraic expressions, similar tools for analytic computations are still somewhat inexistent.

Of course, a large amount of software for numerical analysis does exist, but the user generally has to make several error estimates by hand in order to guarantee the applicability of the method being used. There are also several systems for interval arithmetic, but the vast majority of them works only for fixed precisions. Finally, several systems have been developed for certified arbitrary precision computations with polynomial systems. However, such systems cannot cope with transcendental functions or differential equations.

The central concept of a systematic theory for certified computational analysis is the notion of an effective real number [17, 22, 4]. Such a number is given by an approximation algorithm which takes with on input and which produces an -approximation for with . One defines effective complex numbers in a similar way.

Effective real and complex numbers are a bit tricky to manipulate: although they can easily be added, multiplied, etc., there exists no test for deciding whether an effective real number is identically zero. Some interesting questions from a complexity point of view are also raised: if we want to compute an -approximation of , how to determine so that the computation of -approximations of the is most efficient?

Concrete approaches and implementations for computations with effective real numbers have been proposed, often independently, by several authors [12, 3, 2, 13, 11, 14, 20]. A first step in these approaches is often to implement some interval arithmetic [1, 16, 7, 15, 18]. As an optional second step, one may then provide a class real for which a real number is given by an interval approximation algorithm which, given , computes a closed interval with endpoints in , of radius .

In this paper, we report on the implementation of a C++ class for effective real numbers in the Mmxlib library [21]. This implementation is based on [20], but it also contains some new ideas.

In section 2, we start by quickly reviewing interval arithmetic and the computationally more efficient variant of “ball arithmetic” (see also [1, 16, 2]). We also try to establish a more precise semantics for this kind of arithmetic in the multi-precision context and discuss the use of interval classes as parameters of template classes such as complex numbers, matrices or polynomials. Our implementation relies on the Gmp and Mpfr libraries [6, 8].

In section 3, we mainly review previous work: equivalent definitions of effective real numbers, representation by dags and the different techniques for a priori and a posteriori error estimations. We also state improved versions of the “global approximation problem”, which serve as a base for the complexity analysis of our library. We finally correct an error which slipped into [20].

In section 4, we describe the current implementation, which is based on the sole use of a posteriori error estimates.

In section 5, we prove that our implementation is optimal up to a linear overhead in the input size (4) and a logarithmic overhead in the time complexity (3). It is interesting to compare these results with previous theoretical work on the complexity of interval arithmetic [5, 9].

In the last section, we indicate how to use a priori error estimates in a more efficient way than in [20]. In a forthcoming paper, we hope to work out the details and remove the linear overhead (4) in the input size, at least under certain additional assumptions.

2Interval arithmetic

Since real numbers cannot be stored in finite memory, a first approach to certified computations with real numbers is to compute with intervals instead of numbers. For instance, when using interval arithmetic, a real number like would be approximated by an interval like . Evaluation of a real function at a point then corresponds to finding an interval with for all . When all functions under consideration are Lipschitz continuous and when all computations are performed using a sufficiently high precision, this technique provides arbitrarily good approximations for the real numbers we want to compute.

2.1Representation issues

Our current interval library is based on MPFR [8], which implements a generalization of the IEEE 754 standard. More precisely, let be the word precision of the computer under consideration. Given a bit precision and a maximal exponent fixed by the user, the MPFR library implements arithmetic on -bit floating point numbers and exponents in the range with exact rounding. We recall that the IEEE 754 norm (and thus MPFR) includes special numbers , and (not a number). Assuming that has been fixed once and for all, we will denote by the set of -bit numbers. Several representations are possible for the computation with -bit intervals:

Endpoint representation
In this representation, an -bit interval is determined by its end-points . We also allow for the exceptional value . Since the MPFR library provides exact rounding, it is particularly simple to base an interval library on it, when using this representation: whenever a function is monotonic on a certain range, it suffices to consider the values at the end-points with opposite rounding modes. However, every high precision evaluation of an interval function requires two high precision evaluations of floating point numbers.

Ball representation
In this representation, an -bit interval is represented by a ball with center and radius . If and , then we also require that the ball is normal in the sense that . The underlying idea is that the endpoints of a high precision interval are usually identical apart from a few bits at the end, whence it is more efficient to only store the common part and the difference. As a consequence, one high precision operation on balls reduces to one high precision operation on floating point numbers and several operations on low precision numbers.

However, the ball representation has essentially less expressive power. For instance, it is impossible to represent the interval . Also, positivity is not naturally preserved by addition if . This may be problematic in the case we want to compute quantities like .

Remark 1. The normality condition admits several variants. For instance, given , we define the step as the exponent of minus . A ball may then be called normal if instead of . Alternatively, one may require .

Hybrid representation
In Mmxlib, we currently use the endpoint representation for low precision numbers and the ball representation for high precision numbers . Modulo some additional overhead, this combines the advantages of both representations while removing their drawbacks.

2.2Precision changes and semantics

Let denote the set of -bit intervals for the hybrid representation. If , then it should be noticed that the set is not stable under the usual arithmetic operations, due to the phenomenon of precision loss. Indeed, the sum of is typically computed using and , where is a small bound for the rounding error. However, if , then is not necessarily normal.

Nevertheless, the set is stable under all usual arithmetic operations. Indeed, any ball or interval may be normalized by replacing by a lower bit approximation. More precisely, consider an abnormal interval with and . Given , let and , so that . Let be minimal such that , and for some . Then is called the normalization of . If no such exists, then is defined to be the smallest interval with endpoints in , which contains . It can be shown that and .

Remark 2. The normalization procedure is not very canonical and it admits several variants (in particular, it has to be adapted whenever we change the definition of normality). Nevertheless, all good normalization procedures share the property that for some small fixed constant .

Dually, it may occur that the result of an operation can be given with more precision than the argument. For instance, if , then can be computed with a precision of about binary digits. Similarly, can be computed with a precision of about binary digits. We call this the phenomenon of precision gain. The point here is that it is not necessarily desirable to compute the results of and with the maximal possible precision.

Taking into account the phenomena of precision loss and gain, we propose the following “ideal” semantics for operations on intervals. First of all, a default precision for computations is fixed by the user. Now assume that we wish to evaluate an -ary function at intervals , where . Then there exists a smallest interval with , which satisfies either

Whenever , then is taken to be the smallest interval of which contains . Otherwise, we take .

Remark 3. For some purposes one may use the alternative convention for exceptions that is the smallest interval which contains all non exceptional values.

Since the normalization procedure is somewhat arbitrary (see remark 2), the ideal semantics may be loosened a little bit for implementation purposes. Instead of requiring the optimal return value , we rather propose to content oneself with a return value with and , for some fixed small constant , in the case when has precision . This remark also applies to the underlying MPFR layer: exact rounding is not really necessary for our purpose. It would be sufficient to have a “looser” rounding mode, guaranteeing results up to times the last bit.

2.3Complex numbers and other template types

A tempting way to implement the complex analogue of interval arithmetic in C++ is to use the complex template class from the standard library. Unfortunately, this approach leads to a lot of overestimation for the bounding rectangles. In order to see this, consider the complex rectangle and the sequence , . Because multiplication with “turns” the bounding rectangle, the error is roughly multiplied by at each step. In other words, we loose one bit of precision every two steps.

The above phenomenon can be reduced in two ways. First of all, one may use a better algorithm for computing , like repeated squaring. In the case of complex numbers though, the best solution is to systematically use complex ball representations. However, standardization of the operations requires more effort. Indeed, given an operation on balls of precision , it can be non-trivial to design an algorithm which computes a ball of almost minimal radius (up to ).

The precision loss phenomenon is encountered more generally when combining interval arithmetic with template types. The best remedy is again to modify the algorithms and/or data types in a way that the errors in the data are all of a similar order of magnitude. For instance, when computing a monodromy matrix as a product of connection matrices, it is best to compute this product by dichotomy . Similarly, when computing the product of two truncated power series, it is good to first perform a change of variables which makes the errors in the coefficients of and of the same order of magnitude [19, Section 6.2]

3Effective real numbers

For users of computer algebra systems, it would be convenient to provide a data type for real numbers which can be used in a similar way as the types of rational numbers, polynomials, etc. Since interval arithmetic already provides a way to perform certified computations with “approximate real numbers”, this additional level of abstraction should mainly be thought of as a convenient interface. However, due to the fact that real numbers can only be represented by infinite structures like Cauchy sequences, their manipulation needs more care. Also, the actual implementation of a library of functions on effective real numbers raises several interesting computational complexity issues. In this section, we review some previous work on this matter.

3.1Definitions and theoretical properties

Let denote the set of dyadic numbers. Given and , we recall from the introduction that an -approximation of is a dyadic number with . We say that is effective, if it admits an approximation algorithm, which takes on input and which returns an -approximation for . The asymptotic time complexity of such an approximation algorithm is the time it takes to compute a -approximation for , when . We denote by the set of effective real numbers.

The above definition admits numerous variants [22, section 4.1]. For instance, instead of require an approximation algorithm, one may require the existence of an algorithm which associates a closed interval with end-points in to each , such that and (an interval with end-points in will also be called an -bounding interval for ). Similarly, one may require the existence of an effective and rapidly converging Cauchy sequence for which there exists a number with for all .

All these definitions have in common that an effective real number is determined by an algorithm which provides more and more precise approximations of on demand. In an object oriented language like C++, this can be implemented by providing an abstract representation class real_rep with a purely virtual method approximate which corresponds to this approximation algorithm. The class real is implemented as a pointer to real_rep.

Since effective real numbers should be thought of as algorithms, the zero-test problem in can be reduced to the halting problem for Turing machines. Consequently, there exist no algorithms for the basic relations , , , , and on .

Given an open domain of , a real function is said to be effective if there exists an algorithm which takes an approximation algorithm for on input and which produces an approximation algorithm for . Here we understand that approximates the same number , if is another approximation algorithm for .

Most common operations, like , , , , , , , , etc., can easily shown to be effective. On the other hand, without any of the operations for comparison, it seems more difficult to implement functions like . In fact, it turns out that effective real functions are necessarily continuous [22, theorem 1.3.4].

3.2Dag models

A concrete library for computations with effective real numbers consists of a finite number of functions like , , , , , , etc. Given inputs of type real, such an operation should produce a new instance of real. Usually, the representation class for in particular contains members for , which can then be used in the method which implements the approximation algorithm for . For instance, a very simple implementation of addition might look as follows:

class add_real_rep: public real_rep {
  real x, y;
public:
  add_real_rep (const real& x2, const real& y2):
    x (x2), y (y2) {}
  dyadic approximate (const dyadic& err) {
    return x->approximate (err/2) + y->approximate (err/2);
  }
};

When implementing a library of effective real functions in this way, we notice in particular that any effective real number computed by the library reproduces the expression by which it was computed in memory. Such effective real numbers may therefore be modeled faithfully by rooted dags (directed acyclic graphs) , whose nodes are labeled by . More generally, finite sets of effective real numbers can be modeled by general dags of this type. Figure 1 shows an example of such a dag, together with some parameters for measuring its complexity.

Since storing entire computations in memory may require a lot of space, the bulk of computations should not be done on the effective real numbers themselves, but rather in their approximation methods. In particular, real should not be thought of as some kind of improved double type, which can be plugged into existing numerical algorithms: the real class rather provides a user-friendly high-level interface, for which new algorithms need to be developed.

Figure 1. Example of a dag with roots. The dag has size (i.e. the total number of nodes) and depth 3 (i.e. the longest path from a root to a leaf). The weight of the dag corresponds to the sum of the sizes of the trees obtained by “copying” each of the roots. In our example, the weight is . Finally, the ancestrality of the dag is defined to be the maximum number of ancestors of a leaf. In our example, the ancestrality is .

3.3The global approximation problems

Let be a library of effective real functions as in the previous section, based on a corresponding library of functions on intervals. In order to study the efficiency of our library, it is important to have a good model for the computational complexity. In this section, we will describe a static and a dynamic version of the global approximation problem, which are two attempts to capture the computational complexity issues in a precise manner.

In its static version, the input of the global approximation problem consists of

Denoting by the function associated to the node and by its children, we may recursively associate a real value to . On output, we require for each node an interval with endpoints in , such that

Notice that the second condition implies in particular that .

The dynamic version of the global approximation problem consists of a sequence of static global approximation problems for a fixed labeled dag , when we incrementally add challenges for nodes . More precisely, we are given

On output, we require for each a solution to the -th static global approximation problem, which consists of the labeled dag with challenges . Here we understand that a solution at the stage may be presented as the set of changes w.r.t. the solution at stage .

Let us explain why we think that the dynamic global approximation problem models the complexity of the library in an adequate way. For this, consider a computation by the library. The set of all effective real numbers constructed during the computation forms a labeled dag . The successive calls of the approximation methods of these numbers naturally correspond to the sequence .

It is reasonable to assume that the library itself does not construct any real numbers, i.e. all nodes of correspond to explicit creations of real numbers by the user. Indeed, if new numbers are created from inside an approximation method, then all computations which are done with these numbers can be seen as parts of the approximation method, so they should not be taken into account during the complexity analysis. Similarly, if the constructor of a number induces the construction of other real numbers, then may be expressed in terms of more basic real functions, so we may consider as a function outside our library.

Now assume that another, possibly better library were used for the same computation. It is reasonable to assume that the corresponding dag and challenges coincide with the previous ones. Indeed, even though it might happen that the first and second library return different bounding intervals and for a given challenge , the libraries cannot know what the user wants to do with the result. Hence, for a fair comparison between the first and second library, we should assume that the user does not take advantage out of possible differences between and . This reduces to assuming that , and for all .

Finally, it is reasonable to assume that all actual approximations of the are done using a fixed interval library . This means for instance that the second library has no better algorithms for multiplication, exponentiation, etc. than the first one. When putting all our “reasonable assumptions” together, the time of the computation which was spent in the library now corresponds to the time which was required to solve the corresponding dynamic global approximation problem.

3.4A priori error estimates

Let us now consider the problem of obtaining an -approximation for the result of an operation . For simplicity, we will focus on the case of addition . In this and the next section, we briefly recall several strategies, which are discussed in more detail in [20].

In the case of a priori error estimates, the tolerance is distributed a priori over and . In other words, we first determine and with , next compute -approximations for the , and finally add the results. The systematic choice of can be very inefficient: in the case of badly balanced trees like in figure 2 (this occurs in practice when evaluating polynomials using Horner's rule), this requires the approximation of with a much lower tolerance than ( versus ).

This problem can be removed by balancing the error according to the weights and of and (i.e., by taking ). For “non degenerate” cases of the global approximation problem for a dag of weight and size , it can be shown this technique requires tolerances which are never worse than times the optimal ones.

Unfortunately, while implementing the algorithms from [20], it turned out that is often of the same order as and therefore far from good enough. This is for instance the case when the expressions are obtained via some iterative process or as the coefficient of a lazy power series. For this reason, we have currently abandoned the use of a priori error estimates in our implementation. However, this situation is quite unsatisfactory, since this technique is still most efficient in many cases. We will come back to this problem in section 6.

Figure 2. A badly balanced tree.

3.5A posteriori error estimates and relaxed evaluation

A second strategy consists of computing error estimates a posteriori: if we want to compute an -approximation for , we start with the computation of a bounding interval for at precision . As long as the obtained result is not sufficiently precise, we keep doubling the precision and repeating the same computation.

As explained in [20], this strategy can be optimized in two ways. First of all, the strategy may be carried out locally, by storing a “best available approximation” (together with the corresponding precision) for each instance of real. Indeed, when increasing the precision for the computation of , sufficiently precise approximations for and might already be known, in which case their recomputation is unnecessary.

Secondly, instead of doubling the precision at each step, it is better to double the expected computation time. For instance, consider the computation of , where has time complexity (i.e. admits an approximation algorithm, whenever admits a approximation algorithm). Evaluate at successive precisions , where and is the smallest precision at which the evaluation yields a sufficiently precise result. Then the total computation time never exceeds by a factor more than (see also [10]).

Unfortunately, an error slipped into [20], because the successive recursive approximations of may not be sufficiently precise in order to allow for evaluations of at successive precisions . For instance, if is given by an algorithm of exponential time complexity , then successive approximations of will only yield one more bit at every step. This error can be repaired up to a logarithmic factor in two ways. First of all, we notice that the error only concerns the cumulative cost of the successive reevaluation of . In section 5, we will prove that the total cost of reevaluating all nodes of the dag remains good.

Secondly, it is possible to adapt the technique of relaxed formal power series to real numbers. Roughly speaking, this approach relies on the recursive decomposition of a “relaxed mantissa” of length into a fixed part of length and a relaxed remainder (so that ). Given an operation , we then compute and at precision and obtain a formula for the relaxed decomposition , since is a truncation of and . As soon as the precision of exceeds , we take a new value for and recompute and at a doubled precision. Working out the details of this construction shows that most common real functions can be evaluated in a relaxed way with the same complexity as usual, multiplied by an overhead.

However, the relaxed strategy accounts for a lot of additional implementation work and no noticeable improvement with respect to the global bound (3) which will be proved in section 5. Therefore, it is mainly interesting from a theoretical point of view.

4Effective real numbers in Mmxlib

4.1The classes real and real_rep

Inside Mmxlib, dyadic numbers in are represented using generalized floating point numbers in , where is bounded by a precision of the order of or .

Effective real numbers (of type real) are implemented as pointers to an abstract representation class real_rep with a virtual method for the computation of -bounding intervals. Usually, such a number is of the form , where is an effective real function and are other effective real numbers. The number is concretely represented by an instance of a class f_real_rep which derives from real_rep and with fields corresponding to .

The current implementation is based on the technique of a posteriori error bounds from section 3.5 with the two optimizations mentioned there: remembering the best currently available approximations for each real number and doubling computations times instead of precisions. These strategies are reflected as follows in the real_rep data type:

class real_rep {
protected:
  double   cost;
  interval best;
  real_rep (): cost (1.0) { compute (); }
  virtual int as_precision (double cost);
  virtual interval compute ();
public:
  interval improve (double new_cost);
  interval approximate (const dyadic& err);
};

The field best corresponds to the best currently available bounding interval for . The value of best is recomputed several times by the purely virtual method compute at increasing intended costs, the last one of which is stored in cost. More precisely, best is recomputed as a function of approximations of at the same costs. When these approximations are sufficiently precise, then the cost of the computation of best will be more or less equal to cost. Otherwise, the actual computation may take less time (see the discussion at the end of section 3.5).

The costs are normalized (we start with 1.0) and doubled at each iteration. The purely virtual method as_precision is used to convert an intended cost to the corresponding intended precision.

The user interface is given by the routines improve and approximate. The first one computes an approximation of at intended cost new_cost:

interval real_rep::improve (double new_cost) {
  if (new_cost <= cost) return best;
  cost= max (new_cost, 2.0 * cost);
  set_precision (as_precision (cost));
  best= compute ();
  restore_precision ();
  return best;
}

The method approximate returns an -bounding interval for as a function of :

interval real_rep::approximate (const dyadic& eps) {
  while (radius (best) >= eps)
    (void) improve (2.0 * cost);
  return best;
}

Remark 4. In practice, the method improve also avoids the call of compute if the new precision associated to cost is equal to the old one. This may indeed happen if the cost of the operation increases more than linearly as a function of the bit precision.

4.2Examples of derived classes of real_rep

Let us illustrate the mechanism from the previous section in the case of exponentiation. The exponential of a number is represented by an instance of

class exp_real_rep: public real_rep {
  real x;
  int as_precision (double cost);
  interval compute ();
public:
  exp_real_rep (const real& x2): x (x2) {}
};

The computation of bits of takes a time proportional to for small values of and a more or less linear time for large values of . Therefore, a simple implementation of as_precision would be as follows:

int exp_real_rep::as_precision (double cost) {
  if (cost <= 256.0) return (int) sqrt (cost);
  return min (MAX_PREC, (int) cost/16.0);
}

Of course, this is a very rough approximation of the real time complexity of . For the theoretical bounds in the next sections, better approximations are required. In practice however, a simple implementation like the above one is quite adequate. If necessary, one may implement a more precise algorithm, based on benchmarks. One may also gradually increase precisions and use a timer. The actual approximation of is done using the overloaded function exp on intervals:

interval exp_real_rep::compute () {
  return exp (x->improve (cost));
}

In the case of functions with arity more than one it is often possible to avoid unnecessarily precise computations of one of the arguments, when the approximations of the other argument are far less precise. For instance, in the case of addition, compute may be implemented as follows:

interval add_real_rep::compute () {
  dyadic eps= pow (2.0, -BITS_IN_WORD)
  if (radius (y->best) < eps * radius (x->best)) {
    (void) x->improve (cost);
    while (y->cost < cost &&
           radius (y->best) >= eps * radius (x->best))
      (void) y->improve (2.0 * y->cost);
  }
  else symmetric case with x and y interchanged
  else return x->improve (cost) + y->improve (cost);
}

5Complexity analysis

Notice that our Mmxlib implementation, as outlined in the previous section, naturally solves both the static and the dynamic versions of the global approximation problem: we first construct the dag and then either compute an -approximation for each , or successive -approximations for each (). In this section, we examine the efficiency of this approach.

5.1Total complexity versus final complexity

Since is approximated several times during our algorithm, let us first study the difference between the total computation time and the time taken by the final and most precise approximations of the .

For each node , let be the successive timings for the approximation of . We will also denote by the intended computation times and precisions. By construction, we have for all and . For each , let and . We define , and .

We already warned against the possibility that . Nevertheless, we necessarily have if is a leaf. Also, any operation of cost triggers an operation of cost for one of the children of . By induction, it follows that there exists at least one leaf descending from which really spends a time . Hence, denoting by the ancestrality of the dag and by its subset of leaves, we have

(1)

We also have since for all . Consequently,

(2)

The bound (1) is sharp in the case when the dag has only one leaf and a) the computation of an digit approximation of requires exponential time; b) all other operations can be performed in linear or polynomial time. A similar situation occurs when cancellations occur during the computation of , in which case the computation of at many bits precision still produces a -bit result.

A variant of (2), which is usually better, is obtained as follows. Since the precision of the result of an operation on intervals increases with the precision of the arguments, and similarly for the computation times, we have . Let be a node (which can be assumed to be a leaf by what precedes) for which is maximal. Then

It follows that

(3)

since .

5.2Final complexity versus optimal complexity

Let us now compare with the computation time for an optimal solution to the global approximation problem. In fact, it suffices to compare with an optimal solution for the static version: in the dynamic case, we consider the last static global approximation problem.

Denote by the computation time at each node for a fixed optimal solution, so that . If is the size of the dag, then we claim that

(4)

For simplicity, we will assume that whenever is a descendant from . This is no longer the case if we apply the optimization from the end of section 4.1, but the reasoning can probably be adapted to that case.

Assume for contradiction that (4) does not hold. Now consider the moment during the execution at which we first call improve with a maximal cost for some node . At that point, the current cost of each of the descendants of is . When such a is a leaf, it follows that . By structural induction over the descendants of , it follows that and the best available (resp. optimal) approximation (resp. ) for satisfies . In particular . On the other hand, the first call of improve with a maximal cost was necessarily triggered by approximate, whence . This contradiction proves our claim.

Up to a constant factor, the bound (4) is sharp. Indeed, consider the case of a multiplication of numbers which are all zero. When gradually increasing the precisions for the computation of , it can happen that one of the produces bounding intervals whose radii quickly converge to zero, contrary to each of the other . In that case, the time spent on improving each of the () is a waste, whence we loose a factor with respect to the optimal solution. On the other hand, without additional knowledge about the functions , it is impossible to design a deterministic procedure for choosing the most efficient index . In this sense, our current solution is still optimal. However, under additional monotonicity hypotheses on the cost functions, efficient indices can be found, by taking into account the “cost per digit”.

6Back to a priori error estimates

Although the approach from the previous section has the advantage of never being extremely bad and rather easy to implement on top of an existing layer for interval arithmetic, there are even simple cases in which the factor in the bound (3) is not necessary: in the truncated power series evaluation with for all , the computation of a -approximation of induces the computation of -bit approximations of each of the . If , this means that we spend a time instead of .

In order to remedy to this problem, we suggest to improve the balanced a priori estimate technique from [20] and cleverly recombine it with the current approach. In this section, we will briefly sketch how this could be done. The results are based on joint ideas with V. Kreinovich, which we plan to work out in a forthcoming paper.

6.1Rigid dags

Let us start by isolating those situations in which a priori error estimates should be efficient. Consider a labeled dag so that admits an initial interval approximation for each . Assume also that for each node and each child of , we have an interval with . If , then we say that (together with the ) is a Lipschitz dag. If, in addition, we have for some and all , then we say that is -rigid.

A typical obstruction to the Lipschitz property occurs in dags like . Similarly, a dag like is typically Lipschitz, but not rigid. Given a Lipschitz dag, a variant of automatic differentiation provides us with bounds for the error in in terms of the errors in the , where ranges over the leaves below . If is -rigid, and especially when , then these bounds actually become very sharp.

For instance, given a rooted Lipschitz dag and a challenge at the root , one may compute a sufficient precision for obtaining an -approximation of as follows. Let be the error in at each leaf , when computing with precision . We have for some which depends on . Then we recursively estimate the error at each node by

This provides us with a bound of the form for the error at the root . We may thus take . The approach can be further improved using similar ideas as in the implementation of addition in section 4.2.

6.2Backward error bounds

Instead of working with a fixed precision , a better idea is to compute the contribution of the error at each leaf to the error at . This problem is dual to the problem of automatic differentiation, since it requires us to look at the opposite dag of , which is obtained by inverting the direction of the edges. Indeed, if denote all parents of a node , and if is the -th child of for each , then we take

Together with the initial condition , this allows us to compute for all leafs . In order to compute with error , we may now balance over the leaves according to the . More precisely, we compute an -approximation of each , where is the number of leaves, and recompute all other nodes using interval arithmetic. As an additional optimization, one may try to balance according to the computational complexities of the .

The above strategy is a bit trickier to implement in an incremental way. Indeed, in the dynamic global approximation problem, we ask for -approximations at different nodes and, since good previous approximations may already be present, it is not always necessary to compute the complete dag below . A solution to this problem is to keep track of the “creation date” of each node and to compute the from the top down to the leaves, while first considering nodes with the latest creation date (i.e. by means of a heap). Whenever the computed is so small that the current error at contributes only marginally to the error at the top (i.e. ), then it is not necessary to consider the descendants of .

Bibliography

[1]

G. Alefeld and J. Herzberger. Introduction to interval analysis. Academic Press, 1983.

[2]

J. Blanck. General purpose exact real arithmetic. Technical Report CSR 21-200, Luleå University of Technology, Sweden, 2002. http://www.sm.luth.se/~jens/.

[3]

J. Blanck, V. Brattka, and P. Hertling, editors. Computability and complexity in analysis, volume 2064 of Lect. Notes in Comp. Sc. Springer, 2001.

[4]

A. Edalat and P. Sünderhauf. A domain-theoretic approach to real number computation. TCS, 210:73–98, 1998.

[5]

A. Gaganov. Computational complexity of the range of the polynomial in several variables. Cybernetics, pages 418–425, 1985.

[6]

T. Granlund et al. GMP, the GNU multiple precision arithmetic library. http://www.swox.com/gmp, 1991–2006.

[7]

M. Grimmer, K. Petras, and N. Revol. Multiple precision interval packages: Comparing different approaches. Technical Report RR 2003-32, LIP, École Normale Supérieure de Lyon, 2003.

[8]

G. Hanrot, V. Lefèvre, K. Ryde, and P. Zimmermann. MPFR, a C library for multiple-precision floating-point computations with exact rounding. http://www.mpfr.org, 2000–2006.

[9]

V. Kreinovich. For interval computations, if absolute accuracy is NP-hard, then so is relative accuracy+optimization. Technical Report UTEP-CS-99-45, UTEP-CS, 1999.

[10]

V. Kreinovich and S. Rump. Towards optimal use of multi-precision arithmetic: a remark. Technical Report UTEP-CS-06-01, UTEP-CS, 2006.

[11]

B. Lambov. The RealLib project. http://www.brics.dk/~barnie/RealLib, 2001–2006.

[12]

V. Ménissier-Morain. Arbitrary precision real arithmetic: design and algorithms. http://www-calfor.lip6.fr/~vmm/documents/submission_JSC.ps.gz, 1996.

[13]

N. Müller. iRRAM, exact arithmetic in C++. http://www.informatik.uni-trier.de/iRRAM/, 2000–2006.

[14]

R. O'Connor. A monadic, functional implementation of real numbers. Technical report, Institute for Computing and Information Science, Radboud University Nijmegen, 2005.

[15]

N. Revol. MPFI, a multiple precision interval arithmetic library. http://perso.ens-lyon.fr/nathalie.revol/software.html, 2001–2006.

[16]

S. Rump. Fast and parallel inteval arithmetic. BIT, 39(3):534–554, 1999.

[17]

A. Turing. On computable numbers, with an application to the Entscheidungsproblem. Proc. London Maths. Soc., 2(42):230–265, 1936.

[18]

J. van der Hoeven. GMPX, a C-extension library for gmp. http://www.math.u-psud.fr/~vdhoeven/, 1999. No longer maintained.

[19]

J. van der Hoeven. Relax, but don't be too lazy. JSC, 34:479–542, 2002.

[20]

J. van der Hoeven. Computations with effective real numbers. TCS, 351:52–60, 2006.

[21]

J. van der Hoeven et al. Mmxlib: the standard library for Mathemagix, 2002–2006. http://www.mathemagix.org/mml.html.

[22]

K. Weihrauch. Computable analysis. Springer-Verlag, Berlin/Heidelberg, 2000.