> <\body> <\hide-preamble> >> >> (,)>> >(,)>> \\\>> \\\>> >> \; \ <\doc-data||<\doc-author-data||<\author-address> LIX, CNRS École polytechnique 91128 Palaiseau Cedex France ||> \; |<\doc-note> This work has been supported by the ANR-09-JCJC-0098-01 project, as well as a Digiteo 2009-36HD grant and Région Ile-de-France. > \; >||> <\abstract> The project aims at the development of a ``computer analysis'' system, in which numerical computations can be done in a mathematically sound manner. Amajor challenge for such systems is to conceive algorithms which are both efficient, reliable and available at any working precision. In this paper, we survey several older and newer such algorithms. We mainly concentrate on the automatic and efficient computation of high quality error bounds, based on a variant of interval arithmetic which we like to call ``ball arithmetic''. >>>>>>>>>>>> Computer algebra systems are widely used today in order to perform mathematically correct computations with objects of algebraic or combinatorial nature. It is tempting to develop a similar system for analysis. On input, the user would specify the problem in a high level language, such as computable analysis or interval analysis. The system should then solve the problem in acertified manner. In particular, all necessary computations of error bounds must be carried out automatically. There are several specialized systems and libraries which contain work in this direction. For instance, Taylor models have been used with success for the validated long term integration of dynamical systems. A fairly complete interval arithmetic library for linear algebra and polynomial computations is . Another historical interval library, which continues to be developed is. There exist several libraries for multiple precision arithmetic with correct or at least well specified rounding, as well as a library for multiple precision interval arithmetic, and libraries for computable real numbers. There are also libraries for very specific problems, such as the library, which allows for the certified computation of the roots of multiple precision polynomials. The system intends to develop a new ``computer analysis'' system, distributed under a free licence and not depending on proprietary software such as . Ultimately, we hope to cover most of the functionality present in the software mentioned above and add several new features, such as computable analytic functions and transseries. We also insist on performance and the possibility to easily switch between machine doubles and multiple precision arithmetic (without being penalized too much). In this paper, we will discuss several older and newer ideas for combining efficiency, reliability and multiple precision arithmetic (when necessary). Many algorithms are already present in the system, or will be incorporated soon. Most of the algorithms in this paper will be based on ``ball arithmetic'', which provides a systematic tool for the automatic computation of error bounds. In section, we provide ashort introduction to this topic and describe its relation with computable analysis. Although ball arithmetic is really a variant of interval arithmetic, we will give several arguments in section why we actually prefer balls over intervals for most applications. Roughly speaking, balls should be used for the reliable approximation of numbers, whereas intervals are mainly useful for certified algorithms which rely on the subdivision of space. Although we will mostly focus on ball arithmetic in this paper, section presents a rough overview of the other kinds of algorithms that are needed in acomplete computer analysis system. After basic ball arithmetic for real and complex numbers, the next challenge is to develop efficient algorithms for reliable linear algebra, polynomials, analytic functions, resolution of differential equations, etc. In sections and we will survey several basic techniques which can be used to implement efficient ball arithmetic for matrices and formal power series. Usually, there is a trade-off between efficiency and the quality of the obtained error bounds. One may often start with a very efficient algorithm which only computes rough bounds, or bounds which are only good in favourable well-conditioned cases. If the obtained bounds are not good enough, then we may switch to a more expensive and higher quality algorithm. Of course, the main algorithmic challenge in the area of ball arithmetic is to reduce the overhead of the bound computations as much as possible with respect to the principal numeric computation. In favourable cases, this overhead indeed becomes negligible. For instance, for high working precisions , the centers of real and complex balls are stored with the full precision, but we only need a small precision for the radii. Consequently, the cost of elementary operations(>, >, >>, , ) on balls is dominated by the cost of the corresponding operations on their centers. Similarly, crude error bounds for large matrix products can be obtained quickly with respect to the actual product computation, by using norm bounds for the rows and columns of the multiplicands; see(). Another algorithmic challenge is to use fast algorithms for the actual numerical computations on the centers of the balls. In particular, it is important to use existing high performance libraries, such as , , , whenever this is possible. Similarly, we should systematically rely on asymptotically efficient algorithms for basic arithmetic, such as fast integer and polynomial multiplication. There are several techniques to achieve this goal: <\enumerate> The representations of objects should be chosen with care. For instance, should we rather work with ball matrices or matricial balls (see sections and)? If the result of our computation satisfies an equation, then we may first solve the equation numerically and only perform the error analysis at the end. In the case of matrix inversion, this method is known as Hansen's method; see section. When considering a computation as a tree or dag, then the error tends to increase with the depth of the tree. If possible, algorithms should be designed so as to keep this depth small. Examples will be given in sections and. Notice that this kind of algorithms are usually also more suitable for parallelization. When combining the above approaches to the series of problems considered in this paper, we are usually able to achieve a constant overhead for sharp error bound computations. In favourable cases, the overhead becomes negligible. For particularly ill conditioned problems, we need a logarithmic overhead. It remains an open question whether we have been lucky or whether this is a general pattern. In this paper, we will be easygoing on what is meant by ``sharp error bound''. Regarding algorithms as functions \\;x\y=f(x)>, an error \\> in the input automatically gives rise to an error \J(x)*\> in the output. When performing our computations with bit precision , we have to consider that the input error > is as least of the order of p>*\|x\|\(\>)> (where =\|x\|> for all ). Now given an error bound \|\\> for the input, an error bound \|\\> is considered to be sharp if \max\|\\>\|J(x)*\\|>. More generally, condition numbers provide a similar device for measuring the quality of error bounds. A detailed investigation of the quality of the algorithms presented in this paper remains to be carried out. Notice also that the computation of optimal error bounds is usually NP-hard. Of course, the development of asymptotically efficient ball arithmetic presupposes the existence of the corresponding numerical algorithms. This topic is another major challenge, which will not be addressed in this paper. Indeed, asymptotically fast algorithms usually suffer from lower numerical stability. Besides, switching from machine precision to multiple precision involves a huge overhead. Special techniques are required to reduce this overhead when computing with large compound objects, such as matrices or polynomials. It is also recommended to systematically implement single, double, quadruple and multiple precision versions of all algorithms (see for a double-double and quadruple-double precision library). The system has been designed so as to ease this task, since most current packages are C++ template libraries. The aim of our paper is to provide a survey on how to implement an efficient library for ball arithmetic. Even though many of the ideas are classical in the interval analysis community, we do think that the paper sheds a new light on the topic. Indeed, we systematically investigate several issues which have received limited attention until now: <\enumerate> Using ball arithmetic instead of interval arithmetic, for a wide variety of center and radius types. The trade-off between algorithmic complexity and quality of the error bounds. The complexity of multiple precision ball arithmetic. Throughout the text, we have attempted to provide pointers to existing literature, wherever appropriate. We apologize for possible omissions and would be grateful for any suggestions regarding existing literature. We will denote by <\equation*> \=\*2>={m*2:m,e\\} the set of . Given fixed bit precisions \> and \> for the unsigned mantissa (without the leading ) and signed exponent, we also denote by <\eqnarray*> >||:m,e\\,\|m\|\2,1-2\e+p\2}>>>> the corresponding set of floating point numbers. Numbers in > can be stored in bit space and correspond to ``ordinary numbers'' in the IEEE 754 norm. For double precision numbers, we have and . For multiple precision numbers, we usually take to be the size of a machine word, say , and denote =\>. The IEEE 754 norm also defines several special numbers. The most important ones are the infinities \> and ``not a number'' , which corresponds to the result of an invalid operation, such as 2>>. We will denote <\eqnarray*> >>||\{\\,\\,NaN}.>>>> Less important details concern the specification of signed zeros 0> and several possible types of NaNs. For more details, we refer to. The other main feature of the IEEE 754 is that it specifies how to round results of operations which cannot be performed exactly. There are four basic rounding modes >(upwards), > (downwards), > (nearest) and (towards zero). Assume that we have an operation \> with \>. Given U\\> and , we define >(x)> to be the smallest number > in >>, such that \y>. More generally, we will use the notation >=f>(x)+>y> to indicate that all operations are performed by rounding upwards. The other rounding modes are specified in a similar way. One major advantage of IEEE 754 arithmetic is that it completely specifies how basic operations are done, making numerical programs behave exactly in the same way on different architectures. Most current microprocessors implement the IEEE 754 norm for single and double precision numbers. A conforming multiple precision implementation also exists. Using Schönhage-Strassen multiplication, it is classical that the product of two -bit> integers can be computed in time (n)=O(n*log n*log log n)>. A recent algorithm by Fürer further improves this bound to (n)=O(n*log n*2> n)>)>, where >> denotes the iterator of the logarithm (we have > n=log> log n+1>). Other algorithms, such as division two >n>-bit> integers, have a similar complexity (n))>. Given prime numbers \\\p> and a number q\p*\*p>, the computation of > for ,k> can be done in time (n)*log n)> using binary splitting10.25>, where *\*p)>. It is also possible to reconstruct from the remainders > in time (n)*log n)>. Let > be an effective ring, in the sense that we have algorithms for performing the ring operations in >. If > admits >-th roots of unity for \n>, then it is classical that the product of two polynomials \[z]> with n> can be computed using ring operations in >, using the fast Fourier transform. For general rings, this product can be computed using (n)=O(n*log n*log log n)> operations, using avariant of Schönhage-Strassen multiplication. A formal power series \[[z]]> is said to be , if there exists an algorithm which takes \> on input and which computes \\>. We denote by [[z]]> the set of computable power series. Many simple operations on formal power series, such as multiplication, division, exponentiation, , as well as the resolution of algebraic differential equations can be done up till order using a similar time complexity (n))> as polynomial multiplication. Alternative multiplication algorithms of time complexity (n)=O((n))> are given in. In this case, the coefficients > are computed gradually and > is output as soon as ,\,f> and ,\,g> are known. This strategy is often most efficient for the resolution of implicit equations. Let n>> denote the set of n> matrices with entries in >. Given two n> matrices \n>>>, the naive algorithm for computing requires >)> ring operations with =3>. The exponent > has been reduced by Strassen to =log 7/log 2>. Using more sophisticated techniques, > can be further reduced to \2.376> . However, this exponent has never been observed in practice. Given y> in a totally ordered set , we will denote by the closed interval <\eqnarray*> ||R:x\z\y}.>>>> Assume now that is a totally ordered field and consider a normed vector space over. Given V> and R>={r\\:r\0}>, we will denote by <\eqnarray*> >||V:\<\|\|\>x-c\<\|\|\>\r}>>>> the closed ball with center and radius . Notice that we will never work with open balls in what follows. We denote the set of all balls of the form() by >. Given >>, we will denote by > and > the center radius of . We will use the standard euclidean norm <\eqnarray*> x\<\|\|\>>||\|+\+\|x\||n>>>>>> as the default norm on > and >. Occasionally, we will also consider the max-norm <\eqnarray*> x\<\|\|\>>>||\|,\,\|x\|}.>>>> For matrices \n>> or \n>>, the default operator norm is defined by <\eqnarray*> M\<\|\|\>>||x\<\|\|\>=1>\<\|\|\>M*x\<\|\|\>.>>>> Occasionally, we will also consider the max-norm <\eqnarray*> M\<\|\|\>>>||\|M\|,>>>> which satisfies <\equation*> \<\|\|\>M\<\|\|\>>\\<\|\|\>M\<\|\|\>\n*\<\|\|\>M\<\|\|\>>. We also define the max-norm for polynomials \[x]> or \[x]> by <\eqnarray*> P\<\|\|\>>>||\|.>>>> For power series \[[z]]> or \[[z]]> which converge on the compact disk >, we define the norm <\eqnarray*> f\<\|\|\>>||r>\|f(z)\|.>>>> After rescaling f(r*z)>, we will usually work with the norm \\<\|\|\>=\<\|\|\>\\<\|\|\>>. In some cases, it may be useful to generalize the concept of a ball and allow for radii in partially ordered rings. For instance, a ball |\>> stands for the set <\eqnarray*> ||\\|\1\i\n,\|x-()\|\()}.>>>> The setsn>|\n>>>, [x]|\[x]>>, [[x]]|\[[x]]>>, are defined in a similar component wise way. Given \>, it will also be convenient to denote by the vector with =\|x\|>. For matrices \n>>, polynomials \[x]> and power series \[[x]]>, we define , and similarly. Let > be a normed vector space over > and recall that |\>> stands for the set of all balls with centers in > and radii in >. Given an operation :\\\>, the operation is said to lift to an operation <\equation*> f>:|\>\|\>, if we have <\eqnarray*> >(X,\,X)>|>|,\,x):x\X,\,x\X},>>>> for all ,\,X\|\>>. For instance, both the addition :\\\> and subtraction >:\\\>> admits lifts <\eqnarray*> +>>||>>|->>||.>>>> Similarly, if > is a normed algebra, then the multiplication lifts to <\eqnarray*> \>>||>>>> The lifts >>>, >>>, >>>, are also said to be the ball arithmetic counterparts of addition, subtractions, multiplication, etc. Ball arithmetic is a systematic way for the computation of error bounds when the input of a numerical operation is known only approximately. These bounds are usually not sharp. For instance, consider the mathematical function \\x-x> which evaluates to zero everywhere, even if is only approximately known. However, taking >, we have >x=\>. This phenomenon is known as . In general, ball algorithms have to be designed carefully so as to limit overestimation. In the above definitions, > can be replaced by a subfield \>, > by an -vector space \>, and the domain of by an open subset of >. If and are in the sense that we have algorithms for the additions, subtractions and multiplications in and , then basic ball arithmetic in >> is again effective. If we are working with finite precision floating point numbers in > rather than a genuine effective subfield , we will now show how to adapt the formulas(),() and() in order to take into account rounding errors; it may also be necessary to allow for an infinite radius in this case. Let us detail what needs to be changed when using IEEE conform finite precision arithmetic, say >. We will denote <\eqnarray*> >|||\>>>|>|||\>>>|[\]>||[\]|\>.>>>> When working with multiple precision numbers, it usually suffices to use low precision numbers for the radius type. Recalling that =\>, we will therefore denote <\eqnarray*> >|||\>>>|[\]>||[\]|\>.>>>> We will write =\=2p>> for the machine accuracy and =\=2-p>> for the smallest representable positive number in >. Given an operation\\> as in section, together with balls =|r>>, it is natural to compute the center of <\eqnarray*> >||>(X,\,X)>>>> by rounding to the nearest: <\eqnarray*> ||>(x,\,x).>>>> One interesting point is that the committed error <\eqnarray*> >||,\,x)\|>>>> does not really depend on the operation itself: we have the universal upper bound <\eqnarray*> >|>|(y)>>|(y)>|>|>\)*\>\.>>>> It would be useful if this > were present in the hardware. For the computation of the radius , it now suffices to use the sum of (y)> and the theoretical bound formulas for the infinite precision case. For instance, <\eqnarray*> +>>||>y|r+>s>>>|->>||>y|r+>s>>>|\>>||>y|[(\|x\|+r)*s+r*\|y\|]>>,>>>> where >> stands for the ``adjusted constructor'' <\eqnarray*> >||>\(y)>.>>>> The approach readily generalizes to other ``normed vector spaces'' > over >, as soon as one has a suitable rounded arithmetic in > and a suitable adjustment function> attachedto it. Notice that (y)=\>, if > or is the largest representable real number in >. Consequently, the finite precision ball computations naturally take place in domains of the form >=>|\>>> rather than >. Of course, balls with infinite radius carry no useful information about the result. In order to ease the reading, we will assume the absence of overflows in what follows, and concentrate on computations with ordinary numbers in>. We will only consider infinities if they are used in an essential way during the computation. Similarly, if we want ball arithmetic to be a natural extension of the IEEE 754 norm, then we need an equivalent of . One approach consists of introducing a (not a ball) object, which could be represented by > or \>>. A ball function >> returns if returns for one selection of members of the input balls. For instance, >()=NaB>. An alternative approach consists of the attachment of an additional flag to each ball object, which signals a possible invalid outcome. Following this convention, >()>> yields >. Using the formulas from the previous section, it is relatively straightforward to implement ball arithmetic as a C++ template library, as we have done in the system. However, in the case of multiple precision arithmetic this is far from optimal. Let us discuss several possible optimizations: <\enumerate> Multiple precision libraries such as suffer from a huge overhead when it comes to moderate ( quadruple) precision computations. Since the radii are always stored in low precision, it is recommended to inline all computations on the radii. In the case of multiplication, this divides the number of function calls byfour. When computing with complex numbers \[\]>, one may again save several function calls. Moreover, it is possible to regard as an element of [\]*2>> rather than*2>)[\]>, use a single exponent for both the real and imaginary parts of . This optimization reduces the time spent on exponent computations and mantissa normalizations. Consider a ball \\> and recall that \>, \>. If 2*r>, then the log r+p-log \|c\|-64\> least significant binary digits of are of little interest. Hence, we may replace by its closest approximation in >>, with =\log \|c\|+64-log r\>, and reduce the working precision to >. Modulo slightly more work, it is also possible to share the exponents of the center and the radius. If we don't need large exponents for our multiple precision numbers, then it is possible to use machine doubles > as our radius type and further reduce the overhead of bound computations. When combining the above optimizations, it can be hoped that multiple precision ball arithmetic can be implemented almost as efficiently as standard multiple precision arithmetic. However, this requires a significantly higher implementation effort. Given \>, \\> and \\>={\\\:\\0}>, we say that > is an >-approximation> of if -x\|\\>. A real number \> is said to be if there exists an which takes an absolute tolerance \\>> on input and which returns an >-approximation of. We denote by > the set of computable real numbers. We may regard > as a data type, whose instances are represented by approximation algorithms (this is also known as the 9.6>). In practice, it is more convenient to work with so called ball approximation algorithms: a real number is computable if and only if it admits a , which takes a working precision \> on input and returns a |r>\\> with |r>> and \> r=0>. Indeed, assume that we have a ball approximation algorithm. In order to obtain an >-approximation, it suffices to compute ball approximations at precisions > which double at every step, until \\>. Conversely, given an approximation algorithm >:\>\\> of \>, we obtain a ball approximation algorithm |r>> by taking =2p>> and =>(r)>. Given \> with ball approximation algorithms >,>:\\\>, we may compute ball approximation algorithms for simply by taking <\eqnarray*> >>>)(p)>||>(p) +> >(p)>>|>>>)(p)>||>(p) -> >(p)>>|>>>)(p)>||>(p) \> >(p).>>>> More generally, assuming a good library for ball arithmetic, it is usually easy to write awrapper library with the corresponding operations on computable numbers. From the efficiency point of view, it is also convenient to work with ball approximations. Usually, the radius > satisfies <\eqnarray*> r>||p+o(1),>>>> or at least <\eqnarray*> r>||c*p+o(1),>>>> for some \>>. In that case, doubling the working precision until a sufficiently good approximation is found is quite efficient. An even better strategy is to double the ``expected running time'' at every step. Yet another approach will be described in section below. The concept of computable real numbers readily generalizes to more general normed vector spaces. Let > be a normed vector space and let > be an effective subset of in >, the elements of > admit a computer representation. For instance, if =\>, then we take =\>. Similarly, if =\n>> is the set of real n> matrices, with one of the matrix norms from section, then it is natural to take =(\)n>=\n>>. A point \> is said to be , if it admits an which takes \\>> on input, and returns an >-approximation> \\> of (satisfying -x\<\|\|\>\\>, as above). A real number is said to be if there exists an algorithm for computing an increasing sequence >:\\\;n\>> with \> >=x> (and >> is called a left approximation algorithm). Similarly, is said to be if x> is left computable. A real number is computable if and only if it is both left and right computable. Left computable but non computable numbers occur frequently in practice and correspond to ``computable lower bounds'' (see also ). We will denote by > and > the data types of left and right computable real numbers. It is convenient to specify and implement algorithms in computable analysis in terms of these data types, whenever appropriate. For instance, we have computable functions <\eqnarray*> :\\\>|>|>>|>:\\\>|>|>>||>|>>> More generally, given a subset \>, we say that S> is left computable in if there exists a left approximation algorithm >:\\S> for . We will denote by > and> the data types of left and computable numbers in , and define =S\S>. Identifying the type of boolean numbers > with , we have =\=\> as sets, but as data types. For instance, it is well known that equality is non computable for computable real numbers. Nevertheless, equality ``ultimately computable'' in the sense that there exists a computable function <\eqnarray*> :\\\>|>|.>>>> Indeed, given \> with ball approximation algorithms >> and >>, we may take <\eqnarray*> >)>|||(>\>\\\>\>)\\>>||>>>>>>>>> Similarly, the ordering relation >> is ultimately computable. This asymmetric point of view on equality testing also suggest a semantics for the relations >, >>, on balls. For instance, given balls \>, it is natural to take <\eqnarray*> |>|y\\>>|y>|>|y=\>>|y>|>|a\x,b\y,a\b>>||>|>>> These definitions are interesting if balls are really used as successive approximations of areal number. An alternative application of ball arithmetic is for modeling non-deterministic computations: the ball models a set of possible values and we are interested in the set of possible outcomes of an algorithm. In that case, the natural return type of a relation on balls becomes a ``boolean ball''. In the area of interval analysis, this second interpretation is more common. <\remark> We notice that the notions of computability and asymmetric computability do not say anything about the speed of convergence. In particular, it is usually impossible to give useful complexity bounds for algorithms which are based on these mere concepts. In the case of asymmetric computability, there even do not exist any recursive complexity bounds, in general. Given a computable function \\>, \> and \\>>, let us return to the problem of efficiently computing an approximation \\> of with -y\|\\>. In section, we suggested to compute ball approximations of at precisions which double at every step, until a sufficiently precise approximation is found. This computation involves an implementation >:\\\> of on the level of balls, which satisfies <\eqnarray*> >(X)>|>|X},>>>> for every ball \>. In practice, is often differentiable, with (x)\0>. In that case, given a ball approximation of , the computed ball approximation >(X)> of typically has a radius <\eqnarray*> >|>|(x)\|*,>>>> for \0>. This should make it possible to directly predict a sufficient precision at which \\>. The problem is that() needs to be replaced by a more reliable relation. This can be done on the level of ball arithmetic itself, by replacing the usual condition() by <\eqnarray*> >(X)>|>|)|M*>>>|||X>\|f(x)\|.>>>> Similarly, the multiplication of balls is carried out using <\eqnarray*> \> >||.>>>> instead of(). A variant of this kind of ``Lipschitz ball arithmetic'' has been implemented in. Although a constant factor is gained for high precision computations at regular points , the efficiency deteriorates near singularities ( the computation of >). In the area of reliable computation, interval arithmetic has for long been privileged with respect to ball arithmetic. Indeed, balls are often regarded as a more or less exotic variant of intervals, based on an alternative midpoint-radius representation. Historically, interval arithmetic is also preferred in computer science because it is easy to implement if floating point operations are performed with correct rounding. Since most modern microprocessors implement the IEEE754 norm, this point of view is well supported by hardware. Not less historically, the situation in mathematics is inverse: whereas intervals are the standard in computer science, balls are the standard in mathematics, since they correspond to the traditional >->-calculus. Even in the area of interval analysis, one usually resorts (at least implicitly) to balls for more complex computations, such as the inversion of a matrix. Indeed, balls are more convenient when computing error bounds using perturbation techniques. Also, we have a great deal of flexibility concerning the choice of anorm. For instance, a vectorial ball is not necessarily a Cartesian product of one dimensional balls. In this section, we will give a more detailed account on the respective advantages and disadvantages of interval and ball arithmetic. One advantage of interval arithmetic is that the IEEE 754 norm suggests a natural and standard implementation. Indeed, let be a real function which is increasing on some interval . Then the natural interval lift >> of is given by <\eqnarray*> >([l,h])>||>(l),f>(h)].>>>> This implementation has the property that >([l,h])> is the smallest interval with end-points in \{\\}>, which satisfies <\eqnarray*> >([l,h])>|>|[l,h]}.>>>> For not necessarily increasing functions this property can still be used as a requirement for the ``standard'' implementation of >>. For instance, this leads to the following implementation of the cosine function on intervals: <\eqnarray*> >([l,h])>||> l,cos> h]>|\l/2*\\=\h/2*\\\2*\-1>>|> h,cos> l]>|\l/2*\\=\h/2*\\\2*\>>|> l,cos> h),1]>|\l/2*\\=\h/2*\\\1\2*\-1>>|1,max(cos> l,cos> h)]>|\l/2*\\=\h/2*\\\1\2*\>>|1,1]>|\l/2*\\\\h/2*\\\1>>>>>>>>> Such a standard implementation of interval arithmetic has the convenient property that programs will execute in the same way on any platform which conforms to the IEEE 754 standard. By analogy with the above approach for standardized interval arithmetic, we may standardize the ball implementation >> of by taking <\eqnarray*> >()>||>(c)|s>,>>>> where the radius is smallest in \{\\}> with <\eqnarray*> >(c)|s>>|>|}.>>>> Unfortunately, the computation of such an optimal is not always straightforward. Inparticular, the formulas(),() and() do not necessarily realize this tightest bound. In practice, it might therefore be better to achieve standardization by fixing once and for all the formulas by which ball operations are performed. Of course, more experience with ball arithmetic is required before this can happen. The respective efficiencies of interval and ball arithmetic depend on the precision at which we are computing. For high precisions and most applications, ball arithmetic has the advantage that we can still perform computations on the radius at single precision. By contrast, interval arithmetic requires full precision for operations on both end-points. This makes ball arithmetic twice as efficient at high precisions. When working at machine precision, the efficiencies of both approaches essentially depend on the hardware. , interval arithmetic is better supported by current computers, since most of them respect the IEEE 754 norm, whereas the function > from() usually has to be implemented by hand. However, changing the rounding mode is often highly expensive (over hundred cycles). Therefore, additional gymnastics may be required in order to always work with respect to a fixed rounding mode. For instance, if > is our current rounding mode, then we may take <\eqnarray*> >y>||((\x)+>(\y)),>>>> since the operation \x> is always exact ( does not depend on the rounding mode). As a consequence, interval arithmetic becomes slightly more expensive. By contrast, when releasing the condition that centers of balls are computed using rounding to the nearest, we may replace() by <\eqnarray*> ||>(x,\,x)>>>> and() by <\eqnarray*> (y)>|>|>\)*\>(2*\).>>>> Hence, ball arithmetic already allows us to work with respect to a fixed rounding mode. Of course, using() instead of() does require to rethink the way ball arithmetic should be standardized. An alternative technique for avoiding changes in rounding mode exists when performing operations on compound types, such as vectors or matrices. For instance, when adding two vectors, we may first add all lower bounds while rounding downwards and next add the upper bounds while rounding upwards. Unfortunately, this strategy becomes more problematic in the case of multiplication, because different rounding modes may be needed depending on the signs of the multiplicands. As a consequence, matrix operations tend to require many conditional parts of code when using interval arithmetic, with increased probability of breaking the processor pipeline. On the contrary, ball arithmetic highly benefits from parallel architecture and it is easy to implement ball arithmetic for matrices on top of existing libraries: see and section below. Besides the efficiency of ball and interval arithmetic for basic operations, it is also important to investigate the quality of the resulting bounds. Indeed, there are usually differences between the sets which are representable by balls and by intervals. For instance, when using the extended IEEE 754 arithmetic with infinities, then it is possible to represent]>> as an interval, but not as a ball. These differences get more important when dealing with complex numbers or compound types, such as matrices. For instance, when using interval arithmetic for reliable computations with complex numbers, it is natural to enclose complex numbers by rectangles >, where and are intervals. For instance, the complex number > may be enclosed by <\eqnarray*> |>|,1+\]+[1-\,1+\]*\,>>>> for some small number >. When using ball arithmetic, we would rather enclose by <\eqnarray*> |>||\>.>>>> Now consider the computation of >. The computed rectangular and ball enclosures are given by <\eqnarray*> |>|2*\,2*\]+[2-2*\,2+2*\]*\+o(\)>>||>|*\>+o(\).>>>> Consequently, ball arithmetic yields a much better bound, which is due to the fact that multiplication by > turns the rectangular enclosure by degrees, leading to an overestimation by a factor > when re-enclosing the result by a horizontal rectangle(see figure). This is one of the simplest instances of the . For similar reasons, one may prefer to compute the square of the matrix <\equation> M=|>||>>>> in 2>|\>>>> rather than |\>2>>, while using the operator norm() for matrices. This technique highlights another advantage of ball arithmetic: we have a certain amount of flexibility regarding the choice of the radius type. By choosing a simple radius type, we do not only reduce the wrapping effect, but also improve the efficiency: when computing with complex balls in [\]>, we only need to bound one radius instead of two for every basic operation. More precisely, we replace () and () by <\eqnarray*> ||>(x,\,x)>>|||>(x,\,x)+(Im f)>(x,\,x)*\>>|(y)>|>|>(y)+>\)*\>(*\).>>>> On the negative side, generalized norms may be harder to compute, even though a rough bound often suffices ( replacing >(y)> by >\|Im y\|> in the above formula). In the case of matricial balls, amore serious problem concerns overestimation when the matrix contains entries of different orders of magnitude. In such badly conditioned situations, it is better to work in n>|\n>>> rather than n>|\>>. Another more algorithmic technique for reducing the wrapping effect will be discussed in sections and below. <\big-figure||gr-frame|>|gr-geometry||gr-grid||1>|gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||1>|gr-edit-grid-old||1>|magnification|>|gr-fill-color|default|gr-text-at-valign|center|gr-grid-aspect-props|||>|gr-grid-aspect||>|gr-line-arrows|||>>>|>|fill-color|pastel orange||||>>|>|fill-color|pastel yellow||||>>|>|fill-color|pastel orange||||>>|>|line-arrows|||>>>||>>||||>>|>|>>|>|line-arrows|||>>>||>>>>|gr-frame|>|gr-geometry||gr-grid||1>|gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||1>|gr-edit-grid-old||1>|magnification|>|gr-fill-color|pastel orange|gr-text-at-valign|center|gr-line-arrows|||>>>|gr-grid-aspect-props|||>|gr-grid-aspect||>|>|fill-color|pastel orange|||>>||>|fill-color|pastel orange|||>>|||>>|>|>>|>|line-arrows|||>>>|fill-color|pastel orange||>>|>|line-arrows|||>>>|fill-color|pastel orange||>>>>> Illustration of the computation of > using interval and ball arithmetic, for >. Even though personal taste in the choice between balls and intervals cannot be discussed, the elegance of the chosen approach for a particular application can partially be measured in terms of the human time which is needed to establish the necessary error bounds. We have already seen that interval enclosures are particularly easy to obtain for monotonic real functions. Another typical algorithm where interval arithmetic is more convenient is the resolution of a system of equations using dichotomy. Indeed, it is easier to cut an -dimensional block ,b]\\\[a,b]> into > smaller blocks than to perform a similar operation on balls. For most other applications, ball representations are more convenient. Indeed, error bounds are usually obtained by perturbation methods. For any mathematical proof where error bounds are explicitly computed in this way, it is generally easy to derive acertified algorithm based on ball arithmetic. We will see several illustrations of this principle in the sections below. Implementations of interval arithmetic often rely on floating point arithmetic with correct rounding. One may question how good correct rounding actually is in order to achieve reliability. One major benefit is that it provides a simple and elegant way to specify what a mathematical function precisely does at limited precision. In particular, it allows numerical programs to execute exactly in the same way on many different hardware architectures. On the other hand, correct rounding does have a certain cost. Although the cost is limited for field operations and elementary functions, the cost increases for more complex special functions, especially if one seeks for numerical methods with a constant operation count. For arbitrary computable functions on >, correct rounding even becomes impossible. Another disadvantage is that correct rounding is lost as soon we perform more than one operation: in general, >\f>> and f)>> do not coincide. In the case of ball arithmetic, we only require an upper bound for the error, not necessarily the best possible representable one. In principle, this is just as reliable and usually more economic. Now in an ideal world, the development of numerical codes goes hand in hand with the systematic development of routines which compute the corresponding error bounds. In such a world, correct rounding becomes superfluous, since correctness is no longer ensured at the micro-level of hardware available functions, but rather at the top-level, mathematical proof. Computable analysis provides a good high-level framework for the automatic and certified resolution of analytic problems. The user states the problem in a formal language and specifies a required absolute or relative precision. The program should return a numerical result which is certified to meet the requirement on the precision. A simple example is to compute an >-approximation for >, for a given \\>>. The system aims the implementation of efficient algorithms for the certified resolution of numerical problems. Our ultimate goal is that these algorithms become as efficient as more classical numerical methods, which are usually non certified and only operate at limited precision. Anaive approach is to systematically work with computable real numbers. \ Although this approach is convenient for theoretical purposes in the area of computable analysis, the computation with functions instead of ordinary floating point numbers is highly inefficient. In order to address this efficiency problem, the libraries for basic arithmetic on analytic objects (real numbers, matrices, polynomials, ) are subdivided into four layers of the so called (see figure ). We will illustrate this decomposition on the problem of multiplying two n> computable real matrices. The numerical hierarchy turns out to be a convenient framework for more complex problems as well, such as the analytic continuation of the solution to a dynamical system. As a matter of fact, the framework incites the developer to restate the original problem at the different levels, which is generally a good starting point for designing an efficient solution. <\big-figure||gr-frame|>|gr-geometry||gr-grid||gr-grid-old||1>|gr-edit-grid-aspect|||>|gr-edit-grid||gr-edit-grid-old||1>|gr-grid-aspect|||>|gr-grid-aspect-props|||>|gr-text-at-halign|default|gr-line-arrows|none|gr-fill-color|pastel blue|gr-color|default|gr-as-visual-grid|on|magnification|>|gr-line-width|default|gr-text-at-valign|default|gr-point-style|default||||>>|>|fill-color|pastel blue||||>>||||>>|>>||||>>||||>>|>>|>|>>||||>>||||>>|>>|>|>>|>|>>||||>>|>>||||>>|>>|>>|||>>>||>>|>|line-arrows|||>>>|fill-color|pastel yellow||>>|||>>>||>>|||>>>||>>|>|line-arrows|||>>>|fill-color|pastel yellow||>>||||>>||||>>||||>>|||>>>||>>|>|line-arrows|||>>>|fill-color|pastel yellow||>>|>|line-arrows|||>>>|fill-color|pastel yellow||>>|>|line-arrows|||>>>|fill-color|pastel yellow||>>|>|>>|>>|>>|>|>>|>|>>|>|>>|>|>>|>|line-arrows|||>>>|fill-color|pastel yellow||>>>>> The numerical hierarchy. On the mathematical top level, we are given two computable real n> matrices (\)n>> and an absolute error \\>>. The aim is to compute >-approximations for all entries of the product . The simplest approach to this problem is to use a generic formal matrix multiplication algorithm, using the fact that > is an effective ring. However, as stressed above, instances of > are really functions, so that ring operations in> are quite expensive. Instead, when working at precision , we may first compute ball approximations for all the entries of and , after which we form two approximation matrices ,\n>>>. The multiplication problem then reduces to the problem of multiplying two matrices in n>>. This approach has the advantage that )> operations on ``functions'' in > are replaced by a single multiplication in n>>. The aim of this layer is to implement efficient algorithms on balls. Whereas the actual numerical computation is delegated to the numerical level below, the reliable level should be able to perform the corresponding error analysis automatically. When operating on non-scalar objects, such as matrices of balls, it is often efficient to rewrite the objects first. For instance, when working in fixed point arithmetic ( all entries of and admit similar orders of magnitude), a matrix in n>> may also be considered as a in n>|\>> for the matrix norm \\<\|\|\>>>. We multiply two such balls using the formula <\eqnarray*> \> >||x\<\|\|\>>+r)*s+r*\<\|\|\>y\<\|\|\>>)]>>,>>>> which is a corrected version of(), taking into account that the matrix norm \\<\|\|\>>> only satisfies x*y\<\|\|\>>\n*\<\|\|\>x\<\|\|\>>*\<\|\|\>y\<\|\|\>>>. Whereas the naive multiplication in n>> involves > multiplications in>, the new method reduces this number to +2*n>>: one ``expensive'' multiplication in n>> and two scalar multiplications of matrices by numbers. This type of tricks will be discussed in more detail in section below. Essentially, the new method is based on the isomorphism <\eqnarray*> |\>n>>|>|n>|\>.>>>> A similar isomorphism exists on the mathematical level: <\eqnarray*> )n>>|>|n>)>>>> As a variant, we directly may use the latter isomorphism at the top level, after which ball approximations of elements of n>)> are already in n>|\>>. Being able to perform an automatic error analysis, the actual numerical computations are done at the numerical level. In our example, we should implement an efficient algorithm to multiply two matrices in n>>. In single or double precision, we may usually rely on highly efficient numerical libraries (, , ). In higher precisions, new implementations are often necessary: even though there are efficient libraries for multiple precision floating point numbers, these libraries usually give rise to a huge overhead. For instance, when using at double precision, the overhead with respect to machine doubles is usually comprised between and . When working with matrices with multiple precision floating point entries, this overhead can be greatly reduced by putting the entries under a common exponent using the isomorphism <\eqnarray*> n>=(\*2>)n>>|>|n>*2>.>>>> This reduces the matrix multiplication problem for floating point numbers to a purely arithmetic problem. Of course, this method becomes numerically unstable when the exponents differ wildly; in that case, row preconditioning of the first multiplicand and column preconditioning of the second multiplicand usually helps. After the above succession of reductions, we generally end up with an arithmetic problem such as the multiplication of two n> matrices in n>>. The efficient resolution of this problem for all possible and integer bit lengths is again non-trivial. Indeed, libraries such as do implement Schönhage-Strassen's algorithm algorithm for integer multiplication. However, the corresponding naive algorithm for the multiplication of matrices has a time complexity (p)*n)>, which is far from optimal for large values of . Indeed, for large , it is better to use multi-modular methods. For instance, choosing sufficiently many small primes ,\,q\2> (or >) with *\*q\2*n*4>, the multiplication of the two integer matrices can be reduced to multiplications of matrices in /q*\)n>>>. Recall that a -bit number can be reduced modulo all the> and reconstructed from these reductions in time (p)*log p)>. The improved matrix multiplication algorithm therefore admits a time complexity +n*(p)*log p)> and has been implemented in , and several other systems. FFT-based methods achieve similar practical time complexities +n*(p))> when and are of the same order of magnitude. In this section, we will start the study of ball arithmetic for non numeric types, such as matrices. We will examine the complexity of common operations, such as matrix multiplication, linear system solving, and the computation of eigenvectors. Ideally, the certified variants of these operations are only slightly more expensive than the non certified versions. As we will see, this objective can sometimes be met indeed. In general however, there is atrade-off between the efficiency of the certification and its quality, the sharpness of the obtained bound. As we will see, the overhead of bound computations also tends to diminish for increasing bit precisions . Let us first consider the multiplication of two n> double precision matrices <\equation*> M,N\\n>. The simplest multiplication strategy is to compute using the naive symbolic formula <\eqnarray*> >||M*N.>>>> Although this strategy is efficient for very small , it has the disadvantage that we cannot profit from high performance libraries which might be available on the computer. Reconsidering and as balls with matricial radii <\equation*> M,N\n>|\n>>, we may compute using <\eqnarray*> ||*]>|[\|\|**+*(\|\|+)+n*\*\|\|*\|\|]>>,>>>> where is given by =\|M\|>. A similar approach was first proposed in. Notice that the additional term *\|\|*\>\|\|> replaces (M*N)>. This extra product is really required: the computation of *)> may involve cancellations, which prevent a bound for the rounding errors to be read off from the end-result. The formula() does assume that the underlying library computes *> using the naive formula() and correct rounding, with the possibility to compute the sums in any suitable order. Less naive schemes, such as Strassen multiplication, may give rise to additional rounding errors. The above naive strategies admit the disadvantage that they require four non certified n> matrix products in n>>. If and are well-conditioned, then the following formula may be used instead: <\eqnarray*> ||*|R>>>|>||>)>\<\|\|\>*\<\|\|\>,k>)>\<\|\|\>+\<\|\|\>>)>\<\|\|\>*\<\|\|\>,k>)>\<\|\|\>+n*\*\<\|\|\>>)>\<\|\|\>*\<\|\|\>>)>\<\|\|\>]>,>>>> where >> and ,k>> stand for the -th row of and the -th column of . Since the norms can be computed using only )> operations, the cost of the bound computation is asymptotically negligible with respect to the )> cost of the multiplication*>. For large n> matrices, chances increase that or gets badly conditioned, in which case the quality of the error bound() decreases. Nevertheless, we may use a compromise between the naive and the fast strategies: fix a not too small constant , such as 16>, and rewrite and as \\\\> matrices whose entries are K> matrices. Now multiply and using the revisited naive strategy, but use the fast strategy on each of the K> block coefficients. Being able to choose , the user has an explicit control over the trade-off between the efficiency of matrix multiplication and the quality of the computed bounds. As an additional, but important observation, we notice that the user often has the means to perform an `` quality check''. Starting with a fast but low quality bound computation, we may then check whether the computed bound is suitable. If not, then we recompute a better bound using a more expensive algorithm. Assume now that we are using multiple precision arithmetic, say \n>>>. Computing using() requires one expensive multiplication in n>> and three cheap multiplications in n>>. For large , the bound computation therefore induces no noticeable overhead. Assume that we want to invert an n> ball matrix \n>>. This a typical situation where the naive application of a symbolic algorithm (such as gaussian elimination or LR-decomposition) may lead to overestimation. An efficient and high quality method for the inversion of is called Hansen's method. The main idea is to first compute the inverse of> using a standard numerical algorithm. Only at the end, we estimate the error using aperturbative analysis. The same technique can be used for many other problems. More precisely, we start by computing an approximation \\n>> of )1>>. Putting |0>>, we next compute the product using ball arithmetic. This should yield a matrix of the form , where is small. If is indeed small, say E\<\|\|\>\2p/2>>, then <\eqnarray*> (1-E)1>-1-E\<\|\|\>>|>|E\<\|\|\>|1-\<\|\|\>E\<\|\|\>>.>>>> Denoting by > the n> matrix whose entries are all >, we may thus take <\eqnarray*> 1>>|>|E\<\|\|\>|1-\<\|\|\>E\<\|\|\>>*\.>>>> Having inverted , we may finally take 1>=N*(1-E)1>>. Notice that the computation of E\<\|\|\>> can be quite expensive. It is therefore recommended to replace the check E\<\|\|\>\2p/2>> by a cheaper check, such as E\<\|\|\>>\2p/2>>. Unfortunately, the matrix is not always small, even if is nicely invertible. For instance, starting with a matrix of the form <\eqnarray*> >|||||>|||>|>|||>|>||||>>>>,>>>> with large, we have <\eqnarray*> 1>>|||K>|>|>|K)>>|||K>|>|>>||||>|>>||||>|K>>|||||>>>>.>>>> Computing )1>> using bit precision , this typically leads to <\eqnarray*> E\<\|\|\>>|>|*2p>.>>>> In such cases, we rather reduce the problem of inverting to the problem of inverting >, using the formula <\eqnarray*> 1>>||)1>.>>>> More precisely, applying this trick recursively, we compute ,E,E,\> until E>\<\|\|\>> becomes small (say E>\<\|\|\>\2p/2>>) and use the formula <\eqnarray*> 1>>||)*\*(1+E>)*(1-E>)1>>>|>)1>>|>|>+E>\<\|\|\>|1-\<\|\|\>E>\<\|\|\>>*\.>>>> We may always stop the algorithm for \n>, since -1)=0>. We may also stop the algorithm whenever E>\<\|\|\>\1> and E>\<\|\|\>\\<\|\|\>E>\<\|\|\>>, since usually fails to be invertible in that case. In general, the above algorithm requires ball n> matrix multiplications, where is the size of the largest block of the kind > in the Jordan decomposition of . The improved quality therefore requires an additional overhead in the worst case. Nevertheless, for a fixed matrix and\>, the norm E\<\|\|\>> will eventually become sufficiently small ( p/2>>) for () to apply. Again, the complexity thus tends to improve for high precisions. An interesting question is whether we can avoid the ball matrix multiplication > altogether, if gets really large. Theoretically, this can be achieved by using a symbolic algorithm such as Gaussian elimination or LR-decomposition using ball arithmetic. Indeed, even though the overestimation is important, it does not depend on the precision . Therefore, we have 1>)>=O(2p>)> and the cost of the bound computations becomes negligible for large . Let us now consider the problem of computing the eigenvectors of a ball matrix \n>>>, assuming for simplicity that the corresponding eigenvalues are non zero and pairwise distinct. We adopt a similar strategy as in the case of matrix inversion. Using astandard numerical method, we first compute a diagonal matrix \\n>> and an invertible transformation matrix \n>>, such that <\eqnarray*> 1>**T>|>|.>>>> The main challenge is to find reliable error bounds for this computation. Again, we will use the technique of small perturbations. The equation() being a bit more subtle than *\1>, this requires more work than in the case of matrix inversion. In fact, we start by giving a numerical method for the iterative improvement of an approximate solution. A variant of the same method will then provide the required bounds. Again, this idea can often be used for other problems. The results of this section are work in common with Mourrain> and Trebuchet>; in, an even more general method is given, which also deals with the case of multiple eigenvalues. Given > close to >, we have to find > close to and |~>> close to >, such that <\eqnarray*> 1>**>|||~>.>>>> Putting <\eqnarray*> >||>||~>>||*(1+\)>>|||1>**T>>|||,>>>> this yields the equation <\eqnarray*> 1>*N*(1+E)>||*(1+\).>>>> Expansion with respect to yields <\eqnarray*> *N*(1+E)>||*N*(1+E)>>||||1+E>*N-*N*E>>|||,E]+[H,E+|1+E>*N-*N*E>>|||,E]+O([H,E])+O(E).>>>> Forgetting about the non-linear terms, the equation <\eqnarray*> ]+\*\>||>>> admits a unique solution <\eqnarray*> >|||\-\>>|j\i>>||>>>>>>>|>|||\>>|i=j>>||>>>>>>>>> Setting <\eqnarray*> =\(\)>||max -\\|>:1\i\j\n,max>:1\i\n,>>>> it follows that <\eqnarray*> E\<\|\|\>,\<\|\|\>\\<\|\|\>>|>|**\<\|\|\>H\<\|\|\>.>>>> Setting =T*(1+E)>, =\*(1+\)> and =(T)1>**T-\>, the relation() also implies <\eqnarray*> >||H,E+|1+E>*(\+H)-*(\+H)*E.>>>> Under the additional condition E\<\|\|\>\>, it follows that <\eqnarray*> H\<\|\|\>>|>|H\<\|\|\>*\<\|\|\>E\<\|\|\>+4*\<\|\|\>E\<\|\|\>*\<\|\|\>\\<\|\|\>.>>>> For sufficiently small , we claim that iteration of the mapping :(T,\)\(T,\)> converges to a solution of(). Let us denote ,\)=\(T,\)>, =(T)1>**T-\> and let ,\)> be such that =T*(1+E)> and =\*(1+\)>. Assume that <\eqnarray*> H\<\|\|\>>|>|*\<\|\|\>\\<\|\|\>>>>>> and let us prove by induction over that <\eqnarray*> H\<\|\|\>>|>|k>*\<\|\|\>H\<\|\|\>>>|\\<\|\|\>>|>|\\<\|\|\>>>|E\<\|\|\>,\<\|\|\>\\<\|\|\>}>|>|*\*\<\|\|\>H\<\|\|\>>>||>|*\*\<\|\|\>\\<\|\|\>*2>>>>> This is clear for , so assume that 0>. In a similar way as (), we have <\eqnarray*> H\<\|\|\>>|>|H\<\|\|\>*\<\|\|\>E\<\|\|\>+4*\<\|\|\>E\<\|\|\>*\<\|\|\>\\<\|\|\>.>>>> Using the induction hypotheses and *\<\|\|\>\\<\|\|\>\1>, it follows that <\eqnarray*> H\<\|\|\>>|>|*\*\<\|\|\>\\<\|\|\>)*\<\|\|\>E\<\|\|\>*\<\|\|\>H\<\|\|\>>>||>|>*\<\|\|\>H\<\|\|\>,>>>> which proves(). Now let > be such that <\eqnarray*> >||*(1+\)*\*(1+\)>>|||*(1+\).>>>> From (), it follows that <\eqnarray*> \\<\|\|\>>|>|*\<\|\|\>\\<\|\|\>>.>>>> On the one hand, this implies(). On the other hand, it follows that <\eqnarray*> (\)>|>|,>>>> whence () generalizes to(). This completes the induction and the linear convergence of H\<\|\|\>> to zero. In fact, the combination of () and() show that we even have quadratic convergence. Let us now return to the original bound computation problem. We start with the computation of 1>*M*T-\> using ball arithmetic. If the condition() is met (using the most pessimistic rounding), the preceding discussion shows that for every \M> (in the sense that \M> for all ), the equation() admits a solution of the form <\eqnarray*> >|||~>)=T*(1+E)*(1+E)*\>>||~>>||*(1+|~>)=\*(1+\)*(1+\)*\,>>>> with <\eqnarray*> E\<\|\|\>,\<\|\|\>\\<\|\|\>}>|>|**\*\<\|\|\>H\<\|\|\>,>>>> for all . It follows that <\eqnarray*> |~>\<\|\|\>,\<\|\|\>|~>\<\|\|\>}>|>|\6**\*\<\|\|\>H\<\|\|\>.>>>> We conclude that <\eqnarray*> >|>|*\)>>||~>>|>|>*\.>>>> We may thus return *\),>*\)> as the solution to the original eigenproblem associated to the ball matrix . The reliable bound computation essentially reduces to the computation of three matrix products and one matrix inversion. At low precisions, the numerical computation of the eigenvectors is far more expensive in practice, so the overhead of the bound computation is essentially negligible. At higher precisions , the iteration )\(T,\)>> actually provides an efficient way to double the precision of a numerical solution to the eigenproblem at precision . In particular, even if the condition() is not met initially, then it usually can be enforced after a few iterations and modulo a slight increase of the precision. For fixed and \>, it also follows that the numerical eigenproblem essentially reduces to a few matrix products. The certification of the end-result requires a few more products, which induces a constant overhead. By performing a more refined error analysis, it is probably possible to make the cost certification negligible, although we did not investigate this issue in detail. In section, we have already seen that matricial balls in n>|\>> often provide higher quality error bounds than ball matrices in |\>n>> or essentially equivalent variants in n>|\n>>>. However, ball arithmetic in n>|\>>> relies on the possibility to quickly compute a sharp upper bound for the operator norm M\<\|\|\>> of amatrix \n>>. Unfortunately, we do not know of any really efficient algorithm for doing this. One expensive approach is to compute a reliable singular value decomposition of , since M\<\|\|\>> coincides with the largest singular value. Unfortunately, this usually boils down to the resolution of the eigenproblem associated to >*M>, with a few possible improvements (for instance, the dependency of the singular values on the coefficients of is less violent than in the case of a general eigenproblem). Since we only need the largest singular value, a faster approach is to reduce the computation of M\<\|\|\>> to the computation of M>*M\<\|\|\>>, using the formula <\eqnarray*> M\<\|\|\>>||M>*M\<\|\|\>|>.>>>> Applying this formula times and using a naive bound at the end, we obtain <\eqnarray*> M\<\|\|\>>|>|(M>*M)>\<\|\|\>>|2>.>>>> This bound has an accuracy of k-O(log n)> bits. Since >*M> is symmetric, the repeated squarings of >*M> only correspond to about > matrix multiplications. Notice also that it is wise to renormalize matrices before squaring them, so as to avoid overflows and underflows. The approach can be speeded up further by alternating steps of tridiagonalization and squaring. Indeed, for a symmetric tridiagonal matrix , the computation of > and its tridiagonalization only take steps instead of )> for a full matrix product. After a few )> steps of this kind, one obtains agood approximation > of D\<\|\|\>>. One may complete the algorithm by applying an algorithm with quadratic convergence for finding the smallest eigenvalues of >. In the lucky case when has an isolated maximal eigenvalue, a certification of this method will provide sharp upper bounds for D\<\|\|\>> in reasonable time. Even after the above improvements, the computation of sharp upper bounds forM\<\|\|\>> remains quite more expensive than ordinary matrix multiplication. For this reason, it is probably wise to avoid ball arithmetic in n>|\>>> except if there are good reasons to expect that the improved quality is really useful for the application in mind. Moreover, when using ball arithmetic in n>|\n>>>, it is often possible to improve algorithms in ways to reduce overestimation. When interpreting a complete computation as a dag, this can be achieved by minimizing the depth of the dag, by using an algorithm which is better suited for parallelization. Let us illustrate this idea for the computation of the -th power > of a matrix . When using Horner's method (multiply the identity matrix times by ), we typically observe an overestimation of bits (as for the example()). If we use binary powering, based on the rule <\eqnarray*> >||k/2\>*Mk/2\>,>>>> then the precision loss drops down to bits. We will encounter a less obvious application of the same idea in section below. There are two typical applications of power series \[[z]]> or \[[z]]> with certified error bounds. When occurs as a generating function in a counting problem or random object generator, then we are interested in the computation of the coefficients > for large, together with reliable error bounds. A natural solution is to systematically work with computable power series with ball coefficients in>[[z]]>. For many applications, we notice that is fixed, whereas p> may become very large. The second typical application is when \[[z]]> is the local expansion of an analytic function on a disk >> and we wish to evaluate at a point with \>. The geometric decrease of *z\|> implies that we will need only coefficients of the series. In order to bound the remaining error using Cauchy's formula, we do not only need bounds for the individual coefficients >, but also for the norm f\<\|\|\>>> defined in(). Hence, it is more natural to work with serial balls in [\][[z]]|\>>, while using the \\<\|\|\>>> norm. Modulo a rescaling f(\*z)>, it will be convenient to enforce =1>. In order to compute sharp upper bounds > for f\<\|\|\>=\<\|\|\>f\<\|\|\>>, it will also be convenient to have an algorithm which computes bounds >> for the tails <\eqnarray*> >||*z+f*z+\.>>>> Compared to the computation of the corresponding head <\eqnarray*> >||+\+f*z,>>>> we will show in section that the computation of such a tail bound is quite cheap. Again the question arises how to represent > in a reliable way. We may either store aglobal upper bound for the error, so that \[\][z]|\>>, or compute individual bounds for the errors, so that \\[\][z]>. If our aim is to evaluate at a point with 1>, then both representations \[\][z]> and =|\<\|\|\>\<\|\|\>>>\[\][z]|\>> give rise to evaluations (z)\\[\]> with equally precise error bounds. Since the manipulation of global error bounds is more efficient, the corresponding representation should therefore be preferred in this case. In the multivariate case, one has the additional benefit that ``small'' coefficients*z> ( \|\\*\<\|\|\>f\<\|\|\>>) can simply be replaced by a global error \|>>, thereby increasing the sparsity of . On the other hand, individual error bounds admit the advantage that rescaling f(\*z)> is cheap. If we suddenly find out that is actually convergent on a larger disk and want to evaluate at a point with 1>, then we will not have to recompute the necessary error bounds for > from scratch. Serial ball representations similar to what has been described above are frequently used in the area of Taylor models for the validated long term integration of dynamical systems. In the case of Taylor models, there is an additional twist: given adynamical system of dimension , we not only compute a series expansion with respect to the time, but also with respect to small perturbations ,\,\> of the initial conditions. In particular, we systematically work with power series in several variables. Although such computations are more expensive, the extra information may be used in order to increase the sharpness of the computed bounds. A possible alternative is to compute the expansions in ,\,\> only up to the first order and to use binary splitting for the multiplication of the resulting Jacobian matrices on the integration path. This approach will be detailed in section. In order to study the reliable multiplication of series and , let us start with the case when [[z]]|\>>, using the sup-norm on the unit disk. We may take <\eqnarray*> =f*g>||*|~>|[\<\|\|\>\<\|\|\>*\<\|\|\>\<\|\|\>+\<\|\|\>\<\|\|\>*(\<\|\|\>\<\|\|\>+\<\|\|\>\<\|\|\>)+\]>>,>>>> where =*|~>> stands for a >-approximation of *>. Since > is really anumeric algorithm for the computation of its coefficients, the difficulty resides in the fact that> has to be chosen once and for all, in such a way that the bound -*\<\|\|\>> will be respected at the limit. A reasonable choice is =\*\<\|\|\>\<\|\|\>*\<\|\|\>\<\|\|\>>. We next distribute this error over the infinity of coefficients: picking some \1>, each coefficient )> is taken to be an )*\*\]>-approximation of *>. Of course, these computation may require alarger working precision than . Nevertheless, and are actually convergent on aslightly larger disk >>. Picking =1/\>, the required increase of the working precision remainsmodest. Let us now turn our attention to the multiplication of two computable series \[[z]]> with ball coefficients. Except for naive power series multiplication, based on the formula =f*g>, most other multiplication algorithms (whether relaxed or not) use polynomial multiplication as a subalgorithm. We are thus left with the problem of finding an efficient and high quality algorithm for multiplying two polynomials \[z]> of degrees> n>. In order to simplify the reading, we will assume that ,P,Q,Q> are all non-zero. As in the case of matrix multiplication, there are various approaches with different qualities, efficiencies and aptitudes to profit from already available fast polynomial arithmetic in [z]>. Again, the naive )> approach provides almost optimal numerical stability and qualities for the error bounds. However, this approach is both slow from an asymptotic point of view and unable to rely on existant multiplication algorithms in [z]>. If the coefficients of and are all of the same order of magnitude, then we may simply convert and into polynomial balls in [z]|\>> for the norm \\<\|\|\>>> and use the following crude formula for their multiplication: <\eqnarray*> ||*|~>|[n*\<\|\|\>\<\|\|\>>*\<\|\|\>\<\|\|\>>+n*\<\|\|\>\<\|\|\>>*(\<\|\|\>\<\|\|\>>+\<\|\|\>\<\|\|\>>)+\]>>,>>>> where *|~>> stands for a >-approximation of *>. In other words, we may use any efficient multiplication algorithm in [z]> for the approximation of *>, provided that we have a means to compute a certified bound> for the error. In our application where and correspond to ranges of coefficients in the series and, we usually have \P*\i>> and \Q*\i>> for the convergence radii > and > of and . In order to use(), it is therefore important to scale P(z/\)> and Q(z/\)> for a suitable >. If we are really interested in the evaluation of at points in a disk >, then we may directly take =r>. In we are rather interested in the coefficients of , then =min(\,\)> is the natural choice. However, since > and> are usually unknown, we first have to compute suitable approximations for them, based on the available coefficients of and. A good heuristic approach is to determine indices n/2\j> such that <\eqnarray*> |P>>|>||P>,>>>> for all , and to take <\eqnarray*> >|>||P>>>>>> as the approximation for >. Recall that the numerical Newton polygon of > is the convex hull of all points \|-\)\\> with \0>. Consider the edge of > whose projection on the -axis contains . Then and are precisely the extremities of this projection, so they can be computed in linear time. For large precisions , the scaling algorithm is both very efficient and almost of an optimal quality. For small and large , there may be some precision loss which depends on the nature of the smallest singularities of and . Nevertheless, for many singularities, such as algebraic singularities, the precision loss is limited to bits. For a more detailed discussion, we refer to6.2>. In other applications, where and are not obtained from formal power series, it is usually insufficient to scale using a single factor >. This is already the case when multiplying and *z> for small \\>, since the error bound for =\> exceeds>. One possible remedy is to ``precondition'' and according to their numerical Newton polygons and use the fact that > is close to the Minkowski sum +N>. More precisely, for each , let )\0> denote the number such that ))>> lies on one of the edges of >. Then )> is of the same order of magnitude as >, except for indices for which > is small ``by accident''. Now consider the preconditioned relative error <\eqnarray*> >||)>|(N)>>>>> and similarly for . Then <\eqnarray*> ]>>|>|+\+\*\)*(N+N),>>>> if is computed using infinite precision ball arithmetic. Assuming a numerically stable multiplication algorithm for the centers, as proposed in, and incorporating the corresponding bound for the additional errors into the right hand side of(), we thus obtain an efficient way to compute an upper bound for >. Notice that the numerical Newton polygon > has a close relation to the orders of magnitudes of the roots of . Even though the error bounds for some ``accidentally small'' coefficients > may be bad for the above method, the error bounds have a good quality if we require them to remain valid for small perturbations of the roots of . The algorithms from the previous section can in particular be used for the relaxed multiplication of two ball power series \[[z]]>. In particular, using the implicit equation <\eqnarray*> ||>>>> this yields a way to compute 1>>. Unfortunately, the direct application of this method leads to massive overestimation. For instance, the computed error bounds for the inverses of <\eqnarray*> ||>>>|||>>>>> coincide. Indeed, even when using a naive relaxed multiplication, the coefficients of and are computed using the recurrences <\eqnarray*> >||+2*g>>|>||-2*h,>>>> but the error bound > for > and > is computed using the same recurrence <\eqnarray*> >||+2*\,>>>> starting with \\>. For \>, it follows that \\*\n>> and \\*\*\n>> for some >, where \0.281> is the smallest root of -2*\>. Hence the error bound for > is sharp. On the other hand, \\*\> for some >, where => is the smallest root of +2*\>. Hence, the error bound > is (\/\)> bits too pessimistic in the case of . The remedy is similar to what we did in the case of matrix inversion. We first introduce the series <\eqnarray*> >||>\\[[z]]>>|>||*(z*f-1)]|z>\\[[z]]>>>> and next compute using <\eqnarray*> |||1-z*\>,>>>> where > is inverted using the formula(). This approach has the advantage of being compatible with relaxed power series expansion and it yields high quality error bounds. Another typical operation on power series is exponentiation. Using relaxed multiplication, we may compute the exponential of an infinitesimal power series z*\[[z]]> using the implicit equation <\eqnarray*> ||f*g,>>>> where > stands for ``distinguished integration'' in the sense that h)=0> for all . Again, one might fear that this method leads to massive overestimation. As a matter of fact, it usually does not. Indeed, assume for simplicity that =0>, so that \[[z]]>. Recall that denotes the power series with =\|f\|>. Roughly speaking, the error bound for>, when computed using the formula() will be the coefficient > of the power series =exp(\*\|f\|)>. Since has the same radius of convergence as, it directly follows that the bit precision loss is sublinear . Actually, the dominant singularity of often has the same nature as the dominant singularity of . In that case, the computed error bounds usually become very sharp. The observation generalizes to the resolution of linear differential equations, by taking a square matrix of power series for . In the previous sections, we have seen various reliable algorithms for the computation of the expansion > of a power series \[[z]]> up till a given order . Such expansions are either regarded as ball polynomials in [\][z]> or as polynomial balls in [\][z]|\>>. Assuming that is convergent on the closed unit disk >, it remains to be shown how to compute tail bounds >>. We will follow. Given a ball polynomial, then we notice that reasonably sharp upper and lower bounds > and> for can be obtained efficiently by evaluating at primitive roots of unity using fast Fourier transforms6.2>. We will assume that the series is either an explicit series, the result of an operation on other power series or the solution of an implicit equation. Polynomials are the most important examples of explicit series. Assuming that is a polynomial of degree > d>, we may simply take <\eqnarray*> >>||n>*f>>|n\d>>||>>>>>,>>>> where <\eqnarray*> >||*z+\+f*z.>>>> For simple operations on power series, we may use the following bounds: <\eqnarray*> >>||>+>>>|>>||>+>>>|>>||>*(>+>)+>*>+*g)>>>|f)>>||>*>,>>>> where <\eqnarray*> *g)>>|>|\|f\|*\|g\|>>>> can be computed in time . Due to possible overestimation, division has to be treated with care. Given an infinitesimal power series > and <\eqnarray*> ||>,>>>> so that <\eqnarray*> >||*f-f|1-\>,>>>> we take <\eqnarray*> >>||*f)>*|+>>>>,>>>> where *f)>> is computed using(). Let (f)> be an expression which is constructed from and polynomials, using the ring operations and distinguished integration (we exclude division in order to keep the discussion simple). Assume that each coefficient (f)> only depends on the previous coefficients ,\,f> of and not on ,f,\>. Then the sequence (0),\(0),\> tends to the unique solution of the implicit equation <\eqnarray*> ||(f),>>>> Moreover, (0)=f> for any n>. In() and(), we have already seen two examples of implicit equations of this kind. The following technique may be used for the computation of tail bounds >>. Given \>>> and assuming that >\c>, we may use the rules (--) in order to compute a ``conditional tail bound'' (f)\|c>> for \(f)\<\|\|\>>. Mathematically speaking, this bound has the property that for any power series with =f> and g\<\|\|\>\c>, we have \(g)\<\|\|\>\(g)\|c>>. If (g)\|c>\c>, then it follows that [\(f)]\<\|\|\>\c> for all . Given any disk >> with \1>, it follows that \(f)-f\<\|\|\>>\2*c*\/(1-\)> for any n>, since (f)-f=O(z)>. In other words, (f)> uniformly converges to on >>. Therefore, f\<\|\|\>>\c> and f\<\|\|\>\c>, by letting \1>. Let us now consider the mapping :c\(f)\|c>> and assume that > involves no divisions. When computing (f)\|c>> using infinite precision and the rules(--), we notice that > is a real analytic function, whose power series expansion contains only positive coefficients. Finding the smallest with (c)\c> thus reduces to finding the smallest fixed point > of >, if such a fixed point exists. We may use the secant method: <\eqnarray*> >|>|>|>|>|(c)>>|>|>|+(c)-c|c-\(c)+\(c)-c>*(c-c)>>>> If \c> for some or if exceeds a given threshold , then the method fails and we set >=\\>. Otherwise, > converges quadratically to >. As soon as /c-1\|\\>, for some given \0>, we check whether ()\> for =2*c-c>, in which case we stop. The resulting > is an approximation of > with relative accuracy \0>. Assuming that (f)> has been computed for every subexpression of >, we notice that the computation of (f)\|c>> only requires floating point operations, where is the size of> as an expression. More generally, the evaluation of (f)\|c>> for different hypotheses ,\,c> only requires operations, since the heads (f)> do not need to be recomputed. Our algorithm for the computation of > therefore requires at most floating point operations. Taking (n)/n)>, it follows that the cost of the tail bound computation remains negligible with respect to the series expansion itself. The approach generalizes to the case when is a vector or matrix of power series, modulo a more involved method for the fixed point computation6.3>. If is indeed convergent on >, then it can also be shown that > indeed admits a fixed point if is sufficiently large. Let us now consider a dynamical system <\eqnarray*> ||\(f),>>>> where is a vector of unknown complex power series, \[\]>, and > an expression built up from the entries of and polynomials in [\][z]> using the ring operations. Given \[\]\{0}>>, denote by t>> the scaled power series and by > the shifted power series , assuming that converges at . Also denote by t>> and > the expressions which are obtained when replacing each polynomial in > by t>> >. Then t>> and > satisfy <\eqnarray*> t>>||\t>(ft>)>>|>||\(f)>>>> Since() is aparticular case of an implicit equation of the form(), we have an algorithm for the computation of tail bounds >\(\>\{\})> for on >. Modulo rescaling(), we may also compute tail bounds >>> on more general disks(0,\)>>. Assume that we want to integrate() along the real axis, as far as possible, and performing all computations with an approximate precision of bits. Our first task is to find a suitable initial step size and evaluate at . Since we require a relative precision of approximately bits, we roughly want to take \/2>, where > is the radius of convergence of , and evaluate the power series up to order p>. We thus start by the numerical computation of > and the estimation |~>> of >, based on the numerical Newton polygon of >. Setting |~>/2>, we next compute a tail bound >>. This bound is considered to be of an acceptable quality if <\eqnarray*> >>|>|*>.>>>> In the contrary case, we keep setting t/2> and recomputing >>, until we find a bound of acceptable quality. It can be shown that this process ultimately terminates, when is sufficiently large. We thus obtain an appropriate initial step size which allows us to compute a >-approximation of with =2*\*>+>(t))>>. In principle, we may now perform the translation t+z> and repeat the analytic continuation process using the equation(). Unfortunately, this approach leads to massive overestimation. Indeed, if the initial condition is given with relative precision >, then the relative precisions of the computed coefficients > are typically a non trivial factor times larger than >, as well as the next ``initial condition'' at . Usually, we therefore lose bits of precision at every step. One remedy is the following. Let =\> be the analytic continuation operator which takes an initial condition \> on input and returns the evaluation \> of the unique solution to() with . Now instead of computing > using a ball initial condition \[\]>, we rather use its center > as our initial condition and compute ()> using ball arithmetic. In order to obtain a reliable error bound for , we also compute the Jacobian >> of > at > using ball arithmetic, and take <\eqnarray*> ||()+>()*]>>.>>>> The Jacobian can either be computed using the technique of automatic differentiation or using series with coefficients in a jet space of order one. With this approach ()> is computed with an almost optimal -bit accuracy, and >()> with an accuracy which is slightly worse than the accuracy of the initial condition. In the lucky case when >()> is almost diagonal, the accuracy of will therefore be approximately the same as the accuracy of . However, if >()> is not diagonal, such as in(), then the multiplication >()*> may lead to overestimation. This case may already occur for simple linear differential equations such as <\eqnarray*> >|>>>>>||>|>>>>+|1>>||>>>>*>|>>>>>>>> and the risk is again that we lose bits of precision at every step.\ Let us describe amethod for limiting the harm of this manifestation of the wrapping effect. Consider the analytic continuations of at successive points \t\\\t> and denote <\eqnarray*> >||,t>>()>),i=1,\,k.>>>> Instead of computing the error at > due to perturbation of the initial condition using the formula <\eqnarray*> ]>||*(J*(\J**\)),>>>> we may rather use the formula <\eqnarray*> ]>||*J*\*J)*,>>>> where matrix chain products are computed using a variant of binary powering(): <\eqnarray*> *\*J>||*\*J(i+j)/2\+1>)*(J(i+j)/2\>*\*J).>>>> In order to be complete, we also have to take into account the additional error>, which occurs during the computation of ,t>()>)>. Setting =>, we thus take <\eqnarray*> )>>||+J*\+\+(J*\*J)*\.>>>> When using this algorithm, at least in the case of simple systems such as(), the precision loss at step will be limited to bits. Notice that we benefit from the fact that the Jacobian matrices remain accurate as long as the initial conditions remain accurate. Although we have reduced the wrapping effect, the asymptotic complexity of the above algorithm is cubic or at least quadratic in when evaluating() using a binary splitting version of Horner's method. Let us now describe the final method which requires only matrix multiplications up till step . For i\j>, we define <\eqnarray*> >||+J*\+\+(J*\*J)*\.>>|>||*\*J,>>>> where =1>. At stage >+\+2>> with \\\e>, we assume that <\equation*> >|>|>|>|>+\+2>,2>+\+2>>>>|>|>|>|>|>+\+2>,2>+\+2>>>>>>> are stored in memory for all j\l>. From these values, we may compute <\eqnarray*> )>>||+J*(\+\J*(\+*J*\)\)>>>> using only matrix multiplications. Now assume that we go to the next stage . If is maximal such that > divides , then >+\+2>+2>. Consequently, =\> and =J> for l\l-m+1> and <\eqnarray*> ]>>||+J*(\+\J+2]>*(\+1]>+J+1]>*\]>)\)>>|]>>||*J*\*J]>.>>>> Hence, the updated lists > and > can be computed using matrix multiplications. Furthermore, we only need to store auxiliary matrices at step. There is a vast literature on validated integration of dynamical systems and reduction of the wrapping effect. We refer to for a nice review. Let us briefly discuss the different existing approaches. The wrapping effect was noticed in the early days of interval arithmetic and local coordinate transformations were proposed as a remedy. The idea is to work as much as possible with respect to coordinates in which all errors are parallel to axes. Hence, instead of considering blocks \>, we rather work with parallelepipeds >, with \>, \d>> and \>. A natural choice for after steps is *\*J>, but more elaborate choices may be preferred. Other geometric shapes for the enclosures have been advanced in the literature, such as ellipsoids, which are also invariant under linear transformations, and zonotopes. However, as long as we integrate() using a straightforward iterative method, and even if we achieve a small average loss \1> of the bit precision at a single step, the precision loss after steps will be of the form >. The idea of using dichotomic algorithms in order to reduce the wrapping effect was first described in the case of linear differential equations. The previous section shows how to adapt that technique to non linear equations. We notice that the method may very well be combined with other geometric shapes for the enclosures: this will help to reduce the precision loss to > instead of > in the above discussion. Notice that there are two common misunderstandings about the dichotomic method. Contrary to what we have shown above (see also in the linear case), it is sometimes believed that we need to keep all matrices ,\,J> in memory and that we have to reevaluate products *\*J> over and over again. Secondly, one should not confuse and of the wrapping effect: if is the 2> rotation matrix by a >> angle, then none of its repeated squarings >> will be the identity matrix, so every squaring >)> will involve a wrapping. Even though we have eliminated the wrapping effect, we achieve to reduce the number of wrappings to instead of >. Taylor models are another approach for the validated long term integration of dynamical systems. The idea is to rely on higher -th order expansions of the continuation operator >. This allows for real algebraic enclosures which are determined by polynomials of degree . Such enclosures are more precise for non linear problems. However, the method requires us to work in order jet spaces in variables; the mere storage of such a jet involves > coefficients. From atheoretical point of view it is also not established that Taylor methods eliminate the wrapping effect by themselves. Nevertheless, Taylor models can be combined with any of the above methods and the non linear enclosures seem to be more adapted in certain cases. For a detailed critical analysis, we refer to. Let us finally investigate how sharp good error bounds actually may be. If =f(0)>\<\|\|\>/\<\|\|\>>> denotes the relative precision of the initial condition at the start and =f(t)>\<\|\|\>/\<\|\|\>>> the relative precision of the final result, then it is classical that <\eqnarray*> >|>|(J>(f(0)))*\,>>|(J>(f(0)))>||J>(f(0))\<\|\|\>*\<\|\|\>J>(f(0))1>\<\|\|\>,>>>> where (J>(f(0)))> is the condition number of the matrix >(f(0))>. We propose to define the condition number (\,f(0),0,t)> for the integration problem() on by <\eqnarray*> =\(\,f(0),0,t)>||t\t\t>\(J,t>>(f(t))).>>>> Indeed, without using any particular mathematical properties of > or , we somehow have to go through the whole interval . Of course, it may happen that > is indeed special. For instance, if =M*f> for a matrix with a special triangular or diagonal form, then special arguments may be used to improve error bounds and more dedicated condition numbers will have to be introduced. However, in general, we suspect that a precision loss of > cannot be avoided. If> gets large, the only real way to achieve long term stability is to take 2*log \> (say), whence the interest of efficient multiple precision and high order ball algorithms. It seems to us that the parallelepiped method should manage to keep the precision loss bounded by >, for 2*log \> and \\>. The algorithm from section (without coordinate transformations) only achieves a > in the worst case, although a > bound is probably obtained in many cases of interest. We plan to carry out a more detailed analysis once we will have finished a first efficient multiple precision implementation. <\bibliography|bib|alpha|all.bib> <\bib-list|DGG+02b> O.Aberth. . McGraw-Hill, New York, 1980. G.Alefeld and J.Herzberger. . Academic Press, New York, 1983. ANSI/IEEE. IEEE standard for binary floating-point arithmetic. Technical report, ANSI/IEEE, New York, 2008. ANSI-IEEE Standard 754-2008. Revision of IEEE 754-1985, approved on June 12, 2008 by IEEE Standards Board. ANSI/IEEE. IEEE interval standard working group - p1788., 2009. E.Bishop and D.Bridges. . Die Grundlehren der mathematische Wissenschaften. Springer, Berlin, 1985. Andrea Balluchi, Alberto Casagrande, Pieter Collins, Alberto Ferrari, Tiziano Villa, and AlbertoL. Sangiovanni-Vincentelli. Ariadne: a framework for reachability analysis of hybrid automata. In , pages 1269–--1267, Kyoto, July 2006. A.Bostan, F.Chyzak, F.Ollivier, B.Salvy, É. Schost, and A.Sedoglavic. Fast computation of power series solutions of systems of differential equation. preprint, april 2006. submitted, 13 pages. M.Berz. Cosy infinity version 8 reference manual. Technical Report MSUCL-1088, Michigan State University, East Lansing, 1998. D.A. Bini and G.Fiorentino. Design, analysis, and implementation of a multiprecision polynomial rootfinder. , 23:127--173, 2000. D.H. Bailey, Y.Hida, and X.S. Li. QD, a C/C++/Fortran library for double-double and quad-double arithmetic. , 2000. D.H. Bailey, Y.Hida, and X.S. Li. Algorithms for quad-double precision floating point arithmetic. In , pages 155--162, June 2001. R.P. Brent and H.T. Kung. Fast algorithms for manipulating formal power series. , 25:581--595, 1978. Walter Baur and Volker Strassen. The complexity of partial derivatives. , 22:317--330, 1983. D.G. Cantor and E.Kaltofen. On fast multiplication of polynomials over arbitrary algebras. , 28:693--701, 1991. J.W. Cooley and J.W. Tukey. An algorithm for the machine calculation of complex Fourier series. , 19:297--301, 1965. D.Coppersmith and S.Winograd. Matrix multiplication via arithmetic progressions. In > Annual Symposium on Theory of Computing>, pages 1--6, New York City, may 25--27 1987. J.-G. Dumas, T.Gautier, M.Giesbrecht, P.Giorgi, B.Hovinen, E.Kaltofen, B.D. Saunders, W.J. Turner, and G.Villard. Linbox: A generic library for exact linear algebra. In , pages 40--50, Beijing, China, 2002. J.-G. Dumas, T.Gautier, M.Giesbrecht, P.Giorgi, B.Hovinen, E.Kaltofen, B.D. Saunders, W.J. Turner, and G.Villard. Project linbox: Exact computational linear algebra. , 2002. U.W.Kulisch etal. Scientific computing with validation, arithmetic requirements, hardware solution and language support. , 1967. J.-P. Eckmann, H.Koch, and P.Wittwer. A computer-assisted proof of universality in area-preserving maps. , 47(289), 1984. J.-P. Eckmann, A.Malaspinas, and S.Oliffson-Kamphorst. A software tool for analysis in function spaces. In K.Meyer and D.Schmidt, editors, , pages 147--166. Springer, New York, 1991. M.Fürer. Faster integer multiplication. In , pages 57--66, San Diego, California, 2007. J.vonzur Gathen and J.Gerhard. . Cambridge University Press, 2-nd edition, 2002. T.Granlund et al. GMP, the GNU multiple precision arithmetic library., 1991. A.Grzegorczyk. On the definitions of computable real continuous functions. , 44:61--71, 1957. T.N. Gambill and R.D. Skeel. Logarithmic reduction of the wrapping effect with application to ordinary differential equations. , 25(1):153--162, 1988. B.Haible. CLN,a Class Library for Numbers. , 1995. G.Hanrot, V.Lefèvre, K.Ryde, and P.Zimmermann. MPFR, a C library for multiple-precision floating-point computations with exact rounding. , 2000. E.Hansen and R.Smith. Interval arithmetic in matrix computations, part ii. , 4:1--9, 1967. L.Jaulin, M.Kieffer, O.Didrit, and E.Walter. . Springer, London, 2001. W.Kühn. Rigourously computed orbits of dynamical systems without the wrapping effect. , 61:47--67, 1998. V.Kreinovich, A.V. Lakeyev, J.Rohn, and P.T. Kahl. . Springer, 1997. A.Karatsuba and J.Ofman. Multiplication of multidigit numbers on automata. , 7:595--596, 1963. V.Kreinovich and S.Rump. Towards optimal use of multi-precision arithmetic: a remark. Technical Report UTEP-CS-06-01, UTEP-CS, 2006. U.W. Kulisch. . Number33 in Studies in Mathematics. de Gruyter, 2008. B.Lambov. Reallib: An efficient implementation of exact real arithmetic. , 17(1):81--98, 2007. R.Lohner. . PhD thesis, Universität Karlsruhe, 1988. R.Lohner. On the ubiquity of the wrapping effect in the computation of error bounds. In U.Kulisch, R.Lohner, and A.Facius, editors, , pages 201--217, Wien, New York, 2001. Springer. N.Müller. iRRAM, exact arithmetic in C++. , 2000. K.Makino and M.Berz. Remainder differential algebras and their applications. In M.Berz, C.Bischof, G.Corliss, and A.Griewank, editors, , pages 63--74, SIAM, Philadelphia, 1996. K.Makino and M.Berz. Suppression of the wrapping effect by Taylor model-based validated integrators. Technical Report MSU Report MSUHEP 40910, Michigan State University, 2004. R.E. Moore. Automatic local coordinate transformations to reduce the growth of error bounds in interval computation of solutions to ordinary differential equation. In L.B. Rall, editor, , volume2, pages 103--140. John Wiley, 1965. R.E. Moore. . Prentice Hall, Englewood Cliffs, N.J., 1966. Jean-Michel Muller. . Birkhauser Boston, Inc., 2006. A.Neumaier. . Cambridge university press, Cambridge, 1990. A.Neumaier. The wrapping effect, ellipsoid arithmetic, stability and confedence regions. , 9:175--190, 1993. A.Neumaier. Taylor forms - use and limits. , 9:43--79, 2002. K.Nickel. How to fight the wrapping effect. In Springer-Verlag, editor, , pages 121--132, 1985. V.Pan. , volume 179 of Springer, 1984. N.Revol. MPFI, a multiple precision interval arithmetic library. , 2001. S.M. Rump. Fast and parallel inteval arithmetic. , 39(3):534--554, 1999. S.M. Rump. INTLAB - INTerval LABoratory. In Tibor Csendes, editor, inReliable Computing>, pages 77--104. Kluwer Academic Publishers, Dordrecht, 1999. . A.Schönhage and V.Strassen. Schnelle Multiplikation grosser Zahlen. , 7:281--292, 1971. V.Strassen. Gaussian elimination is not optimal. , 13:352--356, 1969. A.Turing. On computable numbers, with an application to the Entscheidungsproblem. , 2(42):230--265, 1936. J.vander Hoeven. Relax, but don't be too lazy. , 34:479--542, 2002. J.vander Hoeven etal. Mathemagix, 2002. . J.vander Hoeven. Computations with effective real numbers. , 351:52--60, 2006. J.vander Hoeven. Newton's method and FFT trading. Technical Report 2006-17, Univ. Paris-Sud, 2006. Submitted to JSC. J.vander Hoeven. New algorithms for relaxed multiplication. , 42(8):792--802, 2007. J.vander Hoeven. On effective analytic continuation. , 1(1):111--175, 2007. J.vander Hoeven. Making fast multiplication of polynomials numerically stable. Technical Report 2008-02, Université Paris-Sud, Orsay, France, 2008. J.vander Hoeven. Meta-expansion of transseries. In Y.Boudabbous and N.Zaguia, editors, , pages 390--398, Mahdia, Tunesia, May 2008. J.vander Hoeven, B.Mourrain, and Ph. Trébuchet. Efficient and reliable resolution of eigenproblems. In preparation. K.Weihrauch. . Springer-Verlag, Berlin/Heidelberg, 2000. <\initial> <\collection> <\references> <\collection> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > <\auxiliary> <\collection> <\associate|bib> Wei00 BB85 Ab80 Tur36 Moo66 AH83 Neu90 JKDW01 Kul08 Berz98 MB96 MB04 Intlab Rump99 XSC MPFR CLN MPFI IRRAM Lam07 BCCFVSV06 BiFio00 vdH:mmx vdH:riemann vdH:metaexp Rump99 Kar63 CT65 SS71 KLCK97 QD BHL01 IEEE754 IEEE754 MPFR SS71 Fur07 GaGe02 CT65 CK91 BK78 BCOSSS06 vdH:fnewton vdH:relax vdH:newrelax Str69 Pan84 WC87 MPFR Tur36 Grz57 Wei00 Wei00 vdH:effreal KR06 Wei00 vdH:riemann vdH:riemann Tur36 IEEE1788 IRRAM HS67 Moo66 Rump99 Moo66 Mul06 vdH:mmx MPFR GMP SS71 LINBOX LINBOX02 Rump99 Str69 HS67 Moo66 vdH:eigen MB96 MB04 vdH:relax vdH:stablemult vdH:riemann vdH:riemann vdH:riemann vdH:riemann BS83 Moo65 Moo66 Nick85 Loh88 GS88 Neu93 Kuhn98 Loh01 Neu02 MB04 Loh01 Moo65 Loh88 Neu93 Kuhn98 GS88 GS88 EKW84 EMO91 MB96 MB04 Neu02 <\associate|figure> <\tuple|normal> Illustration of the computation of |z> using interval and ball arithmetic, for |z=1+\>. > <\tuple|normal> The numerical hierarchy. > <\associate|toc> |math-font-series||font-shape||1.Introduction> |.>>>>|> |math-font-series||font-shape||2.Notations and classical facts> |.>>>>|> |2.1.Floating point numbers and the IEEE 754 norm |.>>>>|> > |2.2.Classical complexity results |.>>>>|> > |2.3.Interval and ball notations |.>>>>|> > |math-font-series||font-shape||3.Balls and computable real numbers> |.>>>>|> |3.1.Ball arithmetic |.>>>>|> > |3.2.Finite precision ball arithmetic |.>>>>|> > |3.3.Implementation details |.>>>>|> > |3.4.Computable numbers |.>>>>|> > |3.5.Asymmetric computability |.>>>>|> > |3.6.Lipschitz ball arithmetic |.>>>>|> > |math-font-series||font-shape||4.Balls versus intervals> |.>>>>|> |4.1.Standardization |.>>>>|> > |4.2.Practical efficiency |.>>>>|> > |4.3.Quality |.>>>>|> > |4.4.Mathematical elegance |.>>>>|> > |math-font-series||font-shape||5.The numerical hierarchy> |.>>>>|> |Mathematical level |.>>>>|> > |Reliable level |.>>>>|> > |Numerical level |.>>>>|> > |Arithmetic level |.>>>>|> > |math-font-series||font-shape||6.Reliable linear algebra> |.>>>>|> |6.1.Matrix multiplication |.>>>>|> > |Naive strategy |.>>>>|> > |Revisited naive strategy |.>>>>|> > |Fast strategy |.>>>>|> > |Hybrid strategy |.>>>>|> > |High precision multiplication |.>>>>|> > |6.2.Matrix inversion |.>>>>|> > |6.3.Eigenproblems |.>>>>|> > |6.4.Matricial balls versus ball matrices |.>>>>|> > |math-font-series||font-shape||7.Reliable power series arithmetic> |.>>>>|> |7.1.Ball series versus serial balls |.>>>>|> > |7.2.Reliable multiplication of series and polynomials |.>>>>|> > |7.3.Reliable series division and exponentiation |.>>>>|> > |7.4.Automatic tail bounds |.>>>>|> > |7.5.Long term integration of dynamical systems |.>>>>|> > |7.6.Discussion |.>>>>|> > |math-font-series||font-shape||Bibliography> |.>>>>|>