Efficient certification ofnumeric solutions to eigenproblems

 Joris van der Hoeven

 Laboratoire d'informatique, UMR 7161 CNRS Campus de l'École polytechnique 1, rue Honoré d'Estienne d'Orves Bâtiment Alan Turing, CS35003 91120 Palaiseau France Email: vdhoeven@lix.polytechnique.fr

 Bernard Mourrain

 Inria Sophia Antipolis Méditerranée AROMATH 2004 route des Lucioles 06902 Sophia Antipolis France

 Email: Bernard.Mourrain@inria.fr

 October 24, 2017

 In this paper, we present an efficient algorithm for the certification of numeric solutions to eigenproblems. The algorithm relies on a mixture of ball arithmetic, a suitable Newton iteration, and clustering of eigenvalues that are close. Keywords: Ball arithmetic, interval arithmetic, reliable computing, computable analysis A.M.S. subject classification: 65G20, 03F60, 65F99

## 1.Introduction

Let be the set of floating point numbers for a fixed precision and a fixed exponent range. We will denote . Consider an matrix with complex floating entries. The numeric eigenproblem associated to is to compute a transformation matrix and a diagonal matrix such that

The entries of are the approximate eigenvalues and the columns of are the approximate eigenvectors of . In addition, we might require that is normalized. For instance, each of the columns might have unit norm. Alternatively, the norm of the -th column may be required to be the same as the norm of the -th row of , for each . There are several well-known algorithms for solving the numeric eigenproblem [6].

Unfortunately, (1) is only an approximate equality. It is sometimes important to have rigourous bounds for the distance between the approximate eigenvalues and/or eigenvectors and the genuine ones. More precisely, we may ask for a diagonal matrix and a matrix such that there exists a matrix for which

is diagonal and

for all . This task will be called the certification problem of the numeric solution to the eigenproblem for . The matrices and can be thought of as reliable error bounds for the numerical solution of the eigenproblem.

It will be convenient to rely on ball arithmetic [11, 14], which is a systematic technique for this kind of bound computations. When computing with complex numbers, ball arithmetic is more accurate than more classical interval arithmetic [17, 1, 18, 13, 15, 21], especially in multiple precision contexts. We will write for the set of balls with centers in and radii in . In a similar way, we may consider matricial balls : given a center matrix and a radius matrix , we have

Alternatively, we may regard as the set of matrices in with ball coefficients:

Standard arithmetic operations on balls are carried out in a reliable way. For instance, if , then the computation of the product using ball arithmetic has the property that for any and . Given a ball , it will finally be convenient to write and for certified lower and upper bounds of in .

In the language of ball arithmetic, it is natural to allow for small errors in the input and replace the numeric input by a ball input . Then we may still compute a numeric solution

for the eigenproblem associated to the center . Assume that the matrices in are all diagonalizable. The generalized certification problem now consists of the computation of a diagonal matrix and a matrix such that, for every , there exist and with

In absence of multiple eigenvalues, known algorithms for solving this problem such as [23, 20] proceed by the individual certification of each eigenvector, which results in an running time. From the more theoretical perspective of -theory [3], we also refer to [2] for numerically stable, strongly accurate, and theoretically efficient algorithms for solving eigenproblems.

Extensions to a cluster of eigenvalues and the corresponding eigenvectors have been considered in [4, 22], with similar complexity bounds. Fixed points theorem based on interval arithmetic are used to prove the existence of a matrix with a given Jordan block in the matrix interval domain. Such an approach has been exploited for the analysis of multiple roots in [7, 19]. A test that provides an enclosing of all the eigenvalues has been proposed in [16]. Its certification relies on interval and ball arithmetics. The complexity of the test is in but no iteration converging to the solution of the eigenproblem is described.

In this paper, we present a new algorithm of time complexity for certifying and enclosing clusters of eigenvectors and eigenvalues in a single step. We also provide an iterative procedure that converges geometrically to clusters of solutions. This convergence is quadratic in the case of single eigenvalues. Our algorithm extends a previous algorithm from [11] to the case of multiple eigenvalues. This yields an efficient test for approximate eigenvalues.

From a more theoretical bit complexity point of view, our algorithm essentially reduces the certification problem to a constant number of numeric matrix multiplications. When using a precision of bits for numerical computations, it has recently been shown [9] that two matrices can be multiplied in time . Here with is the cost of -bit integer multiplication [10, 8] and is the exponent of matrix multiplication [5]. If is large enough with respect to the log of the condition number, then yields an asymptotic bound for the bit complexity of our certification problem.

We recall that it is very unlikely that the numeric matrix with complex floating point coefficients has multiple eigenvalues. Indeed, small perturbations of matrices with multiple eigenvalues, as induced by rounding errors, generically only have simple eigenvalues. Consequently, we may assume without loss of generality that the numeric eigenproblem (2) has a reasonably accurate solution (if necessary, we may slightly perturb and increase accordingly). Using ball arithmetic, it is straightforward to compute the matricial ball

If our numerical algorithm is accurate, then the non diagonal entries of tend to be small, whence can be considered as a small perturbation of a diagonal matrix. If we can estimate how far eigenvalues and eigenvectors of diagonal matrices can drift away under small perturbations, we thus obtain a solution to the original certification problem.

Section 2 introduces notations. In Section 3, we perform a detailed study of the eigenproblem for small perturbations of diagonal matrices. We exhibit a Newton iteration for finding the solutions. This iteration has quadratic convergence in the absence of multiple eigenvalues and is also an efficient tool for doubling the precision of a solution. However, in the case of multiple eigenvalues, the eigenproblem is ill-posed. Indeed, by a well-known observation, any vector occurs as the eigenvector of a small perturbation of the identity matrix. The best we can hope for is to group eigenvectors with close eigenvalues together in “clusters” (see also [22]) and only require to be block diagonal. For this reason, we present our Newton iteration in a sufficiently general setting which encompasses block matrices. We will show that the iteration still admits geometric convergence for sufficiently small perturbations and that the blockwise certification is still sufficient for the computation of rigourous error bounds for the eigenvalues. In Section 4, we will present explicit algorithms for clustering and the overall certification problem.

In absence of multiple eigenvalues, the Analyziz library of the Mathemagix system [12] contains an efficient implementation of our algorithm. The new algorithm from this paper has still to be integrated.

## 2.Notations

### 2.1.Matrix norms

Throughout this paper, we will use the max norm for vectors and the corresponding matrix norm. More precisely, given a vector and an matrix , we set

For a second matrix , we clearly have

Explicit machine computation of the matrix norm is easy using the formula

In particular, when changing certain entries of a matrix to zero, its matrix norm can only decrease.

### 2.2.Clustering

Assume that we are given a partition

Such a partition will also be called a clustering and denoted by . Two indices are said to belong to the same cluster if there exists a with and we will write . Two entries and of a matrix are said to belong to the same block if and . We thus regard as a generalized block matrix, for which the rows and columns of the blocks are not necessarily contiguous inside .

A matrix is said to be block diagonal (relative to the clustering) if whenever . Similarly, we say that is off block diagonal if whenever . For a general , we define its block diagonal and off block diagonal projections and by

By our observation at the end of section 2.1, we have

For the trivial clustering , the matrices and are simply the diagonal and off diagonal projections of . In that case we will also write and .

### 2.3.Diagonal matrices

Below, we will study eigenproblems for perturbations of a given diagonal matrix

It follows from (3) that the matrix norm of a diagonal matrix is given by

It will also be useful to define the separation number by

More generally, given a clustering as in the previous subsection, we also define the block separation number by

This number remains high if the clustering is chosen in such a way that the indices of any two “close” eigenvalues and belong to the same cluster. In particular, if , then implies .

## 3.Eigenproblems for perturbed diagonal matrices

### 3.1.The linearized equation

Let be a diagonal matrix (5). Given a small perturbation

of , where is an off diagonal matrix, the aim of this section is to find a small matrix for which

is block diagonal. In other words, we need to solve the equation

When linearizing this equation in and , we obtain

If is strongly off diagonal, then so is , and the equation further reduces to

This equation can be solved using the following lemma:

Lemma 1. Given a matrix and a diagonal matrix with entries , let be the strongly off diagonal matrix with

Then and

Proof. The inequality follows from (3) and the definition of . One may check (6) using a straightforward computation.

### 3.2.The fundamental iteration

In view of the lemma, we now consider the iteration

where

In order to study the convergence of this iteration, we introduce the quantities

Lemma 2. For , assume that

Then and

Proof. We have

where

Setting , the remainder is bounded by

Consequently,

The inequalities and follow from .

### 3.3.Convergence of the fundamental iteration

Theorem 3. Assume that

Then the sequence

converges geometrically to a limit with and . The matrix is block diagonal and there exists a matrix with, such that

Proof. Let stand for the -th fundamental iterate of and . Denote , , and . Let us show by induction over that

This is clear for . Assume that the induction hypothesis holds for a given and let

Since , the induction hypothesis implies

Applying Lemma 2 for and , we thus find

This completes the induction.

Applying the induction to the sequence starting at , we have for every ,

This shows that is a Cauchy sequence that tends to a limit with . From this inequality, we also deduce that , so converges geometrically to .

Moreover, for each , we have . Hence, the matrix

is well defined, and

We deduce that

since .

We claim that converges geometrically to

For any matrix with , we have

Let By the same arguments as above, we have . Since , the inequality (7) implies

This shows that converges geometrically to . We deduce that the sequence also converges geometrically to a limit with . Since , we finally observe that is block diagonal.

Theorem 4. Assume for all . Then, under the assumptions of Theorem 3, the sequence converges quadratically to .

Proof. The extra assumption implies that for all . Let us show by induction over that we now have

This is clear for . Assume that the result holds for a given . Then we may apply Lemma 2 to for , and obtain

Since , this establishes the quadratic convergence.

## 4.Algorithms

### 4.1.Clustering

Let be the perturbation of a diagonal matrix (5) as in the previous section. In order to apply theorem 3, we first have to find a suitable clustering (4). For a given threshold separation , we will simply take the finest clustering (i.e. for which is maximal) with the property that . This clustering can be computed using the algorithm Cluster below.

Algorithm Cluster

 Input: eigenvalues and Output: the finest clustering (4) with
• Let be the graph with vertices and such that and are connected if and only if .

• Let be the transitive closure of .

• Let the connected components of .

• Let be the set of vertices of for each .

### 4.2.Certification in the case of perturbed diagonal matrices

In order to apply theorem 3, it now remains to find a suitable threshold for which the conditions of the theorem hold. Starting with , we will simply increase to whenever the conditions are not satisfied. This will force the number of clusters to decrease by at least one at every iteration, whence the algorithm terminates. Notice that the workload of one iteration is , so the total running time remains bounded by .

Algorithm DiagonalCertify

 Input: a diagonal ball matrix with entries and an off diagonal ball matrix Output: a clustering and such that, for any and , the conditions of theorem 3 hold and

Repeat

Compute the clustering for and using Cluster

Let , , and

Let

If and , then return

Set

### 4.3.Certification of approximate eigenvectors and eigenvalues

Let us now return to the original problem of certifying a numerical solution to an eigenproblem. We will denote by the matrix of which all entries are one.

Algorithm EigenvectorCertify

 Input: and such that is approximately diagonal Output: a clustering and such that for any , there exists a for which is block diagonal

Compute

Let

Let

Let for all

Return

Obviously, any eigenvalue of a matrix satisfies . We may thus use the following modification of EigenvectorCertify in order to compute enclosures for the eigenvalues of .

Algorithm EigenvalueCertify

 Input: and such that is approximately diagonal Output: ball enclosures for the eigenvalues of , with the appropriate multiplicities in cases of overlapping

Compute

Let

Let and

For each do

If for some , then let

Otherwise

Let be the barycenter of the with

Let be the maximum of for

Let for all

Return

## 5.Possible extensions

Let be a matrix with a (numerically) multiple eigenvalue . We have already stressed that it is generally impossible to provide non trivial certifications for the corresponding eigenvectors. Nevertheless, two observations should be made:

• If the eigenspace corresponding to has dimension , then small perturbations of the matrix only induce small perturbations of and .

• Let denote the full invariant subspace associated to the eigenvalue (or all eigenvalues in the cluster of ). Then small perturbations of only induce small perturbations of and .

More precisely, in these two cases, we may search for ball enclosures for orthonormal bases of the vector spaces resp. , which do not contain the zero vector.

When considering the numeric solution (1) of the eigenproblem for , the column vectors which generate are usually far from being orthogonal. Orthonormalization can only be done at the expense of making only upper triangular. Moreover, the orthogonalization implies a big loss of accuracy, which requires the application of a correction method for restoring the accuracy. It seems that the fundamental Newton iteration from Section 3.2 can actually be used as a correction method. For instance, for small perturbations of the matrix

it can be shown that the fundamental iteration still converges. However, for more general block diagonal matrices with triangular blocks, the details are quite technical and yet to be worked out.

Yet another direction for future investigations concerns the quadratic convergence. As a refinement of Lemma 1, we might replace by a block diagonal matrix with entries . Instead of taking , we then have to solve equations of the form

If the are sufficiently close to , it might then be possible to adapt the fundamental iteration accordingly so as to achieve quadratic convergence for the strongly off diagonal part.

## Bibliography

[1] G. Alefeld and J. Herzberger. Introduction to interval analysis. Academic Press, New York, 1983.

[2] D. Armentano, C. Beltrán, P. Bürgisser, F. Cucker, and M. Shub. A stable, polynomial-time algorithm for the eigenpair problem. Technical report, ArXiv, 2014. http://arxiv.org/abs/1505.03290.

[3] L. Blum, F. Cucker, M. Shub, and S. Smale. Complexity and real computation. Springer-Verlag, 1998.

[4] J. J. Dongarra, C. B. Moler, and J. H. Wilkinson. Improving the accuracy of computed eigenvalues and eigenvectors. SIAM Journal on Numerical Analysis, 20(1):23–45, 1983.

[5] F. Le Gall. Powers of tensors and fast matrix multiplication. In Proc. ISSAC 2014, pages 296–303, Kobe, Japan, July 23–25 2014.

[6] G. H. Golub and F. Van Loan. Matrix Computations. JHU Press, 1996.

[7] S. Graillat and Ph. Trébuchet. A new algorithm for computing certified numerical approximations of the roots of a zero-dimensional system. In Proc. ISSAC '09, pages 167–174. ACM Press, 2009.

[8] D. Harvey. Faster truncated integer multiplication. https://arxiv.org/abs/1703.00640, 2017.

[9] D. Harvey and J. van der Hoeven. On the complexity of integer matrix multiplication. Technical report, HAL, 2014. http://hal.archives-ouvertes.fr/hal-01071191, accepted for publication in JSC.

[10] D. Harvey, J. van der Hoeven, and G. Lecerf. Even faster integer multiplication. Journal of Complexity, 36:1–30, 2016.

[11] J. van der Hoeven. Ball arithmetic. Technical report, HAL, 2009. http://hal.archives-ouvertes.fr/hal-00432152.

[12] J. van der Hoeven, G. Lecerf, B. Mourrain, et al. Mathemagix, 2002. http://www.mathemagix.org.

[13] L. Jaulin, M. Kieffer, O. Didrit, and E. Walter. Applied interval analysis. Springer, London, 2001.

[14] F. Johansson. Arb: A c library for ball arithmetic. ACM Commun. Comput. Algebra, 47(3/4):166–169, 2014.

[15] U. W. Kulisch. Computer Arithmetic and Validity. Theory, Implementation, and Applications. Number 33 in Studies in Mathematics. de Gruyter, 2008.

[16] S. Miyajima. Fast enclosure for all eigenvalues in generalized eigenvalue problems. Journal of Computational and Applied Mathematics, 233(11):2994–3004, 2010.

[17] R. E. Moore. Interval Analysis. Prentice Hall, Englewood Cliffs, N.J., 1966.

[18] A. Neumaier. Interval methods for systems of equations. Cambridge university press, Cambridge, 1990.

[19] S. Rump and S. Graillat. Verified error bounds for multiple roots of systems of nonlinear equations. Num. Algs., 54:359–377, 2010.

[20] S. M. Rump. Guaranteed inclusions for the complex generalized eigenproblem. Computing, 42(2):225–238, 1989.

[21] S. M. Rump. INTLAB - INTerval LABoratory. In Tibor Csendes, editor, Developments in Reliable Computing, pages 77–104. Kluwer Academic Publishers, Dordrecht, 1999. http://www.ti3.tu-harburg.de/rump/.

[22] S. M. Rump. Computational error bounds for multiple or nearly multiple eigenvalues. Linear algebra and its applications, 324(1-3):209–226, 2001.

[23] T. Yamamoto. Error bounds for computed eigenvalues and eigenvectors. Numerische Mathematik, 34(2):189–199, 1980.