Certified Singular Value Decomposition

Joris van der Hoeven

CNRS, Laboratoire LIX

Campus de l'École Polytechnique

1 rue Honoré d'Estienne d'Orves

Bâtiment Alan Turing CS35003

91120 Palaiseau

France

Email: vdhoeven@lix.polytechnique.fr

Jean-Claude Yakoubsohn

Institut de Mathématiques de Toulouse

Université Paul Sabatier

118 route de Narbonne

31062 Toulouse Cedex 9

France

Email: yak@mip.ups-tlse.fr

November 4, 2023

In this paper, we present an efficient algorithm for the certification of numeric singular value decompositions (SVDs) in the regular case, i.e., in the case when all the singular values are pairwise distinct. Our algorithm is based on a Newton-like iteration that can also be used for doubling the precision of an approximate numerical solution.

1.Introduction

Let be the set of floating point numbers for a fixed precision and a fixed exponent range. We denote . Consider an matrix with complex floating entries, where . The problem of computing the numeric singular value decomposition of is to compute unitary transformation matrices , , and a diagonal matrix such that

(1)

If , then is understood to be “diagonal” if it is of the form

The diagonal entries of are the approximate singular values of the matrix and throughout this paper we will assume them to be pairwise distinct and ordered

There are several well-known algorithms for the computation of numeric singular value decompositions [8, 4].

Now (1) is only an approximate equality. It is sometimes important to have a rigorous bound for the distance between an approximate solution and some exact solution. More precisely, we may ask for a diagonal matrix and matrices , , such that there exist unitary matrices , , and a diagonal matrix for which

and

for all . This task will be called the certification problem for the given numeric singular value decomposition (1). The matrices , and can be thought of as reliable error bounds for the matrices , and of the numerical solution.

It will be convenient to rely on ball arithmetic [13, 19], which is a systematic technique for this kind of bound computations. When computing with complex numbers, ball arithmetic is more accurate than more classical interval arithmetic [22, 1, 23, 18, 21, 24], especially in multiple precision contexts. We will write for the set of balls with centers in and radii in . In a similar way, we may consider matricial balls : given a center matrix and a radius matrix , we have

Alternatively, we may regard as the set of matrices in with ball coefficients:

Standard arithmetic operations on balls are carried out in a reliable way. For instance, if , then the computation of the product using ball arithmetic has the property that for any and .

In the language of ball arithmetic, it is natural to allow for small errors in the input and replace the numeric input by a ball input . Then we may still compute a numeric singular value decomposition of the center matrix :

(2)

The generalized certification problem now consists of the computation of matrices , , and a diagonal matrix such that, for every , there exist unitary matrices , and a diagonal matrix with

In this paper we propose an efficient solution for this problem in the case when all singular values are simple. Our algorithm relies on an efficient Newton iteration that is also useful for doubling the precision of a given numeric singular value decomposition. The iteration admits a quadratic convergence and only requires matrix sums and products of size at most . In [13, 15], a similar approach was used for the certification of eigenvalues and eigenvectors.

We are not aware of similarly efficient and robust algorithms in the literature. Jacobi-like methods from Kogbeliantz' SVD algorithm admit quadratic convergence in the presence of cluster in the Hermitian case [3], but only linear convergence is achieved in general [5]. Gauss-Newton type methods have also been proposed for the approximation the regular real SVD in [17, 16].

From the theoretical bit complexity point of view, our algorithm essentially reduces the certification problem to a constant number of numeric matrix multiplications. When using a precision of bits for numerical computations, it has recently been shown [10] that two matrices can be multiplied in time

Here with is the cost of -bit integer multiplication [11, 9] and is the exponent of matrix multiplication [7]. If is large enough with respect to the log of the condition number, then yields an asymptotic bound for the bit complexity of our certification problem.

We have implemented unoptimized versions of the new algorithms in Mathemagix [14] and Matlab. These toy implementations indeed confirmed the quadratic convergence of our Newton iteration and the efficiency of the new algorithms. We intend to report more extensively on implementation issues in a forthcoming paper.

2.Notations

2.1.Matrix norms

Throughout this paper, we will use the max-norm for vectors and the corresponding matrix norm. More precisely, given positive integers , a vector , and an matrix , we set

For a second matrix , we clearly have

We also define

(3)

Explicit machine computation of the matrix norm is easy using the formula

(4)

Given a second matrix it follows that the coefficientwise product

satisfies

(5)

In particular, when changing certain entries of a matrix to zero, its matrix norm can only decrease. We will write for the ball matrix whose entries are all unit balls . This matrix has the property that for all matrices .

2.2.Miscellaneous notations

In the sequel we consider two integers and we introduce the sets of matrices:

(6)

and

We also write for the natural projection that replaces all non-diagonal entries by zeros. For any integer we finally define the map by

where is the identity matrix of size .

3.Overview of our method

Given , the triple with forms an SVD for if and only if it satisfies the following system of equations:

(7)

This is a system of equations with unknowns. Our efficient numerical method for solving this system will rely on the following principles:

  1. For a well-chosen ansatz close to the unitary group , we prove that

    is even closer to the unitary group than : see section 4. Similarly, for an ansatz close to , we take

  2. From , and , we prove that is possible to explicitly compute , and two skew Hermitian matrices and such that

    after which is a first-order approximation of : see section 5.

  3. Let , , and . If is sufficiently close to , then we will prove that is a better approximation of the matrix than :

More precisely, given , , and , we define the following sequence of matrices

(8)
(9)
(10)
(11)
(12)

where is a diagonal matrix and , are two skew Hermitian matrices such that

(13)

In order to measure the quality of the ansatz, we define

The main result of the paper is the following theorem that gives explicit conditions for the quadratic convergence of the sequence , together with explicit error bounds.

Theorem 1. Let and be such that . Denote

where stand for the diagonal entries of . If

then the sequence defined by (1012) converges quadratically towards an SVD of the matrix , i.e. . More precisely, for each , we have

The proof of this theorem will be postponed to section 6. Assuming that the theorem holds, it naturally gives rise to the following algorithm for certifying an approximate SVD:

Algorithm 1

Input: an approximate SVD for the center of a ball matrix

Output: ball enclosures , and of , and such that for any , there exist , and such that is an exact singular value decomposition of

  1. Compute using ball arithmetic

  2. Let be an upper bound for

  3. Let and be upper bounds for and (with and as in Theorem 1)

  4. If , then set , ,

  5. Else set , , (using upward rounding)

  6. Set

  7. Set

  8. Set

  9. Return

Theorem 2. Algorithm 1 is correct.

Proof. If , then we return matrix balls with infinite radii for which the result is trivially correct. If , then for any , the actual values of , and are bounded by , and , so Theorem 1 applies for the ansatz . As a consequence, we obtain an SVD for with the property that , , and . We conclude that , , , as desired.

Remark 3. Notice that the algorithm does not use our Newton iteration in order to improve the quality of the approximate input SVD (in particular, the output is worthless whenever ). The idea is that Algorithm 1 is only used for the certification, and not for numerical approximation. The user is free to use any preferred algorithm for computing the initial approximate SVD. Of course, our Newton iteration can be of great use to increase the precision of a rough approximate SVD that was computed by other means.

4.Polar projection

Since we are doing approximate computations, the unitary matrices in an SVD are not given exactly, so we may wish to estimate the distance between an approximate unitary matrix and the closest actual unitary matrix. This is related to the following problem: given an approximately unitary matrix , find a good approximation for its projection on the group of unitary matrices. We recall a Newton iteration for this problem [20, 2, 12] and provide a detailed analysis of its (quadratic) convergence.

4.1.The Newton iteration

The tangent space to at is

(14)

Consider the Riemannian metric inherited from the embedding space

Then the normal space is

We wish to compute using an appropriate Newton iteration. From the characterization of the normal space, it turns out that it is more convenient to write , where is Hermitian. With and , we have

Taking

(15)

it follows that

(16)

We are thus lead to the following Newton iteration that we will further study below:

(17)

Remark 4. Another way to construct the previous iteration is to remark that the derivative is onto from on the subset of Hermitian matrices. Then it is easy to see that for given and , the matrix satisfies the equation , i.e,

Consequently . In this context the classical Newton operator thus becomes

4.2.Error analysis

Proposition 5. Let be an matrix with . Let , where and write . Then and

(18)

Proof. The conclusion follows from (16), since .

Lemma 6. Given , , and , we have

(19)

Proof. Modulo taking instead of , it suffices to consider the case when . Now

is an increasing function in and , since its power series expansion in and admits only positive coefficients. Consequently, .

We recall that any invertible matrix admits a unique polar decomposition

where and is a positive-definite Hermitian matrix. We call the polar projection of on . The matrix can uniquely be written as the exponential of another Hermitian matrix. It is also well known that is indeed the closest element in to for the Riemannian metric [6, Theorem 1].

Theorem 7. Let be such that . Then the Newton sequence (17) defined from converges quadratically to the polar projection of . More precisely, for all , we have

Proof. The Newton sequence (17) defined from gives

with . An obvious induction using Proposition 5 yields and . Therefore this sequence converges to a limit that is given by

Lemma 6 implies

(20)

More generally, we have

Since is unitary, we have . Neumann's lemma also implies that is invertible with

By induction on , it can also be checked that for all . This means that the all commute, whence and are actually Hermitian matrices. Since , the logarithm is well defined. We conclude that is the exponential of a Hermitian matrix, whence it is positive-definite.

5.SVDs for perturbed diagonal matrices

5.1.Approximate solutions at order one

Let be a matrix with diagonal entries . Consider a perturbation

We wish to compute an approximate SVD

where , , and . Discarding higher order terms, this leads to the linear equation

with and . In view of (14), this means that and are skew Hermitian. The following proposition shows how to solve the linear equation explicitly under these constraints.

Proposition 8. Let and . Consider the diagonal matrix and the two skew Hermitian matrices and that are defined by the following formulas:

Then we have

(29)

Proof. Since and are skew Hermitian, we have . In view of (21), we thus get

By skew symmetry, for the equation

to hold, it is sufficient to have

(30)
(31)
(32)

The formulas (22) clearly imply (30). The from (27) clearly satisfy (32) as well. For , the formulas (31) can be rewritten as

Since , the formulas (2326) indeed provide us with a solution. The entries with do not affect the product , so they can be chosen as in (28). In view of the skew symmetry constraints and , we notice that the matrices and are completely defined.

5.2.Error analysis

Proposition 9. Let . Assume that , and are computed using (2128). Denote

Given with , we have

(33)
(34)
(35)

Setting

and , we also have

Proof. From the formula (21) we clearly have . The formula (22) implies for all . For , the formulas (2326) imply

whence

Similarly,

It follows that

From (27), and using (4), we also deduce that , for and . Combined with the fact that , we get

Since and , we now observe that

Plugging in the above norm bounds, we deduce that

In a similar way, one proves that .

6.Proof of Theorem 1

Let us denote

and, for each ,

where denote the diagonal entries of . Let us show by induction on that

(36)
(37)
(38)
(39)

These inequalities clearly hold for . Assuming that the induction hypothesis holds for a given and let us prove it for .

By the definition of , we have and . Setting

we have

It follows that

Let . Applying Proposition 9 to , we get

Since , we have

Using (18), we obtain

Similarly,

Using and (13), we next have

It follows that

This completes the proof that . We also have

We deduce that and . Let us finally prove that . From , we get

so that

Similarly, using

we get

Hence , which completes the proof of the four induction hypotheses (3639) at order .

From the continuity of the maps , , and , we deduce that the sequence converges. Let be the limit. By continuity, we have . The unitary matrix is of the form with

(40)

From above we know that

whence

Lemma 6 now implies

This shows that is invertible, with

Hence

From the definition of we also have

Using Lemma 6, this yields

Similar bounds can be computed for , , and . Altogether, this leads to

(41)
(42)
(43)
(44)

We finally have

since . This completes the proof.

Bibliography

[1]

Alefeld, G., and Herzberger, J. Introduction to interval analysis. Academic Press, New York, 1983.

[2]

Björck, Å., and Bowie, C. An iterative algorithm for computing the best estimate of an orthogonal matrix. SIAM J. on Num. Analysis 8, 2 (1971), 358–364.

[3]

Charlier, J.-P., and Van Dooren, P. On Kogbetliantz's SVD algorithm in the presence of clusters. Linear Algebra and its Applications 95 (1987), 135–160.

[4]

Demmel, J. W. Applied Numerical Linear Algebra, vol. 56. Siam, 1997.

[5]

Drmac, Z. A global convergence proof for cyclic Jacobi methods with block rotations. SIAM journal on matrix analysis and applications 31, 3 (2009), 1329–1350.

[6]

Fan, K., and Hoffman, A. J. Some metric inequalities in the space of matrices. Proc. of the AMS 6, 1 (1955), 111–116.

[7]

Gall, F. L. Powers of tensors and fast matrix multiplication. In Proc. ISSAC 2014 (Kobe, Japan, July 2014), pp. 296–303.

[8]

Golub, G. H., and Loan, F. V. Matrix Computations. JHU Press, 1996.

[9]

Harvey, D., and Hoeven, J. v. d. Faster integer multiplication using short lattice vectors. Tech. rep., ArXiv, 2018. http://arxiv.org/abs/1802.07932, to appear in the Proceedings of ANTS 2018.

[10]

Harvey, D., and Hoeven, J. v. d. On the complexity of integer matrix multiplication. JSC 89 (2018), 1–8.

[11]

Harvey, D., Hoeven, J. v. d., and Lecerf, G. Even faster integer multiplication. Journal of Complexity 36 (2016), 1–30.

[12]

Higham, N. J. Matrix nearness problems and applications. In Applications of Matrix Theory (1989), M. J. C. Gover and S. Barnett, Eds., Oxford University Press, pp. 1–27.

[13]

Hoeven, J. v. d. Ball arithmetic. In Logical approaches to Barriers in Computing and Complexity (February 2010), A. Beckmann, C. Gaßner, and B. Löwe, Eds., no. 6 in Preprint-Reihe Mathematik, Ernst-Moritz-Arndt-Universität Greifswald, pp. 179–208. International Workshop.

[14]

Hoeven, J. v. d., Lecerf, G., Mourrain, B., et al. Mathemagix, 2002. http://www.mathemagix.org.

[15]

Hoeven, J. v. d., and Mourrain, B. Efficient certification of numeric solutions to eigenproblems. In Proc. MACIS 2017, Vienna, Austria (Cham, 2017), J. Blömer, I. S. Kotsireas, T. Kutsia, and D. E. Simos, Eds., Lect. Notes in Computer Science, Springer International Publishing, pp. 81–94.

[16]

Janovská, D., Janovsk, V., and Tanabe, K. An algorithm for computing the analytic singular value decomposition. World Academy of Science, Engineering and Technology 47 (2008), 135–140.

[17]

Janovsk, V., Janovská, D., and Tanabe, K. Computing the analytic singular value decomposition via a pathfollowing. In Numerical Mathematics and Advanced Applications. Springer, 2006, pp. 954–962.

[18]

Jaulin, L., Kieffer, M., Didrit, O., and Walter, E. Applied interval analysis. Springer, London, 2001.

[19]

Johansson, F. Arb: a C library for ball arithmetic. ACM Commun. Comput. Algebra 47, 3/4 (2014), 166–169.

[20]

Kovarik, Z. Some iterative methods for improving orthonormality. SIAM J. on Num. Analysis 7, 3 (1970), 386–389.

[21]

Kulisch, U. W. Computer Arithmetic and Validity. Theory, Implementation, and Applications. No. 33 in Studies in Mathematics. de Gruyter, 2008.

[22]

Moore, R. E. Interval Analysis. Prentice Hall, Englewood Cliffs, N.J., 1966.

[23]

Neumaier, A. Interval methods for systems of equations. Cambridge University Press, Cambridge, 1990.

[24]

Rump, S. M. INTLAB - INTerval LABoratory. In Developments in Reliable Computing, T. Csendes, Ed. Kluwer Academic Publishers, Dordrecht, 1999, pp. 77–104. http://www.ti3.tu-harburg.de/rump/