• No results found

Orthogonality of matrices and some distance problems


Academic year: 2023

Share "Orthogonality of matrices and some distance problems"


Full text


LINEAR ALGEBRA AND ITS APPLICATIONS ELSEVIER Linear Algebra and its Applications 287 (1999) 77-85

Orthogonality of matrices and some distance problems

Rajendra Bhatia a,., Peter Semrl b,l

Indian Statistical Institute, 7, S.J.S. Sansanwal Marg, New Delhi 110016, India b Faeu•y o f Mechanical Engineering, Uni~:ersi O, o f Maribor, Smetanova 17, 2000 Maribor, Slovenia

Received 18 February 1998; accepted 29 June 1998 Submitted by V. Mehrmann

Dedicated to Ludwig Elsner on the occasion of his 60th birthday


If A and B are matrices such that IIA + zBII ~ IIA II for all complex numbers z, then A is said to be orthogonal to B. We find necessary and sufficient conditions for this to be the case. Some applications and generalisations are also discussed. © 1999 Elsevier Science Inc. All rights reserved.

Ke;vwords: Birkhoff James orthogonality; Derivative; Norms; Distance problems

Let A and B be two n × n matrices. The matrix A will be identified with an o p e r a t o r acting on an n-dimensional Hilbert space H in the usual way. T h e symbol IIA II stands for the n o r m o f this operator. A is said to be o r t h o g o n a l to B (in the Birkhoff-James sense [7]) if JJA +zB]l ~> HAIl for every c o m p l e x n u m b e r z. In Section 1 o f this note we give a necessary a n d sufficient condition for A to be o r t h o g o n a l to B. T h e special case when B --- 1 can be applied to get some distance formulas for matrices as well as a simple p r o o f o f a well-known result o f Stampfli on the n o r m o f a derivation. In Section 2 we consider the a n a l o g o u s p r o b l e m when the n o r m ]].]i is replaced by the Schatten p - n o r m . The special case A = 1 o f this p r o b l e m has been studied by K i t t a n e h [8], and used to

* Corresponding author. E-mail: rbh@isid.ernet.in.

i E-mail: peter.semrl@uni-mb.si.

0024-3795/99/$ see front matter © 1999 Elsevier Science Inc. All rights reserved.

PII: S 0 0 2 4 - 3 7 9 5 ( 9 8 ) 1 0 1 3 4 - 9


78 R. Bhatia, P. ¢Semrl I Linear Algebra and its Applications 287 (1999) 77~86

characterise matrices whose trace is zero. In Section 3 we make some remarks on how to extend some results from Section 1 to infinite-dimensional Hilbert spaces, and formulate a conjecture about orthogonality with respect to induced matrix norms.

1. The operator norm

Theorem 1.1. A matrix A is orthogonal to B if and only if there exists a unit vector x E H such that rlAxll = IIAI[ and (Ax, Bx) = O.

Proof. If such a vector x exists then

[IA + zBll 2 >>. I[(A + zB)xJ[ 2 = ll,4xf[2 + [zl2tlBxld 2 ~ IIAx[I 2 = IIAI[ 2.

So, the sufficiency of the condition is obvious.

Before proving the converse in full generality we make a remark that serves three purposes. It gives a p r o o f in a special case, indicates why the condition of the theorem is a natural one, and establishes a connection with the theorem in Section 2.

It is well-known that the operator norm II,ll is not Fr6chet differentiable at all points. However, if A is a point at which this norm is differentiable, then there exists a unit vector x, unique upto a scalar multiple, such that [IAx[[ = HAIl, and such that for all B

d A B

dt t=o"A + tB[[ = Re ( [-~[ x' x ) .

See Theorem 3.1 of [1]. Using this, one can easily see that the statement of the theorem is true for all matrices A that are points o f differentiability of the norm [[. [[.

N o w let A be any matrix and suppose A is orthogonal to B. Let A = UP be a polar decomposition o f A with U unitary and P positive. Then we have

[IP+zU*BII >~ [IP/[ = p[All

for all z. In other words, the distance o f P to the linear span of U'B is [IP][.

Hence, by the H a h n - B a n a c h theorem, there exists a linear functional 4, on the space of matrices such that [[~bl[ = 1, ~b(P) =


and q~(U*B) = 0. We can find a matrix T s u c h that ~b(X) = tr(XT) for all X. Since I[ ll : 1 the trace norm (the sum o f singular values) o f T must be 1. So, T has a polar decomposition

T = sjuj V,

\ i = l J

where s~ are singular values of T i n decreasing order, ~ j = l n s: -- 1, the vectors uj form an orthonormal basis for H, and V is unitary. We have


R. Bhatia, P. Semrl I Linear Algebra and its Applications 287 (1999) 77-86 79



= tr(PT) = y ~ s s tr[Puj(V* uj)* ]


n n n

= Z j<pu , V*ujl <. Zsjll jll E jllt'll = ItPII.

j=l /=1 j=l

Hence, if k is the rank of T (i.e., sk # 0, but s,+l = 0), then [[Pus[ [ = [[PH for j = 1 , . . . , k ; and hence Pug. = [[P[[ug. F r o m the conditions for the C a u c h y - Schwarz inequality to be an equality we conclude that V'u: is a scalar multiple o f Pug., j = 1 , . . . , k . Obviously, these scalars must be positive, and so,

V*uj = u: for all j = 1 , . . . , k. It follows that T is o f the form


Z juj.;,


where u s belong to the eigenspace K of P corresponding to its maximal eigenvalue HP][. Then O(U*B) = 0 implies


Zsj<B*Uuj,.j) =0.


If Q is the orthoprojector on the linear span of the u j, then this equality can be rewritten as


Z s j < Q B * UQuj, u j) = O.


Since the numerical range of any operator is a convex set, there exists a unit vector x c K such that

0 = <QB*UQx, x) = (B*Ux, x) = (Ux, Bx).


(Ax, gx> = (UPx, Bx> = IIPIl(Ux, gx> = O. []

Notice that orthogonality is not a symmetric relation. The special cases when A or B is the identity are of particular interest [3,4,8,10].

Theorem 1.1 says that I is orthogonal to B if and only if W(B), the numerical range of B, contains 0. F o r another p r o o f of this see Remark 4 of [8].

The more complicated case when B = I has been important in problems related to derivations and operator approximations. In this case the theorem (in infinite dimensions) was proved by Stampfli ([10], Theorem 2). A different p r o o f attributed to Ando [3] can be found in [4] (p. 206). It is this p r o o f that we have adopted for the general case.


80 R. Bhatia, P. Semrl / Linear Algebra and its Applications 287 (1999) 77~86

Problems o f a p p r o x i m a t i n g an o p e r a t o r by a simpler one have been o f in- terest to o p e r a t o r theorists [4], numerical analysts [6], and statisticians [9]. T h e second special result gives a f o r m u l a for the distance o f an o p e r a t o r to the class o f scalar operators. We have, by definition,

dist(A, CI) = rain IIA +zZII. (1.1)


If this m i n i m u m is attained at Ao = A 4 - z o I then A0 is o r t h o g o n a l to the identity. T h e o r e m 1.1 then says that

dist(A, CI) =


= m a x { J ( A o x , y)l:




-- 1 and x 3_ y}

= m a x { l ( A x , y)[:

Ilxll --Ilyll

= 1 and x A_y}. (1.2) This result is due to A n d o [3]. We will use it to calculate the diameter o f the unitary orbit of a matrix.

T h e u n i t a r y orbit o f a matrix A is the set o f all matrices o f the form UA U*

where U is unitary. The diameter o f this set is dA = max{[[VAV* - UAU*[I: U, V unitary }

= max{[lA - UAU*[]: U unitary}. (1.3)

Notice that this diameter is zero if and only if A is a scalar matrix. T h e fol- lowing t h e o r e m is, therefore, interesting.

Theorem 1.2. F o r e v e r y m a t r i x A we have

dA ---- 2 dist(A, C I ) . (1.4)

Proof. F o r every unitary U and scalar z we have

JIA - UAU*I[ = II(A - z l ) - U ( A - zI)U*[I ~<2[IA - zll[.


dA ~< 2 dist(A, CI).

As before we choose Ao = A + zol and an o r t h o g o n a l pair o f unit vectors x and y such that

dist(A, CI) = [[A0[[ = (Aox, y).

By the condition for equality in the C a u c h y - S c h w a r z inequality we must have A0x -- [[A0[[y. We can find a unitary U satisfying Ux = x and Uy = - y . T h e n

UAoU*x = -tlAo[[y. We have

dA ~- dAo >~ [[A0x - UAoU*x[[ = 2][A0[[ = 2 dist(A, CI). []

F r o m (1.3) and (1.4) we have


R. Bhatia, P. Semrl / Linear Algebra and its Applieations 287 (1999) 77~86 81

m a x { inA U - UA LI: u unitary} = 2 dist(A, C I ) . (1.5) I f X is a n y o p e r a t o r with [[XI[ = 1, then X can be written as X = ~ (V + W) where V a n d W are unitary. (Use the singular value d e c o m p o s i t i o n o f X, a n d observe t h a t every positive n u m b e r between 0 and 1 can be expressed as 1 (ei0 _1_ e-i0).) H e n c e we have

m a x IIAX - NAil = 2 dist(A, CI). (1.6)


Recall that the o p e r a t o r hA (X) = A X - XA on the space o f matrices is called an inner derivation. T h e preceding r e m a r k shows that the n o r m o f 6A is 2 dist(A, CI). This was p r o v e d (for o p e r a t o r s in a H i l b e r t space) b y Stampfli [10].

T h e p r o o f we have given for matrices is simpler. In Section 4 we will show h o w to p r o v e the result for infinite-dimensional Hilbert spaces.

A trivial u p p e r b o u n d for dA is 2llAI[. This b o u n d can be attained. F o r ex- ample, any block diagonal m a t r i x o f the f o r m

is unitarily similar to

[ oj0

A simple lower b o u n d for dx is given in o u r next p r o p o s i t i o n .

Proposition 1.3. L e t A be any m a t r i x with singular values Sl(A) /> ..- /> s , ( A ) . Then

d.~ >~ sl (A ) - s,(A). (1.7)

Proof. Let z be a n y c o m p l e x n u m b e r with p o l a r f o r m z : re i°. Let A = UP be a p o l a r d e c o m p o s i t i o n o f A. T h e n

II A - zlII = II P - z U * l l >~ i n f { l l P - z V l l : V unitary}

= inf{lIP - rVll: v unitary}.

By a t h e o r e m o f F a n a n d H o f f m a n , the value o f the last infimum is [IP - rill (see [5], p. 276). So

m i n IlA - zll[ >1 min lIP - rill : min m a x [sj - r I

zEC r >/0 r ) 0 j

= ½ (s, (A) - s , ( A ) ) . T h e p r o p o s i t i o n n o w follows f r o m T h e o r e m 1.2. []


82 1L Bhatia, P. ~Semrl / Linear Algebra and its Applications 287 (1999) 77~86 If A is a Hermitian matrix then there is equality in (1.7).

2. The Sehatten norms

F o r 1 ~<p < c~, the Schatten p-norm o f A is defined as

IIAllp =

sj(A)) p ,

where sl(A) >>.... >>. sn(A) are the singular values o f A.

If 1 < p < ec, then the norm II,llp is Fr~chet differentiable at every A. In this case

d =0[IA +

tB[IPp = P Re tr IAI p-IU*B,



for every B, where A =


is a polar decomposition o f A. Here IAI =

(A'A) 1/2.

I f p = 1 this is true if A is invertible. See [2] (Theorem 2.1) and [1] (Theorems 2.2 and 2.3).

As before, we say that A is orthogonal to B in the Schatten p-norm (for a given 1 ~<p < oe) if


I> IlAl[p f o r a l l z. (2.2)

The case p = 2 is special. The quantity (A, B) -- tr A*B

defines an inner product on the space o f matrices, and the norm associated with this inner product is [].[[2. The condition (2.2) for orthogonality is then equivalent to the usual Hilbert space condition (A, B) = 0. Our next theorem includes this as a very special case.

Theorem 2.1. Let A have a polar decomposition A =


I f for any 1 <<.p < oe we have

tr[A[ p-I U*B = 0, (2.3)

then A is orthogonal to B in the Schatten p-norm. The converse is true for all A, if 1 < p < c~, and for all invertible A, i f p = 1.

Proof. If (2.3) is satisfied, then for all z tr


= tr [AIP-I([AI +zU*B).

Hence, by H61der's Inequality ([5], p. 88),


R. Bhatia, P. Semrl I Linear Algebra and its Applications 287 (1999) 77-86 83


IAI p <~



[Iql[ IA[ +




Ilq[[ A +


= [tr

]AI(P-I)q] 1/q IIA +zBIl~

= (tr

IAI")J/°IIA + zBIl~,

where q is the index conjugate to p (i.e., 1/p

+ 1/q

= 1). Since (tr

[ALP) '-'/q

= (tr

]AqP) 1/p



this shows that

[IA[[p~< IIA


forall z.

Conversely, if (2.2) is true, then Ilei°A +

tnllp >! Ilei°A[Ip

for all real t and 0. Using the expression (2.1) we see that this implies Re tr(lAI p-'


= 0,

for all A if I < p < co, and for invertible A i f p = 1. Since this is true for all 0, we get (2.3). []

The following example shows that the case p = 1 is exceptional. If A = (10 0)0 and B = ( 00 ~ ) '


[IA + zBIIl ~

IIAII1 However,



= tr B ¢ 0.

for all z.

The ideas used in our proof of Theorem 2.1 are adopted from Kittaneh [8]

who restricted himself to the special case A = 1.

3. Remarks

Remark 3.1. Theorem 1.1 can be extended to the infinite-dimensional case with a small modification. Let A and B be bounded operators on an infinite- dimensional Hilbert space H. Then A is orthogonal to B if and only if there exists a sequence {xn} of unit vectors such that

llAx.It--, IIAII,


(Axe, Bx~) ---, O.

Indeed, if such a sequence {xn} exists then


8 4


R. Bhatia, P. Semrl I Linear Algebra and its Applications 287 (1999) 77~6 I[A +zBII 2 >/II(A +zB)x,,l[ 2

= []Ax.[[ 2 + [zlZllBx.][ z + 2 R e ( ~ ( A x . , B x . ) ) .

IIA + zBll 2 ~ lim supll(A + zB)x.II 2 ~ IIAId 2.

To prove the converse we first note that T h e o r e m 1.1 can be reformulated in the following way: if A and B are operators acting on a finite-dimensional Hilbert space H then

min I[A


: max{l(Ax,y)[: Ilxl[ = Ilyll = 1 and y ± Bx}.

It follows that for operators A and B acting on an infinite-dimensional Hilbert space H we have

min [[A +zB[[ = sup{l(Ax,y)[: [Ix[[ : I[Y][ = 1 and y ± Bx}.

This implication was proved in the special case when B = I in [4] (p. 207). A slight modification of the p r o o f yields the general case. Assume now that A is orthogonal to B. Then rain IIA + zB[] = HAIl. Therefore we can find sequences of unit vectors {xn}, {y~} E H such that (Axn,y,) -~ ][A][ a n d y , ± Bx,. It follows that [[Axn[] ---* J[A[I, and consequently

Axn ~0

Y° IIAx.II and

lim (Ax.,Bx.)= lim



This completes the proof.

Remark 3.2. The statement following (1.6) about norms of derivations can also be proved for infinite-dimensional Hilbert spaces by a limiting argument.

Let H be an infinite-dimensional separable Hilbert space, and let A be a bounded operator on H. Let {P.} be a sequence of finite rank projections in- creasing to the identity. Denote by A. the finite rank oper~itor P,,A restricted to the range of P~. Let min:~e [IA, - zll[ = IIA,, - z,I[[. F o r each n we have


IlXll ~< 1

f l A X - g A l l ~ sup

IXl~ <~ 1

>~ sup

JtX]I ~< 1

= sup

IlXll <~ 1

I IAp,~Yp. - p.XP.All lIP. (AP.XP. - P,~P.A )P. II


(P.A P.) (P.XP.) - (P.XP.) (P.A Po)lf

= 2][An -z,,III.


R. Bhatia, P. Semrl / Linear Algebra and its Applications 287 (1999) 77-86 85 P a s s i n g to a s u b s e q u e n c e , if n e c e s s a r y , a s s u m e t h a t z, ~ z0. T h e n

l i m [tA, - z,I[I = [IA - zoZrl >1 dist(A, C I ) .

n ~ o c

H e n c e ,

sup IIAX -XAII >/2 dist(A, CI).

IIXII ~< l

T h u s t h e n o r m o f t h e d e r i v a t i o n 6A is e q u a l to 2 dist(A, C I ) .

R e m a r k 3.3. I n view o f T h e o r e m 1.1 we are t e m p t e d to m a k e t h e f o l l o w i n g c o n j e c t u r e . L e t I1.11 n o w r e p r e s e n t a n y n o r m o n t h e v e c t o r s p a c e C", a n d also t h e n o r m it i n d u c e s o n t h e s p a c e o f n x n m a t r i c e s a c t i n g as l i n e a r o p e r a t o r s o n C ". W e c o n j e c t u r e t h a t

IIA + zBII >1 IIAll for all z

if a n d o n l y if t h e r e exists a u n i t v e c t o r x s u c h t h a t


= IIAII a n d

IIAx + zgxll >t

IIAxll f o r all z.


T h i s w o r k w a s b e g u n d u r i n g t h e first a u t h o r ' s visit to S l o v e n i a i n S e p t e m b e r 1997. B o t h a u t h o r s a r e t h a n k f u l to t h e S l o v e n e M i n i s t r y o f Science a n d T e c h n o l o g y for its s u p p o r t .


[1] T.J. Abatzoglou, Norm derivatives on spaces of operators, Math. Ann. 239 (1979) 129--135.

[2] J.G. Aiken, J.A. Erdos, J.A. Goldstein, Unitary approximation of positive operators, Illinois J. Math. 24 (1980) 61-72.

[3] T. Ando, Distance to the set of thin operators, unpublished report, 1972.

[4] C. Apostol, L.A. Fialkow, D.A. Herrero, D. Voiculescu, Approximation of Hilbert Space Operators II, Pitman, Boston, 1984.

[5] R. Bhatia, Matrix Analysis, Springer, New York, 1997.

[6] N.J. Higham, Matrix nearness problems and applications, in: Applications of Matrix Theory, Oxford University Press, Oxford, 1989.

[7] R.C. James, Orthogonality and linear functionals in normed linear spaces, Trans. Amer.

Math. Soc. 61 (1947) 265 292.

[8] F. Kittaneh, On zero-trace matrices, Linear Algebra Appl. 151 (1991) 119-124.

[9] C.R. Rao, Matrix approximations and reduction of dimensionality in multivariate statistical analysis, in: Multivariate Analysis - V, North-Holland, Amsterdam, 1980.

[10] J.G. Stampfli, The norm of a derivation, Pacific J. Math. 33 (1970) 737 747.


Related documents


thawla, Assistant Professor, Department of Mathematics, Indian Institute of Technology, Delhi under whose supervision and guidance I have done this work. His valuable suggests

construct an e.-compactification that is universal andt in the language of category theory, gives an adjunction. Our e.compactification produces topological e.compactification for

In Section 3, we provide some direct implications of our results in determining the absolute clique number of the families of triangle-free projective-planar graphs, which is

The theory(con~ex invariants ha~ grown out ofthe classical results of Helly, Radon and Caratheodory in Euclidean spaces. Levi gave the first general definition of the

Also we introduce some fuzzy generalized metric spaces like fuzzy wa-spaces, fuzzy Moore spaces,· fuzzy M-spaces, fuzzy k-spaces, fuzzy a -spaces, study their properties, prove

In section 2 we state some results that are used to study the properties of Wilcoxon statistic for associated random variables.. In section 3 we discuss the asymptotic normality of

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor... PDF compression, OCR, web optimization using a watermarked evaluation copy

Returning to the general situation we recall the following estimate, due to H¨ormander [8] and Peetre [19], on the kernel of the Riesz mean associated with a dth order

The proof of (1) is given in the next section, while the proof of (2) will be given in Section 3.7 after some preparatory work on backward deterministic Büchi automata (Section

I Standard approach: programming examples drawn from basic numerical processing and text processing.. I May not interest all students, say those not majoring

Six leptocephali, belonging to various genera, were collected from the shore seines of Kovalam beach (7 miles south of Trivandrum) in the month of January 1953. Of these 2

Telecom Regulatory Authority of India, (2014) 3 SCC 222 in some detail, and then went on to hold that the power vested in TRAI under Section 36(1) to make

Root systems and Dynkin diagrams play an important role in understanding structure theory (real forms and symmetric spaces, embedding) and representation theory (highest

However, recently a two-dimensional state model for the standard M/M/l/w queue, which is initially empty, was introduced and the results obtained are independent of the

In Chapter 2, we discuss some basic and standard results on complex analysis and we study Some basic results like Lebesegue’s covering lemma, Maximum modulus theorem and Schwarz

First we extend the available perturbation results to liie next higher order and in the process, we come across some correction terms which have not been

In Section 3, we focus on the block hook matrices formed by hooks of two different shapes and we show that the determinant of these matrices admit nice product formulas.. In Section

In this chapter, we show that Fair Feedback Vertex Set (FFVS) problem defined in section 1.1 is W [1]-hard with respect to some structural graph parameters like treewidth and

It is stated in many standard texts that the reason for quadrapole nature of gravitational waves is that in a gravitating system the center of mass is not accelerating (since

With the help of the symbolic calculation and applying the improved methods, we solve the (3 + 1)- dimensional potential-Yu–Toda–Sasa–Fukuyama (YTSF) equation to obtain some

The present work, though a recheck of some previous calculations, also deals with some new results e.g., use of two dimensional temporal Dirac equation to

We then extend the well-known theorem of Schoenberg [7] for Euclidean distance matrices to block distance matrices and characterize a block distance matrix by the eigenvalues of