• No results found

Maximum likehood characterization of the von mises-fisher matrix distribution

N/A
N/A
Protected

Academic year: 2023

Share "Maximum likehood characterization of the von mises-fisher matrix distribution"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Sanhhya : The Indian Journal of Statistics 1992, Volume 54, Series A, Pt. 1, pp. 123-127.

MAXIMUM LIKELIHOOD CHARACTERIZATION OF THE VON MISES-FISHER M ATRIX DISTRIBUTION

By SUM ITRA PU R K A Y A ST H A Indian Statistical Institute

and

R A H U L MUKERJEE*

Indian Institute o f Management

S U M M A R Y . A characterization o f the von Mises-Fisher m atrix distribution, extending a result o f Bingham and Mardia (1975) for distributions on sphere to distributions on Stiefel manifold, is obtained.

1. In t r o d u c t i o n a k d m a i n r e s u l t

Bingham and Mardia (1975)— hereafter, abbreviated to BM— proved that under mild conditions a rotationally symmetric family o f distributions on the sphere must be the von Mises-Fisher family i f the mean direction is a maximum likelihood estimator (MLE) of the location parameter. In view of Downs’ (1972) extension o f the von Mises-Fisher distribution to a Stiefel mainfold (for further references, see Jupp and Mardia (1979)), it has been attempted here to extend the result in BM in the direction o f Downs’ work.

Let Snp be the class o f n X p {n < p ) matrices M satisfying M M ' = I n.

For X v ..., Xn eSnv with X = £ X i having full row rank, define the polar t-i

component of X as the matrix (XX')~*X(cf. Downs, 1972). Then the follow­

ing result, proved in the next section, holds.

Theorem. Let <? = {p (X ; A ) = f[tr(A X ’ )] | A e Snp} be a class o f non- uniform densities on Snp- Assume that f is lower semi-continwms at the point n. Furthermore, suppose that for every positive integral N and for all random samples X lt ..., X N, w ithX = 2 X i of full row rank, the polar component o f

* = i

X is a M L E o f A . Then

p {X ; A ) = K exp{M r(A X % X e Snp, (L1) for some constants A and K , both p o s i t i v e ._______________________________

Paper received. June 1989; revised M ay 1990.

A M S (1980) subject classification. 62E10, 62H05.

K ey words and phrases. Maximum likelihood characterization, orientation statistics, von Mises-Fisher m atrix distribution.

*On leave from Indian Statistical Institute, Calcutta, India*

(2)

Remark 1. The* class & considered above ha3 the following property.

p (X; A) — p (X B ; A) for all p X p orthogonal matrix B with det (B) = 1 that (♦atistics A B A. Because o f this geometric consideration the matrix A can be thought o f as a location parameter for the class <?. Thus is a natural extension o f the class considered in BM.

Remark 2. The converse o f the theorem is also true, i.e, i f X has the density (1.1), then for i.i.d. observations X v ..., X N from p (X ; A ) the polar component o f X — £ X i is the MLE of A (cf. Downs (1972)).

< - 1

2. P r o o f o f t h e t h e o r e m

For n = 1, our theorem follows from Theorem 2 in BM. Throughout this section, we therefore consider the case n ^ 2, and it appears that this generalization is non-trivial especially for odd n. Observe that the condition regarding the MLE o f A is equivalent to the following : for every positive

x integral N and every choice o f matrices X x, ..., X n , A e Snp with X = 2 -Xf

» - i

of full row rank, the relation

n f[tv (A x l)] > n . f M A X t i (2.1)

< = 1 < = 1

holds, where A = ( X X '^ X . The following lemmas will be helpful.

Lemma 1. For every positive integral N and every choice o f matrices N

Cv ..., CN, UeSnn with C = £ C< positive definite, the relation

< — 1

n f[tr (C i)]> n f[tr(UCi)] ... (2.2)

i= l <=1

holds.

Proof. Let L = (In, 0) e Snp. Then the lemma follows from (2.1) taking X t = C ’tL, 1 < i < N, and A = (U, 0) e Snp.

Lemma 2. For each x e [ ~ n , n], f(n) > f{x).

Proof. Follows taking N = 1, Cx = I n in (2.2) and observing that for each u e [ n, n], there exists U e Snn satisfying ftr(£7) = u.

Lemma 3. For each x e [ —n, n], f(x) < oo.

Proof. In consideration o f Lemma 2, it is enough to show that

f(n) < oo, _ (2.3)

(3)

Taking N=> 2, V = C{ in (2.2), we gel /(tr(C ,)]/(tr(<’,)) > / ( » )/(tr (/;r ,)].

for every C ^ C ^ e Sntl such that C i+ C t is punitive dHiiuu- lim rr if (:' 3) does not hold th en /(n ) = oo, and for every C’,. Ct t SKtt mu-h th«t

positive definite, one must have either (a) /It r ft ’j f , ) ] - o. «>r (h) / ( t r ( f ,)]

/[tr(C 2) ] = o o .

For real a, u and positive integral m, define the matri<<-*

cos a sin a \ /

Qm.

0

) >

Qmm

=

Im

®

H f Qmt(u)

I

—am a cos a / \ 0 ’ h

Consider first the case of odd n. I f n = 2m f l(wi > 1) and (2 3) not hold, then taking Cj = Q*m ( 1), C , = l).~ ff/2 - a - n 2 (not.- that then C1} C 2 e Snn and C1-j-Ctis positive definite), it follown from tin- <lim-iiw«ii>ii in the last paragraph that for each a e (—77/2), tt/2), either (») /(1 • ‘Jm <<« ‘Ja)

= 0, or (b) / ( l + 2 m c o a a ) = o o . The condition (b) cannot hold <»v<t a set o f positive Lebeague measure. Hence (a) must hold iilm<*nt iv<rvwh<Ti’

(a.e.) over a e(—tt/2, n/2), i.e., f(x) = 0 a.e. over x f, (-(2 m 1). (2m • I )) and a contradiction is reached in consideration o f lower semicontiuuifcv o f / the point n( — 2 m + l) (cf. (2.4) below). Similarly, for even w( 2m, m > 1). if (2.3) does not hold, then taking C , = Q mt, C% -= n;2 - a - *7/2.

it follows as before that for each a e ( —n/2,77/2), either (a) /{n c o n 2a) <>, or (h) f(n cos a) = oo, and a contradiction is reached again by the low er m-mi- continuity o f / at n.

Lemma 4. For each x e [ —n, ri), f(x ) > 0.

Proof. First note that

/ ( » ) > 0, ... (2.4)

for otherwise b y Lemma 2, f(x) = 0 for each xe[—jj, n], w h ic h is im possible an f is a density. Also, observe that for any given 0 e[0,77], there exist* 7 solisfy- ing (cf. BM)

(i) _ 1 0 < ^ < 0, (ii) cos0+2cos 7? > 0, (iii) sin0-f 2 sin 7/ = 0. ... (2.5) Consider first the case o f odd n. For n = 2m + l(m > 1), define

& = {d : 6 e [0 ,7 r],/(l+ 2 m cos 0) = 0}.

If & is non-empty, then for each d e £ , one can choose rj satisfying (2.5) and then employ (2. 2) with N = 3, C x = Qme (^)» = ©m>;0)> U

where a - -(< ? + * )/2 , to o b ta in /[I + 2 m c o s ( ^ - ? ) ) ] = 0 ; but as in Lemma

(4)

2 in BM, because of (2.4) and lower semi-continuity o f / at n, this leads to a contradiction. Hence & is empty and

f(x) > 0 for all x e [ —(2m — 1), (2m-(-l)]. ••• (2-6) We shall now show that f(x ) > 0 also for a ;e [—(2 m + 1 ) , —(2m—1)). If possible', let there exist x0 e [ —(2m 4-l), —(2m—1)) such that/(a;0) = 0. Let 0(e[0,7r]) be such that cos 0 = (x0Jr\)l(l2m), and corresponding to this 6, find 7) satisfying (2.5). Taking N = 3, Cx — Qmq(—1)> Cz = Ca = Qmq(l)>

V = Q'n[-g)( 1) in (2.2), and using Lemma 3, one then gets /(2 m —1) {/[l-(-2m cos (7 7—0)]}z ss 0, which is impossible b y (2.6). This proves the lemma for odd n. The proof for even n is similar.

Lemma 5. For every positive integral N‘ and every choice of matrices N’

C v ..., Cn, V e Snn with S C( non-negative definite, the relation t-i

n f[tr(Ct)] > fi f[(tr(UCi)}

< = 1 <=i

holds.

Proof. In view of Lemma 1, it is enough to consider the case when C S'

= E Cj is positive semidefinite. Obviously, then I-\-vC is positive definite for every positive integral v. In Lemma 1, now take N = l-\-vN', and choose the C(’a sueh that one of them equals / and the rest are given by v copies of each o f C x ..., Cjv. The rest o f the proof follows using agruments similar to those in Lemma 3 in BM.

We now proceed to the final step o f our proof. For n = 2 m + l (to ^ 1), in Lemma 5 taking N' = N, Ct = Q*m0(l) (1 < i < N), U = 1), where

N n

S cos dt > 0, 2 sin6»i = 0, ... (2.7)

< = 1 <=i

it follows that for every positive integral N and for every a,

it s

n /( l + 2 m cos0<) > II / ( I - f 2m cos(0<—a)), whenever the 0j’s satisfy

i-1 »=.!

(2.7). Writing h(0) = l o g /( I - f 2m cos8), which is well-defined b y Lemmas 3.4, it follows that for each positive integral N and each a,

S W ) > s m - * ) , ... (2.8) (=1

(5)

whenever the di’a satisfy (2.7). The relation (2.8) is equivalent to the relation (4) in BM and hence as in BM, h(6) = a cos6+ b, for every 6, where a( > 0) and b are some constants. By the definition of h(6), one obtains

f(x ) = K exp(Aa;), for x e [ —(2m—1), (2 m + l)] ... (2.9) where i? (> 0 ) and A( > 0 ) are constants. By Lemma 5, for every C, V e Snn, /[t r ( C )]/[—tr(C)] > /[t r ( f/C )]/[—tv(UC)], so that f{x )j(—x) remains constant

over x e [—n, ri\. This, together with (2.9), implies that f(x) = K exp(Ax), for each xe[—n, n], where K, A are constants, both positive, the positiveness o f A being a consequence o f the stipulated non-uniformity o f / . This proves the theorem for odd n. The proof for even n is similar.

Acknowledgement. The authors are thankful to a referee for very con­

structive suggestions.

Re f e r e n c e s

Bi n g h a m, M. S. and Ma r d i a, K . V. (1975). Maximum likelihood characterization o f the von M is e s distribution. I n : Statistical Distributions in Scientific Work, vol. 3 (Q. P. Patil et al.

eds.), Reidel, Dordrecht-Holland, 387-398.

Do w n s, T . D . (1972). Orientation statistics. Biometrika, 5 9 , 665-676.

■Jupp, P. E. and M a r d ia , K . V. (1979). Maximum likelihood estimators for the matrix von Mises-Fisher and Bingham distributions. Ann. Statist, 7, 599-606.

St a t-Ma t h Di v i s i o n In d i a n In s t i t u t e o r Ma n a g e m e n t, Ca l c u t t a In d i a n St a t i s t i c a l In s t i t u t e Jo k a, Di a m o n d Ha b b a b Ro a d

203, B. T. Ro a d Po s t b o x n o 16757

Ca l c u t t a 700 035 Ca l c u t t a 700027

In d i a. In d i a.

References

Related documents

Figure 4.1: The von Mises stress distribution in cortical bone region of proximal femur models (a) intact model of left femur (b) model implanted with edoprosthetic (c) model

When the lifetime data from the series system are masked, we consider the reliability estimation of exponentiated- Weibull distribution based on different masking level.. We consider

Gupta and Kundu (1999) compared the maximum likelihood estimators (MLE) with the other estimators such as the method of moments estimators (MME), estimators

Elastic-Plastic (with or without strain hardening) is a trivial issue in modeling the material (for both uniaxial and multi-axial Von Mises criteria), in numerical

In chapter 3, we have estimated the parameters of well known discrete distribution functions by different methods like method of moments, method of maximum likelihood estimation

Table 4.6: Effect of varying bicortical thread pitch on maximum von Mises stress (MPa)

4.31 Contour of Principal Stress of the lnterstiffener Plating Model for Simply Supported Boundary Conditions 4.32 Contour of von-Mises Stress of the lnterstiffener Plating.. Model

The Heidelberg and Welch diagnostic calculates a test statistic (based on the Cramer-von Mises test statistic) to accept or reject the null hypothesis that the Markov chain is from