Schur–Horn theorem

Characterizes the diagonal of a Hermitian matrix with given eigenvalues

In mathematics, particularly linear algebra, the Schur–Horn theorem, named after Issai Schur and Alfred Horn, characterizes the diagonal of a Hermitian matrix with given eigenvalues. It has inspired investigations and substantial generalizations in the setting of symplectic geometry. A few important generalizations are Kostant's convexity theorem, Atiyah–Guillemin–Sternberg convexity theorem, Kirwan convexity theorem.

Statement

Schur–Horn theorem — Theorem. Let d 1 , , d N {\displaystyle d_{1},\dots ,d_{N}} and λ 1 , , λ N {\displaystyle \lambda _{1},\dots ,\lambda _{N}} be two sequences of real numbers arranged in a non-increasing order. There is a Hermitian matrix with diagonal values d 1 , , d N {\displaystyle d_{1},\dots ,d_{N}} (in this order, starting with d 1 {\displaystyle d_{1}} at the top-left) and eigenvalues λ 1 , , λ N {\displaystyle \lambda _{1},\dots ,\lambda _{N}} if and only if

i = 1 n d i i = 1 n λ i n = 1 , , N 1 {\displaystyle \sum _{i=1}^{n}d_{i}\leq \sum _{i=1}^{n}\lambda _{i}\qquad n=1,\dots ,N-1}
and
i = 1 N d i = i = 1 N λ i . {\displaystyle \sum _{i=1}^{N}d_{i}=\sum _{i=1}^{N}\lambda _{i}.}

The inequalities above may alternatively be written:

d 1 λ 1 d 2 + d 1 λ 1 + λ 2 d N 1 + + d 2 + d 1 λ 1 + λ 2 + + λ N 1 d N + d N 1 + + d 2 + d 1 = λ 1 + λ 2 + + λ N 1 + λ N . {\displaystyle {\begin{alignedat}{7}d_{1}&\;\leq \;&&\lambda _{1}\\[0.3ex]d_{2}+d_{1}&\;\leq &&\lambda _{1}+\lambda _{2}\\[0.3ex]\vdots &\;\leq &&\vdots \\[0.3ex]d_{N-1}+\cdots +d_{2}+d_{1}&\;\leq &&\lambda _{1}+\lambda _{2}+\cdots +\lambda _{N-1}\\[0.3ex]d_{N}+d_{N-1}+\cdots +d_{2}+d_{1}&\;=&&\lambda _{1}+\lambda _{2}+\cdots +\lambda _{N-1}+\lambda _{N}.\\[0.3ex]\end{alignedat}}}

The Schur–Horn theorem may thus be restated more succinctly and in plain English:

Schur–Horn theorem: Given any non-increasing real sequences of desired diagonal elements d 1 d N {\displaystyle d_{1}\geq \cdots \geq d_{N}} and desired eigenvalues λ 1 λ N , {\displaystyle \lambda _{1}\geq \cdots \geq \lambda _{N},} there exists a Hermitian matrix with these eigenvalues and diagonal elements if and only if these two sequences have the same sum and for every possible integer n , {\displaystyle n,} the sum of the first n {\displaystyle n} desired diagonal elements never exceeds the sum of the first n {\displaystyle n} desired eigenvalues.

Reformation allowing unordered diagonals and eigenvalues

Although this theorem requires that d 1 d N {\displaystyle d_{1}\geq \cdots \geq d_{N}} and λ 1 λ N {\displaystyle \lambda _{1}\geq \cdots \geq \lambda _{N}} be non-increasing, it is possible to reformulate this theorem without these assumptions.

We start with the assumption λ 1 λ N . {\displaystyle \lambda _{1}\geq \cdots \geq \lambda _{N}.} The left hand side of the theorem's characterization (that is, "there exists a Hermitian matrix with these eigenvalues and diagonal elements") depends on the order of the desired diagonal elements d 1 , , d N {\displaystyle d_{1},\dots ,d_{N}} (because changing their order would change the Hermitian matrix whose existence is in question) but it does not depend on the order of the desired eigenvalues λ 1 , , λ N . {\displaystyle \lambda _{1},\dots ,\lambda _{N}.}

On the right hand right hand side of the characterization, only the values of λ 1 + + λ n {\displaystyle \lambda _{1}+\cdots +\lambda _{n}} depend on the assumption λ 1 λ N . {\displaystyle \lambda _{1}\geq \cdots \geq \lambda _{N}.} Notice that this assumption means that the expression λ 1 + + λ n {\displaystyle \lambda _{1}+\cdots +\lambda _{n}} is just notation for the sum of the n {\displaystyle n} largest desired eigenvalues. Replacing the expression λ 1 + + λ n {\displaystyle \lambda _{1}+\cdots +\lambda _{n}} with this written equivalent makes the assumption λ 1 λ N {\displaystyle \lambda _{1}\geq \cdots \geq \lambda _{N}} completely unnecessary:

Schur–Horn theorem: Given any N {\displaystyle N} desired real eigenvalues and a non-increasing real sequence of desired diagonal elements d 1 d N , {\displaystyle d_{1}\geq \cdots \geq d_{N},} there exists a Hermitian matrix with these eigenvalues and diagonal elements if and only if these two sequences have the same sum and for every possible integer n , {\displaystyle n,} the sum of the first n {\displaystyle n} desired diagonal elements never exceeds the sum of the n {\displaystyle n} largest desired eigenvalues.

Permutation polytope generated by a vector

The permutation polytope generated by x ~ = ( x 1 , x 2 , , x n ) R n {\displaystyle {\tilde {x}}=(x_{1},x_{2},\ldots ,x_{n})\in \mathbb {R} ^{n}} denoted by K x ~ {\displaystyle {\mathcal {K}}_{\tilde {x}}} is defined as the convex hull of the set { ( x π ( 1 ) , x π ( 2 ) , , x π ( n ) ) R n : π S n } . {\displaystyle \{(x_{\pi (1)},x_{\pi (2)},\ldots ,x_{\pi (n)})\in \mathbb {R} ^{n}:\pi \in S_{n}\}.} Here S n {\displaystyle S_{n}} denotes the symmetric group on { 1 , 2 , , n } . {\displaystyle \{1,2,\ldots ,n\}.} In other words, the permutation polytope generated by ( x 1 , , x n ) {\displaystyle (x_{1},\dots ,x_{n})} is the convex hull of the set of all points in R n {\displaystyle \mathbb {R} ^{n}} that can be obtained by rearranging the coordinates of ( x 1 , , x n ) . {\displaystyle (x_{1},\dots ,x_{n}).} The permutation polytope of ( 1 , 1 , 2 ) , {\displaystyle (1,1,2),} for instance, is the convex hull of the set { ( 1 , 1 , 2 ) , ( 1 , 2 , 1 ) , ( 2 , 1 , 1 ) } , {\displaystyle \{(1,1,2),(1,2,1),(2,1,1)\},} which in this case is the solid (filled) triangle whose vertices are the three points in this set. Notice, in particular, that rearranging the coordinates of ( x 1 , , x n ) {\displaystyle (x_{1},\dots ,x_{n})} does not change the resulting permutation polytope; in other words, if a point y ~ {\displaystyle {\tilde {y}}} can be obtained from x ~ = ( x 1 , , x n ) {\displaystyle {\tilde {x}}=(x_{1},\dots ,x_{n})} by rearranging its coordinates, then K y ~ = K x ~ . {\displaystyle {\mathcal {K}}_{\tilde {y}}={\mathcal {K}}_{\tilde {x}}.}

The following lemma characterizes the permutation polytope of a vector in R n . {\displaystyle \mathbb {R} ^{n}.}

Lemma[1][2] — If x 1 x n , {\displaystyle x_{1}\geq \cdots \geq x_{n},} and y 1 y n , {\displaystyle y_{1}\geq \cdots \geq y_{n},} have the same sum x 1 + + x n = y 1 + + y n , {\displaystyle x_{1}+\cdots +x_{n}=y_{1}+\cdots +y_{n},} then the following statements are equivalent:

  1. y ~ := ( y 1 , , y n ) K x ~ . {\displaystyle {\tilde {y}}:=(y_{1},\cdots ,y_{n})\in {\mathcal {K}}_{\tilde {x}}.}
  2. y 1 x 1 , {\displaystyle y_{1}\leq x_{1},} and y 1 + y 2 x 1 + x 2 , {\displaystyle y_{1}+y_{2}\leq x_{1}+x_{2},} and , {\displaystyle \ldots ,} and y 1 + y 2 + + y n 1 x 1 + x 2 + + x n 1 {\displaystyle y_{1}+y_{2}+\cdots +y_{n-1}\leq x_{1}+x_{2}+\cdots +x_{n-1}}
  3. There exist a sequence of points x ~ 1 , , x ~ n {\displaystyle {\tilde {x}}_{1},\dots ,{\tilde {x}}_{n}} in K x ~ , {\displaystyle {\mathcal {K}}_{\tilde {x}},} starting with x ~ 1 = x ~ {\displaystyle {\tilde {x}}_{1}={\tilde {x}}} and ending with x ~ n = y ~ {\displaystyle {\tilde {x}}_{n}={\tilde {y}}} such that x ~ k + 1 = t x ~ k + ( 1 t ) τ ( x k ~ ) {\displaystyle {\tilde {x}}_{k+1}=t{\tilde {x}}_{k}+(1-t)\tau ({\tilde {x_{k}}})} for each k {\displaystyle k} in { 1 , 2 , , n 1 } , {\displaystyle \{1,2,\ldots ,n-1\},} some transposition τ {\displaystyle \tau } in S n , {\displaystyle S_{n},} and some t {\displaystyle t} in [ 0 , 1 ] , {\displaystyle [0,1],} depending on k . {\displaystyle k.}

Reformulation of Schur–Horn theorem

In view of the equivalence of (i) and (ii) in the lemma mentioned above, one may reformulate the theorem in the following manner.

Theorem. Let d 1 , , d N {\displaystyle d_{1},\dots ,d_{N}} and λ 1 , , λ N {\displaystyle \lambda _{1},\dots ,\lambda _{N}} be real numbers. There is a Hermitian matrix with diagonal entries d 1 , , d N {\displaystyle d_{1},\dots ,d_{N}} and eigenvalues λ 1 , , λ N {\displaystyle \lambda _{1},\dots ,\lambda _{N}} if and only if the vector ( d 1 , , d n ) {\displaystyle (d_{1},\ldots ,d_{n})} is in the permutation polytope generated by ( λ 1 , , λ n ) . {\displaystyle (\lambda _{1},\ldots ,\lambda _{n}).}

Note that in this formulation, one does not need to impose any ordering on the entries of the vectors d 1 , , d N {\displaystyle d_{1},\dots ,d_{N}} and λ 1 , , λ N . {\displaystyle \lambda _{1},\dots ,\lambda _{N}.}

Proof of the Schur–Horn theorem

Let A = ( a j k ) {\displaystyle A=(a_{jk})} be a n × n {\displaystyle n\times n} Hermitian matrix with eigenvalues { λ i } i = 1 n , {\displaystyle \{\lambda _{i}\}_{i=1}^{n},} counted with multiplicity. Denote the diagonal of A {\displaystyle A} by a ~ , {\displaystyle {\tilde {a}},} thought of as a vector in R n , {\displaystyle \mathbb {R} ^{n},} and the vector ( λ 1 , λ 2 , , λ n ) {\displaystyle (\lambda _{1},\lambda _{2},\ldots ,\lambda _{n})} by λ ~ . {\displaystyle {\tilde {\lambda }}.} Let Λ {\displaystyle \Lambda } be the diagonal matrix having λ 1 , λ 2 , , λ n {\displaystyle \lambda _{1},\lambda _{2},\ldots ,\lambda _{n}} on its diagonal.

( {\displaystyle \Rightarrow } ) A {\displaystyle A} may be written in the form U Λ U 1 , {\displaystyle U\Lambda U^{-1},} where U {\displaystyle U} is a unitary matrix. Then

a i i = j = 1 n λ j | u i j | 2 , i = 1 , 2 , , n . {\displaystyle a_{ii}=\sum _{j=1}^{n}\lambda _{j}|u_{ij}|^{2},\;i=1,2,\ldots ,n.}

Let S = ( s i j ) {\displaystyle S=(s_{ij})} be the matrix defined by s i j = | u i j | 2 . {\displaystyle s_{ij}=|u_{ij}|^{2}.} Since U {\displaystyle U} is a unitary matrix, S {\displaystyle S} is a doubly stochastic matrix and we have a ~ = S λ ~ . {\displaystyle {\tilde {a}}=S{\tilde {\lambda }}.} By the Birkhoff–von Neumann theorem, S {\displaystyle S} can be written as a convex combination of permutation matrices. Thus a ~ {\displaystyle {\tilde {a}}} is in the permutation polytope generated by λ ~ . {\displaystyle {\tilde {\lambda }}.} This proves Schur's theorem.

( {\displaystyle \Leftarrow } ) If a ~ {\displaystyle {\tilde {a}}} occurs as the diagonal of a Hermitian matrix with eigenvalues { λ i } i = 1 n , {\displaystyle \{\lambda _{i}\}_{i=1}^{n},} then t a ~ + ( 1 t ) τ ( a ~ ) {\displaystyle t{\tilde {a}}+(1-t)\tau ({\tilde {a}})} also occurs as the diagonal of some Hermitian matrix with the same set of eigenvalues, for any transposition τ {\displaystyle \tau } in S n . {\displaystyle S_{n}.} One may prove that in the following manner.

Let ξ {\displaystyle \xi } be a complex number of modulus 1 {\displaystyle 1} such that ξ a j k ¯ = ξ a j k {\displaystyle {\overline {\xi a_{jk}}}=-\xi a_{jk}} and U {\displaystyle U} be a unitary matrix with ξ t , t {\displaystyle \xi {\sqrt {t}},{\sqrt {t}}} in the j , j {\displaystyle j,j} and k , k {\displaystyle k,k} entries, respectively, 1 t , ξ 1 t {\displaystyle -{\sqrt {1-t}},\xi {\sqrt {1-t}}} at the j , k {\displaystyle j,k} and k , j {\displaystyle k,j} entries, respectively, 1 {\displaystyle 1} at all diagonal entries other than j , j {\displaystyle j,j} and k , k , {\displaystyle k,k,} and 0 {\displaystyle 0} at all other entries. Then U A U 1 {\displaystyle UAU^{-1}} has t a j j + ( 1 t ) a k k {\displaystyle ta_{jj}+(1-t)a_{kk}} at the j , j {\displaystyle j,j} entry, ( 1 t ) a j j + t a k k {\displaystyle (1-t)a_{jj}+ta_{kk}} at the k , k {\displaystyle k,k} entry, and a l l {\displaystyle a_{ll}} at the l , l {\displaystyle l,l} entry where l j , k . {\displaystyle l\neq j,k.} Let τ {\displaystyle \tau } be the transposition of { 1 , 2 , , n } {\displaystyle \{1,2,\dots ,n\}} that interchanges j {\displaystyle j} and k . {\displaystyle k.}

Then the diagonal of U A U 1 {\displaystyle UAU^{-1}} is t a ~ + ( 1 t ) τ ( a ~ ) . {\displaystyle t{\tilde {a}}+(1-t)\tau ({\tilde {a}}).}

Λ {\displaystyle \Lambda } is a Hermitian matrix with eigenvalues { λ i } i = 1 n . {\displaystyle \{\lambda _{i}\}_{i=1}^{n}.} Using the equivalence of (i) and (iii) in the lemma mentioned above, we see that any vector in the permutation polytope generated by ( λ 1 , λ 2 , , λ n ) , {\displaystyle (\lambda _{1},\lambda _{2},\ldots ,\lambda _{n}),} occurs as the diagonal of a Hermitian matrix with the prescribed eigenvalues. This proves Horn's theorem.

Symplectic geometry perspective

The Schur–Horn theorem may be viewed as a corollary of the Atiyah–Guillemin–Sternberg convexity theorem in the following manner. Let U ( n ) {\displaystyle {\mathcal {U}}(n)} denote the group of n × n {\displaystyle n\times n} unitary matrices. Its Lie algebra, denoted by u ( n ) , {\displaystyle {\mathfrak {u}}(n),} is the set of skew-Hermitian matrices. One may identify the dual space u ( n ) {\displaystyle {\mathfrak {u}}(n)^{*}} with the set of Hermitian matrices H ( n ) {\displaystyle {\mathcal {H}}(n)} via the linear isomorphism Ψ : H ( n ) u ( n ) {\displaystyle \Psi :{\mathcal {H}}(n)\rightarrow {\mathfrak {u}}(n)^{*}} defined by Ψ ( A ) ( B ) = t r ( i A B ) {\displaystyle \Psi (A)(B)=\mathrm {tr} (iAB)} for A H ( n ) , B u ( n ) . {\displaystyle A\in {\mathcal {H}}(n),B\in {\mathfrak {u}}(n).} The unitary group U ( n ) {\displaystyle {\mathcal {U}}(n)} acts on H ( n ) {\displaystyle {\mathcal {H}}(n)} by conjugation and acts on u ( n ) {\displaystyle {\mathfrak {u}}(n)^{*}} by the coadjoint action. Under these actions, Ψ {\displaystyle \Psi } is an U ( n ) {\displaystyle {\mathcal {U}}(n)} -equivariant map i.e. for every U U ( n ) {\displaystyle U\in {\mathcal {U}}(n)} the following diagram commutes,

Let λ ~ = ( λ 1 , λ 2 , , λ n ) R n {\displaystyle {\tilde {\lambda }}=(\lambda _{1},\lambda _{2},\ldots ,\lambda _{n})\in \mathbb {R} ^{n}} and Λ H ( n ) {\displaystyle \Lambda \in {\mathcal {H}}(n)} denote the diagonal matrix with entries given by λ ~ . {\displaystyle {\tilde {\lambda }}.} Let O λ ~ {\displaystyle {\mathcal {O}}_{\tilde {\lambda }}} denote the orbit of Λ {\displaystyle \Lambda } under the U ( n ) {\displaystyle {\mathcal {U}}(n)} -action i.e. conjugation. Under the U ( n ) {\displaystyle {\mathcal {U}}(n)} -equivariant isomorphism Ψ , {\displaystyle \Psi ,} the symplectic structure on the corresponding coadjoint orbit may be brought onto O λ ~ . {\displaystyle {\mathcal {O}}_{\tilde {\lambda }}.} Thus O λ ~ {\displaystyle {\mathcal {O}}_{\tilde {\lambda }}} is a Hamiltonian U ( n ) {\displaystyle {\mathcal {U}}(n)} -manifold.

Let T {\displaystyle \mathbb {T} } denote the Cartan subgroup of U ( n ) {\displaystyle {\mathcal {U}}(n)} which consists of diagonal complex matrices with diagonal entries of modulus 1. {\displaystyle 1.} The Lie algebra t {\displaystyle {\mathfrak {t}}} of T {\displaystyle \mathbb {T} } consists of diagonal skew-Hermitian matrices and the dual space t {\displaystyle {\mathfrak {t}}^{*}} consists of diagonal Hermitian matrices, under the isomorphism Ψ . {\displaystyle \Psi .} In other words, t {\displaystyle {\mathfrak {t}}} consists of diagonal matrices with purely imaginary entries and t {\displaystyle {\mathfrak {t}}^{*}} consists of diagonal matrices with real entries. The inclusion map t u ( n ) {\displaystyle {\mathfrak {t}}\hookrightarrow {\mathfrak {u}}(n)} induces a map Φ : H ( n ) u ( n ) t , {\displaystyle \Phi :{\mathcal {H}}(n)\cong {\mathfrak {u}}(n)^{*}\rightarrow {\mathfrak {t}}^{*},} which projects a matrix A {\displaystyle A} to the diagonal matrix with the same diagonal entries as A . {\displaystyle A.} The set O λ ~ {\displaystyle {\mathcal {O}}_{\tilde {\lambda }}} is a Hamiltonian T {\displaystyle \mathbb {T} } -manifold, and the restriction of Φ {\displaystyle \Phi } to this set is a moment map for this action.

By the Atiyah–Guillemin–Sternberg theorem, Φ ( O λ ~ ) {\displaystyle \Phi ({\mathcal {O}}_{\tilde {\lambda }})} is a convex polytope. A matrix A H ( n ) {\displaystyle A\in {\mathcal {H}}(n)} is fixed under conjugation by every element of T {\displaystyle \mathbb {T} } if and only if A {\displaystyle A} is diagonal. The only diagonal matrices in O λ ~ {\displaystyle {\mathcal {O}}_{\tilde {\lambda }}} are the ones with diagonal entries λ 1 , λ 2 , , λ n {\displaystyle \lambda _{1},\lambda _{2},\ldots ,\lambda _{n}} in some order. Thus, these matrices generate the convex polytope Φ ( O λ ~ ) . {\displaystyle \Phi ({\mathcal {O}}_{\tilde {\lambda }}).} This is exactly the statement of the Schur–Horn theorem.

Notes

  1. ^ Kadison, R. V., Lemma 5, The Pythagorean Theorem: I. The finite case, Proc. Natl. Acad. Sci. USA, vol. 99 no. 7 (2002):4178–4184 (electronic)
  2. ^ Kadison, R. V.; Pedersen, G. K., Lemma 13, Means and Convex Combinations of Unitary Operators, Math. Scand. 57 (1985),249–266

References

  • Schur, Issai, Über eine Klasse von Mittelbildungen mit Anwendungen auf die Determinantentheorie, Sitzungsber. Berl. Math. Ges. 22 (1923), 9–20.
  • Horn, Alfred, Doubly stochastic matrices and the diagonal of a rotation matrix, American Journal of Mathematics 76 (1954), 620–630.
  • Kadison, R. V.; Pedersen, G. K., Means and Convex Combinations of Unitary Operators, Math. Scand. 57 (1985),249–266.
  • Kadison, R. V., The Pythagorean Theorem: I. The finite case, Proc. Natl. Acad. Sci. USA, vol. 99 no. 7 (2002):4178–4184 (electronic)

External links

  • MathWorld
  • Terry Tao: 254A, Notes 3a: Eigenvalues and sums of Hermitian matrices
  • Sheela Devadas, Peter J. Haine, Keaton Stubis: The Schur-Horn Theorem
  • v
  • t
  • e
Spaces
Properties
TheoremsOperatorsAlgebrasOpen problemsApplicationsAdvanced topics
  • Category