Preliminaries
This chapter begins with a review of fundamental concepts related to real matrices While the content is self-contained, readers are expected to have a basic understanding of matrix operations and elementary properties of determinants.
An m×n matrix is composed of mn real numbers organized into m rows and n columns The element located at the intersection of row i and column j in matrix A is represented as a_ij An m×1 matrix is identified as a column vector of order m, while a 1×n matrix is referred to as a row vector of order n When m equals n, the m×n matrix is classified as a square matrix.
IfAandBarem×nmatrices, thenA+B is defined as them×nmatrix with
(i, j )-entrya ij +b ij IfAis a matrix andcis a real number thencAis obtained by multiplying each element ofAbyc.
IfAism×pandBisp×n, then their productC=ABis anm×nmatrix with
(i, j )-entry given by c ij p k = 1 a ik b kj
(AB)C=A(BC), A(B+C)=AB+AC, (A+B)C=AC+BC.
The transpose of them×nmatrixA, denoted byA , is then×mmatrix whose
(i, j )-entry isa j i It can be verified that(A ) =A,(A+B) =A +B and(AB) B A
A good understanding of the definition of matrix multiplication is quite useful.
In this article, we highlight essential facts regarding matrix multiplication, assuming that all products mentioned are defined and that the dimensions of the matrices are compatible for multiplication.
R.B Bapat, Linear Algebra and Linear Models, Universitext,
DOI 10.1007/978-1-4471-2739-0_1, © Springer-Verlag London Limited 2012
(i) Thej-th column ofABis the same asAmultiplied by thej-th column ofB.
(ii) Thei-th row ofABis the same as thei-th row ofAmultiplied byB.
(iii) The(i, j )-entry ofABCis obtained as
⎥⎦ where(x 1 , , x p )is thei-th row ofAand(y 1 , , y q ) is the j-th column ofC.
⎥⎦ wherea i denote columns ofAandb j denote rows ofB, then
A diagonal matrix is a square matrixAsuch thata ij =0,i=j We denote the diagonal matrix
The matrix diag(λ₁, , λₙ) becomes the identity matrix of order n, denoted as Iₙ or simply I when the order is clear, when all λᵢ equal 1 It is important to note that for any square matrix A, the equation AI = I A = A holds true.
The entriesa 11 , , a nn are said to constitute the (main) diagonal entries ofA.
The trace ofAis defined as traceA=a 11 + ã ã ã +a nn
It follows from this definition that ifA,B are matrices such that bothABandBA are defined, then traceAB=traceBA.
The determinant of ann×nmatrixA, denoted by|A|, is defined as
|A| σ ε(σ )a 1σ (1) ã ã ãa nσ (n) where the summation is over all permutations{σ (1), , σ (n)}of {1, , n}and ε(σ )is 1 or−1 according asσ is even or odd.
Vector Spaces
We state some basic properties of determinant without proof:
(i) The determinant can be evaluated by expansion along a row or a column Thus, expanding along the first row,
(−1) 1 + j a 1j |A 1j | where A 1j is the submatrix obtained by deleting the first row and thej-th column ofA We also note that n j= 1
The sign of a determinant reverses when two rows or columns are swapped, while adding a constant multiple of one row to another does not affect the determinant's value; this property also applies to columns.
(iv) The determinant is a linear function of any column (row) when all the other columns (rows) are held fixed.
The matrixAis upper triangular ifa ij =0,i > j The transpose of an upper triangular matrix is lower triangular.
It will often be necessary to work with matrices in partitioned form For example, let
, be two matrices where eachA ij ,B ij is itself a matrix If compatibility for matrix multiplication is assumed throughout then we can write
A nonempty setSis called a vector space if it satisfies the following conditions: (i) For anyx,yinS,x+y is defined and is inS Further, x+y=y+x (commutativity), x+(y+z)=(x+y)+z (associativity).
(ii) There exists an element inS, denoted by 0, such thatx+0=xfor allx. (iii) For anyxinSthere exists an elementyinSsuch thatx+y=0.
(iv) For anyx inS and any real numberc,cx is defined and is inS; moreover,
(v) For anyx 1 ,x 2 inSand realsc 1 ,c 2 ,c 1 (x 1 +x 2 )=c 1 x 1 +c 1 x 2 ,(c 1 +c 2 )x 1 c 1 x 1 +c 2 x 1 andc 1 (c 2 x 1 )=(c 1 c 2 )x 1
Vectors are the fundamental elements in vector spaces, where the sum of two vectors, denoted as x + y, is known as vector addition The zero vector, represented in the context, plays a crucial role in this mathematical framework Additionally, scalar multiplication is an essential operation that involves multiplying a vector by a scalar While vector spaces can be defined using any field, this discussion focuses on the field of real numbers, which is adequate for our purposes.
The collection of column vectors of order n, or n×1 matrices, forms a vector space, as does the set of row vectors of order n These two vector spaces are the primary focus in many mathematical applications.
Let \( \mathbb{R}^n \) represent the set of n-tuples of real numbers, where \( \mathbb{R} \) denotes the set of real numbers Elements of \( \mathbb{R}^n \) can be expressed as either column vectors or row vectors, depending on the context and convenience of the situation.
IfS,T are vector spaces andS⊂T thenSis called a subspace ofT.
In the context of vector spaces, R³ is a well-defined vector space that includes the trivial subspace consisting solely of the zero vector For any real numbers c₁, c₂, and c₃, the collection of all vectors x ∈ R³ that satisfy the equation c₁x₁ + c₂x₂ + c₃x₃ = 0 forms a subspace of R³, which geometrically represents a plane that passes through the origin Additionally, the intersection of two distinct planes that both pass through the origin results in a straight line through the origin, which is also classified as a subspace Thus, the only possible subspaces of R³ include the zero vector space, planes through the origin, and lines through the origin.
Basis and Dimension
The linear span of the vectors \( x_1, \ldots, x_m \) is the collection of all possible linear combinations of these vectors, expressed as \( c_1 x_1 + c_2 x_2 + \ldots + c_m x_m \), where \( c_1, \ldots, c_m \) are real numbers This linear span constitutes a subspace, which is evident from its definition.
A set of vectorsx 1 , , x m is said to be linearly dependent if there exist real numbersc 1 , , c m such that at least onec i is nonzero andc 1 x 1 + ã ã ã +c m x m =0.
A collection of vectors is considered linearly independent if it does not exhibit linear dependence It is important to note that when discussing linear independence or dependence, we refer to a multiset of vectors rather than a strict set, allowing for the inclusion of non-distinct vectors.
The following statements are easily proved.
(i) The set consisting of the zero vector alone is linearly dependent.
(ii) IfX⊂Y and ifXis linearly dependent, then so isY.
(iii) IfX⊂Y and ifY is linearly independent, then so isX.
A set of vectors is said to form a basis for the vector spaceS if it is linearly independent and its linear span equalsS.
Lete i be thei-th column of then×nidentity matrix The sete 1 , , e n forms a basis forR n , called the standard basis.
Ifx 1 , , x m is a basis forSthen any vectorxinSadmits a unique representation as a linear combinationc 1 x 1+ ã ã ã +c m x m For, if x=c 1 x 1+ ã ã ã +c m x m =d 1 x 1+ ã ã ã +d m x m , then
(c 1 −d 1 )x 1 + ã ã ã +(c m −d m )x m =0 and sincex 1 , , x m are linearly independent,c i =d i for eachi.
A vector space is classified as finite dimensional if it possesses a basis made up of a finite number of vectors, including the special case where the space contains only the zero vector In this discussion, we will focus exclusively on finite dimensional vector spaces, typically assuming they are nontrivial, meaning they contain vectors beyond just the zero vector.
1.1 LetSbe a vector space Then any two bases ofShave the same cardinality.
Proof Supposex 1 , , x p andy 1 , , y q are bases forSand let, if possible,p > q.
Every vector \( x_i \) can be represented as a linear combination of vectors \( y_1, \ldots, y_q \), which implies the existence of a \( p \times q \) matrix \( A \) such that \( x_i = \sum_{j=1}^{q} a_{ij} y_j \) for \( i = 1, \ldots, p \) Similarly, there is a \( q \times p \) matrix \( B \) such that \( y_j = \sum_{k=1}^{p} b_{jk} x_k \) for \( j = 1, \ldots, q \) By combining these equations, we derive that \( x_i = \sum_{k=1}^{p} c_{ik} x_k \) where \( C = AB \), demonstrating the interrelation between the matrices and the vectors.
In this article, we explore the relationship between matrices A and B, where AB equals the identity matrix I of order p By augmenting matrix A with p−q zero columns to form a p×p matrix U, and adding p−q zero rows to matrix B to create a p×p matrix V, we find that the product UV equals AB, which is I Consequently, the determinant of UV is 1; however, the determinants of U and V are both 0 due to the presence of a zero column in U and a zero row in V This leads to a contradiction, implying that p must be less than or equal to q.
We can similarly prove thatq≤pand it follows thatp=q
In the process of proving1.1we have proved the following statement which will be useful LetSbe a vector space Supposex 1 , , x p is a basis forSand suppose the sety 1 , , y q spansS Thenp≤q.
The dimension of a vector space S, represented as dim(S), is defined as the number of vectors in a basis for S According to convention, the dimension of the space that contains only the zero vector is considered to be zero.
Two vector spaces, S and T, are considered isomorphic if there exists a linear map f from S to T that is both one-to-one and onto This means that for all vectors x and y in S and any real number c, the map satisfies the conditions f(x+y) = f(x) + f(y) and f(cx) = cf(x).
1.2 LetSandT be vector spaces ThenS,T are isomorphic if and only if dim(S) dim(T ).
Proof We first prove the only if part Supposef :S −→T is an isomorphism.
If \( x_1, \ldots, x_k \) is a basis for \( S \), then \( f(x_1), \ldots, f(x_k) \) forms a basis for \( T \) To demonstrate this, assume \( c_1 f(x_1) + \ldots + c_k f(x_k) = 0 \) By the definition of isomorphism, this implies \( f(c_1 x_1 + \ldots + c_k x_k) = 0 \), leading to \( c_1 x_1 + \ldots + c_k x_k = 0 \) Since \( x_1, \ldots, x_k \) are linearly independent, it follows that \( c_1 = \ldots = c_k = 0 \), establishing that \( f(x_1), \ldots, f(x_k) \) are linearly independent Furthermore, for any \( v \in T \), there exists \( u \in S \) such that \( f(u) = v \), which can be expressed as \( u = d_1 x_1 + \ldots + d_k x_k \) for some coefficients \( d_1, \ldots, d_k \) Thus, \( v = f(u) = d_1 f(x_1) + \ldots + d_k f(x_k) \), showing that \( f(x_1), \ldots, f(x_k) \) span \( T \) and confirming that they form a basis for \( T \) Consequently, it follows that \( \text{dim}(T) = k \).
To prove the converse, let x 1 , , x k ;y 1 , , y k be bases forS andT respec- tively (Since dim(S)=dim(T ), the bases have the same cardinality.) Anyx inS admits a unique representation x=c 1 x 1+ ã ã ã +c k x k
Definef (x)=ywherey=c 1 y 1+ ã ã ã +c k y k It can be verified thatf satisfies the definition of isomorphism
1.3 Let S be a vector space and suppose S is the linear span of the vectors x 1 , , x m If somex i is a linear combination ofx 1 , , x i − 1 , x i + 1 , , x m , then these latter vectors also spanS.
1.4 LetSbe a vector space of dimensionnand letx 1 , , x m be linearly indepen- dent vectors inS Then there exists a basis forScontainingx 1 , , x m
If the vectors \( y_1, \ldots, y_n \) form a basis for the space \( S \), then the combined set \( x_1, \ldots, x_m, y_1, \ldots, y_n \) is linearly dependent This implies the existence of a linear combination \( c_1 x_1 + \ldots + c_m x_m + d_1 y_1 + \ldots + d_n y_n = 0 \), where at least one coefficient \( c_i \) or \( d_i \) is nonzero Given that \( x_1, \ldots, x_m \) are linearly independent, it follows that at least one \( d_i \) must be nonzero, indicating that at least one \( y_i \) can be expressed as a linear combination of the other vectors Thus, the set \( x_1, \ldots, x_m, y_1, \ldots, y_{i-1}, y_{i+1}, \ldots, y_n \) remains linearly independent.
Exercises
also spansS If the set is linearly independent then we have a basis as required.
Otherwise we continue the process until we get a basis containingx 1 , , x m 1.5 Any set ofn+1 vectors inR n is linearly dependent.
Proof If the set is linearly independent then by1.4we can find a basis forR n con- taining the set This is a contradiction since every basis forR n must contain pre- ciselynvectors
To construct a basis for the vector space S, we can select vectors x1, x2, , xm successively, ensuring they remain linearly independent at each stage If the chosen vectors span S at any point, we have successfully established a basis However, if they do not span S, we can find an additional vector xm+1 in S that is not in the linear span of the previous vectors This results in the set {x1, x2, , xm, xm+1}, which is also linearly independent This process will eventually conclude, as it is established that any n+1 vectors in R^n are linearly dependent.
1.7 IfS is a subspace ofT then dim(S)≤dim(T ) Furthermore, equality holds if and only ifS=T.
In finite dimensional vector spaces, if the dimension of subspace S is denoted as p and the dimension of space T as q, then any set of r vectors in T is linearly dependent when r exceeds q Given that the basis vectors of S are linearly independent and belong to T, it follows that p must be less than or equal to q If p equals q and S is equal to T, then there exists a vector z in T that is not within the span of the basis vectors of S, leading to the conclusion that the set of these basis vectors along with z is linearly independent This contradicts the earlier assertion that any p+1 vectors in T are linearly dependent Thus, it is established that if S is a subspace of T and their dimensions are equal, then S must be equal to T Conversely, if S equals T, their dimensions are inherently equal, completing the proof.
1 Construct a 3×3 matrixAsuch that bothA,A 2 are nonzero butA 3 =0.
2 Decide whether the determinant of the following matrixAis even or odd, with- out evaluating it explicitly:
Can you find 3×3 matricesX,Y such thatXY−Y X=A?
4 IfA,Baren×nmatrices, show that
5 Evaluate the determinant of then×n matrixA, wherea ij =ij if i=j and a ij =1+ij ifi=j.
6 LetAbe ann×nmatrix and supposeAhas a zero submatrix of orderr×s wherer+s=n+1 Show that|A| =0.
7 LetAbe ann×nmatrix such that traceAB=0 for everyn×nmatrixB Can we conclude thatAmust be the zero matrix?
8 Which of the following sets are vector spaces (with the natural operations of addition and scalar multiplication)? (i) Vectors(a, b, c, d)such thata+2b c−d (ii)n×nmatricesAsuch thatA 2 =I (iii) 3×3 matricesAsuch that a 11 +a 13 =a 22 +a 31
9 IfSandT are vector spaces, then areS∪T andS∩T vector spaces as well?
10 For any matrixA, show thatA=0 if and only if traceA A=0.
11 Let A be a square matrix Prove that the following conditions are equiva- lent: (i) A=A (ii) A 2 =AA (iii) traceA 2 =traceAA (iv) A 2 =A A.
12 LetAbe a square matrix with all row sums equal to 1 IfAA =A A, then show that the column sums ofAare also equal to 1.
13 Verify that each of the following sets is a vector space and find its dimension: (i) Vectors(a, b, c, d)such thata+b=c+d (ii) The set of solutions(x, y, z) to the system 2x−y=0,2y+3z=0.
14 If x,y, z is a basis for R 3 , which of the following are also bases for R 3 ? (i)x+2y,y+3z,x+2z (ii)x+y−2z,x−2y+z,−2x+y+z (iii)x,y, x+y+z.
15 If{x 1 , x 2 }and{y 1 , y 2 }are both bases ofR 2 , show that at least one of the fol- lowing statements is true: (i){x 1 , y 2},{x 2 , y 1}are both bases ofR 2 (ii){x 1 , y 1}, {x 2 , y 2 }are both bases ofR 2
16 Consider the set of all vectorsxinR n such that n i= 1 x i =0 Show that the set is a vector space and find a basis for the space.
17 Determine the dimension of the vector space of alln×nmatricesAsuch that traceA=0.
18 LetS, T be subspaces ofR n DefineS+T as the set of vectors of the form x+ywherex∈S,y∈T Prove thatS+T is a subspace and that dim(S+T )=dim(S)+dim(T )−dim(S∩T ).
19 LetS, T be subspaces ofR n such that dim(S)+dim(T ) > n Show that dim(S∩T )≥1.
Rank, Inner Product and Nonsingularity
Rank
LetA be anm×n matrix The subspace ofR m spanned by the column vectors ofAis called the column space or the column span ofAand is denoted byC(A).
The row space of a matrix A, denoted as R(A), is the subspace of R^n spanned by its row vectors and is isomorphic to the column space C(A) The dimension of the column space is referred to as the column rank, while the dimension of the row space is known as the row rank Notably, in linear algebra, these two ranks are always equal, a concept that is often briefly covered in textbooks.
2.1 The column rank of a matrix equals its row rank.
Let \( A \) be an \( m \times n \) matrix with a column rank of \( r \) The column space \( C(A) \) has a basis consisting of \( r \) vectors, denoted as \( b_1, \ldots, b_r \) We can form an \( m \times r \) matrix \( B = [b_1, \ldots, b_r] \), which implies that each column of \( A \) can be expressed as a linear combination of \( b_1, \ldots, b_r \) Consequently, we can represent \( A \) as \( A = BC \) for some \( r \times n \) matrix \( C \) This indicates that every row of \( A \) is also a linear combination of the rows of \( C \).
The row rank of matrix A, denoted as R(A), is a subset of the column rank R(C), indicating that the dimension of R(A) is at most r Additionally, it can be demonstrated that the column rank does not surpass the row rank, leading to the conclusion that both ranks are equal.
The rank of a matrix A, denoted as rank A, is defined as the common value of its column rank and row rank It is evident that rank A is always greater than or equal to zero Additionally, the rank of A is zero if and only if A is the zero matrix.
2.2 LetA,Bbe matrices such thatABis defined Then rank(AB)≤min{rankA,rankB}.
Proof A vector inC(AB)is of the formABx for some vectorx, and therefore it belongs toC(A) ThusC(AB)⊂C(A)and hence by 1.7,
R.B Bapat, Linear Algebra and Linear Models, Universitext,
DOI 10.1007/978-1-4471-2739-0_2, © Springer-Verlag London Limited 2012
10 2 Rank, Inner Product and Nonsingularity rank(AB)=dimC(AB)≤dimC(A)=rankA.
Now using this fact we have rank(AB)=rank
2.3 LetAbe anm×nmatrix of rankr,r=0 Then there exist matricesB,Cof orderm×r,r×nrespectively such that rankB=rankC=r andA=BC This decomposition is called a rank factorization ofA.
Proof The proof proceeds along the same lines as that of2.1so that we can write
A=BC whereB ism×r andC isr×n Since the columns of B are linearly independent, rankB =r Since C hasr rows, rankC ≤r However, by 2.2,r rankA≤rankCand hence rankC=r
Throughout this monograph, whenever we talk of rank factorization of a matrix it is implicitly assumed that the matrix is nonzero.
2.4 LetA,Bbem×nmatrices Then rank(A+B)≤rankA+rankB.
Proof LetA=XY, B=U V be rank factorizations ofA,B Then
Let \( x_1, \ldots, x_p \) and \( u_1, \ldots, u_q \) represent bases for \( C(X) \) and \( C(U) \), respectively Any vector within the column space of \( [X, U] \) can be expressed as a linear combination of these \( p+q \) vectors Therefore, it follows that the rank of \( [X, U] \) is less than or equal to the sum of the ranks of \( X \) and \( U \), leading to the conclusion that \( \text{rank}[X, U] \leq \text{rank}X + \text{rank}U = \text{rank}A + \text{rank}B \).
The following operations performed on a matrixAare called elementary column operations.
(ii) Multiply a column ofAby a nonzero scalar.
(iii) Add a scalar multiple of one column to another column.
Elementary row and column operations do not alter the rank of a matrix, as they leave the column space, C(A), unchanged These operations are essential for computations, enabling us to simplify a matrix by introducing zeros To determine the rank of a matrix, we first apply these operations to reduce it, and then we compute the rank of the resulting simplified matrix.
Inner Product
LetS be a vector space A function which assigns a real numberx, y to every pair of vectorsx,y inS is said to be an inner product if it satisfies the following conditions:
(ii) x, x ≥0 and equality holds if and only ifx=0
InR n ,x, y =x y=x 1 y 1 + ã ã ã +x n y n is easily seen to be an inner product We will work with this inner product while dealing withR n and its subspaces, unless indicated otherwise.
The norm of a vector \( x \) is defined as the positive square root of its inner product, denoted as \( \|x\| \) Vectors \( x \) and \( y \) are considered orthogonal or perpendicular if their inner product equals zero, which is expressed as \( x \perp y \).
2.5 If x 1 , , x m are pairwise orthogonal nonzero vectors then they are linearly independent.
Proof Supposec 1 x 1 + ã ã ã +c m x m =0 Then c 1 x 1+ ã ã ã +c m x m , x 1 =0 and hence m i = 1 c i x i , x 1 =0.
Since the vectorsx 1 , , x m are pairwise orthogonal, it follows thatc 1 x 1 , x 1 =0 and sincex 1is nonzero,c 1=0 Similarly we can show that eachc i is zero There- fore the vectors are linearly independent
A set of vectorsx 1 , , x m is said to form an orthonormal basis for the vector spaceSif the set is a basis forSand furthermore,x i , x j is 0 ifi=j and 1 ifi=j.
We now describe the Gram–Schmidt procedure which produces an orthonormal basis starting with a given basis,x 1 , , x n
Sety 1=x 1 Having definedy 1 , , y i− 1, we define y i =x i −a i,i − 1 y i − 1− ã ã ã −a i1 y 1 wherea i,i − 1 , , a i 1 are chosen so thaty i is orthogonal toy 1 , , y i − 1 Thus we must solvey i , y j =0,j=1, , i−1 This leads to x i −a i,i − 1 y i − 1− ã ã ã −a i1 y 1 , y j =0, j=1, , i−1 which gives x i , y j − i − 1 k = 1 a ik y k , y j =0, j=1, , i−1.
12 2 Rank, Inner Product and Nonsingularity
Now sincey 1 , , y i − 1 , is an orthogonal set, we get x i , y j −a ij y j , y j =0 and hence, a ij =x i , y j y j , y j ; j=1, , i−1.
The process continues to derive the basis vectors \( y_1, \ldots, y_n \) of pairwise orthogonal vectors Given that \( x_1, \ldots, x_n \) are linearly independent, it follows that each \( y_i \) is nonzero By defining \( z_i = \frac{y_i}{\|y_i\|} \), the vectors \( z_1, \ldots, z_n \) form an orthonormal basis It is important to note that the linear span of \( z_1, \ldots, z_i \) is equivalent to the linear span of \( x_1, \ldots, x_i \) for each index \( i \).
The Gram-Schmidt procedure allows us to transform a set of linearly independent vectors \( x_1, \ldots, x_m \) into a pairwise orthogonal set \( y_1, \ldots, y_m \) Each vector \( y_i \) is a linear combination of the preceding vectors \( x_1, \ldots, x_{i-1} \) for \( i = 1, \ldots, m \) This property is essential for proving the subsequent results.
LetW be a set (not necessarily a subspace) of vectors in a vector spaceS We define
It follows from the definitions thatW ⊥ is a subspace ofS.
2.6 LetS be a subspace of the vector spaceT and let x∈T Then there exists a unique decompositionx=u+vsuch thatu∈Sandv∈S ⊥ The vectoruis called the orthogonal projection ofxon the vector spaceS.
To prove the required decomposition, we start with the assumption that if \( x \in S \), then \( x = x + 0 \) If \( x \) is not in \( S \), we consider a basis \( x_1, \ldots, x_m \) for \( S \) and apply the Gram-Schmidt process to generate a sequence of pairwise orthogonal vectors \( y_1, \ldots, y_m, v \) Since \( v \) is orthogonal to each \( y_i \) and the linear span of \( y_1, \ldots, y_m \) matches that of \( x_1, \ldots, x_m \), it follows that \( v \in S^\perp \) Furthermore, according to the Gram-Schmidt process, \( x - v \) is a linear combination of \( y_1, \ldots, y_m \), confirming that \( x - v \in S \) Thus, we can express \( x \) as \( (x - v) + v \), establishing the required decomposition Finally, we need to demonstrate the uniqueness of this decomposition.
If x=u 1+v 1=u 2+v 2 are two decompositions satisfyingu 1∈S, u 2∈S, v 1 ∈S ⊥ ,v 2 ∈S ⊥ ; then
Since u 1−u 2 , v 1−v 2 =0, it follows from the preceding equation that u 1− u 2 , u 1 −u 2 =0 Thenu 1 −u 2 =0 and henceu 1 =u 2 It easily follows thatv 1 =v 2
Thus the decomposition is unique
2.7 LetWbe a subset of the vector spaceT and letSbe the linear span ofW Then dim(S)+dim
Proof Suppose dim(S)=m, dim(W ⊥ )=nand dim(T )=p Letx 1 , , x m and y 1 , , y n be bases forS,W ⊥ respectively Suppose c 1 x 1+ ã ã ã +c m x m +d 1 y 1+ ã ã ã +d n y n =0.
Letu=c 1 x 1 + ã ã ã +c m x m , v=d 1 y 1 + ã ã ã +d n y n Since x i , y j are orthogonal for eachi,j;u andv are orthogonal Howeveru+v=0 and hence u=v=0.
It follows thatc i =0,d j =0 for eachi,j and hencex 1 , , x m ,y 1 , , y n is a linearly independent set Thereforem+n≤p Ifm+n < p, then there exists a vectorz∈T such that x 1 , , x m ,y 1 , , y n , z is a linearly independent set Let
Let M be the linear span of vectors x₁, , xₘ and y₁, , yₙ According to theorem 2.6, there exists a decomposition z = u + v, where u ∈ M and v ∈ M⊥ Since v is orthogonal to each xᵢ, it follows that v ∈ W⊥ Additionally, v is orthogonal to each yᵢ, leading to the conclusion that v = 0 Consequently, we have z = u, which contradicts the linear independence of z from the vectors x₁, , xₘ and y₁, , yₙ Therefore, we conclude that m + n = p The proof of the subsequent result is left as an exercise.
2.8 IfS 1⊂S 2⊂T are vector spaces, then: (i)(S 2 ) ⊥ ⊂(S 1 ) ⊥ (ii)(S 1 ⊥ ) ⊥ =S 1.
Let A be an m×n matrix, and the collection of all vectors x in R^n that satisfy the equation Ax=0 forms a subspace of R^n This specific subspace is referred to as the null space of A, which is denoted as N(A).
Proof Ifx∈N (A)then Ax=0 and hence y Ax=0 for ally ∈R m Thus x is orthogonal to any vector inC(A ) Conversely, ifx∈C(A ) ⊥ , thenx is orthogonal to every column ofA and thereforeAx=0
2.10 LetAbe anm×nmatrix of rankr Then dim(N (A))=n−r.
The nullity of a matrix A is defined as the dimension of its null space According to the "rank plus nullity" theorem, the sum of the rank and the nullity of a matrix equals the total number of its columns.
14 2 Rank, Inner Product and Nonsingularity
Nonsingularity
In the context of linear algebra, a system of m linear equations with n unknowns can be represented as a matrix equation Ax = b, where A is an m×n matrix of coefficients This equation is deemed consistent if it has at least one solution; if not, it is classified as inconsistent Furthermore, the equation is considered homogeneous when b equals zero The solutions to the homogeneous equation Ax = 0 correspond to the null space of the matrix A.
If the equation \( Ax = b \) is consistent, it can be expressed as \( b = x_1 a_1 + x_2 a_2 + \ldots + x_n a_n \) for some coefficients \( x_1, \ldots, x_n \), indicating that \( b \) belongs to the column space \( C(A) \) Conversely, if \( b \) is in \( C(A) \), then the equation \( Ax = b \) must also be consistent When the equation is consistent and \( x_0 \) is a solution, the complete set of solutions can be represented as \( x_0 + x : x \in N(A) \).
Clearly, the equationAx=b has either no solution, a unique solution or infinitely many solutions.
A matrixAof ordern×nis said to be nonsingular if rankA=n, otherwise the matrix is singular.
2.11 LetAbe ann×nmatrix Then the following conditions are equivalent:
(ii) For anyb∈R n ,Ax=bhas a unique solution.
(iii) There exists a unique matrixBsuch thatAB=BA=I.
Proof (i)⇒(ii) Since rankA=nwe haveC(A)=R n and thereforeAx=bhas a solution IfAx=bandAy=bthenA(x−y)=0 By2.10, dim(N (A))=0 and thereforex=y This proves the uniqueness.
(ii)⇒(iii) By (ii),Ax=e i has a unique solution, sayb i , wheree i is thei-th column of the identity matrix ThenB=(b 1 , , b n )is a unique matrix satisfying
AB=I Applying the same argument toA we conclude the existence of a unique matrixCsuch thatCA=I NowB=(CA)B=C(AB)=C.
(iii)⇒(i) Suppose (iii) holds Then anyx∈R n can be expressed asx=A(Bx) and hence C(A)=R n Thus rankA, which by definition is dim(C(A)) must ben
The matrixB of (ii) of2.11is called the inverse ofAand is denoted byA − 1
If A and B are n×n matrices, then the equation (AB)(B⁻¹ A⁻¹) = I holds true, leading to the conclusion that (AB)⁻¹ = B⁻¹ A⁻¹ This indicates that the product of two nonsingular matrices is also nonsingular For an n×n matrix A, the submatrix A_ij is formed by removing row i and column j The cofactor of the element a_ij is defined as (-1)^(i+j) |A_ij| The adjoint of matrix A, represented as adjA, is an n×n matrix where the (i, j)-entry corresponds to the cofactor of a_ji.
Frobenius Inequality
From the theory of determinants we have n j = 1 a ij (−1) i + j |A ij | = |A| and fori=k, n j = 1 a ij (−1) i+k |A kj | =0.
These equations can be interpreted as
Thus if|A| =0, thenA − 1 exists and
A square matrix is nonsingular if and only if its determinant is nonzero, as demonstrated by the relationship |AA − 1| = |A||A − 1| = 1, which leads to the conclusion that |A| must be greater than zero.
Anr×rminor of a matrix is defined to be the determinant of anr×rsubmatrix ofA.
Let A be an m×n matrix with rank r, and let s be greater than r When considering an s×s minor of A formed by specific rows and columns, it follows that the selected columns must be linearly dependent, resulting in the minor being equal to zero.
If matrix A has rank r, it contains r linearly independent rows, denoted as rows i1, , ir The submatrix B, formed by these rows, also has rank r, indicating that its column rank is r as well Consequently, there exists an r×r submatrix C within B (and A) that maintains rank r According to theorem 2.12, this submatrix C possesses a nonzero determinant.
The rank of a matrix A is defined as r if there exists a nonzero r×r minor, while all s×s minors for s greater than r are zero Additionally, it is important to note that the rank is zero if and only if A is the zero matrix.
2.13 LetB be anm×r matrix of rankr Then there exists a matrixX(called the left inverse ofB), such thatXB=I.
If the matrix \( B \) is nonsingular and possesses an inverse, then we assume \( m \) is less than \( r \) This implies that the columns of \( B \) are linearly independent Consequently, it is possible to identify a set of \( m - r \) columns that, when combined with the columns of \( B \), create a basis for \( \mathbb{R}^m \).
16 2 Rank, Inner Product and Nonsingularity find a matrixUof orderm×(m−r)such that[B, U]is nonsingular Let the inverse of[B, U]be partitioned as X V whereXisr×m Since
An n×m matrix C of rank r has a right inverse, meaning there exists a matrix Y such that CY = I It is important to note that left and right inverses are not unique unless the matrix is square and nonsingular.
2.14 LetBbe anm×rmatrix of rankr Then there exists a nonsingular matrixP such that
Proof The proof is the same as that of2.13 If we setP = X V thenP satisfies the required condition
Similarly, ifC isr×nof rankrthen there exists a nonsingular matrixQsuch thatCQ= [I,0] These two results and the rank factorization (see2.3) immediately lead to the following.
2.15 LetAbe anm×nmatrix of rankr Then there exist nonsingular matricesP,
Rank is not affected upon multiplying by a nonsingular matrix For, ifAism×n andP is nonsingular of ordermthen rankA=rank
Hence rank(P A)=rankA A similar result holds for post-multiplication by a nonsingular matrix.
2.16 IfAis ann×nmatrix of rankrthen there exists ann×nmatrixZof rank n−rsuch thatA+Zis nonsingular.
Proof By2.15there exist nonsingular matricesP , Qsuch that
Exercises
Observe that2.16may also be proved using rank factorization; we leave this as an exercise.
2.17 The Frobenius Inequality LetA,Bben×nmatrices Then rank(AB)≥rankA+rankB−n.
Proof By2.16there exists a matrixZ of rankn−rankAsuch thatA+Zis non- singular We have rankB=rank
≤rank(AB)+rank(ZB) by2.4
Hence rank(AB)≥rankA+rankB−n
1 Find the rank of the following matrix for each real numberα:
2 Let{x 1 , , x p },{y 1 , , y q }be linearly independent sets in R n , where p < q≤n Show that there existsi∈ {1, , q}such that{x 1 , , x p , y i }is linearly independent.
3 LetX= {x 1 , , x n },Y = {y 1 , , y n }be bases forR n and letS⊂Xbe a set of cardinalityr, 1≤r≤n Show that there existsT ⊂Y of cardinalityr such that(X\S)∪T is a basis forR n
4 LetAbe anm×nmatrix and letBbe obtained by changing anykentries ofA.
Show that rankA−k≤rankB≤rankA+k.
5 LetA,B,C ben×nmatrices Is it always true that rank(ABC)≤rank(AC)?
18 2 Rank, Inner Product and Nonsingularity
6 Find two different rank factorizations of the matrix
7 LetA,B,C,Dben×nmatrices such that the matrix
C D has rankn Show that|AD| = |BC|.
8 Which of the following functions define an inner product onR 3 ?
9 Find the orthogonal projection of[2,1,0]on the space spanned by[1,−1,1],
10 The following vectors form a basis forR 3 Use the Gram–Schmidt procedure to convert it into an orthonormal basis. x= 2 3 −1
11 LetAbe ann×nmatrix Show thatAis nonsingular if and only ifAx=0 has no nonzero solution.
12 LetAbe a nonsingular matrix, letB=A − 1 , and supposeA,Bare conformally partitioned as
Then assuming the inverses exist, show that
13 LetAbe ann×nmatrix and letb∈R n Show thatAis nonsingular if and only ifAx=bhas a unique solution.
14 LetAbe ann×nmatrix with only integer entries Show thatA − 1 exists and has only integer entries if and only if|A| = ±1.
15 Compute the inverses of the following matrices:
16 LetA,B be matrices of order 9×7 and 4×3 respectively Show that there exists a nonzero 7×4 matrixXsuch thatAXB=0.
17 LetA,X,Bben×nmatrices Prove the following generalization of the Frobe- nius Inequality: rank(AXB)≥rank(AX)+rank(XB)−rank(X).
18 Let A,B,C,D be n×n matrices such that A is nonsingular and suppose
AC=CA Then show that
19 LetP be an orthogonal matrix and letQbe obtained by deleting the first row and column ofP Show thatp 11and|Q|are equal in absolute value.
Eigenvalues and Positive Definite Matrices
Preliminaries
LetAbe ann×nmatrix The determinant|A−λI|is a polynomial in the (complex) variableλof degreenand is called the characteristic polynomial ofA The equation
|A−λI| =0 is called the characteristic equation ofA By the Fundamental Theorem of Algebra the equation hasnroots and these roots are called the eigenvalues ofA.
The eigenvalues may not all be distinct The number of times an eigenvalue oc- curs as a root of the characteristic equation is called the algebraic multiplicity of the eigenvalue.
We factor the characteristic polynomial as
Setting λ=0 in the characteristic polynomial reveals that the determinant |A| is the product of the eigenvalues of matrix A Additionally, by comparing the coefficients of λ^(n−1) on both sides of the equation, we find that the trace of A is equal to the sum of its eigenvalues.
A principal submatrix of a square matrix is a submatrix formed by a set of rows and the corresponding set of columns A principal minor ofAis the determinant of a principal submatrix.
A square matrixAis called symmetric ifA=A Ann×nmatrixAis said to be positive definite if it is symmetric and if for any nonzero vectorx,x Ax >0.
A symmetricn×nmatrixAis said to be positive semidefinite ifx Ax≥0 for all x∈R n
The identity matrix is clearly positive definite and so is a diagonal matrix with only positive entries along the diagonal.
3.1 IfAis positive definite then it is nonsingular.
Proof IfAx=0, thenx Ax=0 and sinceAis positive definite,x=0 Therefore
R.B Bapat, Linear Algebra and Linear Models, Universitext,
DOI 10.1007/978-1-4471-2739-0_3, © Springer-Verlag London Limited 2012
22 3 Eigenvalues and Positive Definite Matrices The next result is obvious from the definition.
3.2 IfA,Bare positive definite and ifα≥0,β≥0, withα+β >0, thenαA+βB is positive definite.
By3.2,αA+(1−α)Iis positive definite and therefore by3.1,f (α)=0, 0≤α≤1. Clearly,f (0)=1 and sincef is continuous,f (1)= |A|>0 3.4 IfAis positive definite then any principal submatrix ofAis positive definite.
Since matrix A is positive definite, it holds that x^T A x > 0 for all non-zero vectors x By applying this condition to vectors with zeros in coordinates j_1, , j_s, the expression x^T A x simplifies to y^T B y, where B is the principal submatrix of A obtained by removing the rows and columns corresponding to j_1, , j_s Consequently, B, along with any principal submatrix of A, is also positive definite.
The Spectral Theorem
3.5 IfAis a symmetric matrix then the eigenvalues ofAare all real.
Proof Supposeμis an eigenvalue ofAand letμ=α+iβ, whereα,βare real and i=√
Taking the complex conjugate of the above determinant and multiplying the two we get (A−αI )−iβI(A−αI )+iβI=0 (3.2)
Since A is symmetric, it follows that (A−αI)² is positive semidefinite Consequently, if β equals zero, then (A−αI)² + β²I becomes positive definite, which contradicts the conditions of equation (3.2) Therefore, we conclude that β must be zero and μ must be a real number.
IfAis a symmetricn×nmatrix, we will denote the eigenvalues ofAbyλ 1 (A)≥ ã ã ã ≥λ n (A)and occasionally byλ 1 ≥ ã ã ã ≥λ n if there is no possibility of confusion.
LetAbe a symmetricn×nmatrix Then for anyi,|A−λ i I| =0 and therefore
The matrix A - λiI is singular, indicating that its null space has a dimension of at least one This null space is referred to as the eigenspace of A associated with the eigenvalue λi, and any nonzero vector within this eigenspace is termed an eigenvector corresponding to λi The dimension of this null space is known as the geometric multiplicity of λi.
3.6 LetAbe a symmetricn×nmatrix, letλ=μbe eigenvalues ofAwithx, yas corresponding eigenvectors respectively Thenx y=0.
Proof We haveAx=λx andAy=μy Thereforey Ax=y (Ax)=λy x Also, y Ax=(y A)x=μy x Thusλy x=μy x Sinceλ=μ, it follows thatx y=0.
An orthogonal square matrix P satisfies the condition P^(-1) = P, meaning that the product of P and its transpose equals the identity matrix I (P P^T = I) Consequently, an n×n matrix is considered orthogonal if its rows and columns create an orthonormal basis for R^n.
The identity matrix is an example of an orthogonal matrix When rows and/or columns of the identity matrix are permuted, the resulting matrix is known as a permutation matrix, which also retains orthogonality Furthermore, the product of orthogonal matrices remains orthogonal.
3.7 The Spectral Theorem LetAbe a symmetricn×nmatrix Then there exists an orthogonal matrixP such that
The proof is evident for n=1 We assume the result holds for matrices of order n−1 and proceed with induction Let x be an eigenvector corresponding to λ1, normalized to x = 1 An orthogonal matrix Q can be constructed with x as its first column; this is achievable by extending x to a basis for Rn and applying the Gram–Schmidt process.
The eigenvalues ofQ AQare alsoλ 1 , , λ n and hence the eigenvalues ofB are λ 2 , , λ n Clearly,Bis symmetric sinceQ AQis so By the induction assumption there exists an orthogonal matrixRsuch that
24 3 Eigenvalues and Positive Definite Matrices Suppose the matrixP in3.7has columnsx 1 , , x n Then, since
AP =Pdiag(λ 1 , , λ n ), we haveAx i =λ i x i In other words,x i is an eigenvector ofAcorresponding toλ i Another way of writing (3.3) is
This is known as the spectral decomposition ofA.
3.8 LetAbe a symmetricn×nmatrix ThenAis positive definite if and only if the eigenvalues ofAare all positive.
Proof By the Spectral Theorem,P AP =diag(λ 1 , , λ n )for an orthogonal ma- trixP The result follows from the fact thatA is positive definite if and only if
Similarly, a symmetric matrix is positive semidefinite if and only if its eigenval- ues are all nonnegative.
3.9 IfAis positive semidefinite then there exists a unique positive semidefinite ma- trixBsuch thatB 2 =A The matrixBis called the square root ofAand is denoted byA 1/2
Proof There exists an orthogonal matrixP such that (3.3) holds SinceAis positive semidefinite,λ i ≥0,i=1, , n Set
To prove the uniqueness we must show that ifB, C are positive semidefinite matrices satisfyingA=B 2 =C 2 , thenB=C LetD=B−C By the Spectral
Theorem, there exists an orthogonal matrixQsuch thatZ=QDQ is a diagonal matrix LetE=QBQ ,F =QCQ and it will be sufficient to show thatE=F. SinceZ=E−F is a diagonal matrix,e ij =f ij ,i=j Also,
If \( z_{ii} = 0 \), then \( e_{ii} = f_{ii} \) Given that \( z_{ii} = 0 \) leads to \( e_{ii} + f_{ii} = 0 \), and since \( E \) and \( F \) are positive semidefinite, it follows that \( e_{ii} \geq 0 \) and \( f_{ii} \geq 0 \) Consequently, we conclude that \( e_{ii} = f_{ii} = 0 \) for \( i = 1, \ldots, n \), thereby completing the proof.
A square matrixAis said to be idempotent ifA 2 =A.
Schur Complement
Proof Letλ 1 , , λ n be the eigenvalues ofA Thenλ 2 1 , , λ 2 n are the eigenvalues ofA 2 (see Exercise7) Since A=A 2 ,{λ 2 1 , , λ 2 n } = {λ 1 , , λ n }and it follows thatλ i =0 or 1 for eachi
Conversely, ifAis symmetric and if each eigenvalue ofAis 0 or 1 thenAis idempotent This follows by an application of the Spectral Theorem.
We say that a matrix has full row (or column) rank if its rank equals the number of rows (or columns).
3.11 IfAis idempotent then rankA=traceA.
Proof LetA=BCbe a rank factorization SinceBhas full column rank, it admits a left inverse by 2.13 Similarly,C admits a right inverse Let B − ,C − r be a left inverse and a right inverse ofB,Crespectively ThenA 2 =Aimplies
B − BCBCC r − =B − BCC r − =I and henceCB=bI, where the order of the identity matrix is the same as rankA.
Thus traceA=traceBC=traceCB=rankA
If matrix A is positive definite, then all of its principal submatrices are also positive definite, which implies that all principal minors of A are positive We will now demonstrate the converse of this statement, utilizing a result that can be proven by expanding the determinant along a column multiple times.
3.12 LetAbe ann×nmatrix Then for anyμ,
|A+μI| n i= 0 μ n − i s i , (3.4) wheres i is the sum of alli×i principal minors of A We sets 0 =1 Note that s n = |A|.
If A is a symmetric n×n matrix with all principal minors being positive, then it follows that |A + μI| > 0 for any μ ≥ 0, including when μ = 0, where |A| > 0 Consequently, A cannot have any nonpositive eigenvalues, confirming that A is positive definite This conclusion aligns with the findings from sections 3.3 and 3.4.
3.13 LetAbe a symmetricn×nmatrix ThenAis positive definite if and only if all the principal minors ofAare positive.
26 3 Eigenvalues and Positive Definite Matrices
Similarly, a symmetric matrix is positive semidefinite if and only if all its princi- pal minors are nonnegative.
LetAbe a symmetric matrix which is partitioned as
, (3.5) whereB, Dare square matrices IfBis nonsingular, then the Schur complement of
BinAis defined to be the matrixD−C B − 1 C Similarly, ifDis nonsingular, then the Schur complement ofDinAisB−CD − 1 C
LetB in (3.5) be nonsingular and letX= −C B − 1 The following identity can be verified by simple matrix multiplication.
Several useful facts can be proved using (3.6).
3.14 The following assertions are true:
(i) IfAis positive definite thenD−C B − 1 Cis positive definite.
(ii) LetAbe symmetric If a principal submatrix ofAand its Schur complement in
Aare positive definite thenAis positive definite.
Proof (i) Clearly if A is positive definite then SAS is positive definite for any nonsingularS If
, where, as before,X= −C B − 1 , then|S| =1 and henceSis nonsingular ThusSAS is positive definite (see (3.6)) and sinceD−C B − 1 Cis a principal submatrix ofS, it is positive definite.
If matrix A is partitioned as described and matrices B and D - C B - 1 C are positive definite, then the right-hand side of the equation is also positive definite Consequently, it can be concluded that matrix A is positive definite, given that S, as defined previously, is nonsingular.
(iii) This is immediate by taking the determinant of both sides in (3.6)
In (3.5), supposeAisn×nandB is(n−1)×(n−1) Then C is a column vector andDis 1×1 Let us rewrite (3.5) as
The Schur complement ofBinAisd−c B − 1 cwhich is a scalar By (iii),3.14, d−c B − 1 c=|A|
Exercises
A leading principal submatrix is defined by selecting the first k rows and the first k columns of a matrix, and its determinant is referred to as a leading principal minor This concept is essential for characterizing positive definite matrices.
3.15 LetAbe a symmetricn×nmatrix ThenAis positive definite if and only if all leading principal minors ofAare positive.
To demonstrate that a matrix \( A \) is positive definite, it is essential to show that all its leading principal minors are positive This proof employs induction, starting with the base case for \( n=1 \), which is straightforward Assuming the property holds for \( (n-1) \times (n-1) \) matrices, we partition \( A \) accordingly Since all leading principal minors of matrix \( B \) are positive, it follows from the induction hypothesis that \( B \) is positive definite Additionally, given that the determinant \( |A| > 0 \), the Schur complement of \( B \) in \( A \) is also positive Consequently, by applying the relevant theorem, we conclude that \( A \) is indeed positive definite, thereby completing the proof.
LetA,B ben×nmatrices We writeA≥B to denote the fact thatA,B and
A−Bare all positive semidefinite matrices We writeA≤Bif it is true thatB≥A.
3.16 LetA, Bbe positive definite matrices such thatA≥B ThenA − 1 ≤B − 1
In the case where B equals the identity matrix I, if matrix A is greater than or equal to I, it follows that the matrix A minus I is positive semidefinite This indicates that all eigenvalues of A are at least 1 Consequently, the eigenvalues of A minus I are at most 1, leading to the conclusion that A minus I is less than or equal to I Therefore, in general, the relationship A greater than or equal to B can be established.
B − 1/2 AB − 1/2 ≥I and now the first part can be used to complete the proof
Let A be a symmetric n×n matrix where the sum of the entries in each row equals α This implies that α is an eigenvalue of A The remaining eigenvalues of A are denoted as α₂, , αₙ When considering the matrix A + βJ, where β is a real number and J is the all-ones matrix, the eigenvalues of this new matrix can be derived based on the properties of A and the influence of J on the eigenvalue spectrum.
J is a matrix of all ones?
2 Find the eigenvalues of then×nmatrix with all diagonal entries equal toaand all off-diagonal entries equal tob.
3 If Ais a symmetric matrix, then show that the algebraic multiplicity of any eigenvalue ofAequals its geometric multiplicity.
4 LetAbe a symmetric, nonsingular matrix Show that Ais positive definite if and only ifA − 1 is positive definite.
5 LetAbe ann×npositive definite matrix and letx∈R n withx =1 Show that x Ax x A − 1 x
28 3 Eigenvalues and Positive Definite Matrices
6 LetAbe a symmetric matrix If every leading principal minor ofAis nonneg- ative, can we conclude thatAis positive semidefinite?
7 IfAhas eigenvaluesλ 1 , , λ n then show thatA 2 has eigenvaluesλ 2 1 , , λ 2 n
8 IfA,Baren×nmatrices, then show thatABandBAhave the same eigenval- ues.
9 LetA,Bbe matrices of orderm×n,n×mrespectively Consider the identity
Now obtain a relationship between the characteristic polynomials ofAB and
BA Conclude that the nonzero eigenvalues ofABandBAare the same.
10 IfSis a nonsingular matrix then show thatAandS − 1 AShave the same eigen- values.
11 SupposeAis ann×nmatrix and let
|A−λI| =c 0−c 1 λ+c 2 λ 2 − ã ã ã +c n (−1) n λ n be the characteristic polynomial ofA The Cayley–Hamilton Theorem asserts thatAsatisfies its characteristic equation, i.e., c 0 I−c 1 A+c 2 A 2 − ã ã ã +c n (−1) n A n =0.
Prove the theorem for a diagonal matrix Then prove the theorem for any sym- metric matrix.
12 Prove the following: IfA=B Bfor some matrixBthenAis positive semidef- inite Further,Ais positive definite ifB has full column rank.
13 IfAis positive semidefinite, prove the following: (i) |A| ≥0 (ii) All princi- pal minors ofAare nonnegative (iii)Ais positive definite if and only if it is nonsingular.
14 LetAbe a square matrix such thatA+A is positive definite Then prove that
15 IfAis symmetric then show that rankAequals the number of nonzero eigen- values ofA, counting multiplicity.
16 LetAhave eigenvaluesλ 1 , , λ n and let 1≤k≤n Show that i 1 < ããã 1, and supposea ij ≤0 for all i=j LetB be the Schur complement ofa 11 inA Show that b ij ≤0 for all i=j.
23 LetAbe ann×nmatrix, not necessarily symmetric, and suppose all principal minors ofAare positive Show that any real eigenvalue ofAmust be positive.
24 LetA,B ben×npositive semidefinite matrices and letCbe the matrix with its(i, j )-entry given byc ij =a ij b ij , i, j=1, , n Show that C is positive semidefinite.
25 Letx 1 , , x n be positive numbers Show that then×nmatrix with its(i, j )- entry x 1 i + x j ,i, j=1, , nis positive semidefinite.
26 Let X,Y be n×n symmetric matrices such that X is positive definite and
XY+Y Xis positive semidefinite Show thatY must be positive semidefinite.
27 LetA,Bben×nmatrices such thatA≥B Show thatA 1 2 ≥B 1 2
Preliminaries
LetAbe anm×nmatrix A matrixGof ordern×mis said to be a generalized inverse (or a g-inverse) ofAifAGA=A.
IfAis square and nonsingular, thenA − 1 is the unique g-inverse ofA Otherwise
Ahas infinitely many g-inverses as we will see shortly.
4.1 LetA,Gbe matrices of orderm×nandn×mrespectively Then the following conditions are equivalent:
(ii) For anyy∈C(A),x=Gyis a solution ofAx=y.
Proof (i)⇒(ii) Anyy∈C(A)is of the formy=Azfor somez ThenA(Gy) AGAy=Az=y.
In the context of linear algebra, it can be established that for any vector \( y \) in the column space of matrix \( A \), the equation \( AGy = y \) holds true, leading to the conclusion that \( AGAz = Az \) for all vectors \( z \) Notably, when \( z \) is chosen as the \( i \)-th column of the identity matrix, it becomes evident that the \( i \)-th columns of matrices \( AGA \) and \( A \) are identical, thereby confirming that \( AGA = A \).
Let A=BC be a rank factorization We have seen that B admits a left in- verseB − , andC admits a right inverseC r − ThenG=C r − B − is a g-inverse ofA, since
Alternatively, ifAhas rankr, then by 2.15 there exist nonsingular matricesP , Q such that
It can be verified that for anyU,V,W of appropriate dimensions,
R.B Bapat, Linear Algebra and Linear Models, Universitext,
DOI 10.1007/978-1-4471-2739-0_4, © Springer-Verlag London Limited 2012
32 4 Generalized Inverses is a g-inverse of
P − 1 is a g-inverse ofA This also shows that any matrix which is not a square, nonsin- gular matrix admits infinitely many g-inverses.
Another method which is particularly suitable for computing a g-inverse is as follows LetAbe of rank r Choose anyr×r nonsingular submatrix ofA For convenience let us assume
A 21 A 22 whereA 11 isr×r and nonsingular Since Ahas rankr, there exists a matrixX such thatA 12=A 11 X,A 22=A 21 X Now it can be verified that then×mmatrix
0 0 is a g-inverse ofA (Just multiplyAGAout.) We will often use the notationA − to denote a g-inverse ofA.
4.2 IfGis a g-inverse ofA, then rankA=rank(AG)=rank(GA).
Proof We have rankA=rank(AGA)≤rank(AG)≤rankA, and hence rankA rank(AG) The second part follows similarly
A g-inverse of Ais called a reflexive g-inverse if it also satisfies GAG=G.
Observe that ifGis any g-inverse ofA, thenGAGis a reflexive g-inverse ofA.
4.3 LetGbe a g-inverse ofA Then rankA≤rankG Furthermore, equality holds if and only ifGis reflexive.
Proof For any g-inverseGwe have rankA=rank(AGA)≤rankG IfGis reflex- ive, then rankG=rank(GAG)≤rankAand hence rankA=rankG.
Conversely, suppose rankA= rankG First observe that C(GA)⊂ C(G).
By4.2, rankG=rank(GA)and henceC(G)=C(GA) ThereforeG=GAXfor someX Now
GAG=GAGAX=GAX=G andGis reflexive.
Minimum Norm and Least Squares g-Inverse
4.2 Minimum Norm and Least Squares g-Inverse
4.4 LetAbe anm×nmatrix, letGbe a g-inverse ofA, and lety∈C(A) Then the class of solutions ofAx=yis given byGy+(I−GA)z, wherezis arbitrary.
Proof Sincey∈C(A), theny=Avfor somev Now for anyz,
=AGy=AGAv=Av=y and henceGy+(I−GA)zis a solution Conversely, ifuis a solution ofAx=y, then setz=u−Gyand verify that
A g-inverse Gof Ais said to be a minimum norm g-inverse if, in addition to
AGA=A, it satisfies(GA) =GA The reason for this terminology will be clear from the next result.
4.5 LetAbe anm×nmatrix Then the following conditions are equivalent:
(i) Gis a minimum norm g-inverse ofA.
(ii) For anyy∈C(A),x=Gyis a solution ofAx=ywith minimum norm.
Proof (i)⇒(ii) In view of4.4we must show that
Gy ≤Gy+(I−GA)z (4.1) for anyy∈C(A)and for anyz.
Gy+(I−GA)z 2 = Gy 2 +(I−GA)z 2 +2y G (I−GA)z (4.2) Sincey∈C(A), theny=Aufor someu Hence y G (I−GA)z=u A G (I−GA)z
Inserting this in (4.2) we get (4.1).
(ii)⇒(i) Since for anyy∈C(A),x=Gyis a solution ofAx=y, by4.1,Gis a g-inverse ofA Now we have (4.1) for allzand therefore for allu,a,
Replaceubyαu in (4.3) If u A G (I−GA)z 0, then choosingαlarge and negative we get a contradiction We therefore conclude that u A G (I−GA)z=0
34 4 Generalized Inverses for allu,zand henceA G (I−GA)z=0 ThusA G equals(GA) GA, which is symmetric
A g-inverseGofAis said to be a least squares g-inverse ofAif, in addition to
AGA=A, it satisfies(AG) =AG.
4.6 LetAbe anm×nmatrix Then the following conditions are equivalent:
(i) Gis a least squares norm g-inverse ofA.
(ii) For anyx,y,AGy−y ≤ Ax−y.
Proof (i)⇒(ii) Letx−Gy=w Then we must show
AGy−y+Aw 2 =(AG−I )y 2 + Aw 2 +2w A (AG−I )y (4.5) But w A (AG−I )y=w
A G A −A y=0, (4.6) since(AG) =AG Inserting (4.6) in (4.5) we get (4.4).
(ii)⇒(i) For any vectorx, sety=Axin (ii) Then we see that
The equation AGAx - Ax ≤ Ax - Ax = 0 implies that AGAx = Ax Given that x is arbitrary, we conclude that AGA = A, establishing that G serves as a g-inverse of A The remainder of the proof follows a similar structure to part (ii) ⇒ (i) in section 4.5 and is left as an exercise for the reader.
To find a solution \( x \) that minimizes \( Ax - y \) for the equation \( Ax = y \), which may not be consistent, we can utilize a least squares g-inverse \( G \) of \( A \) By selecting \( x = Gy \), we achieve the optimal solution that minimizes the discrepancy between \( Ax \) and \( y \).
Moore–Penrose Inverse
IfGis a reflexive g-inverse ofAwhich is both minimum norm and least squares then it is called a Moore–Penrose inverse ofA In other words,Gis a Moore–Penrose inverse ofAif it satisfies
AGA=A, GAG=G, (AG) =AG, (GA) =GA (4.7)
We will demonstrate the existence and uniqueness of the graph G To establish uniqueness, we assume that two graphs, G1 and G2, both satisfy the specified condition Our goal is to prove that G1 is equal to G2 This will be achieved by systematically applying the conditions outlined, with each step relying on a reinterpretation of the underlined terms to progress to the next conclusion.
Exercises
We will denote the Moore–Penrose inverse ofAbyA + We now show the exis- tence LetA=BCbe a rank factorization Then it can be easily verified that
1 Find two different g-inverses of
2 Find a g-inverse of the following matrix such that it does not contain any zero entry.
3 Let Abe a matrix and let Gbe a g-inverse of A Show that the class of all g-inverses ofAis given by
G+(I−GA)U+V (I−AG), whereU,V are arbitrary.
4 Show that the class of g-inverses of 1 − 1
5 LetAbe anm×nmatrix, let rankA=r, and letr≤k≤min{m, n} Show that
Ahas a g-inverse of rankk In particular, show that any square matrix has a nonsingular g-inverse.
6 Find the minimum norm solution of the system of equations:
7 Find the Moore–Penrose inverse of
8 Letxbe ann×1 vector Find the g-inverse ofxwhich is closest to the origin.
9 LetXbe ann×mmatrix and lety∈R n Show that the orthogonal projection ofyontoC(X)is given byX(X X) − X yfor any choice of the g-inverse.
10 For any matrixX, show thatX + =(X X) + X andX(X X) − X =XX +
11 LetAbe anm×nmatrix and letP,Qbe matrices of orderr×m Show that
P A=QAif and only ifP AA =QAA
12 LetA,G be matrices of orderm×n,n×mrespectively Show that Gis a minimum norm g-inverse ofAif and only ifGAA =A
13 LetAbe anm×nmatrix of rankrand supposeAis partitioned as
14 LetAbe anm×nmatrix of rankrwith only integer entries If there exists an integer linear combination of ther×rminors ofAwhich equals 1, then show thatAadmits a g-inverse with only integer entries.
Inequalities for Eigenvalues and Singular Values
Eigenvalues of a Symmetric Matrix
We begin by providing a representation for the largest eigenvalue of a symmetric matrix.
5.1 LetAbe a symmetricn×nmatrix Then max x = 0 x Ax x x =λ 1 and the maximum is attained at any eigenvector ofAcorresponding toλ 1.
Proof WriteA=P diag(λ 1 , , λ n )P, whereP is orthogonal and where, as usual, λ 1 ≥ ã ã ã ≥λ n are the eigenvalues ofA Then forx=0, x Ax x x =y diag(λ 1 , , λ n )y y y , y=P x
Clearly, ifxis an eigenvector corresponding toλ 1, then x Ax x x =λ 1 , and the result is proved
R.B Bapat, Linear Algebra and Linear Models, Universitext,
DOI 10.1007/978-1-4471-2739-0_5, © Springer-Verlag London Limited 2012
38 5 Inequalities for Eigenvalues and Singular Values
5.2 Let A,B ben×nmatrices where Ais symmetric andB is positive definite.
Then max x = 0 x Ax x Bx ≤μ, whereμis the largest eigenvalue ofAB − 1
Proof We have max x = 0 x Ax x Bx=max x = 0 x Ax (x B 1/2 )(B 1/2 x),
By5.1this last expression equals the maximum eigenvalue of B − 1/2 AB − 1/2 However,B − 1/2 AB − 1/2 andAB − 1 have the same eigenvalues and the proof is com- plete
In 5.2 the maximum is attained at B 1/2 y where y is any eigenvector of
B − 1/2 AB − 1/2 corresponding to the eigenvalueμ.
5.3 LetBbe a positive definiten×nmatrix and lety∈R n Then max x = 0
The largest eigenvalue of the matrix \(yy B - 1\) is represented as \((x y)^2 x Bx\), with the condition that \(x\) equals zero It is important to note that the eigenvalues of \(yy B - 1\) and \(B^{-1/2} yy B^{-1/2}\) are equivalent The latter matrix is symmetric and possesses a rank of one, resulting in only one nonzero eigenvalue when considering multiplicity This eigenvalue is determined by the trace of \(B^{-1/2} yy B^{-1/2}\), which is equal to the trace of \(y B^{-1} y\) Thus, the proof is concluded.
(5.1) for any positive definite matrixB An alternative proof of (5.1) can be given using the Cauchy–Schwarz inequality: u i v i
Singular Values
LetAbe ann×nmatrix The singular values ofAare defined to be the eigenvalues of(AA ) 1 2 SinceA Ais positive semidefinite, the singular values are nonnegative and we denote them by σ 1 (A)≥ ã ã ã ≥σ n (A).
If there is no possibility of confusion then we will denote the singular values of
The singular values of a rectangular matrix are determined by augmenting the matrix with zero rows or columns to create a square matrix For an m×n matrix A where m < n, we add n−m zero rows to form a square matrix B, and the singular values of A are defined as those of B Conversely, if m > n, we augment A with zero columns This article primarily focuses on the singular values of square matrices, while the general case can be addressed with slight adjustments.
The following assertions are easily verified We omit the proof.
(i) The singular values ofAandP AQare identical for any orthogonal matrices
(ii) The rank of a matrix equals the number of nonzero singular values of the ma- trix.
(iii) IfAis symmetric then the singular values ofAare the absolute values of its eigenvalues IfAis positive semidefinite then the singular values are the same as eigenvalues.
5.4 The Singular Value Decomposition LetAbe ann×nmatrix Then there exist orthogonal matricesP,Qsuch that
⎥⎦, (5.2) whereσ 1≥ ã ã ã ≥σ n are the singular values ofA.
Proof IfA=0, then the result is trivial, so we assumeA=0 Letxbe an eigenvec- tor ofAA corresponding to the eigenvalueσ 1 2 , such thatx =1 Thus
AA x=σ 1 2 x and hencex AA x=σ 1 2 If we set y= 1
A xA x, then it follows thatx Ay=σ 1 We can construct orthogonal matricesU, V such that the first row ofUisx and the first column ofV isy Then
40 5 Inequalities for Eigenvalues and Singular Values
Now we use induction as in the proof of the Spectral Theorem (3.7) and get the result
In section 5.2, the columns y₁, , yₙ of matrix Q and the columns x₁, , xₙ of matrix P are identified Each yᵢ serves as an eigenvector of the matrix AA, while each xᵢ acts as an eigenvector of matrix A corresponding to the same eigenvalue Collectively, these vectors are referred to as the singular vectors of matrix A.
5.5 LetAbe ann×nmatrix Then u =max1, v = 1 u Av=σ 1
Proof We make use of (5.2) For anyu, vof norm 1, u Av=u Pdiag(σ 1 , , σ n )Q v
≤σ 1 wz, by the Cauchy–Schwarz inequality
Also, following the notation of the proof of5.2,|x 1 Ay 1 | =σ 1 and hence equality is attained in (5.5)
5.6 LetAbe ann×nmatrix LetP,Qbe orthogonal matrices such that (5.2) holds.
Lety 1 , , y n denote the columns ofQand letx 1 , , x n denote the columns ofP Then for 2≤k≤n, maxu Av=σ k , where the maximum is taken over the set u, v: u = v =1, u⊥x i , v⊥y i ; i=1, , k−1
Proof The proof proceeds along similar lines as that of 5.5 Only observe that in (5.3), the firstk−1 terms reduce to zero.
Minimax Principle and Interlacing
If5.5,5.6are applied to a positive semidefinite matrix then we get a representation for the eigenvalues This is known as a Rayleigh quotient expression and is given in the next result.
5.7 LetAbe a positive semidefiniten×nmatrix Then
(ii) Let{x 1 , , x n }be a set of orthonormal eigenvectors ofA corresponding to λ 1 , , λ n respectively Then maxu Au=λ k , k=2, , n; where the maximum is taken over the set u: u =1, u⊥x 1 , , x k − 1
Proof SinceAis positive semidefinite, u Av≤ u Au 1
This fact and5.5,5.6give the result
Note that5.7holds for any symmetric matrix as well This can be seen as follows.
IfAis symmetric then we may chooseδ >0 sufficiently large so thatA+δI is positive semidefinite and then apply5.7.
The next result is known as the Courant–Fischer Minimax Theorem It is an im- portant result with several interesting consequences.
5.8 LetAbe a symmetricn×nmatrix Then fork=2, , n; λ k (A)= min w 1 , ,w k− 1 max u ⊥ w 1 , ,w k − 1 , u = 1 u Au.
Proof It is sufficient to show that λ k (A)≤ min w 1 , ,w k− 1 max u ⊥ w 1 , ,w k − 1 , u = 1 u Au, since by5.7, equality is attained whenw i =x i LetP be orthogonal withP AP diag(λ 1 , , λ n ) Then u Au n i = 1 λ i z 2 i , wherez=P u.
42 5 Inequalities for Eigenvalues and Singular Values and
Then dim(T 1 )≥n−k+1 and dim(T 2 )=k Thus (see Exercise 19 in Chap 1) there exists a vectorzof norm 1 inT 1∩T 2 For thisz, n i = 1 λ i z 2 i k i = 1 λ i z 2 i ≥λ k
Thus for anyw 1 , , w k − 1, max u ⊥ w 1 , ,w k − 1 , u = 1 u Au= max z ⊥ P w 1 , ,P w k − 1 , z = 1 n i = 1 λ i z i 2 ≥λ k , and the proof is complete
5.9 Cauchy Interlacing Principle LetAbe a symmetricn×nmatrix, and letB be a principal submatrix ofAof order(n−1)×(n−1) Then λ k (A)≥λ k (B)≥λ k + 1 (A), k=1, , n−1.
Proof We assume, without loss of generality, thatBis obtained by deleting the first row and column ofA By5.7, λ 1 (A)=max u = 1 u Au
Similarly, fork=2, , n−1, by5.8, λ k (A)= min w 1 , ,w k − 1 u⊥w 1 , ,w max k − 1 ,u= 1 u Au
5.10 LetA,Bben×nmatrices such thatA≥B Then λ k (A)≥λ k (B), k=1, , n.
Majorization
The notation A ≥ B indicates that the matrices A, B, and A - B are all symmetric positive semidefinite While the result can be derived using equation 5.8, we will present an alternative proof Consider u₁, , uₙ as orthonormal eigenvectors.
Acorresponding toλ 1 (A), , λ n (A), respectively Similarly, letv 1 , , v n be or- thonormal eigenvectors ofB corresponding toλ 1 (B), , λ n (B), respectively Fix k∈ {1, , n}and let
Then dim(T 1 )=n−k+1, dim(T 2 )=kand therefore there exists a unit vector z∈T 1 ∩T 2 Consider z Az=z λ 1 (A)u 1 u 1 + ã ã ã +λ n (A)u n u n z.
Sincez∈T 1, it is orthogonal to{u 1 , , u k − 1}and hence we have z Az=z λ k (A)u k u k + ã ã ã +λ n (A)u n u n z.
Now, using the fact thatλ k (A)≥λ i (A),i≥kand that z u k u k + ã ã ã +u n u n z≤z u 1 u 1 + ã ã ã +u n u n z=1, we getz Az≤λ k (A) A similar argument, usingz∈T 2, gives z Bz=z λ 1 (B)v 1 v 1 + ã ã ã +λ n (B)v n v n z
The result now follows since for anyz,z Az≥z Bz
Ifx∈R n , then we denote by x [ 1 ]≥ ã ã ã ≥x [ n ] the components ofxarranged in nonincreasing order Ifx, y∈R n , thenxis said to be majorized byy, denotedx≺y, if k i = 1 x [i] ≤ k i = 1 y [i ] , k=1, , n−1 and n i = 1 x i n i = 1 y i
44 5 Inequalities for Eigenvalues and Singular Values
Intuitively,x≺yif the components ofyare more “spread out” than the compo- nents ofx The concept finds applications in several areas in mathematics, statistics and economics.
5.11 LetA=(a ij )be a symmetricn×nmatrix Then
To prove the result for symmetric n×n matrices, we start by confirming its validity for n=1 We then assume the result holds for matrices of order n−1 and apply induction By permuting the rows and columns of the symmetric matrix A, we can arrange its diagonal entries in non-increasing order without affecting the eigenvalues Thus, we can assume, without loss of generality, that the diagonal entries satisfy the condition a11 ≥ a22 ≥ ≥ ann.
Let B be the submatrix formed by removing the last row and column from matrix A For k ranging from 1 to n-1, the following inequalities hold: the sum of the first k eigenvalues of A is greater than or equal to the sum of the first k eigenvalues of B, which in turn is greater than or equal to the sum of the first k diagonal elements of A The first inequality is derived from theorem 5.9, while the second is based on the induction assumption Additionally, the total sum of the eigenvalues of A equals the trace of A, which is also equal to the sum of the diagonal elements of A Thus, the proof is complete.
5.12 LetA=(a ij )be a symmetricn×nmatrix Then fork=1, , n; k i = 1 λ i =max traceR AR where the maximum is taken over alln×kmatricesRwith orthonormal columns.
Let \( R \) be an \( n \times k \) matrix with orthonormal columns, and extend \( R \) into an \( n \times n \) orthogonal matrix \( P \) The expression \( \text{trace}(R^T A R) \) corresponds to the sum of the first \( k \) diagonal entries of \( P^T A P \) Since \( P^T A P \) shares the same eigenvalues as \( A \), it follows that the sum of the first \( k \) eigenvalues, denoted as \( \sum_{i=1}^{k} \lambda_i \), is greater than or equal to \( \text{trace}(R^T A R) \).
If the columns ofR are chosen to form an orthonormal set of eigenvectors corre- sponding toλ 1 , , λ k then equality holds in (5.6) and the result is proved.
Volume of a Matrix
5.13 LetA,Bbe symmetricn×nmatrices Then λ 1 (A+B), , λ n (A+B) is majorized by λ 1 (A)+λ 1 (B), , λ n (A)+λ n (B)
Proof We use5.12 The maximum in the following argument is overn×kmatrices
Rwith orthonormal columns We have k i = 1 λ i (A+B)=max traceR (A+B)R
≤max traceR AR+max traceR BR k i = 1 λ i (A)+ k i = 1 λ i (B) and the proof is complete
5.14 LetAbe a symmetric n×nmatrix and letP 1 , , P m ben×northogonal matrices Then the eigenvalues of
P i AP i are majorized by the eigenvalues ofA.
Proof The result follows by a simple application of5.13
Let A be an m×n matrix, where 1≤k≤min{m, n} The set Q k,n represents the increasing sequences of k elements from {1, , n} For subsets I of {1, , m} and J of {1, , n}, the notation A I J refers to the corresponding submatrix of A The k-th compound of A, denoted C k (A), is an m k × n k matrix The rows and columns of C k (A) are indexed by Q k,m and Q k,n, respectively, with an arbitrary but fixed ordering For indices I in Q k,m and J in Q k,n, the (I, J)-entry of C k (A) is defined as the absolute value of the determinant of the submatrix A I J.
The next result generalizes the familiar fact that forn×nmatricesA, B,|AB| |A||B|.
5.15 Cauchy–Binet Formula LetA,Bbe matrices of orderm×n,n×mrespec- tively, wherem≤n LetS= {1, , m} Then
46 5 Inequalities for Eigenvalues and Singular Values
Proof We only sketch a proof By elementary row operations it can be verified that
The determinant on the right-hand side in the equation above is clearly
(−1) n |AB| It can be seen, by expanding along the first n columns, that the de- terminant on the left equals
|A SI ||B I S |, and the result is proved
The next result follows immediately from5.15.
5.16 Let A,B be matrices of order m×n,n×mrespectively and let 1≤k≤ min{m, n, p} ThenC k (AB)=C k (A)C k (B).
If Ais an nìn matrix, then as usual,σ 1 (A)≥ ã ã ã ≥σ n (A) will denote the singular values ofA We will often writeσ i instead ofσ i (A).
5.17 LetAbe ann×nmatrix and let 1≤k≤n Then the singular values ofC k (A) are given by i ∈ I σ i (A):I∈Q k,n
Proof By5.4, there exist orthogonal matricesP , Qsuch that
Since P and Q are orthogonal, it follows that C_k(P) and C_k(Q) are also orthogonal Consequently, the singular values of C_k(A) correspond to the diagonal elements of C_k(diag(σ_1, , σ_n)), as outlined in equation (5.8) For an n×n matrix A, the k-volume of A, denoted as vol_k(A), is defined for I, J ∈ Q_k,n.
5.5 Volume of a Matrix 47 Note that vol k (A)=0 ifk >rankA Also, it follows from5.17that vol k (A) I ∈ Q k,n i ∈ I σ i 2
In particular, ifk=r=rankA, then vol r (A)=σ 1 ã ã ãσ r
The volume of a set A, denoted as vol(A), is closely related to the concept of determinant in geometry For an n×n matrix A, the determinant provides a measure that reflects the geometric volume associated with the matrix.
Cbe the unit cube inR n Consider the linear transformationx→Ax fromR n to
In the context of linear transformations, let A be an n×n matrix, and C be a unit cube in its column space The image of C under the transformation defined by A is denoted as A(C), and it is established that the volume of A(C) is equal to the absolute value of the determinant of A, denoted as |A| More broadly, if A is an n×n matrix with rank r, the volume of the image of the unit cube C˜ in the column space of A is given by the volume of A, represented as vol(A).
This article explores characterizations of the Moore–Penrose inverse based on volume, specifically demonstrating that for an n×n matrix A, the Moore–Penrose inverse A+ serves as a g-inverse of A that minimizes volume The determination of A+ can be achieved through the singular value decomposition of A, leading to further generalized results.
5.18 LetAbe ann×nmatrix of rankrand let
Q be the singular value decomposition ofA, where P, Qare orthogonal andΣ diag(σ 1 , , σ r ) Then the class of g-inverses ofAis given by
The reflexive g-inverses \( G \) of matrix \( A \) are defined by the equation \( P(5.9) \), where \( Z = Y \Sigma X \) The least squares g-inverses \( G \) of \( A \) are characterized by setting \( X = 0 \) in the same equation In contrast, the minimum norm g-inverses \( G \) of \( A \) are determined by the condition \( Y = 0 \).
Moore–Penrose inverse ofAis given by (5.9) withX,Y,Zall being zero.
Y Z is a g-inverse ofB ThenBH B=B leads to Σ U Σ 0
Thus withU=Σ − 1 and with arbitraryX,Y,Z;His a g-inverse ofB Since the class of g-inverses ofAis given by
48 5 Inequalities for Eigenvalues and Singular Values
, we have proved the first assertion The remaining assertions are proved easily For example, imposing both the conditions BH B=B and H BH =H we see that
U=Σ − 1 andZ=Y Σ X That completes the proof
We now show that the Moore–Penrose inverse enjoys a certain minimality prop- erty with respect to the singular values in the class of all g-inverses of a given matrix.
5.19 LetAbe ann×nmatrix of rankrwith singular values σ 1≥ ã ã ã ≥σ r > σ r+ 1= ã ã ã =σ n =0.
LetGbe a g-inverse ofAwith ranksand with singular values σ 1 (G)≥ ã ã ã ≥σ s (G) > σ s + 1 (G)= ã ã ã =σ n (G)=0.
Proof We assume, without loss of generality, thatA= Σ 0
, where ? denotes a matrix not needed explicitly in the proof Using5.9,5.10, we get σ i 2 (G)=λ i
Ifi > r, thenσ i =0 and the proof is complete
5.20 LetAbe anm×nmatrix of rankrand let 1≤k≤r Then for any g-inverse
Proof Recall that the square of vol k (G)equals the sum of squares of the singular values ofC k (G) Now the result follows using (5.8) and5.19.
Exercises
1 LetA be ann×n symmetric matrix Show that the largest eigenvalue ofA cannot be less than 1 n n i,j = 1 a ij
3 Find the singular values and the singular vectors of the following matrices:
4 LetAbe ann×nmatrix with n 2 2 entries equal to 2 and the remaining entries equal to 4 Show that the sum of the squares of the singular values ofAis 10n 2
In the context of linear algebra, consider a matrix \( A \) with \( n \) rows and \( n(n-1)/2 \) columns, where each column is represented by the vector \( e_{ij} \) that contains 1 at the \( i \)-th position, -1 at the \( j \)-th position, and zeros elsewhere for all pairs \( 1 \leq i < j \leq n \) It can be demonstrated that the only nonzero singular value of matrix \( A \) is \( \sqrt{n} \), which occurs with a multiplicity of \( n-1 \).
6 LetAbe annìnmatrix with singular valuesσ 1≥ ã ã ã ≥σ n Show that max i,j |a ij | ≤σ 1
7 Let A,B be n×n positive semidefinite matrices such that λ k (A)≥λ k (B), k=1, , n Does it follow thatA≥B?
8 LetAbe a symmetricnìnmatrix with eigenvaluesλ 1≥ ã ã ã ≥λ n such that onlyλ n is negative Suppose all the diagonal entries of Aare negative Show that any principal submatrix ofAalso has exactly one negative eigenvalue.
9 Show that the vector(x 1 , , x p )majorizes the vector(x, , x), wherexap- pearsptimes.
In a chess tournament with n players, each participant competes against every other player, where a win earns 1 point, a loss results in 0 points, and a draw awards each player 0.5 points The scores of the players, denoted as s1, s2, , sn, can be shown to be majorized by a specific vector, illustrating the distribution of points among the competitors.
11 LetAbe a symmetricn×nmatrix with eigenvaluesλ 1 , , λ n By the Spectral Theorem there exists an orthogonal matrixP such thatA=P DP , whereD diag(λ 1 , , λ n ) Let S=(s ij ) be the n×n matrix where s ij =p 2 ij , i, j 1, , n Show that
Also show thatShas row and column sums equal to one (ThusSis doubly stochastic and it follows by a well-known result of Hardy, Littlewood and Polya that
(a 11 , , a nn )≺(λ 1 , , λ n ) leading to another proof of5.11.)
50 5 Inequalities for Eigenvalues and Singular Values
12 The Frobenius norm of a matrixA, denoted byA F , is defined as( i,j a ij 2 ) 1 2 Show that ifAis ann×nmatrix, then for any g-inverseGofA,G F ≥ A F and that equality holds if and only ifG=A +
LetCbe the unit square with vertices
Find the area of the image ofCunder the transformationx→Ax Verify that the area equals vol(A).
14 LetAbe ann×nmatrix of rankrwith singular value decomposition
P AQ=diag(σ 1 , , σ r ,0, ,0), whereP andQare orthogonal Show that
15 LetA, B be positive definite matrices of orderm×m,n×nrespectively, and letCbem×n Prove that the matrix
C B is positive definite if and only if the largest singular value ofA − 1 2 CB − 1 2 is less than 1.
16 LetAbe a symmetricn×nmatrix Show that λ n = min x = 1 x Ax.
Obtain a similar expression for λ 1 Also obtain a max-min version of the Courant–Fischer Theorem.
17 LetAbe a symmetricn×nmatrix with|A| =0 Show that the(n−1)×(n−1) principal minors ofAare either all nonnegative or all nonpositive.
Rank Additivity and Matrix Partial Orders
Characterizations of Rank Additivity
IfA,Barem×nmatrices then, sinceB=A+(B−A), we have rankB≤rankA+rank(B−A) (6.1)
Equality in (6.1) is achieved under various equivalent necessary and sufficient conditions This chapter explores these conditions and their interconnections, with a later chapter providing an application to generalized linear models.
6.1 LetA,Bbe nonzero matrices ThenAC − Bis invariant under the choice of the g-inverse if and only ifC(B)⊂C(C)andR(A)⊂R(C).
Proof IfC(B)⊂C(C)andR(A)⊂R(C), thenB=CXandA=Y C for some matricesX,Y ThenAC − B=Y CC − CX=Y CX is clearly invariant under the choice of the g-inverse.
Conversely, supposeAC − B is invariant under the choice of the g-inverse, and suppose thatC(B)is not contained inC(C) Then
B is a nonzero matrix LetA=XY,D=P Qbe rank factorizations Let
I−CC − whereY − ,P − are respectively, a right inverse ofY, and a left inverse ofP Clearly
AC = B=AC − B+AY − P − D=AC − B+XQ.
SinceX admits a left inverse andQadmits a right inverse, XQis a nonzero matrix and we get a contradiction ThusC(B)⊂C(C) Similarly we can show that
R.B Bapat, Linear Algebra and Linear Models, Universitext,
DOI 10.1007/978-1-4471-2739-0_6, © Springer-Verlag London Limited 2012
52 6 Rank Additivity and Matrix Partial Orders
SupposeA, B are n×n positive definite matrices In the context of parallel connections of two electrical networks, the following sum, called the parallel sum ofA,Bis defined.
=A(A+B) − 1 B and this suggests the following definition Call matricesA,B(not necessarily posi- tive definite) parallel summable if
The expression A(A+B) − B (6.2) remains unchanged regardless of the selected g-inverse, and is referred to as the parallel sum of A and B, denoted as P(A, B) This concept of parallel sum is intricately connected to the principle of rank additivity.
As usual,A − will denote an arbitrary g-inverse ofA We say that two vector spaces are virtually disjoint if they have only the zero vector in common.
6.2 LetA,B bem×nmatrices Then the following conditions are equivalent:
(iii) everyB − is a g-inverse ofA
(iv) there existsA − such thatA − A=A − B,AA − =BA −
(v) there existsA − such thatAA − B=BA − A=A
(vi) A,B−Aare parallel summable andP (A, B−A)=0
(vii) C(A)⊂C(B)and there existsA − such thatA − B=A − A
(viii) R(A)⊂R(B)and there existsA − such thatBA − =AA −
(ix) there exist g-inversesB − ,B,ˆ B˜ such thatA=BB − A=ABBˆ =ABA˜ (x) for anyB − ,A=BB − A=AB − B=AB − A
(xi) there existK,L, at least one of which is idempotent, such thatA=KB=BL
(xii) there exists aB − which is a g-inverse of bothAandB−A
(xiii) everyB − is a g-inverse of bothAandB−A
(xiv) there exist nonsingular matricesP,Qsuch that
(xv) there exist g-inversesA − ,A = such thatA − A=A − B,AA = =BA =
(xvi) there exist g-inversesA − ,A = such thatA − AB=BA = A=A.
Proof We will assume that bothAandB−Aare nonzero matrices, for otherwise, the result is trivial LetA=XY,B−A=U V be rank factorizations.
Since (i) holds, (6.3) is a rank factorization ofB Now for anyB − ,
SinceA=XY, it follows thatAB − A=A We also conclude that
(B−A)B − A=0 and AB − (B−A)=0 and thus (i)⇒(ix),(x) Furthermore,
(B−A)B − (B−A)=(B−A)B − B=B−A and therefore (i)⇒(xii),(xiii) as well.
(iii) ⇒(ii) For any B − ,AB − A=A and thus AB − A is invariant under the choice of g-inverse Thus by6.1,C(A)⊂C(B),R(A)⊂R(B) In particular,A U Bfor someUand thereforeAB − B=A Suppose x∈C(A)∩C(B−A) (6.4)
Then x=Ay=(B−A)z (6.5) for somey,z Then x=Ay=AB − Ay=AB − (B−A)z
=AB − Bz−AB − Az=Az−Az=0.
ThusC(A)∩C(B−A)= {0} Similarly it can be shown that the row spaces of
AandB−Aare also virtually disjoint.
(ii)⇒(i) We make use of (6.3) Since (ii) holds,[X U]must have full column rank and Y
V must have full row rank It follows that the rank ofB must equal the number of columns in[X U]and hence (i) is true.
(i)⇒ (iv) Since (i)⇒(iii),(x), for any B − , we haveAB − B=AB − A=A.
Since B − AB − is a g-inverse of A as well, setting A − =B − AB − we see that
A − B=A − A Similarly using(B−A)B − A=0 it follows thatAA − =BA − The proof of (i)⇒(v) is similar.
54 6 Rank Additivity and Matrix Partial Orders
(iv)⇒(ii) Suppose (6.4) holds Then (6.5) is true for somey,z Now x=Ay=AA − Ay=AA − (B−A)z
=A − Bz−Az=AA − Az−Az=0.
ThusC(A)∩C(B−A)= {0} Similarly it can be shown that the row spaces of
AandB−Aare also virtually disjoint.
The proof of (v)⇒(ii) is similar.
(vi) ⇒ (iii) By 6.1, R(A)⊂R(B) Thus for any B − , AB − B =A Now
AB − (B−A)=0 implies thatAB − A=Aand thus (iii) holds.
(iv)⇒(vii) We must only show thatC(A)⊂C(B) Lety∈C(A)so thaty Axfor somex Theny=AA − Ax=BA − Ax∈C(B).
(vii)⇒(iii) SinceC(A)⊂C(B), thenBB − A=A Now for anyB − ,
AB − A=AA − AB − A=AA − BB − A=AA − A=A.
The proof of (iv)⇒(viii) and of (viii)⇒(iii) is similar We have already seen that (i)⇒(ix),(x).
(ix)⇒(ii) Suppose (6.2) holds Then (6.5) is true for somey, z Now x=Ay=AB = Ay=AB = (B−A)z
=AB = Bz−AB = Az=ABBBˆ = Bz−Az
ThusC(A)∩C(B−A)= {0} Similarly it can be shown that the row spaces ofA andB−Aare also virtually disjoint.
(x)⇒(xi) SetK=AB − , L=B − Afor someB −
(xi)⇒(x) SupposeLis idempotent We have
AB − A=KBB − BL=KBL=AL
IfKis idempotent then the proof is similar.
We have already seen that (i)⇒(xii),(xiii).
(xii)⇒(i) SinceBB − ,AB − ,(B−A)B − are idempotent, rankB=rank
The proof of (xiv)⇒(i) is easy We now prove (ii)⇒(xiv) We may assume, without loss of generality, that
, whereris the rank ofA Let rank(B−A)=sand let
U 2 V 1 V 2 be a rank factorization ofB−A, whereU 1 isr×sandV 1 iss×r We first claim that rankU 2=s Suppose rankU 2 < s Then the null space ofU 2contains a nonzero vector, sayx IfU 1 x=0, then
0 is a nonzero vector inC(A)∩C(B−A), which is not possible ThusU 1 x=0 Then
U x=0, which contradicts the fact thatU has full column rank Thus the claim is proved It follows thatU 1=MU 2for someM Similarly,V 1=V 2 N for some N.
The treatment of cases (xv), (xvi) is similar to that of (iv), (v) respectively.
We have thus proved the following implications (or a hint is provided towards the proof) It is left to the reader to verify that all the other implications then follow.
• (i)⇒(iii)–(vi),(ix),(x),(xii),(xiii),(xv),(xvi)
• (iv)⇒(ii),(vii),(viii); (v)⇒(ii) (vi)⇒(iii)
• (vii)⇒(iii); (viii)⇒(iii); (ix)⇒(ii)
• (xiii)⇒(xii); (xiv)⇒(i); (xv)⇒(vii); (xvi)⇒(ii).
56 6 Rank Additivity and Matrix Partial Orders