• No results found

Row and Column Vectors

N/A
N/A
Protected

Academic year: 2022

Share "Row and Column Vectors"

Copied!
68
0
0

Loading.... (view fulltext now)

Full text

(1)

Numerical Linear Algebra

CS 740

Instructor: Ajit Rajwade

(2)

What is numerical linear algebra?

• Branch of mathematics dealing with operations on vectors and matrices.

• Vast applications in: signal and image

processing, computer graphics, computer vision, statistics, machine learning and data mining, economics and finance, many other fields.

(3)

What is a matrix?

• A matrix is a 2D array of size m x n containing values defined over a “field”.

• A “field” is a set of numbers – most common being “real” or “complex”.

Examples:

Square matrix: m = n

(4)

Row and Column Vectors

• A matrix having only one row is called as a row-vector.

• A matrix having only one column is called as a column-vector.

(5)

Some notation

• The space of real numbers in n dimensions is denoted as Rn.

• The space of complex numbers in n dimensions is denoted as Cn.

• If A is a real/complex matrix of size m x n, we say that

n m

n m

A or

A

C R

(6)

Some notation

• By convention, scalars are denoted in non- bold italic font.

• Vectors and matrices are denoted in bold font, with matrices in upper case and vectors in

lower case.

(7)

Common types of matrices

Diagonal matrix: a matrix with all zeros except along the diagonal, i.e. for every entry where i=j.

Identity matrix – square diagonal matrix with value of 1 along the diagonal – commonly denoted as Inxn.

Symmetric matrix: square matrix which is equal to its transpose.

We will see many more types.

(8)

Common matrix operations

• Addition/subtraction

• Scalar multiplication

• Multiplication of a row vector with a matrix.

• Multiplication of a matrix with a column vector.

31 51

6 4

5 2

1 3 5 4

1





42 31 14 5

3 6 4

5 2

1 3

(9)

Common matrix operations:

multiplication

• Multiplication of matrix A having size m1 x n1, with matrix B shaving size m2 x n2 (possible

only when n1 = m2) produces a matrix of size m1 x n2.

Note:

AB≠ BA

Matrix multiplication is non-commutative (or does not commute).

http://en.wikipedia.org/wiki/Matrix_%28math ematics%29

(10)

Common matrix operations:

multiplication

Matrix multiplication, while not commutative, is

associative, i.e. A(BC) = (AB)C for any matrices A, B, C of appropriate dimension.

Multiplication of a row vector (1 x n matrix) and a column vector (n x 1 matrix) produces a scalar (1 x 1 matrix!). Such a multiplication is called the inner

product (dot product) of two vectors.

Multiplication of a column vector (n x 1 matrix) and a row vector (1 x n matrix) produces an n x n matrix.

Such a multiplication is called the outer product of two vectors!

(11)

Common matrix operations

Transpose of a m x n real matrix A is the matrix of size n x m – denoted as AT or A’ –

such that the rows of AT are columns of A, and columns of AT are rows of A.

(12)

Common matrix operations

• For a complex matrix A, the transpose is

defined slightly differently – it is the conjugate transpose and denoted A* or AH :

(13)

Common matrix operations

Trace of a square matrix = sum of values along the diagonal.

• Matrix multiplication is not commutative, but the trace of the product of two matrices does not depend upon the order of the matrices in the product: trace(AB) = trace(BA).

• Trace of a matrix equals that of its transpose.

(14)

Common matrix operations

• Determinant of a square matrix is a scalar quantity defined recursively as follows:

) det(

) 1 ( )

det(

|

|

det

1

ij ij

n

j

j

i a

bc d ad

c

b a

d c

b a









A A

A

Matrix of size n x n Matrix of size (n-1) x (n-1) created by dropping the i-th row and j-th column of A

Element from the i-th row and j-th column of A

(15)

Matrix Inverse

A square matrix A is called invertible or non-

singular if there exists a matrix B such that AB = I where I is the identity matrix.

Matrix B is called the inverse of A. Also, A is the inverse of B.

If a matrix has an inverse, it is unique (just like if a real number has a reciprocal, it is unique).

If A has no inverse, it is said to be non-invertible or singular. Such a matrix has determinant 0.

An invertible matrix (and its inverse) have strictly non-zero determinants.

(16)

Rank of a square matrix

• Consider a matrix A of size n x n. Let ai be the i-th column vector of A.

• The set of n vectors {a1, a2,…, an} is said to be linearly dependent if:

Otherwise the vectors are said to be linearly independent.

zero not

- scalars are

} {

all where

,

1

i all

n

i

i

i

a 0

(17)

Rank of a square matrix

If the vectors are linearly dependent, then any vector can be expressed as a linear combination of the other vectors:

If the column vectors of a matrix are linearly dependent, then it can shown that the row vectors will also be linearly dependent (and

conversely), and the matrix will have determinant 0.

n

k i i

i i n

k i i

i k

i k

n

i

i

i c

, 1 ,

1 1

a a

a 0

a

 

(18)

Rank of a square matrix

• The rank of a matrix is equal to the number of linearly independent column vectors (also

equal to the number of linearly independent row vectors).

• If all n rows (equivalently all n columns) of a matrix of size n x n are linearly independent, then it is said to have full rank. Its

determinant will be strictly non-zero, and hence it will be invertible.

(19)

Rank of a square matrix

• If the rank of a matrix of size n x n is less than n, it is said to be a rank-deficient (or low-rank) matrix. Such a matrix has determinant 0, and hence is non-invertible.

• Examples:









0 0 0

5 1 2

4 2 1 4 ,

2 2 , 1

1 1

1 , 1

3 3 3

2 2 2

5 4 1









1 0 0

5 1 0

4 2 1 5 ,

2 2 , 1

1 0

0 , 1

4 1 3

2 2 2

1 0 1

Non-invertible

Invertible

(20)

What about rank of a non-square matrix?

In this case, we define row-rank and column-rank.

Row-rank = number of linearly independent row vectors.

Column-rank = number of linearly independent column vectors.

It turns out that for any matrix, row-rank = column-rank.

An m x n matrix has full row-rank if its row-rank = m, and full column-rank if its column-rank = n.

(21)

Multiplying a matrix with a vector – Geometric meaning

• A point in 2D is represented as a column vector v with 2 elements.

• A 2 x 2 matrix M multiplied with a 2 x 1 column vector v induces a geometric

transformation on the point underlying the vector. The transformed point is Mv.

• A shape (polygon) can be represented by an ordered sequence of k points – which can be represented as a 2 x k matrix Vk.

(22)

Multiplying a matrix with a vector – Geometric meaning

• Each point of the transformed shape is

represented as a column of the matrix MVk.

• Example 1:





























y x y

x

y x y

x

1 0

0 1

1 0

0

1 Reflection across Y axis

Reflection across X axis

These matrices are called reflection matrices.

(23)

Multiplying a matrix with a vector – Geometric meaning

• Example 2:

0 ,

0 0 ,

0 











a b

by ax y

x b

a

This matrix is called a scaling matrix. Scaling by a

factor of ‘a’ along X axis, and a factor of ‘b’ along the Y axis. If a > 1, it induces expansion along X axis,

otherwise if 0 < a < 1, it causes shrinking along X axis.

If b > 1, it induces expansion along Y axis, otherwise if 0 < b < 1, it causes shrinking along Y axis.

(24)

Multiplying a matrix with a vector – Geometric meaning

• Example 3:













cos sin

sin cos

cos sin

sin cos

y x

y x

y x

This matrix is called as a rotation matrix – more specifically it is a rotation matrix for anti-clockwise rotation through angle Θ about the origin. The rotation matrix for clockwise rotation through

angle Θ about the origin is given as follows:













cos sin

sin cos

cos sin

sin cos

y x

y x

y x

(25)

Multiplying a matrix with a vector – Geometric meaning

• Any 2 x 2 matrix induces a geometric transformation – called as a linear

transformation.

• Rotation, scaling and reflection

transformations are all special cases of a linear transformation.

• Why linear transformation?

(26)

Multiplying a matrix with a vector – Geometric meaning

• Any transformation of vectors – denoted as T(.) – is said to be linear if:

T(ax + by) = aT(x) + bT(y) for any vector x and y, and any scalars a, b (for now, we

assume all these are defined over real numbers).

• Notice that:

A(ax + by) = aAx + bAy

(27)

Multiplying a matrix with a vector – Geometric meaning

• Consider the matrix

• It can viewed as a matrix that transforms the points (0,0),

(0,1),(1,0),(1,1) into (0,0), (c,d),(a+c,b+d),(a,b).





d c

b M a

The vectors represented by a 2-by-2 matrix correspond to the sides of a unit square transformed into a

parallelogram.

http://en.wikipedia.org/wiki/Matrix_%28math ematics%29

(28)

Multiplying a matrix with a vector – Geometric meaning

• What happens if the transformation matrix M is singular?

• It will shrink shapes into a single line, or even a single point. In other words, it will transform the original shape into one with area 0.

• There is an interesting geometric property of the determinant:

) (

)

| (

|

k k

area area

V M  MV

(29)

Multiplying a matrix with a vector – Geometric meaning

• Look at the rotation matrix. Rotation is an operation that does not change the area of the shape (this is called as a rigid

transformation).

• Notice that the determinant of the rotation matrix is 1.

• For a reflection matrix, the determinant is -1:

notice that reflection flips the directions of the vectors!

(30)

Multiple geometric transformations

Multiple geometric transformations can be performed by sequential multiplication of their matrices with the original vector(s).

This is called composition of transformations.

Example:

Matrix multiplication generally does not commute – hence the changing the order of the matrices will change the effective geometric transformation!





















y x b

a y

x

1 0

0 1 cos

sin

sin cos

0 0

(31)

Inverse geometric transformations

• Some geometric transformations are inverses of one another. Their corresponding

transformation matrices also turn out to be inverses of each other!

• Example: anti-clockwise and clockwise

rotation about the origin through angle theta

• Any more examples?













1 0

0 1 cos

sin

sin cos

cos sin

sin cos

(32)

More geometric transformations

• We have seen that 2 x 2 matrices represent geometric transformation in 2D space.

Likewise, 3 x 3 matrices represent geometric transformation in 3D space.

• Bear in mind: this geometric interpretation is also valid for matrices of arbitrary dimension (> 3), but we do not pursue it here further! It is very useful in pattern recognition and

statistics.

(33)

Systems of Linear Equations

• Equations of the following type arise in many areas of scientific computing:

n n

nn n

n

n n

n n

b x

a x

a x

a

b x

a x

a x

a

b x

a x

a x

a

...

...

...

...

...

...

2 2 1

1

2 2

2 22 1

21

1 1

2 12 1

11

Unknowns are x1,x2,…,xn – the

coefficients {bi} and {aij} are all known.

b Ax













or

b b b

x x x

a a

a

a a

a

a a

a

n n

nn n

n

n n

. .

.

. .

. .

. .

2 1 2

1

2 1

2 22

21

1 12

11

(34)

Systems of Linear Equations

• Equations of the following type arise in many areas of scientific computing:

Unknown: x (size n x 1),

Known: A (size n x n),b (size n x 1).

There are systems of linear

equations where A has size m x n (m not equal to n), but we will take up those cases later!

b Ax













or

b b b

x x x

a a

a

a a

a

a a

a

n n

nn n

n

n n

. .

.

. .

. .

. .

2 1 2

1

2 1

2 22

21

1 12

11

(35)

Systems of Linear Equations

• Not all such systems of equations have a solution.

• A solution exists if and only if A is invertible. In this case, the solution is unique and given by x = A-1b.

• If A is singular, then there may exist no solution or more than one solutions. This depends on the vector b.

(36)

Systems of Linear Equations

• Not all such systems of equations have a solution.

• A solution exists if and only if A is invertible. In this case, the solution is unique and given by x = A-1b.

• If A is singular, then there may exist no solution or more than one solutions. This depends on the vector b.

(37)

Systems of Linear Equations

• Example 1: Unique solution

• Example 2: No solution

• Example 3: Multiple solutions













4 2 4

3

2 1

y x













40 2 4

2

2 1

y x

solution a

is ) 2 / ) 2

( ,

( 4 ,

2 4

2

2

1 











x y

y

x R

(38)

Systems of Linear Equations:

Geometric Interpretation

Let’s stick to 2D (i.e. 2 unknowns) for now.

The system of equations has the form:

Geometrically, we are dealing with two lines: ax+by

= e, cx+dy=f.

If these lines intersect, they either do so in one and only one point, or the two lines are coincident.

First case: unique solution, second case: infinitely many solutions all lying on a line.













f e y

x d

c

b a

(39)

Systems of Linear Equations:

Geometric Interpretation

• If these lines do not intersect, they will be parallel. In that case, there is no solution.

• Try to interpret the earlier three examples in this geometric context.

(40)

Systems of Linear Equations:

Geometric Interpretation

The same geometric treatment extends to 3D (i.e.

3 unknowns).

In this case, each of the 3 equations has the form ax+by+cz=d, which is the equation of a plane.

If the three planes intersect, they either do so in a point (unique solution), or all three planes are coincident (infinitely many solutions all lying on a plane), or they intersect in a line (infinitely many solutions all lying on a line).

If the 3 planes do not intersect, they are parallel (no solution).

(41)

Systems of Linear Equations:

Geometric Interpretation

The same geometric treatment extends to n-D (i.e. n unknowns, n = 4 or more).

In this case, each of the n equations is a hyper- plane.

If the n hyper-planes intersect, may do so in a point (unique solution), or in an r-dimensional

“flat” where 1 <= r <= n -1 (infinitely many solutions).

If the n hyper-planes do not intersect, they are parallel (no solution).

(42)

Homogeneous system of equations

• The system Ax = 0 is called a homogenous

system – a system in which the constant terms are all zero.

• Such a system always has at least one

solution, i.e. x = 0 – often called as the trivial solution.

• If A has full rank, the only solution is the trivial solution.

(43)

Homogeneous system of equations

• If A is singular, this system has infinitely many solutions.

• If u and v are two solutions to this system,

then u+v and ku are also solutions (where k is a scalar).

• All such solutions constitute what is called the null-space or kernel of matrix A.

(44)

Homogeneous system of equations

Example:

For a full-rank matrix, the null-space contains the vector 0 only. For a low-rank matrix, the null-

space will contain other vectors as well.

How do we solve Ax = 0 for the other non-zero vectors? We will study later!

0 0 0

1 1 0

3 3 3

2 2 2

0 0

1 The null-space of this matrix

consists of all points lying on the line x=0,y+z=0

(45)

Range of a matrix

• All those vectors which can be expressed as linear combinations of the columns of A

constitute the range (also called column- space) of A.

• The column vectors are said to span the range of A.

• For full rank matrix A of size n x n, the range is the entire Rn.

(46)

Example: Range

• Any vector with 3 elements can be expressed as a linear combination of the columns of:

• But some vectors cannot be expressed as linear combinations of the columns of:

1 0 0

0 1 0

0 0 1

3 3 3

2 2 2

0 0 1 , 0 0 0

0 1 0

0 0 1

(47)

Recap: singular matrices

• An n x n matrix A is singular if any one of the following equivalent conditions is true:

1. Det(A) = 0 2. Rank(A) < n

3. Inverse of A does not exist

4. There exists some non-zero vector y such that Ay = 0.

(48)

System of linear equations with m≠n

• Generally, if m (number of equations) < n (number of unknowns), the system has

infinitely many solutions. It is called an under- determined system.

• Generally, if m > n, the system has no solution.

It is called an over-determined system.

(49)

System of linear equations with m≠n

• Example of under-determined system:

• Example of over-determined system:









2 1 2

3 1

3 2

1

z y x





5 2 1

2 3

1 2

1 1

y x

(50)

Solving an over-determined system

b A A

A x

b A A

A

b, A

Ax A

b x

A b, Ax

T T

T T

) 1

( of size

1 x m is of

size 1,

n x is

of size n),

(m n x m is of

size

1.

x n is of

size n,

x n

is T

T

The inverse of A is not defined as A is not a square matrix. But ATA is a square matrix of size n x n, and it will have a well-defined inverse if A has full column-rank.

(ATA)-1AT is called the pseudo-inverse of A and is often denoted as A.

The x thus obtained is not a true solution – it will not exactly satisfy all (or any of!) the equations. The solution derivation is on the next slide.

(51)

Solving an over-determined system

. a

as called is

This

rank.

- column full

has provided

) (

2 2

: us gives it to

setting and

w.r.t.

derivative Taking

2

) (

2 )

( ) (

) (

) (

) (

) (

) (

possible.

as small as

is

, i.e.

, of magnitude that

such Find

. vector

residual Consider

1 2

2 2

solution squares

least

A b

A A

A x

b A Ax

A

0 b

A Ax

A

0 x

b A x b

b Ax

A x

b Ax b

b Ax

Ax

Ax b

- b Ax b

b Ax

Ax

b Ax

b Ax

b Ax

b Ax

r r

b Ax

r

T T

T T

T T

T t t

T t

t t

t

t t

t t

t

x

(52)

Solving an over-determined system

Over-determined systems can be solved in MATLAB using the following commands :

x = inv(A’*A)*A’*b; % slower, less accurate

x = A\b; % this is the back-slash operator and it implements the

% pseudo-inverse of A efficiently and accurately

% In case of either method, the MATLAB system will produce a

%warning if the matrix is low-rank or ill-conditioned (i.e. close to

%being a low-rank matrix)

(53)

Application of over-determined systems

• Image alignment is a popular application in image processing.

• Consider two images – where one image is a linear transformed version of the other.

http://www.d-

type.com/images/main/s_dt_

14.gif

(54)

Application of over-determined systems

• Let’s say I mark out some N salient points {(x1i,y1i)} in the left image, and their

corresponding points {(x2i,y2i)} in the right image – see figure on next slide.

• I can write the equation: P2 = A P1 where P1 is a 2 X N matrix, whose i-th row contains the x and y coordinates of the i-th point in the first set (likewise for P2). A is the unknown linear transformation.

• The solution for A: .1 1T 1 T

1

2 ( )

P P PP A

(55)
(56)

Application of over-determined systems

This is a least squares solution.

Note: In practice, the equation P2 = A P1 will

never be satisfied exactly. We are humans after all, and there will always be some usually small error in marking the point positions in P2 (we will assume P1 contains no errors – it provides a

reference set of points after all!)

Hence P2 = A P1 + Ε is a more appropriate form of the equation, where Ε is a 2 x N matrix containing small values representing errors – often called as noise.

The least squares solution is also applicable in

such a situation, with some very mild restrictions on E that are usually satisfied in practice.

(57)

Application of over-determined systems

• What will happen if the points you marked out in any one of the images were coincident or

collinear? Answer: If points are coincident or they lie on a line passing through the origin, system is undetermined. If points lie on any other line, the system is not undetermined.

• If all points are non-collinear, what is the minimum number of points (N) so that the

system of equations has a solution? Answer is N = 2 as these provide 4 knowns (2 X

coordinates, 2 Y coordinates). We have only 4 unknowns here.

(58)

Orthogonal Matrices

• An orthogonal matrix is a square matrix (over the real field) whose inverse is equal to its

transpose:

• For square matrices over the complex field, the transpose is replaced by the conjugate transpose. Such a matrix is called unitary.

Q

-1

Q*

I;

Q Q

QQ *  *  

-1 T

T

T

Q Q I; Q Q

QQ   

(59)

Properties of orthogonal matrices

Have determinant +1 or -1 (rotation or reflection matrices respectively).

The magnitude of any row or column vector is 1.

The dot-product of any two different row-vectors or any two different column-vectors is 0.

Orthogonal transformations always preserve the magnitude of a vector, and the dot product

(hence also angle between) two vectors.

The identity matrix is a simple example of an orthogonal matrix.

v u Qv

Q u (Qv) Qu)

Qu.Qv v

u u.v

u u Qu

Q u (Qu) Qu)

Qu.Qu u

u u.u

u

t T

t t

t

t T

t t

t

(

;

(

2 ;

(60)

Applications of orthogonal matrices

• Discrete signals (represented vectors with n elements) are often expressed as linear

combinations of the columns of some orthogonal matrices.

• Common examples of such orthogonal

matrices are the Discrete Cosine Transform (DCT) matrix, the Discrete Fourier Transform (DFT) matrix and the Haar Wavelet matrix.

i n

i i

n

i

i x

i

1 1

; θ Q x Q x qt

q

x -1 T

(61)

Applications of orthogonal matrices

The afore-mentioned representation is widely used in signal compression.

The DCT matrix is used for compressing audio (in MP3) and images (in JPEG).

Why? Because it is observed that only few of the coefficients in the vector Θ (for audio or images) have large magnitudes.

The rest have magnitudes equal to or close to 0 – and need not be stored.

This is a property of audio signals and natural images – and NOT general vectors.

(62)

Applications of orthogonal matrices

• The audio signals or images reconstructed

from just these large coefficients (and setting all the rest to 0) are often indistinguishable from the original signals!

• So you need to store fewer coefficients!

• There is of course an error incurred in this process – but that error is not (or barely) noticeable to the eye/ear.

(63)

Original Image: called as Barbara – size 512 x 512

(64)

Reconstructed Barbara with 40,000 largest DCT coefficients (out 262144 coefficients), i.e. just 15%

coefficients. By DCT

coefficients, we mean that the matrix Q will be the DCT matrix

(65)

Reconstructed Barbara with 50,000 largest DCT coefficients (out 262144 coefficients), i.e. just 19%

coefficients. By DCT

coefficients, we mean that the matrix Q will be the DCT matrix

(66)

DCT coefficients of images/audio signals decay very rapidly. The top plot shows the magnitudes of the DCT coefficients in descending order of absolute value. The plot below shows the magnitudes of all except the 20 largest coefficients (in terms of absolute value).

References

Related documents

(3) Diagonal Matrix: A square matrix whose elements above and below the principal diagonal are all Zero is called diagonal matrix... (ii) For anti-symmetric matrix the

This function reads a design (D) in matrix format and obtain the number of replications (r) of the treatments (v) of the given design where rows are treated as

Click to edit Master text styles Second level.

Classification Based on the Frequency of the Output Signal: Low-Frequency Oscillators, Audio Oscillators (whose output frequency is of audio range), Radio Frequency

We select elements in a matrix just as we did for vectors, but now we need two indices.. The element of row i and column j of the matrix A is denoted

An alternate way is to see the matrix T as a geometric operator and the matrices A and T are assumed known where matrix [A] contains set of position

B.This argument supports the idea that Susie did violate the rule because her ribbon is a hat.... Hat

But the paradox of negative existence reappears when we put the question: how is the really real compatible with the finite, the phenomenal, the subjective,