02-15: Vector Spaces: Null Spaces and Columnspaces #

Slides
Notes 7 8

Important Jargon #

Rank a matrix .$A$ is the number of linearly independent columns
Nullspace of a matrix is the set of solutions to .$A \vec x = 0$
A vector space is a set of vectors connected by two operators: .$+, \times$ — page 48
A vector subspace is a subset of vectors that have “nice properties” — page 50
A basis for a vector space is a minimum set of vectors needed to represent all vectors in the space
Dimension of a vector space is the number of basis vectors
Column space is the span (range) of the columns of a matrix
Row space is the span of the rows of a matrix

Vector Spaces #

A vector space .$\mathbb{V}$ is a set of vectors and two operators .$+, \cdot$ that satisfy:

Vector Addition

Associative: .$\vec u + (\vec v + \vec w) = (\vec u + \vec v) + \vec w$
Commutative: .$\vec u + \vec v = \vec v + \vec u$
Additive Identity: There exists an additive identity .$\vec 0 \in \mathbb{V}$ such that .$\vec v + \vec 0 = \vec v$
Additive Inverse: There exists .$- \vec v \in \mathbb{V}$ such that .$\vec v + (-\vec v) = \vec 0$. We call .$-\vec v$ the additive inverse of .$\vec v$.
Closure under vector addition: The sum .$\vec v + \vec u$ must also be in .$\mathbb{V}$

Scalar Multiplication

Associative: .$\vec \alpha(\beta \vec v) = (\alpha \beta) \vec v$
Multiplicative Identity: There exists .$1 \in \mathbb{R}$ where .$1 \cdot \vec v = \vec v$
Distributive in vector addition: .$\alpha (\vec u + \vec v) = \alpha \vec u + \alpha \vec v$
Distributive in scalar addition: .$(\alpha + \beta)\vec v = \alpha \vec v + \beta \vec v$
Closure under scalar multiplication: The product .$\alpha \vec v$ must also be in .$\mathbb{V}$.

... for any .$\vec v, \vec u, \vec w \in \mathbb{V}; \alpha, \beta \in \mathbb{R}$

These can be grouped by axioms of closure, addition, and scaling shown on slide 10
For example .$ \mathbb{R}^{n}$ is the vector space of all .$n$-dimensional vectors.
- In fact, the set of all matrices the same size is also a vector space .$ \mathbb{R}^{n \times o}$ since it fulfills all of the properties above as well
- In this class we will generally only deal with vector spaces containing vectors in .$\mathbb{R}^{n}$.

Subspaces #

A subspace .$\mathbb{U}$ consists of a subset of .$\mathbb{V}$ in vector space (.$\mathbb{V}, \mathbb{F}, +, \cdot$). .$\mathbb{U} \subset \mathbb{V}$ and have 3 properties
1. Contains .$\vec 0 $, i.e., .$\vec 0 \in \mathbb{U}$
2. Closed under vector addition: .$\vec v_1, \vec v_2 \in \mathbb{U} \Longrightarrow \vec v_1 + \vec v_2 \in \mathbb{U}$
3. Closed under scalar multiplication: .$\vec v \in \mathbb{U}, \alpha \in \mathbb{F} \Longrightarrow \alpha \vec v \in \mathbb{U}$
Examples on slide 13
Intuitively, a subspace is a closed subset of all the vectors in .$ \mathbb{V}$.
- Any linear combination of vectors in the subspace must also lie in that subspace.
Just as basis and dimension are defined for vector spaces, they have equivalent definitions for subspaces.
- Basis for a Subspace: set of linearly independent vectors that span the subspace (minimal set of subspace-spanning vectors)
- Subspace Dimension: number of vectors in subspace-basis

Basis #

Basis: Given a vector space .$\mathbb{V}$, a set of vectors .$\{\vec v_1, \dots \vec v_n\}$ is a basis of the vector space if it satisfies the following properties:
1. .$\vec v_1, \dots, \vec v_n$ are linearly independent vectors
2. .$\text{span}(\{\vec v_1, \dots, \vec v_n\}) = \mathbb{V} \Longrightarrow \forall \vec v \in \mathbb{V}, \exists \alpha_1, \dots, \alpha_{n-1} \in \mathbb{R}$ such that .$\vec v_1 = \alpha_1 \vec v_2 + \dots \alpha_{n-1} \vec v_n$
Minimum set of vectors that spans a vector space
A basis of a vector space is the minimum set of vectors needed to represent all vectors in the vector space.
- If a set of vectors is linearly dependent and “spans” the vector space, it is still not a basis – we can remove at least one vector from the set and the resulting set will still span the vector space

Basis is not unique #

Intuitively, think about multiplying one of the vectors in a given basis by a nonzero scalar will not affect the linear independence or span of the vectors.
We could alternatively construct another basis by replacing one of the vectors with the sum of itself and any other vector in the set.
Mathematically, suppose that .$\{\vec v_1, \dots, \vec v_n \}$ is a basis for the vector space we are considering.
- Thus .$\{\alpha \vec v_1, \dots, \vec v_n \}$ where .$\alpha \neq 0$ is also a basis because, just as we’ve seen in Gaussian elimination row operations, multiplying a row by a nonzero constant does not change the linear independence or dependence of the rows.
  - We can generalize this to say that multiplying a vector by a nonzero scalar also does not change the linear independence of the set of vectors.
- In addition, we know that .$\text{span}(\{ \vec v_1, \dots, \vec v_n \}) = \text{span}( \{\alpha \vec v_1, \dots, \vec v_n \} )$
  - Any vector in .$\text{span}(\{ \vec v_1, \dots, \vec v_n \})$ can be created as a linear combination of the set .$\text{span}(\{ \alpha \vec v_1, \dots, \vec v_n \})$ by dividing the scale factor on .$\vec v_1$ by .$\alpha$.
  - We can use a similar argument to show that .$\{\alpha \vec v_1, \dots, \vec v_n \}$ is also a basis for the same vector space.
To generalize, for .$\mathbb{R}^{N}$, any .$N$ (and only .$N$) linearly independent vectors form a basis

Dimension #

Dimension: The dimension of a vector space is the number of basis vectors.
Since each basis vector can be scaled by one coefficient, the dimension of a space as the fewest number of parameters needed to describe an element or member of that space.
The dimension can also be thought of as the degrees of freedom of your space – that is, the number of parameters that can be varied when describing a member of that space.
A vector space can have many bases, but each basis must have the same number of vectors:
- Suppose a basis for the vector space we’re considering has .$n$ vectors. This means that the minimum number of vectors we can use to represent all vectors in the vector space is .$n$, because the vectors in the basis would not be linearly independent if the vector space could be represented with fewer vectors.
- Then we can show that any set with less than .$n$ vectors cannot be a basis because it does not have enough vectors to span the vector space – there would be some vectors in the vector space that cannot be expressed as a linear combination of the vectors in the set.
- In addition, we can show that any set with more than .$n$ vectors must be linearly dependent and therefore cannot be a basis.
- Combining the two arguments, we have that any other set of vectors that forms a basis for the vector space must have exactly .$n$ vectors!

Column Space #

The range/span/column space of matrix .$A \in \mathbb{R}^{m \times n}$ – which we can represent as a set of vectors .$\{ \vec a_1, \dots \vec a_n \}$ – is a set of all possible linear combinations: $$\text{span}\big(\{\vec a_1, \dots, \vec a_n\}\big) = \Bigg\{\sum_{i=1}^N \alpha_i \vec a_i\ |\ \alpha_1, \dots, \alpha_n \in \mathbb{R} \Bigg\} = \big\{A \vec x =\ \vec x \in \mathbb{R}^{n}\big\}$$
- That is, the column space of a matrix .$A \in \mathbb{R}^{m \times n}$ is the span of the .$n$ columns in .$A$
- It’s the space of all outputs that the operator can map to.
Thinking about .$A$ as a linear transformation from .$ \mathbb{R}^{n} \to \mathbb{R}^{m}$, the column space is effectively the set of all outputs that this matrix can transform input vectors to
Note that in the general case, input vectors and output vectors can be different lengths
- The column space describes all possible output vectors .$\vec b = \mathbb{R}^{m \times 1}$
- It can be shown that .$\text{span}(A)$ forms a subspace of .$ \mathbb{R}^{m}$
  - Note that .$\text{span}(A)$ is not necessarily .$ \mathbb{R}^{m}$

Row Space #

Similarly, the row space is the span of the .$n$ rows

Rank #

The rank of .$A$ is defined as the dimension of the column space of .$A \in \mathbb{R}^{m \times n}$
- .$\text{rank}(A) = \text{dim(span(A))}$
  - .$\text{dim(span}(A\text{)) ≡ dim(col(}A))$
  - It’s all too easy to confuse an actual space consisting of vectors, like a matrix range describing the output (column) space, with the dimension of that space, which is just a single scalar number. Keep them straight!
- .$ \text{rank}(A) = \text{dim(span}(A)) \leq \text{min}(m, n)$
  - This is at most .$m$, but certainly can be less, since an arbitrary .$A \in \mathbb{R}^{m \times n}$ is not guaranteed to have columns whose span is all of .$ \mathbb{R}^{m}$
  - Consider the simple counterexample of the zero matrix .$\vec 0 \in \mathbb{R}^{m \times n}$, which maps all .$n$-dimensional input vectors to the .$m$-dimensional all-zero vector.
In general, using the column-wise representation of matrix-vector multiplication we can show that .$\text{rank(}A)$ is the number of linearly independent columns in .$A$.
- Any output vector can be represented as a linear combination of the columns of .$A$.
- But some of these columns might themselves be linear combinations of other columns, which means we can replace any redundant column with a weighted sum of the other columns.
- By removing all redundancies, we find that a matrix with .$k \leq \text{min}(n, m)$ linearly independent column vectors can “unlock” exactly .$k$ dimensions in the output.
Thus, we find that .$\text{rank}(A)$ also equals the number of pivots in the RREF of .$A$.
- Since each pivot must belong to a row and a column, the number of pivots in .$A \in \mathbb{R}^{m \times n}$ is limited by the smaller dimension.
- For a tall matrix .$m > n$, the columns are the limiting dimension; for a wide matrix .$n > m$ the rows are.

Null Space #

The null-space of .$ \mathbb{R}^{m \times n}$ is the set of all vectors .$\vec x \in \mathbb{R}^{m}$ such that .$A\vec x = 0$ $$\text{null}(A) = \big\{\vec x\ |\ A \vec x = \vec 0, \vec x \in \mathbb{R}^{m}\big\}$$
- That is, the set of all inputs that get mapped to .$\vec 0$ by .$A$
$\text{dim(null}(A))$ can be interpreted as the number of input directions for which the output is “compressed” down to zero.
- We know that it can be at most .$m$, since all of the input vectors have .$m$ components.
- It’s the set of vectors not in columns space, that is, the number of linearly dependent columns: $$m - \text{dim(span}(A)) = \text{dim(null}(A))$$
  - The loss of dimensionality from the input space to the output space shows up in the nullspace.
.$\vec 0$ is always in the null space — trivial Null space
- This wouldn’t hold if we had affine (instead of linear) functions
Null space DNE when the determinant is not zero – see last week

Procedure to Compute a Null-Space #

Computing the nullspace of .$A$ requires us to solve .$A \vec x = \vec 0$ – the procedure is as follows:
1. Put .$A$ in RREF. Initialize the set .$\mathbb{S} = \{ \vec 0 \}$.
2. Check each column for leading entries and find the number of .$F$ree and .$B$asic variables.
3. if .$F = 0$, stop and skip to the last step.
4. if .$F \neq 0$, repeat the following for each free variable:
  - Set that free variable to .$1$, and all others to zero.
  - Solve .$A \vec x$ under these conditions; add the solution vector to .$\mathbb{S}$.
5. Conclude that .$\text{null}(A) = \text{span}(\mathbb{S})$.
Example is given on page 37-38

Rank-Nullity Theorem #

How is the number of free variables related to the total number of columns in a matrix .$A \in \mathbb{R}^{m \times n}$? Well, each column of a matrix either contributes a “new direction” to the output or it is redundant with other columns and their already-discovered directions. In other words, each of .$n$ columns adds a dimension to .$\text{span}(A)$ or to .$\text{null}(A)$. Therefore, the following holds: $$\text{dim(span}(A)) + \text{dim(null}(A)) = n$$ $$\text{rank}(A) + \text{dim(null}(A)) = n$$

02-17: Eigenvectors, values #

Eigenvectors and Eigenvalues #

Consider a square matrix .$ A \in \mathbb{R}^{n \times n}$. An eigenvector of .$A$ is a nonzero vector .$\vec x \in \mathbb{R}^{n}$ such that $$A \vec x = \lambda \vec x$$ where .$\lambda$ is a scalar value, called the eigenvalue of .$\vec x$.

That is, an eigenvector represents a sort of stability point: vectors aligned with an eigenvector will not change direction under a linear transformation .$A$
- Rather, they will simply be scaled by some factor.
- Note that eigenvectors are a property of the matrix itself and do not depend on the specific vector being transformed (input)
The eigenvalue describes this stretching or compressing factor for vectors aligned with an eigenvector
- This means any vector that’s ‘some’ multiple of the eigenvector, when it’s transformed by .$A$, will become a scaled version of itself that’s a ‘some’ multiple of the eigenvalue
- Example on page 45
Geometrically, an eigenvector, corresponding to a real nonzero eigenvalue, points in a direction in which it is stretched by the transformation and the eigenvalue is the factor by which it is stretched
Because these two terms are so commonly used in conjunction, we often refer to an eigen(value/vector) pair
- Note that scaling a given eigenvector for an eigenvalue will still produce a valid eigenvector, since the vector’s direction will not be changed
Given non-invertible .$A \in \mathbb{R}^{n \times n}$ there are at least .$1$ and at most .$n$ eigenvalues
- Given non-invertibility, some .$\lambda_i = 0$, so we have at least 1 eigenvalue.
  - Indeed, all eigenvalues can be .$0$ (such as for .$\vec 0$)!
- However, non-invertibility does not place any other restrictions on our set of eigenvalues, so all other .$n-1$ eigenvalues can be distinct from this .$\lambda_i$
Properties to know
1. A matrix is uninvertible iff .$0$ is an eigenvalue (because there exists a vector .$\vec v$ such that .$A \vec v = \vec 0$.
2. A scalar times an eigenvector is still an eigenvector: .$(A(c \vec v) = c A \vec v = c \vec v = (c \vec v))$
3. Eigenvectors with distinct eigenvalues are linearly independent (eigenvectors in the same span have the same eigenvalue)
4. .$A^{-1} \vec v = \lambda^{-1} \vec v$
  - .$A \vec v = \lambda \vec v \Longrightarrow A^{-1} A \vec v = A^{-1} \lambda \vec v \Longrightarrow \vec v = \lambda A^{-1} \vec v \Longrightarrow \lambda^{-1} \vec v = A^{-1} \vec v$

Determinants #

The determinant is a quantity we can define for any square matrix
- The determinant is nonzero if and only if the matrix is invertible and the linear map represented by the matrix is an isomorphism
For a .$2 \times 2$ matrix, the formula is:
- The absolute value of .$ad − bc$ is the area of the parallelogram, and thus represents the scale factor by which areas are transformed by .$A$.
$$\text{det}\Bigg( \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix}\Bigg) = ad - bc$$
- If linearly dependent, some vectors will lie on top of each other so the area will be zero
- Similarly in 3D, if any column vectors are multiples of each other, we “squash” a volume into a plane or a line.
- That is, determinant of any square matrix is zero if the columns are linearly dependent.
For a .$3 \times 3$ matrix, the shape is a parallelipiped, and we form hyper-volumes in higher dimensions.
- We can calculate determinants for higher dimension matrices as well, as described here
The determinant of the transpose of .$A$ equals the determinant of .$A$: .$\text{det}(A) = \text{det}(A^\text{T})$
- ‘Proving’ this geometrically on a whiteboard is fun, try out the .$2 \times 2$ case
- Therefore, if .$A^\text{T}$ has an eigenvalue .$\lambda$, then .$A$ also has the eigenvalue .$\lambda$ because .$\text{det}(A - \lambda I) = \text{det}(A^\text{T} - \lambda I)$
- This implies that in all the properties mentioned above, the word “column” can be replaced by “row” throughout
  - For example, viewing an .$n \times n$ matrix as being composed of .$n$ rows, the determinant is an .$n$-linear function.
- See this article which relies on the idea of elementary matrices (not covered) or this more advanced, out-of-scope stackoverflow post that deals with 16B-level topics (and beyond)

Computing Eigenvalues and Eigenvectors #

Solving this equation for nonzero (nontrivial) solutions .$\vec x$ will yield our eigenvectors:
$$A \vec x = \lambda \vec x \label{a}\tag{1}$$
$$\Longrightarrow (A - \lambda I_n) \vec x = \vec 0_n$$
$$\Longrightarrow A’ \vec x = \vec 0_n$$
- Note that .$A$ and .$I_n$ are fixed and only 1 parameter here that can vary in this equation: .$\lambda$

We want to find values of .$\lambda$ such that .$A’ = (A − \lambda I_n)$ has linearly dependent columns
That is, we want to find values of .$\lambda$ that cause the determinant of .$A’$ to become zero
- Linear dependence of the columns creates a nontrivial null-space for .$A'$
- .$N$th order characteristic polynomial with .$N$ solutions
Work-through on page 6

$$A’ = \begin{bmatrix} a_{11} - \lambda & a_{12} & … & a_{1n}\\ a_{21} & a_{22} - \lambda & … & \vdots \\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & \dots & … & a_{nn} - \lambda \end{bmatrix}$$

For the .$2 \times 2$ case determinant is $$\lambda^2 - (a+d)\lambda + (ad - bc) = 0$$
- Solving this quadratic equation, we can find the .$\lambda$ values.
- Then, for each .$\lambda_i$ we find, we revisit .$\eqref{a}$ and solve for the corresponding .$\vec x_i$
- The polynomial on the left hand side of the above equation is known as the characteristic polynomial for the matrix .$A$.
If a matrix has repeated eigenvalues, they may (or may not) have distinct eigenvectors.
- In general, multiplicity of eigenvalues will result in the same multidimensional eigenspace
- (Aside) …except if the matrix is defective
  - An .$n \times n$ matrix is defective iff it does not have .$n$ linearly independent eigenvectors
  - That is, a defective matrix always has fewer than .$n$ distinct eigenvalues, since distinct eigenvalues always have linearly independent eigenvectors
  - In which case the rank is decreased

Eigenspace #

If the eigenvectors are distinct, they form an eigenspace $$E_\lambda = \text{null}(A - \lambda I_n) = \text{null}(A’)$$

That is, null space of some matrix is the set of all vectors that satisfy .$\eqref{a}$ OR all of the eigenvectors that correspond to some eigenvalue OR the eigenspace that corresponds to the eigenvalue
Exactly like the concept of span; the eigenspace is a subspace, the span of all eigenvectors for that eigenvalue (including .$\vec 0$)
Any input vectors that lie in this space (for 2 distinct eigenvectors, a plane; for 1, a line) will be scaled by the shared eigenvalue under a linear transformation.
Eigenspace is a subspace, which means it’s closed under scalar multiplication
- Thus, if two vectors are related by a scalar, they must both lie in the same eigenspace of .$\lambda_i$: .$E_{\lambda_i}$
  - That is, the eigenvectors that make up some eigenspace aren’t unique
- But, every eigenvector can only correspond to one eigenvalue
  - If for some .$A$, where .$A \vec x_1 = \lambda_1 \vec x_1$ and .$A \vec x_1 = \lambda_2 \vec x_1$ then .$\lambda_1 = \lambda_2$
  - Thinking about a physical diagram may help clarify; .$\lambda_1 \neq \lambda_2$ would require some single initial state or vector aligned with .$\vec x_1$ to be scaled by two different values upon being transformed by .$A$
    - This cannot happen, so the two eigenvalues cannot be distinct
Aside: Complex eigenvalues can exist as well; they are much harder to visually understand, but mathematically, we find them using the exact same process as before.

Mean-product formula #

Mean-product formula is a nicer way of solving this, versus finding the roots of the polynomial $$m \pm \sqrt{m^2 - p}$$
- $m$ is the mean of the trace, which is the same as the mean of the eigenvalues
- $p$ is the product of the eigenvalues, which is the determinant

Theorems #

Theorem 9.1 #

Given two eigenvectors .$\vec v_1$ and .$\vec v_2$ corresponding to two different eigenvalues .$\lambda_1$ and .$\lambda_2$ of a matrix .$A$, it is always the case that .$\vec v_1$ and .$\vec v_2$ are linearly independent.

Proof on page 9

Theorem 9.2 #

Let .$\vec v_1,\vec v_2, \dots , \vec v_m$ be eigenvectors of an .$n \times n$ matrix with distinct eigenvalues. It is the case that all the .$\vec v_i$ are linearly independent from one another.

Proof on page 10

Proposition 1 #

If an square .$n \times n$ matrix .$A$ isn’t invertible, then it has some eigenvalue .$\lambda_i = 0$

If a matrix is not invertible, then the dimension of its null space isn’t necessarily greater than 0 because there must be a linearly dependent row or column.
If this is true, then there’s a non-zero vector .$\vec x$ such that .$A \vec x = 0 \vec x$

If the matrix is not invertible, it has a nontrivial null-space.
Then, by definition, there is some nonzero .$\vec x$ for which .$A \vec x = 0 \vec x = \vec 0_n$.
We pattern match to .$\eqref{a}$ and notice the equation is exactly the same if we multiply the right by .$\vec x: A \vec x = 0 \vec x$.
- This is kind of like pulling a scalar .$0$ out of .$\vec 0$, leaving .$0 \vec x$.
Now, we clearly see .$\lambda = 0$

Proposition 2 #

For an invertible .$A$ with some eigenvalue .$\lambda$, .$A^{−1}$ has eigenvalue .$ \frac{1}{\lambda}$

We can start at .$\eqref{a}$: left-multiply both sides by .$A^{−1}$ to get .$\vec v = A^{-1} \lambda \vec v \Longrightarrow A^{-1} \vec v = \frac{1}{\lambda} \vec v$.
Pattern match to .$\eqref{a}$ again and we’ve shown the statement is true
Note: Given invertibility, all .$\lambda_i \neq 0$ so this is always true

States #

Steady States #

We know that a steady state .$\vec x^*$ of a transformation matrix .$A$ is defined to be such $$A \vec x^* = \vec x^* $$
- In other words, it is an element of the eigenspace of .$A$ corresponding to the eigenvalue .$\lambda = 1$.
- The above equation tells us that if we start at a steady state, then we will remain unaffected by the transformation matrix over time.
- Therefore, to solve for the steady-state of a system represented by .$P$, we solve .$\eqref{a}$, substituting .$\lambda = 1$
  - Note that this amounts to solving for the null-space of .$(P − I_n)$.
Great walk-through on page 53

Predicting Behavior for General Initial States #

Given a system and an initial state, can we predict how it’ll dynamically change over time? We saw in the Page Rank example that we seem to approach a sort of steady-state after many timesteps, but under what conditions does this happen?

Simpler Case: .$\vec x (0) = \alpha \vec v$ #

Suppose our initial state is actually a perfect multiple of an eigenvector of the system.
Over time, upon repeated applications of .$A,$ we accumulate factors of .$\lambda$; ultimately, .$A^n \vec x = \alpha (\lambda^n \vec x)$ – as we can see derived to the right

$$\vec x (0) = \alpha \vec v$$ $$\vec x (1) = A\vec x (0) = \alpha \lambda \vec v$$ $$\vec x (2) = A\vec x (1) = \alpha \lambda^2 \vec v$$ $$\vdots$$ $$\vec x (n) = A\vec x (n-1) = \alpha \lambda^n \vec v$$

Based on this pattern, we notice the following behaviors based on the value .$\lambda$ as .$n \to \infty$:
- .$\lambda > 1: \vec x (n) \to \infty$ – Diverge, exponential growth.
- .$\lambda = 1: \vec x (n) \to k\vec v$ – Converge, steady-state.
- .$0 < \lambda < 1: \vec x (n) \to \vec 0$ – Converge, exponential decay.
- .$\lambda = 0: \vec x (n) = 0 \vec v = \vec 0$ – Converge(?), instantaneous disappearance.
- .$\lambda < 0$: Take .$|\lambda|$ and refer to the appropriate case above. But recognize the sign switches at each timestep

General Case: .$x(0) = \alpha_1 \vec v_1 + \alpha_2 \vec v_2 + . . . + \alpha_n \vec v_n$ #

This form says that the initial state is now not a scalar multiple of just one eigenvectors, it’s a linear combination of all of them
- Note that this is still not fully general; we assume here that all initial states are in the span of the eigenvectors of .$A$, which isn’t guaranteed.
But this case devolves into the previous one; we can simply treat each element individually, apply the techniques from the Simpler Case, and put them back together. The final form is as follows: $$\vec x (n) = \alpha_1 (\lambda_1^n \vec v_1) + \alpha_2 (\lambda_2^n \vec v_2) + . . . + \alpha_n (\lambda_n^n \vec v_n) \label{b}\tag{2}$$
Given a matrix .$A$ and some initial state .$\vec x$, how can we actually get to this equation format?
- First, we solve for the .$(\vec v_i, \lambda_i)$ pairs of .$A$
- Then, we use Gaussian elimination to find the .$\alpha_i$’s; Putting eq. .$ \eqref{b}$ in matrix form yields:
- …and we compute the inverse of the matrix of eigenvectors, arriving at:
$$\vec x(0) = \begin{bmatrix} | & | & & | \\ \vec v_1 & \vec v_2 & \dots & \vec v_n\\ | & | & & | \\ \end{bmatrix}\begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \\ \end{bmatrix}$$
$$\begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \\ \end{bmatrix} = \begin{bmatrix} | & | & & | \\ \vec v_1 & \vec v_2 & \dots & \vec v_n\\ | & | & & | \\ \end{bmatrix}^{-1} \vec x (0)$$
- Let’s assume that we drop any terms corresponding to .$\alpha_i = 0$ since those terms are, well, zero. Then, with what remains, we can make some intuitive observations:
  - .$|\lambda_i | > 1: \vec x (n) \to \infty$ – Diverge, even if other components in the linear combination decay, the state itself “blows up” to .$\infty$ as this component overshadows all others.
  - .$|\lambda_i| = −1: \vec x (n) \to\ ?$ – Diverge because that component oscillates forever.
  - .$−1 < \lambda_i \leq 1: \vec x (n) \to \vec x^*$ – Converge, that is, steady-state! Each .$i$th term either decays to zero (.$|\lambda_i| \leq 1$) or stays the same (.$|\lambda_i| = 1$), such that .$\vec x^* = \sum_{i,\lambda_i=1} \alpha_i \vec v_i$.
    - We can normalize this steady-state if we want proportions (column values sum to 1) rather than absolute numbers

Some Useful Information #

If a matrix has .$n$ distinct real eigenvalues, their .$n$ associated eigenvectors are all linearly independent.
An eigenspace for a given eigenvalue is the span of all eigenvectors, including .$\vec 0$, and is also a subspace by definition.
Say we calculate an eigenvector for an eigenvalue; we can pick any scalar multiple of the result and this will still be an eigenvector, since scaling a vector does not change its direction. This follows from the scalar multiplication property of subspaces.
A given eigenvector can only be associated with one eigenvalue, since a vector can only be scaled by some single value upon being transformed by a matrix. But, a eigenvalue can be associated with multiple eigenvectors, the span of which form an eigenspace.
If a matrix has some .$\lambda = 0$, then for some .$\vec x, A\vec x = \lambda\vec x =\vec 0$, so .$A$ has a nontrivial null-space. Therefore, it is not invertible.
If a matrix has some .$\lambda = 1$, then any initial state that is aligned with the corresponding eigenvector is a steady-state. More generally, any initial state for which .$\lambda = 1$ comprises part of the linear combination potentially has a nonzero steady-state, so long as all other .$|\lambda_i| < 1$.

The rotation matrix (that rotates any vector by .$\theta$ degrees counterclockwise) is:

$$A(\theta) = A_R = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \\ \end{bmatrix}$$

Notes can be found as interactive webpage at