02-15: Vector Spaces: Null Spaces and Columnspaces #
Important Jargon #
- Rank a matrix .$A$ is the number of linearly independent columns
- Nullspace of a matrix is the set of solutions to .$A \vec x = 0$
- A vector space is a set of vectors connected by two operators: .$+, \times$ — page 48
- A vector subspace is a subset of vectors that have “nice properties” — page 50
- A basis for a vector space is a minimum set of vectors needed to represent all vectors in the space
- Dimension of a vector space is the number of basis vectors
- Column space is the span (range) of the columns of a matrix
- Row space is the span of the rows of a matrix
Vector Spaces #
- A vector space .$\mathbb{V}$ is a set of vectors and two operators .$+, \cdot$ that satisfy:
Vector Addition
- Associative: .$\vec u + (\vec v + \vec w) = (\vec u + \vec v) + \vec w$
- Commutative: .$\vec u + \vec v = \vec v + \vec u$
- Additive Identity: There exists an additive identity .$\vec 0 \in \mathbb{V}$ such that .$\vec v + \vec 0 = \vec v$
- Additive Inverse: There exists .$- \vec v \in \mathbb{V}$ such that .$\vec v + (-\vec v) = \vec 0$. We call .$-\vec v$ the additive inverse of .$\vec v$.
- Closure under vector addition: The sum .$\vec v + \vec u$ must also be in .$\mathbb{V}$
Scalar Multiplication
- Associative: .$\vec \alpha(\beta \vec v) = (\alpha \beta) \vec v$
- Multiplicative Identity: There exists .$1 \in \mathbb{R}$ where .$1 \cdot \vec v = \vec v$
- Distributive in vector addition: .$\alpha (\vec u + \vec v) = \alpha \vec u + \alpha \vec v$
- Distributive in scalar addition: .$(\alpha + \beta)\vec v = \alpha \vec v + \beta \vec v$
- Closure under scalar multiplication: The product .$\alpha \vec v$ must also be in .$\mathbb{V}$.
- These can be grouped by axioms of closure, addition, and scaling shown on slide 10
- For example .$ \mathbb{R}^{n}$ is the vector space of all .$n$-dimensional vectors.
- In fact, the set of all matrices the same size is also a vector space .$ \mathbb{R}^{n \times o}$ since it fulfills all of the properties above as well
- In this class we will generally only deal with vector spaces containing vectors in .$\mathbb{R}^{n}$.
Subspaces #
- A subspace .$\mathbb{U}$ consists of a subset of .$\mathbb{V}$ in vector space (.$\mathbb{V}, \mathbb{F}, +, \cdot$). .$\mathbb{U} \subset \mathbb{V}$ and have 3 properties
- Contains .$\vec 0 $, i.e., .$\vec 0 \in \mathbb{U}$
- Closed under vector addition: .$\vec v_1, \vec v_2 \in \mathbb{U} \Longrightarrow \vec v_1 + \vec v_2 \in \mathbb{U}$
- Closed under scalar multiplication: .$\vec v \in \mathbb{U}, \alpha \in \mathbb{F} \Longrightarrow \alpha \vec v \in \mathbb{U}$
- Examples on slide 13
- Intuitively, a subspace is a closed subset of all the vectors in .$ \mathbb{V}$.
- Any linear combination of vectors in the subspace must also lie in that subspace.
- Just as basis and dimension are defined for vector spaces, they have equivalent definitions for subspaces.
- Basis for a Subspace: set of linearly independent vectors that span the subspace (minimal set of subspace-spanning vectors)
- Subspace Dimension: number of vectors in subspace-basis
Basis #
- Basis: Given a vector space .$\mathbb{V}$, a set of vectors .$\{\vec v_1, \dots \vec v_n\}$ is a basis of the vector space if it satisfies the following properties:
- .$\vec v_1, \dots, \vec v_n$ are linearly independent vectors
- .$\text{span}(\{\vec v_1, \dots, \vec v_n\}) = \mathbb{V} \Longrightarrow \forall \vec v \in \mathbb{V}, \exists \alpha_1, \dots, \alpha_{n-1} \in \mathbb{R}$ such that .$\vec v_1 = \alpha_1 \vec v_2 + \dots \alpha_{n-1} \vec v_n$
Minimum set of vectors that spans a vector space
- A basis of a vector space is the minimum set of vectors needed to represent all vectors in the vector space.
- If a set of vectors is linearly dependent and “spans” the vector space, it is still not a basis – we can remove at least one vector from the set and the resulting set will still span the vector space
Basis is not unique #
- Intuitively, think about multiplying one of the vectors in a given basis by a nonzero scalar will not affect the linear independence or span of the vectors.
- We could alternatively construct another basis by replacing one of the vectors with the sum of itself and any other vector in the set.
- Mathematically, suppose that .$\{\vec v_1, \dots, \vec v_n \}$ is a basis for the vector space we are considering.
- Thus .$\{\alpha \vec v_1, \dots, \vec v_n \}$ where .$\alpha \neq 0$ is also a basis because, just as we’ve seen in Gaussian elimination row operations, multiplying a row by a nonzero constant does not change the linear independence or dependence of the rows.
- We can generalize this to say that multiplying a vector by a nonzero scalar also does not change the linear independence of the set of vectors.
- In addition, we know that .$\text{span}(\{ \vec v_1, \dots, \vec v_n \}) = \text{span}( \{\alpha \vec v_1, \dots, \vec v_n \} )$
- Any vector in .$\text{span}(\{ \vec v_1, \dots, \vec v_n \})$ can be created as a linear combination of the set .$\text{span}(\{ \alpha \vec v_1, \dots, \vec v_n \})$ by dividing the scale factor on .$\vec v_1$ by .$\alpha$.
- We can use a similar argument to show that .$\{\alpha \vec v_1, \dots, \vec v_n \}$ is also a basis for the same vector space.
To generalize, for .$\mathbb{R}^{N}$, any .$N$ (and only .$N$) linearly independent vectors form a basis
- Thus .$\{\alpha \vec v_1, \dots, \vec v_n \}$ where .$\alpha \neq 0$ is also a basis because, just as we’ve seen in Gaussian elimination row operations, multiplying a row by a nonzero constant does not change the linear independence or dependence of the rows.
Dimension #
- Dimension: The dimension of a vector space is the number of basis vectors.
- Since each basis vector can be scaled by one coefficient, the dimension of a space as the fewest number of parameters needed to describe an element or member of that space.
- The dimension can also be thought of as the degrees of freedom of your space – that is, the number of parameters that can be varied when describing a member of that space.
A vector space can have many bases, but each basis must have the same number of vectors:
- Suppose a basis for the vector space we’re considering has .$n$ vectors. This means that the minimum number of vectors we can use to represent all vectors in the vector space is .$n$, because the vectors in the basis would not be linearly independent if the vector space could be represented with fewer vectors.
- Then we can show that any set with less than .$n$ vectors cannot be a basis because it does not have enough vectors to span the vector space – there would be some vectors in the vector space that cannot be expressed as a linear combination of the vectors in the set.
- In addition, we can show that any set with more than .$n$ vectors must be linearly dependent and therefore cannot be a basis.
- Combining the two arguments, we have that any other set of vectors that forms a basis for the vector space must have exactly .$n$ vectors!
Column Space #
- The range/span/column space of matrix .$A \in \mathbb{R}^{m \times n}$ – which we can represent as a set of vectors .$\{ \vec a_1, \dots \vec a_n \}$ – is a set of all possible linear combinations:
$$\text{span}\big(\{\vec a_1, \dots, \vec a_n\}\big) = \Bigg\{\sum_{i=1}^N \alpha_i \vec a_i\ |\ \alpha_1, \dots, \alpha_n \in \mathbb{R} \Bigg\} = \big\{A \vec x =\ \vec x \in \mathbb{R}^{n}\big\}$$
- That is, the column space of a matrix .$A \in \mathbb{R}^{m \times n}$ is the span of the .$n$ columns in .$A$
- It’s the space of all outputs that the operator can map to.
- Thinking about .$A$ as a linear transformation from .$ \mathbb{R}^{n} \to \mathbb{R}^{m}$, the column space is effectively the set of all outputs that this matrix can transform input vectors to
- Note that in the general case, input vectors and output vectors can be different lengths
- The column space describes all possible output vectors .$\vec b = \mathbb{R}^{m \times 1}$
- It can be shown that .$\text{span}(A)$ forms a subspace of .$ \mathbb{R}^{m}$
- Note that .$\text{span}(A)$ is not necessarily .$ \mathbb{R}^{m}$
Row Space #
- Similarly, the row space is the span of the .$n$ rows
Rank #
- The rank of .$A$ is defined as the dimension of the column space of .$A \in \mathbb{R}^{m \times n}$
- .$\text{rank}(A) = \text{dim(span(A))}$
- .$\text{dim(span}(A\text{)) ≡ dim(col(}A))$
- It’s all too easy to confuse an actual space consisting of vectors, like a matrix range describing the output (column) space, with the dimension of that space, which is just a single scalar number. Keep them straight!
- .$ \text{rank}(A) = \text{dim(span}(A)) \leq \text{min}(m, n)$
- This is at most .$m$, but certainly can be less, since an arbitrary .$A \in \mathbb{R}^{m \times n}$ is not guaranteed to have columns whose span is all of .$ \mathbb{R}^{m}$
- Consider the simple counterexample of the zero matrix .$\vec 0 \in \mathbb{R}^{m \times n}$, which maps all .$n$-dimensional input vectors to the .$m$-dimensional all-zero vector.
- .$\text{rank}(A) = \text{dim(span(A))}$
- In general, using the column-wise representation of matrix-vector multiplication we can show that .$\text{rank(}A)$ is the number of linearly independent columns in .$A$.
- Any output vector can be represented as a linear combination of the columns of .$A$.
- But some of these columns might themselves be linear combinations of other columns, which means we can replace any redundant column with a weighted sum of the other columns.
- By removing all redundancies, we find that a matrix with .$k \leq \text{min}(n, m)$ linearly independent column vectors can “unlock” exactly .$k$ dimensions in the output.
- Thus, we find that .$\text{rank}(A)$ also equals the number of pivots in the
RREF
of .$A$.- Since each pivot must belong to a row and a column, the number of pivots in .$A \in \mathbb{R}^{m \times n}$ is limited by the smaller dimension.
- For a tall matrix .$m > n$, the columns are the limiting dimension; for a wide matrix .$n > m$ the rows are.
Null Space #
- The null-space of .$ \mathbb{R}^{m \times n}$ is the set of all vectors .$\vec x \in \mathbb{R}^{m}$ such that .$A\vec x = 0$
$$\text{null}(A) = \big\{\vec x\ |\ A \vec x = \vec 0, \vec x \in \mathbb{R}^{m}\big\}$$
- That is, the set of all inputs that get mapped to .$\vec 0$ by .$A$
- $\text{dim(null}(A))$ can be interpreted as the number of input directions for which the output is “compressed” down to zero.
- We know that it can be at most .$m$, since all of the input vectors have .$m$ components.
- It’s the set of vectors not in columns space, that is, the number of linearly dependent columns:
$$m - \text{dim(span}(A)) = \text{dim(null}(A))$$
- The loss of dimensionality from the input space to the output space shows up in the nullspace.
- .$\vec 0$ is always in the null space — trivial Null space
- This wouldn’t hold if we had affine (instead of linear) functions
- Null space DNE when the determinant is not zero – see last week
Procedure to Compute a Null-Space #
- Computing the nullspace of .$A$ requires us to solve .$A \vec x = \vec 0$ – the procedure is as follows:
- Put .$A$ in
RREF
. Initialize the set .$\mathbb{S} = \{ \vec 0 \}$. - Check each column for leading entries and find the number of .$F$ree and .$B$asic variables.
- if .$F = 0$, stop and skip to the last step.
- if .$F \neq 0$, repeat the following for each free variable:
- Set that free variable to .$1$, and all others to zero.
- Solve .$A \vec x$ under these conditions; add the solution vector to .$\mathbb{S}$.
- Conclude that .$\text{null}(A) = \text{span}(\mathbb{S})$.
- Put .$A$ in
- Example is given on page 37-38
Rank-Nullity Theorem #
How is the number of free variables related to the total number of columns in a matrix .$A \in \mathbb{R}^{m \times n}$? Well, each column of a matrix either contributes a “new direction” to the output or it is redundant with other columns and their already-discovered directions. In other words, each of .$n$ columns adds a dimension to .$\text{span}(A)$ or to .$\text{null}(A)$. Therefore, the following holds: $$\text{dim(span}(A)) + \text{dim(null}(A)) = n$$ $$\text{rank}(A) + \text{dim(null}(A)) = n$$
02-17: Eigenvectors, values #
Eigenvectors and Eigenvalues #
Consider a square matrix .$ A \in \mathbb{R}^{n \times n}$. An eigenvector of .$A$ is a nonzero vector .$\vec x \in \mathbb{R}^{n}$ such that $$A \vec x = \lambda \vec x$$ where .$\lambda$ is a scalar value, called the eigenvalue of .$\vec x$.
- That is, an eigenvector represents a sort of stability point: vectors aligned with an eigenvector will not change direction under a linear transformation .$A$
- Rather, they will simply be scaled by some factor.
- Note that eigenvectors are a property of the matrix itself and do not depend on the specific vector being transformed (input)
- The eigenvalue describes this stretching or compressing factor for vectors aligned with an eigenvector
- This means any vector that’s ‘some’ multiple of the eigenvector, when it’s transformed by .$A$, will become a scaled version of itself that’s a ‘some’ multiple of the eigenvalue
- Example on page 45
- Geometrically, an eigenvector, corresponding to a real nonzero eigenvalue, points in a direction in which it is stretched by the transformation and the eigenvalue is the factor by which it is stretched
- Because these two terms are so commonly used in conjunction, we often refer to an eigen(value/vector) pair
- Note that scaling a given eigenvector for an eigenvalue will still produce a valid eigenvector, since the vector’s direction will not be changed
- Given non-invertible .$A \in \mathbb{R}^{n \times n}$ there are at least .$1$ and at most .$n$ eigenvalues
- Given non-invertibility, some .$\lambda_i = 0$, so we have at least 1 eigenvalue.
- Indeed, all eigenvalues can be .$0$ (such as for .$\vec 0$)!
- However, non-invertibility does not place any other restrictions on our set of eigenvalues, so all other .$n-1$ eigenvalues can be distinct from this .$\lambda_i$
- Given non-invertibility, some .$\lambda_i = 0$, so we have at least 1 eigenvalue.
- Properties to know
- A matrix is uninvertible iff .$0$ is an eigenvalue (because there exists a vector .$\vec v$ such that .$A \vec v = \vec 0$.
- A scalar times an eigenvector is still an eigenvector: .$(A(c \vec v) = c A \vec v = c \vec v = (c \vec v))$
- Eigenvectors with distinct eigenvalues are linearly independent (eigenvectors in the same span have the same eigenvalue)
- .$A^{-1} \vec v = \lambda^{-1} \vec v$
- .$A \vec v = \lambda \vec v \Longrightarrow A^{-1} A \vec v = A^{-1} \lambda \vec v \Longrightarrow \vec v = \lambda A^{-1} \vec v \Longrightarrow \lambda^{-1} \vec v = A^{-1} \vec v$
Determinants #
- The determinant is a quantity we can define for any square matrix
- The determinant is nonzero if and only if the matrix is invertible and the linear map represented by the matrix is an isomorphism
- For a .$2 \times 2$ matrix, the formula is:
- The absolute value of .$ad − bc$ is the area of the parallelogram, and thus represents the scale factor by which areas are transformed by .$A$.
$$\text{det}\Bigg( \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix}\Bigg) = ad - bc$$
- If linearly dependent, some vectors will lie on top of each other so the area will be zero
- Similarly in 3D, if any column vectors are multiples of each other, we “squash” a volume into a plane or a line.
- That is, determinant of any square matrix is zero if the columns are linearly dependent.
- For a .$3 \times 3$ matrix, the shape is a parallelipiped, and we form hyper-volumes in higher dimensions.
- We can calculate determinants for higher dimension matrices as well, as described here
- The determinant of the transpose of .$A$ equals the determinant of .$A$: .$\text{det}(A) = \text{det}(A^\text{T})$
- ‘Proving’ this geometrically on a whiteboard is fun, try out the .$2 \times 2$ case
- Therefore, if .$A^\text{T}$ has an eigenvalue .$\lambda$, then .$A$ also has the eigenvalue .$\lambda$ because .$\text{det}(A - \lambda I) = \text{det}(A^\text{T} - \lambda I)$
- This implies that in all the properties mentioned above, the word “column” can be replaced by “row” throughout
- For example, viewing an .$n \times n$ matrix as being composed of .$n$ rows, the determinant is an .$n$-linear function.
- See this article which relies on the idea of elementary matrices (not covered) or this more advanced, out-of-scope stackoverflow post that deals with 16B-level topics (and beyond)
Computing Eigenvalues and Eigenvectors #
- Solving this equation for nonzero (nontrivial) solutions .$\vec x$ will yield our eigenvectors:
$$A \vec x = \lambda \vec x \label{a}\tag{1}$$
$$\Longrightarrow (A - \lambda I_n) \vec x = \vec 0_n$$
$$\Longrightarrow A’ \vec x = \vec 0_n$$
- Note that .$A$ and .$I_n$ are fixed and only 1 parameter here that can vary in this equation: .$\lambda$
- We want to find values of .$\lambda$ such that .$A’ = (A − \lambda I_n)$ has linearly dependent columns
- That is, we want to find values of .$\lambda$ that cause the determinant of .$A’$ to become zero
- Linear dependence of the columns creates a nontrivial null-space for .$A'$
- .$N$th order characteristic polynomial with .$N$ solutions
- Work-through on page 6
$$A’ = \begin{bmatrix} a_{11} - \lambda & a_{12} & … & a_{1n}\\ a_{21} & a_{22} - \lambda & … & \vdots \\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & \dots & … & a_{nn} - \lambda \end{bmatrix}$$
- For the .$2 \times 2$ case determinant is
$$\lambda^2 - (a+d)\lambda + (ad - bc) = 0$$
- Solving this quadratic equation, we can find the .$\lambda$ values.
- Then, for each .$\lambda_i$ we find, we revisit .$\eqref{a}$ and solve for the corresponding .$\vec x_i$
- The polynomial on the left hand side of the above equation is known as the characteristic polynomial for the matrix .$A$.
- If a matrix has repeated eigenvalues, they may (or may not) have distinct eigenvectors.
- In general, multiplicity of eigenvalues will result in the same multidimensional eigenspace
- (Aside) …except if the matrix is
defective
- An .$n \times n$ matrix is defective iff it does not have .$n$ linearly independent eigenvectors
- That is, a defective matrix always has fewer than .$n$ distinct eigenvalues, since distinct eigenvalues always have linearly independent eigenvectors
- In which case the rank is decreased
Eigenspace #
If the eigenvectors are distinct, they form an eigenspace $$E_\lambda = \text{null}(A - \lambda I_n) = \text{null}(A’)$$
- That is, null space of some matrix is the set of all vectors that satisfy .$\eqref{a}$ OR all of the eigenvectors that correspond to some eigenvalue OR the eigenspace that corresponds to the eigenvalue
- Exactly like the concept of span; the eigenspace is a subspace, the span of all eigenvectors for that eigenvalue (including .$\vec 0$)
- Any input vectors that lie in this space (for 2 distinct eigenvectors, a plane; for 1, a line) will be scaled by the shared eigenvalue under a linear transformation.
- Eigenspace is a subspace, which means it’s closed under scalar multiplication
- Thus, if two vectors are related by a scalar, they must both lie in the same eigenspace of .$\lambda_i$: .$E_{\lambda_i}$
- That is, the eigenvectors that make up some eigenspace aren’t unique
- But, every eigenvector can only correspond to one eigenvalue
- If for some .$A$, where .$A \vec x_1 = \lambda_1 \vec x_1$ and .$A \vec x_1 = \lambda_2 \vec x_1$ then .$\lambda_1 = \lambda_2$
- Thinking about a physical diagram may help clarify; .$\lambda_1 \neq \lambda_2$ would require some single initial state or vector aligned with .$\vec x_1$ to be scaled by two different values upon being transformed by .$A$
- This cannot happen, so the two eigenvalues cannot be distinct
- Thus, if two vectors are related by a scalar, they must both lie in the same eigenspace of .$\lambda_i$: .$E_{\lambda_i}$
- Aside: Complex eigenvalues can exist as well; they are much harder to visually understand, but mathematically, we find them using the exact same process as before.
Mean-product formula #
- Mean-product formula is a nicer way of solving this, versus finding the roots of the polynomial
$$m \pm \sqrt{m^2 - p}$$
- $m$ is the mean of the trace, which is the same as the mean of the eigenvalues
- $p$ is the product of the eigenvalues, which is the determinant
Theorems #
Theorem 9.1 #
Given two eigenvectors .$\vec v_1$ and .$\vec v_2$ corresponding to two different eigenvalues .$\lambda_1$ and .$\lambda_2$ of a matrix .$A$, it is always the case that .$\vec v_1$ and .$\vec v_2$ are linearly independent.
- Proof on page 9
Theorem 9.2 #
Let .$\vec v_1,\vec v_2, \dots , \vec v_m$ be eigenvectors of an .$n \times n$ matrix with distinct eigenvalues. It is the case that all the .$\vec v_i$ are linearly independent from one another.
- Proof on page 10
Proposition 1 #
If an square .$n \times n$ matrix .$A$ isn’t invertible, then it has some eigenvalue .$\lambda_i = 0$
- If a matrix is not invertible, then the dimension of its null space isn’t necessarily greater than 0 because there must be a linearly dependent row or column.
- If this is true, then there’s a non-zero vector .$\vec x$ such that .$A \vec x = 0 \vec x$
- If the matrix is not invertible, it has a nontrivial null-space.
- Then, by definition, there is some nonzero .$\vec x$ for which .$A \vec x = 0 \vec x = \vec 0_n$.
- We pattern match to .$\eqref{a}$ and notice the equation is exactly the same if we multiply the right by .$\vec x: A \vec x = 0 \vec x$.
- This is kind of like pulling a scalar .$0$ out of .$\vec 0$, leaving .$0 \vec x$.
- Now, we clearly see .$\lambda = 0$
Proposition 2 #
For an invertible .$A$ with some eigenvalue .$\lambda$, .$A^{−1}$ has eigenvalue .$ \frac{1}{\lambda}$
- We can start at .$\eqref{a}$: left-multiply both sides by .$A^{−1}$ to get .$\vec v = A^{-1} \lambda \vec v \Longrightarrow A^{-1} \vec v = \frac{1}{\lambda} \vec v$.
- Pattern match to .$\eqref{a}$ again and we’ve shown the statement is true
- Note: Given invertibility, all .$\lambda_i \neq 0$ so this is always true
States #
Steady States #
- We know that a steady state .$\vec x^*$ of a transformation matrix .$A$ is defined to be such
$$A \vec x^* = \vec x^* $$
- In other words, it is an element of the eigenspace of .$A$ corresponding to the eigenvalue .$\lambda = 1$.
- The above equation tells us that if we start at a steady state, then we will remain unaffected by the transformation matrix over time.
- Therefore, to solve for the steady-state of a system represented by .$P$, we solve .$\eqref{a}$, substituting .$\lambda = 1$
- Note that this amounts to solving for the null-space of .$(P − I_n)$.
- Great walk-through on page 53
Predicting Behavior for General Initial States #
Given a system and an initial state, can we predict how it’ll dynamically change over time? We saw in the Page Rank example that we seem to approach a sort of steady-state after many timesteps, but under what conditions does this happen?
Simpler Case: .$\vec x (0) = \alpha \vec v$ #
- Suppose our initial state is actually a perfect multiple of an eigenvector of the system.
- Over time, upon repeated applications of .$A,$ we accumulate factors of .$\lambda$; ultimately, .$A^n \vec x = \alpha (\lambda^n \vec x)$ – as we can see derived to the right
$$\vec x (0) = \alpha \vec v$$ $$\vec x (1) = A\vec x (0) = \alpha \lambda \vec v$$ $$\vec x (2) = A\vec x (1) = \alpha \lambda^2 \vec v$$ $$\vdots$$ $$\vec x (n) = A\vec x (n-1) = \alpha \lambda^n \vec v$$
- Based on this pattern, we notice the following behaviors based on the value .$\lambda$ as .$n \to \infty$:
- .$\lambda > 1: \vec x (n) \to \infty$ – Diverge, exponential growth.
- .$\lambda = 1: \vec x (n) \to k\vec v$ – Converge, steady-state.
- .$0 < \lambda < 1: \vec x (n) \to \vec 0$ – Converge, exponential decay.
- .$\lambda = 0: \vec x (n) = 0 \vec v = \vec 0$ – Converge(?), instantaneous disappearance.
- .$\lambda < 0$: Take .$|\lambda|$ and refer to the appropriate case above. But recognize the sign switches at each timestep
General Case: .$x(0) = \alpha_1 \vec v_1 + \alpha_2 \vec v_2 + . . . + \alpha_n \vec v_n$ #
- This form says that the initial state is now not a scalar multiple of just one eigenvectors, it’s a linear combination of all of them
- Note that this is still not fully general; we assume here that all initial states are in the span of the eigenvectors of .$A$, which isn’t guaranteed.
- But this case devolves into the previous one; we can simply treat each element individually, apply the techniques from the Simpler Case, and put them back together. The final form is as follows: $$\vec x (n) = \alpha_1 (\lambda_1^n \vec v_1) + \alpha_2 (\lambda_2^n \vec v_2) + . . . + \alpha_n (\lambda_n^n \vec v_n) \label{b}\tag{2}$$
- Given a matrix .$A$ and some initial state .$\vec x$, how can we actually get to this equation format?
First, we solve for the .$(\vec v_i, \lambda_i)$ pairs of .$A$
Then, we use Gaussian elimination to find the .$\alpha_i$’s; Putting eq. .$ \eqref{b}$ in matrix form yields:
…and we compute the inverse of the matrix of eigenvectors, arriving at:
$$\vec x(0) = \begin{bmatrix} | & | & & | \\ \vec v_1 & \vec v_2 & \dots & \vec v_n\\ | & | & & | \\ \end{bmatrix}\begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \\ \end{bmatrix}$$
$$\begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \\ \end{bmatrix} = \begin{bmatrix} | & | & & | \\ \vec v_1 & \vec v_2 & \dots & \vec v_n\\ | & | & & | \\ \end{bmatrix}^{-1} \vec x (0)$$
- Let’s assume that we drop any terms corresponding to .$\alpha_i = 0$ since those terms are, well, zero. Then, with what remains, we can make some intuitive observations:
- .$|\lambda_i | > 1: \vec x (n) \to \infty$ – Diverge, even if other components in the linear combination decay, the state itself “blows up” to .$\infty$ as this component overshadows all others.
- .$|\lambda_i| = −1: \vec x (n) \to\ ?$ – Diverge because that component oscillates forever.
- .$−1 < \lambda_i \leq 1: \vec x (n) \to \vec x^*$ – Converge, that is, steady-state! Each .$i$th term either decays to zero (.$|\lambda_i| \leq 1$) or stays the same (.$|\lambda_i| = 1$), such that .$\vec x^* = \sum_{i,\lambda_i=1} \alpha_i \vec v_i$.
- We can normalize this steady-state if we want proportions (column values sum to 1) rather than absolute numbers
Some Useful Information #
- If a matrix has .$n$ distinct real eigenvalues, their .$n$ associated eigenvectors are all linearly independent.
- An eigenspace for a given eigenvalue is the span of all eigenvectors, including .$\vec 0$, and is also a subspace by definition.
- Say we calculate an eigenvector for an eigenvalue; we can pick any scalar multiple of the result and this will still be an eigenvector, since scaling a vector does not change its direction. This follows from the scalar multiplication property of subspaces.
- A given eigenvector can only be associated with one eigenvalue, since a vector can only be scaled by some single value upon being transformed by a matrix. But, a eigenvalue can be associated with multiple eigenvectors, the span of which form an eigenspace.
- If a matrix has some .$\lambda = 0$, then for some .$\vec x, A\vec x = \lambda\vec x =\vec 0$, so .$A$ has a nontrivial null-space. Therefore, it is not invertible.
- If a matrix has some .$\lambda = 1$, then any initial state that is aligned with the corresponding eigenvector is a steady-state. More generally, any initial state for which .$\lambda = 1$ comprises part of the linear combination potentially has a nonzero steady-state, so long as all other .$|\lambda_i| < 1$.
- The rotation matrix (that rotates any vector by .$\theta$ degrees counterclockwise) is:
$$A(\theta) = A_R = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \\ \end{bmatrix}$$