Notes can be found as interactive webpage at

4: Vector Spaces & Eigenstuff

02-15: Vector Spaces: Null Spaces and Columnspaces #

Important Jargon #

  • Rank a matrix .$A$ is the number of linearly independent columns
  • Nullspace of a matrix is the set of solutions to .$A \vec x = 0$
  • A vector space is a set of vectors connected by two operators: .$+, \times$ — page 48
  • A vector subspace is a subset of vectors that have “nice properties” — page 50
  • A basis for a vector space is a minimum set of vectors needed to represent all vectors in the space
  • Dimension of a vector space is the number of basis vectors
  • Column space is the span (range) of the columns of a matrix
  • Row space is the span of the rows of a matrix

Vector Spaces #

  • A vector space .$\mathbb{V}$ is a set of vectors and two operators .$+, \cdot$ that satisfy:

Vector Addition

  • Associative: .$\vec u + (\vec v + \vec w) = (\vec u + \vec v) + \vec w$
  • Commutative: .$\vec u + \vec v = \vec v + \vec u$
  • Additive Identity: There exists an additive identity .$\vec 0 \in \mathbb{V}$ such that .$\vec v + \vec 0 = \vec v$
  • Additive Inverse: There exists .$- \vec v \in \mathbb{V}$ such that .$\vec v + (-\vec v) = \vec 0$. We call .$-\vec v$ the additive inverse of .$\vec v$.
  • Closure under vector addition: The sum .$\vec v + \vec u$ must also be in .$\mathbb{V}$

Scalar Multiplication

  • Associative: .$\vec \alpha(\beta \vec v) = (\alpha \beta) \vec v$
  • Multiplicative Identity: There exists .$1 \in \mathbb{R}$ where .$1 \cdot \vec v = \vec v$
  • Distributive in vector addition: .$\alpha (\vec u + \vec v) = \alpha \vec u + \alpha \vec v$
  • Distributive in scalar addition: .$(\alpha + \beta)\vec v = \alpha \vec v + \beta \vec v$
  • Closure under scalar multiplication: The product .$\alpha \vec v$ must also be in .$\mathbb{V}$.
... for any .$\vec v, \vec u, \vec w \in \mathbb{V}; \alpha, \beta \in \mathbb{R}$

  • These can be grouped by axioms of closure, addition, and scaling shown on slide 10
  • For example .$ \mathbb{R}^{n}$ is the vector space of all .$n$-dimensional vectors.
    • In fact, the set of all matrices the same size is also a vector space .$ \mathbb{R}^{n \times o}$ since it fulfills all of the properties above as well
    • In this class we will generally only deal with vector spaces containing vectors in .$\mathbb{R}^{n}$.

Subspaces #

  • A subspace .$\mathbb{U}$ consists of a subset of .$\mathbb{V}$ in vector space (.$\mathbb{V}, \mathbb{F}, +, \cdot$). .$\mathbb{U} \subset \mathbb{V}$ and have 3 properties
    1. Contains .$\vec 0 $, i.e., .$\vec 0 \in \mathbb{U}$
    2. Closed under vector addition: .$\vec v_1, \vec v_2 \in \mathbb{U} \Longrightarrow \vec v_1 + \vec v_2 \in \mathbb{U}$
    3. Closed under scalar multiplication: .$\vec v \in \mathbb{U}, \alpha \in \mathbb{F} \Longrightarrow \alpha \vec v \in \mathbb{U}$
  • Examples on slide 13
  • Intuitively, a subspace is a closed subset of all the vectors in .$ \mathbb{V}$.
    • Any linear combination of vectors in the subspace must also lie in that subspace.
  • Just as basis and dimension are defined for vector spaces, they have equivalent definitions for subspaces.
    • Basis for a Subspace: set of linearly independent vectors that span the subspace (minimal set of subspace-spanning vectors)
    • Subspace Dimension: number of vectors in subspace-basis

Basis #

  • Basis: Given a vector space .$\mathbb{V}$, a set of vectors .$\{\vec v_1, \dots \vec v_n\}$ is a basis of the vector space if it satisfies the following properties:
    1. .$\vec v_1, \dots, \vec v_n$ are linearly independent vectors
    2. .$\text{span}(\{\vec v_1, \dots, \vec v_n\}) = \mathbb{V} \Longrightarrow \forall \vec v \in \mathbb{V}, \exists \alpha_1, \dots, \alpha_{n-1} \in \mathbb{R}$ such that .$\vec v_1 = \alpha_1 \vec v_2 + \dots \alpha_{n-1} \vec v_n$

    Minimum set of vectors that spans a vector space

  • A basis of a vector space is the minimum set of vectors needed to represent all vectors in the vector space.
    • If a set of vectors is linearly dependent and “spans” the vector space, it is still not a basis – we can remove at least one vector from the set and the resulting set will still span the vector space

Basis is not unique #

  • Intuitively, think about multiplying one of the vectors in a given basis by a nonzero scalar will not affect the linear independence or span of the vectors.
  • We could alternatively construct another basis by replacing one of the vectors with the sum of itself and any other vector in the set.
  • Mathematically, suppose that .$\{\vec v_1, \dots, \vec v_n \}$ is a basis for the vector space we are considering.
    • Thus .$\{\alpha \vec v_1, \dots, \vec v_n \}$ where .$\alpha \neq 0$ is also a basis because, just as we’ve seen in Gaussian elimination row operations, multiplying a row by a nonzero constant does not change the linear independence or dependence of the rows.
      • We can generalize this to say that multiplying a vector by a nonzero scalar also does not change the linear independence of the set of vectors.
    • In addition, we know that .$\text{span}(\{ \vec v_1, \dots, \vec v_n \}) = \text{span}( \{\alpha \vec v_1, \dots, \vec v_n \} )$
      • Any vector in .$\text{span}(\{ \vec v_1, \dots, \vec v_n \})$ can be created as a linear combination of the set .$\text{span}(\{ \alpha \vec v_1, \dots, \vec v_n \})$ by dividing the scale factor on .$\vec v_1$ by .$\alpha$.
      • We can use a similar argument to show that .$\{\alpha \vec v_1, \dots, \vec v_n \}$ is also a basis for the same vector space.

    To generalize, for .$\mathbb{R}^{N}$, any .$N$ (and only .$N$) linearly independent vectors form a basis

Dimension #

  • Dimension: The dimension of a vector space is the number of basis vectors.
  • Since each basis vector can be scaled by one coefficient, the dimension of a space as the fewest number of parameters needed to describe an element or member of that space.
  • The dimension can also be thought of as the degrees of freedom of your space – that is, the number of parameters that can be varied when describing a member of that space.

    A vector space can have many bases, but each basis must have the same number of vectors:

    • Suppose a basis for the vector space we’re considering has .$n$ vectors. This means that the minimum number of vectors we can use to represent all vectors in the vector space is .$n$, because the vectors in the basis would not be linearly independent if the vector space could be represented with fewer vectors.
    • Then we can show that any set with less than .$n$ vectors cannot be a basis because it does not have enough vectors to span the vector space – there would be some vectors in the vector space that cannot be expressed as a linear combination of the vectors in the set.
    • In addition, we can show that any set with more than .$n$ vectors must be linearly dependent and therefore cannot be a basis.
    • Combining the two arguments, we have that any other set of vectors that forms a basis for the vector space must have exactly .$n$ vectors!

Column Space #

  • The range/span/column space of matrix .$A \in \mathbb{R}^{m \times n}$ – which we can represent as a set of vectors .$\{ \vec a_1, \dots \vec a_n \}$ – is a set of all possible linear combinations: $$\text{span}\big(\{\vec a_1, \dots, \vec a_n\}\big) = \Bigg\{\sum_{i=1}^N \alpha_i \vec a_i\ |\ \alpha_1, \dots, \alpha_n \in \mathbb{R} \Bigg\} = \big\{A \vec x =\ \vec x \in \mathbb{R}^{n}\big\}$$
    • That is, the column space of a matrix .$A \in \mathbb{R}^{m \times n}$ is the span of the .$n$ columns in .$A$
    • It’s the space of all outputs that the operator can map to.
  • Thinking about .$A$ as a linear transformation from .$ \mathbb{R}^{n} \to \mathbb{R}^{m}$, the column space is effectively the set of all outputs that this matrix can transform input vectors to
  • Note that in the general case, input vectors and output vectors can be different lengths
    • The column space describes all possible output vectors .$\vec b = \mathbb{R}^{m \times 1}$
    • It can be shown that .$\text{span}(A)$ forms a subspace of .$ \mathbb{R}^{m}$
      • Note that .$\text{span}(A)$ is not necessarily .$ \mathbb{R}^{m}$

Row Space #

  • Similarly, the row space is the span of the .$n$ rows

Rank #

  • The rank of .$A$ is defined as the dimension of the column space of .$A \in \mathbb{R}^{m \times n}$
    • .$\text{rank}(A) = \text{dim(span(A))}$
      • .$\text{dim(span}(A\text{)) ≡ dim(col(}A))$
      • It’s all too easy to confuse an actual space consisting of vectors, like a matrix range describing the output (column) space, with the dimension of that space, which is just a single scalar number. Keep them straight!
    • .$ \text{rank}(A) = \text{dim(span}(A)) \leq \text{min}(m, n)$
      • This is at most .$m$, but certainly can be less, since an arbitrary .$A \in \mathbb{R}^{m \times n}$ is not guaranteed to have columns whose span is all of .$ \mathbb{R}^{m}$
      • Consider the simple counterexample of the zero matrix .$\vec 0 \in \mathbb{R}^{m \times n}$, which maps all .$n$-dimensional input vectors to the .$m$-dimensional all-zero vector.
  • In general, using the column-wise representation of matrix-vector multiplication we can show that .$\text{rank(}A)$ is the number of linearly independent columns in .$A$.
    • Any output vector can be represented as a linear combination of the columns of .$A$.
    • But some of these columns might themselves be linear combinations of other columns, which means we can replace any redundant column with a weighted sum of the other columns.
    • By removing all redundancies, we find that a matrix with .$k \leq \text{min}(n, m)$ linearly independent column vectors can “unlock” exactly .$k$ dimensions in the output.
  • Thus, we find that .$\text{rank}(A)$ also equals the number of pivots in the RREF of .$A$.
    • Since each pivot must belong to a row and a column, the number of pivots in .$A \in \mathbb{R}^{m \times n}$ is limited by the smaller dimension.
    • For a tall matrix .$m > n$, the columns are the limiting dimension; for a wide matrix .$n > m$ the rows are.

Null Space #

  • The null-space of .$ \mathbb{R}^{m \times n}$ is the set of all vectors .$\vec x \in \mathbb{R}^{m}$ such that .$A\vec x = 0$ $$\text{null}(A) = \big\{\vec x\ |\ A \vec x = \vec 0, \vec x \in \mathbb{R}^{m}\big\}$$
    • That is, the set of all inputs that get mapped to .$\vec 0$ by .$A$
  • $\text{dim(null}(A))$ can be interpreted as the number of input directions for which the output is “compressed” down to zero.
    • We know that it can be at most .$m$, since all of the input vectors have .$m$ components.
    • It’s the set of vectors not in columns space, that is, the number of linearly dependent columns: $$m - \text{dim(span}(A)) = \text{dim(null}(A))$$
      • The loss of dimensionality from the input space to the output space shows up in the nullspace.
  • .$\vec 0$ is always in the null space — trivial Null space
    • This wouldn’t hold if we had affine (instead of linear) functions
  • Null space DNE when the determinant is not zero – see last week

Procedure to Compute a Null-Space #

  • Computing the nullspace of .$A$ requires us to solve .$A \vec x = \vec 0$ – the procedure is as follows:
    1. Put .$A$ in RREF. Initialize the set .$\mathbb{S} = \{ \vec 0 \}$.
    2. Check each column for leading entries and find the number of .$F$ree and .$B$asic variables.
    3. if .$F = 0$, stop and skip to the last step.
    4. if .$F \neq 0$, repeat the following for each free variable:
      • Set that free variable to .$1$, and all others to zero.
      • Solve .$A \vec x$ under these conditions; add the solution vector to .$\mathbb{S}$.
    5. Conclude that .$\text{null}(A) = \text{span}(\mathbb{S})$.
  • Example is given on page 37-38

Rank-Nullity Theorem #

How is the number of free variables related to the total number of columns in a matrix .$A \in \mathbb{R}^{m \times n}$? Well, each column of a matrix either contributes a “new direction” to the output or it is redundant with other columns and their already-discovered directions. In other words, each of .$n$ columns adds a dimension to .$\text{span}(A)$ or to .$\text{null}(A)$. Therefore, the following holds: $$\text{dim(span}(A)) + \text{dim(null}(A)) = n$$ $$\text{rank}(A) + \text{dim(null}(A)) = n$$

02-17: Eigenvectors, values #

Eigenvectors and Eigenvalues #

Consider a square matrix .$ A \in \mathbb{R}^{n \times n}$. An eigenvector of .$A$ is a nonzero vector .$\vec x \in \mathbb{R}^{n}$ such that $$A \vec x = \lambda \vec x$$ where .$\lambda$ is a scalar value, called the eigenvalue of .$\vec x$.

  • That is, an eigenvector represents a sort of stability point: vectors aligned with an eigenvector will not change direction under a linear transformation .$A$
    • Rather, they will simply be scaled by some factor.
    • Note that eigenvectors are a property of the matrix itself and do not depend on the specific vector being transformed (input)
  • The eigenvalue describes this stretching or compressing factor for vectors aligned with an eigenvector
    • This means any vector that’s ‘some’ multiple of the eigenvector, when it’s transformed by .$A$, will become a scaled version of itself that’s a ‘some’ multiple of the eigenvalue
    • Example on page 45
  • Geometrically, an eigenvector, corresponding to a real nonzero eigenvalue, points in a direction in which it is stretched by the transformation and the eigenvalue is the factor by which it is stretched
  • Because these two terms are so commonly used in conjunction, we often refer to an eigen(value/vector) pair
    • Note that scaling a given eigenvector for an eigenvalue will still produce a valid eigenvector, since the vector’s direction will not be changed
  • Given non-invertible .$A \in \mathbb{R}^{n \times n}$ there are at least .$1$ and at most .$n$ eigenvalues
    • Given non-invertibility, some .$\lambda_i = 0$, so we have at least 1 eigenvalue.
      • Indeed, all eigenvalues can be .$0$ (such as for .$\vec 0$)!
    • However, non-invertibility does not place any other restrictions on our set of eigenvalues, so all other .$n-1$ eigenvalues can be distinct from this .$\lambda_i$
  • Properties to know
    1. A matrix is uninvertible iff .$0$ is an eigenvalue (because there exists a vector .$\vec v$ such that .$A \vec v = \vec 0$.
    2. A scalar times an eigenvector is still an eigenvector: .$(A(c \vec v) = c A \vec v = c \vec v = (c \vec v))$
    3. Eigenvectors with distinct eigenvalues are linearly independent (eigenvectors in the same span have the same eigenvalue)
    4. .$A^{-1} \vec v = \lambda^{-1} \vec v$
      • .$A \vec v = \lambda \vec v \Longrightarrow A^{-1} A \vec v = A^{-1} \lambda \vec v \Longrightarrow \vec v = \lambda A^{-1} \vec v \Longrightarrow \lambda^{-1} \vec v = A^{-1} \vec v$

Determinants #

  • The determinant is a quantity we can define for any square matrix
    • The determinant is nonzero if and only if the matrix is invertible and the linear map represented by the matrix is an isomorphism
  • For a .$2 \times 2$ matrix, the formula is:
    • The absolute value of .$ad − bc$ is the area of the parallelogram, and thus represents the scale factor by which areas are transformed by .$A$.

    $$\text{det}\Bigg( \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix}\Bigg) = ad - bc$$

    • If linearly dependent, some vectors will lie on top of each other so the area will be zero
    • Similarly in 3D, if any column vectors are multiples of each other, we “squash” a volume into a plane or a line.
    • That is, determinant of any square matrix is zero if the columns are linearly dependent.
  • For a .$3 \times 3$ matrix, the shape is a parallelipiped, and we form hyper-volumes in higher dimensions.
    • We can calculate determinants for higher dimension matrices as well, as described here
  • The determinant of the transpose of .$A$ equals the determinant of .$A$: .$\text{det}(A) = \text{det}(A^\text{T})$
    • ‘Proving’ this geometrically on a whiteboard is fun, try out the .$2 \times 2$ case
    • Therefore, if .$A^\text{T}$ has an eigenvalue .$\lambda$, then .$A$ also has the eigenvalue .$\lambda$ because .$\text{det}(A - \lambda I) = \text{det}(A^\text{T} - \lambda I)$
    • This implies that in all the properties mentioned above, the word “column” can be replaced by “row” throughout
      • For example, viewing an .$n \times n$ matrix as being composed of .$n$ rows, the determinant is an .$n$-linear function.
    • See this article which relies on the idea of elementary matrices (not covered) or this more advanced, out-of-scope stackoverflow post that deals with 16B-level topics (and beyond)

Computing Eigenvalues and Eigenvectors #

  • Solving this equation for nonzero (nontrivial) solutions .$\vec x$ will yield our eigenvectors:

    $$A \vec x = \lambda \vec x \label{a}\tag{1}$$

    $$\Longrightarrow (A - \lambda I_n) \vec x = \vec 0_n$$

    $$\Longrightarrow A’ \vec x = \vec 0_n$$

    • Note that .$A$ and .$I_n$ are fixed and only 1 parameter here that can vary in this equation: .$\lambda$
  • We want to find values of .$\lambda$ such that .$A’ = (A − \lambda I_n)$ has linearly dependent columns
  • That is, we want to find values of .$\lambda$ that cause the determinant of .$A’$ to become zero
    • Linear dependence of the columns creates a nontrivial null-space for .$A'$
    • .$N$th order characteristic polynomial with .$N$ solutions
  • Work-through on page 6

$$A’ = \begin{bmatrix} a_{11} - \lambda & a_{12} & … & a_{1n}\\ a_{21} & a_{22} - \lambda & … & \vdots \\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & \dots & … & a_{nn} - \lambda \end{bmatrix}$$

  • For the .$2 \times 2$ case determinant is $$\lambda^2 - (a+d)\lambda + (ad - bc) = 0$$
    • Solving this quadratic equation, we can find the .$\lambda$ values.
    • Then, for each .$\lambda_i$ we find, we revisit .$\eqref{a}$ and solve for the corresponding .$\vec x_i$
    • The polynomial on the left hand side of the above equation is known as the characteristic polynomial for the matrix .$A$.
  • If a matrix has repeated eigenvalues, they may (or may not) have distinct eigenvectors.
    • In general, multiplicity of eigenvalues will result in the same multidimensional eigenspace
    • (Aside) …except if the matrix is defective
      • An .$n \times  n$ matrix is defective iff it does not have .$n$ linearly independent eigenvectors
      • That is, a defective matrix always has fewer than .$n$ distinct eigenvalues, since distinct eigenvalues always have linearly independent eigenvectors
      • In which case the rank is decreased

Eigenspace #

If the eigenvectors are distinct, they form an eigenspace $$E_\lambda = \text{null}(A - \lambda I_n) = \text{null}(A’)$$

  • That is, null space of some matrix is the set of all vectors that satisfy .$\eqref{a}$ OR all of the eigenvectors that correspond to some eigenvalue OR the eigenspace that corresponds to the eigenvalue
  • Exactly like the concept of span; the eigenspace is a subspace, the span of all eigenvectors for that eigenvalue (including .$\vec 0$)
  • Any input vectors that lie in this space (for 2 distinct eigenvectors, a plane; for 1, a line) will be scaled by the shared eigenvalue under a linear transformation.
  • Eigenspace is a subspace, which means it’s closed under scalar multiplication
    • Thus, if two vectors are related by a scalar, they must both lie in the same eigenspace of .$\lambda_i$: .$E_{\lambda_i}$
      • That is, the eigenvectors that make up some eigenspace aren’t unique
    • But, every eigenvector can only correspond to one eigenvalue
      • If for some .$A$, where .$A \vec x_1 = \lambda_1 \vec x_1$ and .$A \vec x_1 = \lambda_2 \vec x_1$ then .$\lambda_1 = \lambda_2$
      • Thinking about a physical diagram may help clarify; .$\lambda_1 \neq \lambda_2$ would require some single initial state or vector aligned with .$\vec x_1$ to be scaled by two different values upon being transformed by .$A$
        • This cannot happen, so the two eigenvalues cannot be distinct
  • Aside: Complex eigenvalues can exist as well; they are much harder to visually understand, but mathematically, we find them using the exact same process as before.

Mean-product formula #

  • Mean-product formula is a nicer way of solving this, versus finding the roots of the polynomial $$m \pm \sqrt{m^2 - p}$$
    • $m$ is the mean of the trace, which is the same as the mean of the eigenvalues
    • $p$ is the product of the eigenvalues, which is the determinant

Theorems #

Theorem 9.1 #

Given two eigenvectors .$\vec v_1$ and .$\vec v_2$ corresponding to two different eigenvalues .$\lambda_1$ and .$\lambda_2$ of a matrix .$A$, it is always the case that .$\vec v_1$ and .$\vec v_2$ are linearly independent.

Theorem 9.2 #

Let .$\vec v_1,\vec v_2, \dots , \vec v_m$ be eigenvectors of an .$n \times n$ matrix with distinct eigenvalues. It is the case that all the .$\vec v_i$ are linearly independent from one another.

Proposition 1 #

If an square .$n \times n$ matrix .$A$ isn’t invertible, then it has some eigenvalue .$\lambda_i = 0$

  • If a matrix is not invertible, then the dimension of its null space isn’t necessarily greater than 0 because there must be a linearly dependent row or column.
  • If this is true, then there’s a non-zero vector .$\vec x$ such that .$A \vec x = 0 \vec x$

  • If the matrix is not invertible, it has a nontrivial null-space.
  • Then, by definition, there is some nonzero .$\vec x$ for which .$A \vec x = 0 \vec x = \vec 0_n$.
  • We pattern match to .$\eqref{a}$ and notice the equation is exactly the same if we multiply the right by .$\vec x: A \vec x = 0 \vec x$.
    • This is kind of like pulling a scalar .$0$ out of .$\vec 0$, leaving .$0 \vec x$.
  • Now, we clearly see .$\lambda = 0$

Proposition 2 #

For an invertible .$A$ with some eigenvalue .$\lambda$, .$A^{−1}$ has eigenvalue .$ \frac{1}{\lambda}$

  • We can start at .$\eqref{a}$: left-multiply both sides by .$A^{−1}$ to get .$\vec v = A^{-1} \lambda \vec v \Longrightarrow A^{-1} \vec v = \frac{1}{\lambda} \vec v$.
  • Pattern match to .$\eqref{a}$ again and we’ve shown the statement is true
  • Note: Given invertibility, all .$\lambda_i \neq 0$ so this is always true

States #

Steady States #

  • We know that a steady state .$\vec x^*$ of a transformation matrix .$A$ is defined to be such $$A \vec x^* = \vec x^* $$
    • In other words, it is an element of the eigenspace of .$A$ corresponding to the eigenvalue .$\lambda = 1$.
    • The above equation tells us that if we start at a steady state, then we will remain unaffected by the transformation matrix over time.
    • Therefore, to solve for the steady-state of a system represented by .$P$, we solve .$\eqref{a}$, substituting .$\lambda = 1$
      • Note that this amounts to solving for the null-space of .$(P − I_n)$.
  • Great walk-through on page 53

Predicting Behavior for General Initial States #

Given a system and an initial state, can we predict how it’ll dynamically change over time? We saw in the Page Rank example that we seem to approach a sort of steady-state after many timesteps, but under what conditions does this happen?

Simpler Case: .$\vec x (0) = \alpha \vec v$ #

  • Suppose our initial state is actually a perfect multiple of an eigenvector of the system.
  • Over time, upon repeated applications of .$A,$ we accumulate factors of .$\lambda$; ultimately, .$A^n \vec x = \alpha (\lambda^n \vec x)$ – as we can see derived to the right

$$\vec x (0) = \alpha \vec v$$ $$\vec x (1) = A\vec x (0) = \alpha \lambda \vec v$$ $$\vec x (2) = A\vec x (1) = \alpha \lambda^2 \vec v$$ $$\vdots$$ $$\vec x (n) = A\vec x (n-1) = \alpha \lambda^n \vec v$$

  • Based on this pattern, we notice the following behaviors based on the value .$\lambda$ as .$n \to \infty$:
    • .$\lambda > 1: \vec x (n) \to \infty$ – Diverge, exponential growth.
    • .$\lambda = 1: \vec x (n) \to k\vec v$ – Converge, steady-state.
    • .$0 < \lambda < 1: \vec x (n) \to \vec 0$ – Converge, exponential decay.
    • .$\lambda = 0: \vec x (n) = 0 \vec v = \vec 0$ – Converge(?), instantaneous disappearance.
    • .$\lambda < 0$: Take .$|\lambda|$ and refer to the appropriate case above. But recognize the sign switches at each timestep

General Case: .$x(0) = \alpha_1 \vec v_1 + \alpha_2 \vec v_2 + . . . + \alpha_n \vec v_n$ #

  • This form says that the initial state is now not a scalar multiple of just one eigenvectors, it’s a linear combination of all of them
    • Note that this is still not fully general; we assume here that all initial states are in the span of the eigenvectors of .$A$, which isn’t guaranteed.
  • But this case devolves into the previous one; we can simply treat each element individually, apply the techniques from the Simpler Case, and put them back together. The final form is as follows: $$\vec x (n) = \alpha_1 (\lambda_1^n \vec v_1) + \alpha_2 (\lambda_2^n \vec v_2) + . . . + \alpha_n (\lambda_n^n \vec v_n) \label{b}\tag{2}$$
  • Given a matrix .$A$ and some initial state .$\vec x$, how can we actually get to this equation format?
    • First, we solve for the .$(\vec v_i, \lambda_i)$ pairs of .$A$

    • Then, we use Gaussian elimination to find the .$\alpha_i$’s; Putting eq. .$ \eqref{b}$ in matrix form yields:

    • …and we compute the inverse of the matrix of eigenvectors, arriving at:

    $$\vec x(0) = \begin{bmatrix} | & | & & | \\ \vec v_1 & \vec v_2 & \dots & \vec v_n\\ | & | & & | \\ \end{bmatrix}\begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \\ \end{bmatrix}$$

    $$\begin{bmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \\ \end{bmatrix} = \begin{bmatrix} | & | & & | \\ \vec v_1 & \vec v_2 & \dots & \vec v_n\\ | & | & & | \\ \end{bmatrix}^{-1} \vec x (0)$$

    • Let’s assume that we drop any terms corresponding to .$\alpha_i = 0$ since those terms are, well, zero. Then, with what remains, we can make some intuitive observations:
      • .$|\lambda_i | > 1: \vec x (n) \to \infty$ – Diverge, even if other components in the linear combination decay, the state itself “blows up” to .$\infty$ as this component overshadows all others.
      • .$|\lambda_i| = −1: \vec x (n) \to\ ?$ – Diverge because that component oscillates forever.
      • .$−1 < \lambda_i \leq 1: \vec x (n) \to \vec x^*$ – Converge, that is, steady-state! Each .$i$th term either decays to zero (.$|\lambda_i| \leq 1$) or stays the same (.$|\lambda_i| = 1$), such that .$\vec x^* = \sum_{i,\lambda_i=1} \alpha_i \vec v_i$.
        • We can normalize this steady-state if we want proportions (column values sum to 1) rather than absolute numbers

Some Useful Information #

  • If a matrix has .$n$ distinct real eigenvalues, their .$n$ associated eigenvectors are all linearly independent.
  • An eigenspace for a given eigenvalue is the span of all eigenvectors, including .$\vec 0$, and is also a subspace by definition.
  • Say we calculate an eigenvector for an eigenvalue; we can pick any scalar multiple of the result and this will still be an eigenvector, since scaling a vector does not change its direction. This follows from the scalar multiplication property of subspaces.
  • A given eigenvector can only be associated with one eigenvalue, since a vector can only be scaled by some single value upon being transformed by a matrix. But, a eigenvalue can be associated with multiple eigenvectors, the span of which form an eigenspace.
  • If a matrix has some .$\lambda = 0$, then for some .$\vec x, A\vec x = \lambda\vec x =\vec 0$, so .$A$ has a nontrivial null-space. Therefore, it is not invertible.
  • If a matrix has some .$\lambda = 1$, then any initial state that is aligned with the corresponding eigenvector is a steady-state. More generally, any initial state for which .$\lambda = 1$ comprises part of the linear combination potentially has a nonzero steady-state, so long as all other .$|\lambda_i| < 1$.
  • The rotation matrix (that rotates any vector by .$\theta$ degrees counterclockwise) is:

$$A(\theta) = A_R = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \\ \end{bmatrix}$$