Problem Set 8: Power Iteration

Problem Set 8: Power Iteration#

Theoretical Questions#

Problem 1: Convergence Conditions#

Let $A \in \mathbb{R}^{n \times n}$ be a diagonalizable matrix with eigenvalues $\lambda_1, \lambda_2, \ldots, \lambda_n$ ordered such that $|\lambda_1| > |\lambda_2| \geq |\lambda_3| \geq \cdots \geq |\lambda_n|$.

(a) Prove that if $v^{(0)}$ has a nonzero component in the direction of the eigenvector corresponding to $\lambda_1$, then the power iteration sequence

\[ v^{(k+1)} = \frac{Av^{(k)}}{\|Av^{(k)}\|} \]

converges to an eigenvector corresponding to $\lambda_1$.

(b) Show that the rate of convergence is governed by the ratio $\left|\frac{\lambda_2}{\lambda_1}\right|$. What happens when $|\lambda_1| = |\lambda_2|$?

(c) Express the error in the eigenvalue estimate after $k$ iterations as a function of $k$ and $\left|\frac{\lambda_2}{\lambda_1}\right|$.

Problem 2: Rayleigh Quotient Analysis#

The Rayleigh quotient for a vector $v \neq 0$ and matrix $A$ is defined as:

\[ R(v) = \frac{v^T A v}{v^T v} \]

(a) Prove that if $v$ is an eigenvector of $A$ with eigenvalue $\lambda$, then $R(v) = \lambda$.

(b) Show that for a symmetric matrix $A$, the Rayleigh quotient satisfies $\lambda_{\min} \leq R(v) \leq \lambda_{\max}$ for any nonzero vector $v$.

(c) Derive the gradient $\nabla R(v)$ and show that critical points occur at eigenvectors of $A$.

Problem 3: Shifted Power Iteration#

Consider the shifted power iteration method applied to $(A - \mu I)^{-1}$ where $\mu$ is a scalar.

(a) If $A$ has eigenvalues $\lambda_1, \ldots, \lambda_n$, what are the eigenvalues of $(A - \mu I)^{-1}$?

(b) Explain how choosing $\mu$ close to an eigenvalue $\lambda_i$ can accelerate convergence to the corresponding eigenvector.

(c) What is the relationship between inverse iteration (power iteration on $A^{-1}$) and shifted power iteration with $\mu = 0$?

Problem 4: Deflation Technique#

After finding the dominant eigenpair $(\lambda_1, v_1)$, we can construct a deflated matrix to find the second eigenvalue.

(a) For a symmetric matrix $A$, show that the matrix $A_2 = A - \lambda_1 v_1 v_1^T$ has eigenvalues $0, \lambda_2, \lambda_3, \ldots, \lambda_n$ (assuming $\|v_1\| = 1$).

(b) Why is deflation numerically unstable for non-symmetric matrices?

(c) Propose an alternative approach to find multiple eigenpairs without deflation.

Computational Problems#

Problem 5: Basic Implementation#

Implement the power iteration algorithm in Python with the following specifications:

(a) Write a function power_iteration(A, num_iterations, v0=None) that:

Takes a matrix A (NumPy array)
Performs num_iterations iterations
Returns the dominant eigenvalue estimate and corresponding eigenvector
Uses random initialization if v0 is not provided
Normalizes the vector at each iteration

(b) Test your implementation on the matrix:

\[\begin{split} A = \begin{bmatrix} 4 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]

Compare your results with numpy.linalg.eig().

(c) Plot the convergence of the eigenvalue estimate as a function of iteration number. Use matplotlib.pyplot to create a semi-log plot showing the absolute error $|\lambda^{(k)} - \lambda_1|$.

Problem 6: Convergence Rate Analysis#

Consider the matrices:

\[\begin{split} A_1 = \begin{bmatrix} 5 & 0 \\ 0 & 4.5 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 5 & 0 \\ 0 & 1 \end{bmatrix}, \quad A_3 = \begin{bmatrix} 5 & 0 \\ 0 & 0.1 \end{bmatrix} \end{split}\]

(a) For each matrix, compute the ratio $\left|\frac{\lambda_2}{\lambda_1}\right|$ and predict the convergence rate.

(b) Run power iteration for 50 iterations on each matrix starting from $v^{(0)} = [1, 1]^T$. Plot all three convergence curves on the same semi-log plot.

(c) Verify empirically that the number of iterations required to achieve error $< 10^{-6}$ is approximately proportional to $-\log_{10}|\lambda_2/\lambda_1|$.

Problem 7: Failure Modes#

Explore cases where power iteration fails or behaves unexpectedly.

(a) Apply power iteration to the matrix:

\[\begin{split} B = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} \end{split}\]

What happens? Explain why in terms of the eigenvalues of $B$.

(b) Create a $3 \times 3$ matrix with two dominant eigenvalues $\lambda_1 = 5$, $\lambda_2 = -5$, and $\lambda_3 = 1$. Run power iteration from several random starting vectors and observe the behavior.

(c) Implement a check in your power iteration function to detect oscillating behavior and issue a warning.

Problem 8: PageRank Application#

The Google PageRank algorithm is essentially power iteration on a modified adjacency matrix.

Consider a simple web graph with 4 pages and link structure represented by:

\[\begin{split} L = \begin{bmatrix} 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{bmatrix} \end{split}\]

where $L_{ij} = 1$ if page $i$ links to page $j$.

(a) Construct the Markov matrix $M$ where $M_{ij} = L_{ij}/d_i$ and $d_i$ is the out-degree of page $i$ (number of outgoing links). Handle dangling nodes (pages with no outgoing links).

(b) Construct the Google matrix $G = \alpha M + (1-\alpha)\frac{1}{n}E$ where $\alpha = 0.85$ and $E$ is the matrix of all ones.

(c) Use power iteration to find the PageRank vector (the dominant eigenvector of $G$). What are the relative rankings of the four pages?

Problem 9: Inverse Iteration#

Research and implement inverse iteration to find the eigenvalue closest to a given shift $\mu$.

(a) Write a function inverse_iteration(A, mu, num_iterations, v0=None) that performs power iteration on $(A - \mu I)^{-1}$ using scipy.linalg.solve() instead of explicitly computing the inverse.

(b) Apply your function to find all eigenvalues of: $$C = \begin{bmatrix} 1 & 2 & 0 \\ 2 & 2 & 2 \\ 0 & 2 & 3 \end{bmatrix}$$ by choosing appropriate shifts near 0, 2, and 4.

(c) Compare the convergence speed of inverse iteration versus standard power iteration for finding the smallest eigenvalue (in absolute value).

Problem 10: Symmetric vs. Non-Symmetric Matrices#

Compare power iteration performance on symmetric and non-symmetric matrices.

(a) Generate a random $10 \times 10$ symmetric positive definite matrix using:

A_sym = Q @ np.diag(eigenvalues) @ Q.T

where Q is a random orthogonal matrix and eigenvalues are chosen with a clear dominant eigenvalue.

(b) Generate a non-symmetric matrix with the same eigenvalues using:

A_nonsym = Q1 @ np.diag(eigenvalues) @ Q2.T

where Q1 and Q2 are different random orthogonal matrices.

(c) Run power iteration on both matrices and compare:

Convergence speed
Stability (variation in results from different random initializations)
Behavior of the Rayleigh quotient over iterations

Create visualizations showing these comparisons.

Challenge Problems#

Problem 11: QR Algorithm Connection#

The QR algorithm for computing all eigenvalues can be viewed as simultaneous power iteration on multiple vectors.

(a) Research and explain the relationship between the QR iteration and power iteration.

(b) Implement one step of the QR algorithm using numpy.linalg.qr() and show that the diagonal entries of the resulting matrix converge to eigenvalues.

(c) For a $4 \times 4$ symmetric tridiagonal matrix of your choice, compare the convergence of:

Power iteration (for the dominant eigenvalue only)
QR algorithm (for all eigenvalues)

Problem 12: Arnoldi/Lanczos Preview#

For large sparse matrices, the Arnoldi iteration (or Lanczos for symmetric matrices) is more efficient than power iteration.

(a) Research Arnoldi iteration and explain conceptually how Arnoldi iteration builds a Krylov subspace $\mathcal{K}_m(A, v) = \text{span}\{v, Av, A^2v, \ldots, A^{m-1}v\}$.

(b) Research Lanczos iteration and implement a basic Lanczos iteration for a symmetric matrix that builds an orthonormal basis of the Krylov subspace using Gram-Schmidt orthogonalization.

(c) Apply your implementation to a $100 \times 100$ symmetric tridiagonal matrix and compare the efficiency (iterations needed) with standard power iteration.