CoCalc -- section-matrix-groups.ipynb

📚 The CoCalc Library - books, templates and other resources

cocalc-examples / aata / section-matrix-groups.ipynb

¹³³⁹⁵⁷ views
License: OTHER

Kernel:

In [ ]:

%%html
<link href="http://mathbook.pugetsound.edu/beta/mathbook-content.css" rel="stylesheet" type="text/css" />
<link href="https://aimath.org/mathbook/mathbook-add-on.css" rel="stylesheet" type="text/css" />
<style>.subtitle {font-size:medium; display:block}</style>
<link href="https://fonts.googleapis.com/css?family=Open+Sans:400,400italic,600,600italic" rel="stylesheet" type="text/css" />
<link href="https://fonts.googleapis.com/css?family=Inconsolata:400,700&subset=latin,latin-ext" rel="stylesheet" type="text/css" /><!-- Hide this cell. -->
<script>
var cell = $(".container .cell").eq(0), ia = cell.find(".input_area")
if (cell.find(".toggle-button").length == 0) {
ia.after(
    $('<button class="toggle-button">Toggle hidden code</button>').click(
        function (){ ia.toggle() }
        )
    )
ia.hide()
}
</script>

Important: to view this notebook properly you will need to execute the cell above, which assumes you have an Internet connection. It should already be selected, or place your cursor anywhere above to select. Then press the "Run" button in the menu bar above (the right-pointing arrowhead), or press Shift-Enter on your keyboard.

ParseError: KaTeX parse error: \newcommand{\lt} attempting to redefine \lt; use \renewcommand

Section12.1Matrix Groups

SubsectionSome Facts from Linear Algebra

Before we study matrix groups, we must recall some basic facts from linear algebra. One of the most fundamental ideas of linear algebra is that of a linear transformation. A or $T : {\mathbb R}^n \rightarrow {\mathbb R}^m$ is a map that preserves vector addition and scalar multiplication; that is, for vectors ${\mathbf x}$ and ${\mathbf y}$ in ${\mathbb R}^n$ and a scalar $\alpha \in {\mathbb R}\text{,}$

\begin{align*} T({\mathbf x}+{\mathbf y}) & = T({\mathbf x}) + T({\mathbf y})\\ T(\alpha {\mathbf y}) & = \alpha T({\mathbf y}). \end{align*}

An $m \times n$ matrix with entries in ${\mathbb R}$ represents a linear transformation from ${\mathbb R}^n$ to ${\mathbb R}^m\text{.}$ If we write vectors ${\mathbf x} = (x_1, \ldots, x_n)^{\rm t}$ and ${\mathbf y} = (y_1, \ldots, y_n)^{\rm t}$ in ${\mathbb R}^n$ as column matrices, then an $m \times n$ matrix

\begin{equation*} A = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix} \end{equation*}

maps the vectors to ${\mathbb R}^m$ linearly by matrix multiplication. Observe that if $\alpha$ is a real number,

\begin{equation*} A({\mathbf x} + {\mathbf y} ) = A {\mathbf x }+ A {\mathbf y} \qquad \text{and} \qquad \alpha A {\mathbf x} = A ( \alpha {\mathbf x}), \end{equation*}

where

\begin{equation*} {\mathbf x} = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix}. \end{equation*}

We will often abbreviate the matrix $A$ by writing $(a_{ij})\text{.}$

Conversely, if $T : {\mathbb R}^n \rightarrow {\mathbb R}^m$ is a linear map, we can associate a matrix $A$ with $T$ by considering what $T$ does to the vectors

\begin{align*} {\mathbf e}_1 & = (1, 0, \ldots, 0)^{\rm t}\\ {\mathbf e}_2 & = (0, 1, \ldots, 0)^{\rm t}\\ & \vdots & \\ {\mathbf e}_n & = (0, 0, \ldots, 1)^{\rm t}. \end{align*}

We can write any vector ${\mathbf x} = (x_1, \ldots, x_n)^{\rm t}$ as

\begin{equation*} x_1 {\mathbf e}_1 + x_2 {\mathbf e}_2 + \cdots + x_n {\mathbf e}_n. \end{equation*}

Consequently, if

\begin{align*} T({\mathbf e}_1) & = (a_{11}, a_{21}, \ldots, a_{m1})^{\rm t},\\ T({\mathbf e}_2) & = (a_{12}, a_{22}, \ldots, a_{m2})^{\rm t},\\ & \vdots & \\ T({\mathbf e}_n) & = (a_{1n}, a_{2n}, \ldots, a_{mn})^{\rm t}, \end{align*}

then

\begin{align*} T({\mathbf x} ) & = T(x_1 {\mathbf e}_1 + x_2 {\mathbf e}_2 + \cdots + x_n {\mathbf e}_n)\\ & = x_1 T({\mathbf e}_1) + x_2 T({\mathbf e}_2) + \cdots + x_n T({\mathbf e}_n)\\ & = \left( \sum_{k=1}^{n} a_{1k} x_k, \ldots, \sum_{k=1}^{n} a_{mk} x_k \right)^{\rm t}\\ & = A {\mathbf x}. \end{align*}

Example12.1

If we let $T : {\mathbb R}^2 \rightarrow {\mathbb R}^2$ be the map given by

\begin{equation*} T(x_1, x_2) = (2 x_1 + 5 x_2, - 4 x_1 + 3 x_2), \end{equation*}

the axioms that $T$ must satisfy to be a linear transformation are easily verified. The column vectors $T {\mathbf e}_1 = (2, -4)^{\rm t}$ and $T {\mathbf e}_2 = (5,3)^{\rm t}$ tell us that $T$ is given by the matrix

\begin{equation*} A = \begin{pmatrix} 2 & 5 \\ -4 & 3 \end{pmatrix}. \end{equation*}

Since we are interested in groups of matrices, we need to know which matrices have multiplicative inverses. Recall that an $n \times n$ matrix $A$ is exactly when there exists another matrix $A^{-1}$ such that $A A^{-1} = A^{-1} A = I\text{,}$ where

\begin{equation*} I = \begin{pmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{pmatrix} \end{equation*}

is the $n \times n$ identity matrix. From linear algebra we know that $A$ is invertible if and only if the determinant of $A$ is nonzero. Sometimes an invertible matrix is said to be .

Example12.2

If $A$ is the matrix

\begin{equation*} \begin{pmatrix} 2 & 1 \\ 5 & 3 \end{pmatrix}, \end{equation*}

then the inverse of $A$ is

\begin{equation*} A^{-1} = \begin{pmatrix} 3 & -1 \\ -5 & 2 \end{pmatrix}. \end{equation*}

We are guaranteed that $A^{-1}$ exists, since $\det(A) = 2 \cdot 3 - 5 \cdot 1 = 1$ is nonzero.

Some other facts about determinants will also prove useful in the course of this chapter. Let $A$ and $B$ be $n \times n$ matrices. From linear algebra we have the following properties of determinants.

The determinant is a homomorphism into the multiplicative group of real numbers; that is, $\det( A B) = (\det A )(\det B)\text{.}$
If $A$ is an invertible matrix, then $\det(A^{-1}) = 1 / \det A\text{.}$
If we define the transpose of a matrix $A = (a_{ij})$ to be $A^{\rm t} = (a_{ji})\text{,}$ then $\det(A^{\rm t}) = \det A\text{.}$
Let $T$ be the linear transformation associated with an $n \times n$ matrix $A\text{.}$ Then $T$ multiplies volumes by a factor of $|\det A|\text{.}$ In the case of ${\mathbb R}^2\text{,}$ this means that $T$ multiplies areas by $|\det A|\text{.}$

Linear maps, matrices, and determinants are covered in any elementary linear algebra text; however, if you have not had a course in linear algebra, it is a straightforward process to verify these properties directly for $2 \times 2$ matrices, the case with which we are most concerned.

SubsectionThe General and Special Linear Groups

The set of all $n \times n$ invertible matrices forms a group called the . We will denote this group by $GL_n({\mathbb R})\text{.}$ The general linear group has several important subgroups. The multiplicative properties of the determinant imply that the set of matrices with determinant one is a subgroup of the general linear group. Stated another way, suppose that $\det(A) =1$ and $\det(B) = 1\text{.}$ Then $\det(AB) = \det(A) \det (B) = 1$ and $\det(A^{-1}) = 1 / \det A = 1\text{.}$ This subgroup is called the and is denoted by $SL_n({\mathbb R})\text{.}$

Example12.3

Given a $2 \times 2$ matrix

\begin{equation*} A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}, \end{equation*}

the determinant of $A$ is $ad-bc\text{.}$ The group $GL_2({\mathbb R})$ consists of those matrices in which $ad-bc \neq 0\text{.}$ The inverse of $A$ is

\begin{equation*} A^{-1} = \frac{1}{ad-bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}. \end{equation*}

If $A$ is in $SL_2({\mathbb R})\text{,}$ then

\begin{equation*} A^{-1} = \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}. \end{equation*}

Geometrically, $SL_2({\mathbb R})$ is the group that preserves the areas of parallelograms. Let

\begin{equation*} A = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix} \end{equation*}

be in $SL_2({\mathbb R})\text{.}$ In Figure 12.4, the unit square corresponding to the vectors ${\mathbf x} = (1,0)^{\rm t}$ and ${\mathbf y} = (0,1)^{\rm t}$ is taken by $A$ to the parallelogram with sides $(1,0)^{\rm t}$ and $(1, 1)^{\rm t}\text{;}$ that is, $A {\mathbf x} = (1,0)^{\rm t}$ and $A {\mathbf y} = (1, 1)^{\rm t}\text{.}$ Notice that these two parallelograms have the same area.

Figure12.4$SL_2(\mathbb R)$ acting on the unit square

SubsectionThe Orthogonal Group $O(n)$

Another subgroup of $GL_n({\mathbb R})$ is the orthogonal group. A matrix $A$ is if $A^{-1} = A^{\rm t}\text{.}$ The consists of the set of all orthogonal matrices. We write $O(n)$ for the $n \times n$ orthogonal group. We leave as an exercise the proof that $O(n)$ is a subgroup of $GL_n( {\mathbb R})\text{.}$

Example12.5

The following matrices are orthogonal:

\begin{equation*} \begin{pmatrix} 3/5 & -4/5 \\ 4/5 & 3/5 \end{pmatrix}, \quad \begin{pmatrix} 1/2 & -\sqrt{3}/2 \\ \sqrt{3}/2 & 1/2 \end{pmatrix}, \quad \begin{pmatrix} -1/\sqrt{2} & 0 & 1/ \sqrt{2} \\ 1/\sqrt{6} & -2/\sqrt{6} & 1/\sqrt{6} \\ 1/ \sqrt{3} & 1/ \sqrt{3} & 1/ \sqrt{3} \end{pmatrix}. \end{equation*}

There is a more geometric way of viewing the group $O(n)\text{.}$ The orthogonal matrices are exactly those matrices that preserve the length of vectors. We can define the length of a vector using the , or , of two vectors. The Euclidean inner product of two vectors ${\mathbf x}=(x_1, \ldots, x_n)^{\rm t}$ and ${\mathbf y}=(y_1, \ldots, y_n)^{\rm t}$ is

\begin{equation*} \langle {\mathbf x}, {\mathbf y} \rangle = {\mathbf x}^{\rm t} {\mathbf y} = (x_1, x_2, \ldots, x_n) \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix} = x_1 y_1 + \cdots + x_n y_n. \end{equation*}

We define the length of a vector ${\mathbf x}=(x_1, \ldots, x_n)^{\rm t}$ to be

\begin{equation*} \| {\mathbf x} \| = \sqrt{\langle {\mathbf x}, {\mathbf x} \rangle} = \sqrt{x_1^2 + \cdots + x_n^2}. \end{equation*}

Associated with the notion of the length of a vector is the idea of the distance between two vectors. We define the between two vectors ${\mathbf x}$ and ${\mathbf y}$ to be $\| {\mathbf x}-{\mathbf y} \|\text{.}$ We leave as an exercise the proof of the following proposition about the properties of Euclidean inner products.

Proposition12.6

Let ${\mathbf x}\text{,}$ ${\mathbf y}\text{,}$ and ${\mathbf w}$ be vectors in ${\mathbb R}^n$ and $\alpha \in {\mathbb R}\text{.}$ Then

$\langle {\mathbf x}, {\mathbf y} \rangle = \langle {\mathbf y}, {\mathbf x} \rangle\text{.}$
$\langle {\mathbf x}, {\mathbf y} + {\mathbf w} \rangle = \langle {\mathbf x}, {\mathbf y} \rangle + \langle {\mathbf x}, {\mathbf w} \rangle\text{.}$
$\langle \alpha {\mathbf x}, {\mathbf y} \rangle = \langle {\mathbf x}, \alpha {\mathbf y} \rangle = \alpha \langle {\mathbf x}, {\mathbf y} \rangle\text{.}$
$\langle {\mathbf x}, {\mathbf x} \rangle \geq 0$ with equality exactly when ${\mathbf x} = 0\text{.}$
If $\langle {\mathbf x}, {\mathbf y} \rangle = 0$ for all ${\mathbf x}$ in ${\mathbb R}^n\text{,}$ then ${\mathbf y} = 0\text{.}$

Example12.7

The vector ${\mathbf x} =(3,4)^{\rm t}$ has length $\sqrt{3^2 + 4^2} = 5\text{.}$ We can also see that the orthogonal matrix

\begin{equation*} A= \begin{pmatrix} 3/5 & -4/5 \\ 4/5 & 3/5 \end{pmatrix} \end{equation*}

preserves the length of this vector. The vector $A{\mathbf x} = (-7/5,24/5)^{\rm t}$ also has length 5.

Since $\det(A A^{\rm t}) = \det(I) = 1$ and $\det(A) = \det( A^{\rm t} )\text{,}$ the determinant of any orthogonal matrix is either 1 or $-1\text{.}$ Consider the column vectors

\begin{equation*} {\mathbf a}_j = \begin{pmatrix} a_{1j} \\ a_{2j} \\ \vdots \\ a_{nj} \end{pmatrix} \end{equation*}

of the orthogonal matrix $A= (a_{ij})\text{.}$ Since $AA^{\rm t} = I\text{,}$ $\langle {\mathbf a}_r, {\mathbf a}_s \rangle = \delta_{rs}\text{,}$ where

\begin{equation*} \delta_{rs} = \left\{ \begin{array}{cc} 1 & r = s \\ 0 & r \neq s \end{array} \right. \end{equation*}

is the Kronecker delta. Accordingly, column vectors of an orthogonal matrix all have length 1; and the Euclidean inner product of distinct column vectors is zero. Any set of vectors satisfying these properties is called an . Conversely, given an $n \times n$ matrix $A$ whose columns form an orthonormal set, it follows that $A^{-1} = A^{\rm t}\text{.}$

We say that a matrix $A$ is , , or when $\| T{\mathbf x}- T{\mathbf y} \| =\| {\mathbf x}- {\mathbf y} \|\text{,}$ $\| T{\mathbf x} \| =\| {\mathbf x} \|\text{,}$ or $\langle T{\mathbf x}, T{\mathbf y} \rangle = \langle {\mathbf x},{\mathbf y} \rangle\text{,}$ respectively. The following theorem, which characterizes the orthogonal group, says that these notions are the same.

Theorem12.8

Let $A$ be an $n \times n$ matrix. The following statements are equivalent.

The columns of the matrix $A$ form an orthonormal set.
$A^{-1} = A^{\rm t}\text{.}$
For vectors ${\mathbf x}$ and ${\mathbf y}\text{,}$ $\langle A{\mathbf x}, A {\mathbf y} \rangle = \langle {\mathbf x}, {\mathbf y} \rangle\text{.}$
For vectors ${\mathbf x}$ and ${\mathbf y}\text{,}$ $\| A{\mathbf x}- A{\mathbf y} \| = \| {\mathbf x}- {\mathbf y} \|\text{.}$
For any vector ${\mathbf x}\text{,}$ $\| A{\mathbf x} \| = \| {\mathbf x}\|\text{.}$

Proof

We have already shown (1) and (2) to be equivalent.

$(2) \Rightarrow (3)\text{.}$

\begin{align*} \langle A{\mathbf x}, A{\mathbf y} \rangle & = (A {\mathbf x})^{\rm t} A {\mathbf y}\\ & = {\mathbf x}^{\rm t} A^{\rm t} A {\mathbf y}\\ & = {\mathbf x}^{\rm t} {\mathbf y}\\ & = \langle {\mathbf x}, {\mathbf y} \rangle. \end{align*}

$(3) \Rightarrow (2)\text{.}$ Since

\begin{align*} \langle {\mathbf x}, {\mathbf x} \rangle & = \langle A{\mathbf x}, A{\mathbf x} \rangle\\ & = {\mathbf x}^{\rm t} A^{\rm t} A {\mathbf x}\\ & = \langle {\mathbf x}, A^{\rm t} A{\mathbf x} \rangle, \end{align*}

we know that $\langle {\mathbf x}, (A^{\rm t} A - I){\mathbf x} \rangle = 0$ for all ${\mathbf x}\text{.}$ Therefore, $A^{\rm t} A -I = 0$ or $A^{-1} = A^{\rm t}\text{.}$

$(3) \Rightarrow (4)\text{.}$ If $A$ is inner product-preserving, then $A$ is distance-preserving, since

\begin{align*} \| A{\mathbf x} - A{\mathbf y} \|^2 & = \| A({\mathbf x} - {\mathbf y}) \|^2\\ & = \langle A({\mathbf x} - {\mathbf y}), A({\mathbf x} - {\mathbf y}) \rangle\\ & = \langle {\mathbf x} - {\mathbf y}, {\mathbf x} - {\mathbf y} \rangle\\ & = \| {\mathbf x} - {\mathbf y} \|^2. \end{align*}

$(4) \Rightarrow (5)\text{.}$ If $A$ is distance-preserving, then $A$ is length-preserving. Letting ${\mathbf y} = 0\text{,}$ we have

\begin{equation*} \| A{\mathbf x}\| = \| A{\mathbf x}- A{\mathbf y} \| = \| {\mathbf x}- {\mathbf y} \| = \| {\mathbf x} \|. \end{equation*}

$(5) \Rightarrow (3)\text{.}$ We use the following identity to show that length-preserving implies inner product-preserving:

\begin{equation*} \langle {\mathbf x}, {\mathbf y} \rangle = \frac{1}{2} \left[ \|{\mathbf x} +{\mathbf y}\|^2 - \|{\mathbf x}\|^2 - \|{\mathbf y}\|^2 \right]. \end{equation*}

Observe that

\begin{align*} \langle A {\mathbf x}, A {\mathbf y} \rangle & = \frac{1}{2} \left[ \|A {\mathbf x} + A {\mathbf y} \|^2 - \|A {\mathbf x} \|^2 - \|A {\mathbf y} \|^2 \right]\\ & = \frac{1}{2} \left[ \|A ( {\mathbf x} + {\mathbf y} ) \|^2 - \|A {\mathbf x} \|^2 - \|A {\mathbf y} \|^2 \right]\\ & = \frac{1}{2} \left[ \|{\mathbf x} + {\mathbf y}\|^2 - \|{\mathbf x}\|^2 - \|{\mathbf y}\|^2 \right]\\ & = \langle {\mathbf x}, {\mathbf y} \rangle. \end{align*}

Figure12.9$O(2)$ acting on $\mathbb R^2$

Example12.10

Let us examine the orthogonal group on ${\mathbb R}^2$ a bit more closely. An element $T \in O(2)$ is determined by its action on ${\mathbf e}_1 = (1, 0)^{\rm t}$ and ${\mathbf e}_2 = (0, 1)^{\rm t}\text{.}$ If $T({\mathbf e}_1) = (a,b)^{\rm t}\text{,}$ then $a^2 + b^2 = 1$ and $T({\mathbf e}_2) = (-b, a)^{\rm t}\text{.}$ Hence, $T$ can be represented by

\begin{equation*} A = \begin{pmatrix} a & -b \\ b & a \end{pmatrix} = \begin{pmatrix} \cos \theta & - \sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}, \end{equation*}

where $0 \leq \theta \lt 2 \pi\text{.}$ A matrix $T$ in $O(2)$ either reflects or rotates a vector in ${\mathbb R}^2$ (Figure 12.9). A reflection about the horizontal axis is given by the matrix

\begin{equation*} \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}, \end{equation*}

whereas a rotation by an angle $\theta$ in a counterclockwise direction must come from a matrix of the form

\begin{equation*} \begin{pmatrix} \cos \theta & \sin \theta \\ \sin \theta & -\cos \theta \end{pmatrix}. \end{equation*}

A reflection about a line $\ell$ is simply a reflection about the horizontal axis followed by a rotation. If $\det A =-1\text{,}$ then $A$ gives a reflection.

Two of the other matrix or matrix-related groups that we will consider are the special orthogonal group and the group of Euclidean motions. The , $SO(n)\text{,}$ is just the intersection of $O(n)$ and $SL_n({\mathbb R})\text{;}$ that is, those elements in $O(n)$ with determinant one. The , $E(n)\text{,}$ can be written as ordered pairs $(A, {\mathbf x})\text{,}$ where $A$ is in $O(n)$ and ${\mathbf x}$ is in ${\mathbb R}^n\text{.}$ We define multiplication by

\begin{equation*} (A, {\mathbf x}) (B, {\mathbf y}) = (AB, A {\mathbf y} +{\mathbf x}). \end{equation*}

The identity of the group is $(I,{\mathbf 0})\text{;}$ the inverse of $(A, {\mathbf x})$ is $(A^{-1}, -A^{-1} {\mathbf x})\text{.}$ In Exercise 12.3.6, you are asked to check that $E(n)$ is indeed a group under this operation.

Figure12.11Translations in $\mathbb R^2$