# Solving Linear Equations

## 18.1 Introduction

This lecture is about solving linear equations of the form Ax = b, where A is
a symmetric positive

semi-definite matrix. In fact, we will usually assume that A is diagonally
dominant, and will

sometimes assume that it is in fact positive definite .

We will brie y describe the Chebyshev method for solving such systems, and
the related Conjugate

Gradient. We will then explain how these methods can be accelerated by
preconditioning. In

the next few lectures, we will see how material from the class can be applied to
construct good

preconditioners.

## 18.2 Approximate Inverses

In undergraduate linear algebra classes , we are taught to solve a system like
Ax = b by first

computing the inverse of A, call it C, and then setting x = Cb. However, this
can take way too

long. In this lecture, we will see some ways of quickly computing approximate
inverses of A, when A

is symmetric and positive semi-definite. By "quickly", we mean that we might not
actually compute

the entries of the approximate inverse, but rather just provide a fast procedure
for multiplying it

by a vector. I'll talk more about that part later, so let's first consider a
reasonable notion of

approximate.

The fact that the inverse C satisfies AC = I is equivalent to saying that all
eigenvalues of AC are

1. We will call C an -approximate inverse of A if all eigenvalues of AC lie in
In

this case, all eigenvalues of I - AC lie in
from which we can see that C can be used to

approximately solve linear systems in A. If we set x = Cb, then

In our construction, we will guarantee that AC is symmetric, and so and

## 18.3 Chebyshev Polynomials

We will exploit matrices C of the form

Note that one can quickly compute the product Cb : we will iteratiely compute
A^{i}b, and then take

the sum . So, if A has m non- zero entries , then this will take O(mt) steps. I
should warn you that

I'm ignoring some numerical issues here.

Now, let's find a good choice of polynomials. Recall that we want

to have all its eigenvalues between . Let

Then, for each eigenvalue
is an eigenvalue of I - AC. Assuming that A is positive

definite, its eigenvalues all lie between
and
where
So, it suffices to find a

polynomial p with constant term 1 such that
for all
As in Lecture 8,

we will use Chebyshev polynomials to construct p. We recall that the k-th
Chebyshev polynomial,

T_{k}, has the following properties

1.
T_{k} has degree k.

2.

3.
T_{k}(x) is monotonically increasing for x≥1.

4.

To express p(x) in terms of a Chebyshev polynomial, we should map the range
on which we want

p to be small,
to [-1, 1]. We will accomplish this with the linear map:

Note that

So, to guarantee that the constant coefficient in p(x) is zero, we should set

We know that
To find p(x) for x in this range, we must compute

We will express the result of our computation in terms of the condition number
of , which

for a symmetric matrix is

We have

and so by properties 3 and 4 of Chebyshev polynomials,

Thus,

for
and so all eigenvalues of I-AC will have absolute value at most

So, to get accuracy
in our solution to Ax = b, we need

## 18.4 Computational properties, and the Conjugate Gradient

The Chebyshev properties have some useful properties that I should mention,
but which I will not

prove. The most useful property is that their coefficients have a clean
recursive definition. As a

result, we can construct an algorithm, called the Chebyshev method, that just
takes as input a

lower bound on
and an upper bound on
and, after t iterations produces the estimate

obtained from
.
In particular, the algorithm is iterative, and at each iteration obtain the
correct

estimate.

The Chebyshev method is not used much in practice because there is a better
method|the Conjugate

Gradient. The CG method adaptively computes the coefficients of the polynomial,
and is

guaranteed to compute the optimal coefficients at each iteration. I will now dwell
on how it works

now, but I will mention that it is quite efficient, and always at least as good as
the Chebyshev

method.

## 18.5 Preconditioning

So, we know that we can obtain an approximate solution to a system of the
form Ax = b in time

depending on
But, what if
is large? In this case, we can try to precondition A .

That means finding a matrix B that is similar to A, but for which it is easy to
solve equations like

Bz = c. We then apply the Chebyshev method, or the Conjugate Gradient, to solve

The running time of the preconditioned method is effected in two ways :

1. At each iteration, in addition to multiplying a vector by A, we must solve
a linear system

Bz = c, which we can think of as multiplying by B^{-1}, and

2. The number of iterations now depends the eigenvalues of
(B^{-1}A). If A and B are positive

definite, then all the eigenvalues of B^{-1}A will be real and non-negative, so the
number of

iterations just depends upon

Note that I am ignoring one detail that B^{-1}A is not
symmetric. Please just trust me for now that

this doesn't make much difference, so I can get through the rest of the lecture.

Our goal in preconditioning is to find a matrix B that
guarantees that the eigenvalues of B^{-1}A are

not too big, and so that it is easy to solve a linear system in B. It will help
if I give a different

characterization of

Prev | Next |