# Maximization of Quadratic Functions

I. Introduction.

A. Quadratic functions are among the simplest, and generally most useful, func-

tions that we might hope to encounter.

B. An understandning of their maximization is certainly a good first step in the

direction of understanding maximization of more general functions.

C. It turns out that the theory of quadratic functions is prototypical, in that
the

main features are preserved or generalized in the general theory.

II. Quadratic Functions.

A. Polynomials.

1. A polynomial in the variables is an
expression of the form

where I is a finite index set and, for each i,
is a real number and

are nonnegative integers.

2. Of course polynomials can be added and multiplied . In fact with these

operations the set of polynomials is a commutative ring with unit.

B. Degree

1. The degree of a monomial is the sum
of the

exponents , and the degree of the polynomial q above is, by definition, the

maximal degree of any of the monomials

2. A polynomial is said to be homogeneous of degree d if
all its monomials

have degree d.

3. Any polynomial p(x) of degree d can be written as a sum

where each is homogeneous of degree i.

4. A quadratic function is a polynomial

of degree 2.

C. A Matrix Representation

1. The general form of a quadratic function is

where

2. An n × n matrix M with entries
is symmetric if it is equal to its

own transpose:
for all 1≤ i; j ≤ n. For any quadratic

function as above, there is a unique symmetric matrix representing the

homogeneous part of degree 2. There are other ways of singling out one

particular matrix from the in finite collection that all represent the same

homogeneous polynomial of degree 2. The advantages of the symmetric

representation will emerge as we go along.

D. Differentiation .

1. We define the derivative Dq(x) of q at x to be the function that assigns

to each
the number

2. We evaluate the difference quotient by computing:

a. The transpose of a 1 ×1 matrix is itself, so

,
and since M is symmetric we have
.
We now

see that

b. Note that Dq(x) is a linear function from IR^{n} to IR.

III. Maximization of Quadratic Functions.

A. The first order condition for maximization states that if
is a maximizer of

q(x), then is a critical point of q, which means that
for all

.

1. If for some v, then we could easily
show that either

or for sufficiently small ε, contradicting

the assumption that is a maximum.

2. If M is invertible, then the only critical point is
.

3. If M is not invertible, then there are either no critical points or infinitely

many critical points.

B. Assuming that the origin is a critical point, we now want to investigate
condi-

tions under which it is a maximum, and when it is the only maximum.

1. The matrix M is positive definite if
for all .

2. The matrix M is positive semidefinite if
for all .

3. The matrix M is negative definite if for
all .

4. The matrix M is negative semidefinite if
for all .

C. We now want to characterize symmetric definite matrices. We will show that,

by changing coordinate systems, we can display such a matrix as a diagonal

matrix, with the signs of the entries on the diagonal corresponding to the

version of definiteness being characterized.

IV. Orthogonal Transformations

A. Let P be an n × n matrix, which we think of as representing a change of

coordinates.

1. We ask for the conditions under which P preserves the inner product.

a. The inner product can be represented as a product of matrixes:

b. For P to preserve the inner product means that for all
,

c. This is the case if and only if P'P = I, i.e.,
.

2. Such a matrix P is said to be an orthogonal matrix. The set of such

matrices is closed under multiplication and inversion, and is therefore a

group , called the orthogonal group, and typically denoted by O(n).

B. Diagonalization of Symmetric Matrices.

1. If x'Mx is a quadratic form with M a symmetric matrix, it is intuitive

that we would not say that the "geometry" of the function had changed

if we change the coordinates used to represent the point x is a way that

preserved the inner product. Let y denote the same point in the new

system of coordinates , and let the two systems of
coordinates be related

by the formula x = Py where P is an n × n orthgonal matrix. We have

so in the new coordinate system the quadratic form is represented by the

matrix P'MP.

**Theorem:** If M is a symmetric matrix, there is an orthogonal matrix P such that
P'MP

is a diagonal matrix.

**Corollary:** A symmetric n × n matrix is positive definite (positive semidefinite,
nega-

tive definite, negative semidefinite) if and only if, for any orthogonal matrix P
such that

P0MP is diagonal, all the diagonal entries of P'MP are positive (nonnegative,
negative,

nonpositive).

V. The Characteristic Polynomial.

A. For an n × n matrix M, if Av = λ v for some
and some
, we say

that v is an eigenvector of M while λ is an eigenvalue of M.

1. If λ is an eigenvalue, then M-λI is a singular matrix, and its determinant

is 0.

2. Conversely, if M - λ I is singular, then (M - λI)v = 0 for some v ≠ 0,

and λ is an eigenvalue.

B. The characteristic polynomial of M is the expression

det(M -λ I)

which is a polynomial of degree n.

C. The characteristic polynomial is invariant under coordinate changes: if Q is

any nonsingular matrix, then, since det(AB) = det(A) · det(B) for any n × n

matrices A and B,

so that M and Q^{-1}MQ have the same characteristic polynomial.

D. From this we can conclude that, if M is symmetric, all its eigenvalues are
real.

Prev | Next |