A comparison of the convergence ofgradient descent with optimal step size (in green) and conjugate gradient (in red) for minimizing a quadratic function associated with a given linear system. Conjugate gradient, assuming exact arithmetics, converges in at most n steps where n is the size of the matrix of the system (here n=2).
In numerical linear algebra, the method of successive over-relaxation (SOR) is a variant of the Gauss–Seidel method for solving alinear system of equations, resulting in faster convergence. A similar method can be used for any slowly converging iterative process. It was devised simultaneously by David M. Young and by H. Frankel in 1950 for the purpose of automatically solving linear systems on digital computers. Over-relaxation methods had been used before the work of Young and Frankel. For instance, the method of Lewis Fry Richardson, and the methods developed by R. V. Southwell. However, these methods were designed for computation by human calculators, and they required some expertise to ensure convergence to the solution which made them inapplicable for programming on digital computers. These aspects are discussed in the thesis of David M. Young.[1]
The soroban(算盤, そろばん?, counting tray) is an abacus developed in Japan. It is derived from the suanpan, imported from China to Japan around 1600.[1] Like the suanpan, the soroban is still used today, despite the proliferation of practical and affordable pocketelectronic calculators.
that takes account of information about the inclusion of — the spectrum of the operator — in a certain set , and uses the properties and parameters of those polynomials that deviate least from zero on and are equal to 1 at 0.
The most well-developed Chebyshev iteration method is obtained when in (1), is a linear self-adjoint operator and , where are the boundary points of the spectrum; then the Chebyshev iteration method uses the properties of the Chebyshev polynomials of the first kind, . For this case one considers two types of Chebyshev iteration methods:
in which for a given one obtains a sequence as . In (2) and (3) and are the numerical parameters of the method. If , then the initial error and the error at the -th iteration are related by the formula
The methods (2) and (3) can be optimized on the class of problems for which by choosing the parameters such that in (4) is the polynomial least deviating from zero on . It was proved in 1881 by P.L. Chebyshev that this is the polynomial
An important problem for small is the question of the stability of the method (2), (5), (11). An imprudent choice of may lead to a catastrophic increase in for some , to the loss of significant figures, or to an increase in the rounding-off errors allowed on intermediate iteration. There exist algorithms that mix the parameters in (11) and guarantee the stability of the calculations: for see Iteration algorithm; and for one of the algorithms for constructing is as follows. Let , and suppose that has been constructed, then
There exists a class of methods (2) — the stable infinitely repeated optimal Chebyshev iteration methods — that allows one to repeat the method (2), (5), (11) after iterations in such a way that it is stable and such that it becomes optimal again for some sequence . For the case , it is clear from the formula
then once again one obtains a Chebyshev iteration method after iterations. To ensure stability, the set(14) is decomposed into two sets: in the -th set, , one puts the for which is a root of the -th bracket in (13); within each of the subsets the are permuted according to the permutation . For one substitutes elements of the first set in (5), (11), and for one uses the second subset; the permutation is defined in the same way. Continuing in an analogous way the process of forming parameters, one obtains an infinite sequence , uniformly distributed on , called a -sequence, for which the method (2) becomes optimal with and
The theory of the Chebyshev iteration methods (2), (3) can be extended to partial eigen value problems. Generalizations also exist to a certain class of non-self-adjoint operators, when lies in a certain interval or within a certain domain of special shape (in particular, an ellipse); when information is known about the distribution of the initial error; or when the Chebyshev iteration method is combined with the method of conjugate gradients.
One of the effective methods of speeding up to the convergence of the iterations (2), (3) is a preliminary transformation of equation (1) to an equivalent equation of the form
and the application of the Chebyshev iteration method to this equation. The operator is defined by taking account of two facts: 1) the algorithm for computing a quantity of the form should not be laborious; and 2) should lie in a set that ensures the fast convergence of the Chebyshev iteration method.
V.I. Lebedev, S.A. Finogenov, “The order of choices of the iteration parameters in the cyclic Chebyshev iteration method” Zh. Vychisl. Mat. i Mat. Fiz. , 11 : 2 (1971) pp. 425–438 (In Russian)
V.I. Lebedev, S.A. Finogenov, “Solution of the problem of parameter ordering in Chebyshev iteration methods” Zh. Vychisl. Mat. i Mat. Fiz , 13 : 1 (1973) pp. 18–33 (In Russian)
V.I. Lebedev, S.A. Finogenov, “The use of ordered Chebyshev parameters in iteration methods” Zh. Vychisl. Mat. i Mat. Fiz. , 16 : 4 (1976) pp. 895–907 (In Russian)
V.I. Lebedev, “Iterative methods for solving operator equations with spectrum located on several segments” Zh. Vychisl. Mat. i Mat. Fiz. , 9 : 6 (1969) pp. 1247–1252 (In Russian)
V.I. Lebedev, “Iteration methods for solving linear operator equations, and polynomials deviating least from zero” , Mathematical analysis and related problems in mathematics , Novosibirsk (1978) pp. 89–108 (In Russian)
In the Western literature the method (2), (5), (11) is known as the Richardson method of first degree [a2]or, more widely used, the Chebyshev semi-iterative method of first degree. The method goes back to an early paper of L.F. Richardson , where the method (2), (5) was already proposed. However, Richardson did not identify the zeros of with the zeros of (shifted) Chebyshev polynomials as done in (11), but (less sophisticatedly) sprinkled them uniformly over the interval . The use of Chebyshev polynomials seems to be proposed for the first time in [a1] and [a3].
The “stable infinitely repeated optimal Chebyshev iteration methods” outlined above are based on the identity , which immediately leads to the factorization
This formula has already been used in [a1] in the numerical determination of fundamental modes.
The method (3), (9) is known as Richardson’s method or Chebyshev’s semi-iterative method of second degree. It was suggested in [a9] and turns out to be completely stable; thus, at the cost of an extra storage array the instability problems associated with the first-degree process are avoided.
As to the choice of the transformation operator (called “preconditioningpreconditioning” ), an often used “preconditionerpreconditioner” is the so-called SSOR matrix (Symmetric Successive Over-Relaxation matrix) proposed in [a8].
Introductions to the theory of Chebyshev semi-iterative methods are provided by [a2] and [a3]. An extensive analysis can be found in [a10], Chapt. 5 and in [a4]. In this work the spectrum of the operator is assumed to be real. An analysis of the case where the spectrum is not real can be found in [a5].
Instead of using minimax polynomials, one may consider integral measures for “minimizing” on . This leads to the theory of kernel polynomials introduced in [a9] and extended in [a11], Chapt. 5.
Iterative methods as opposed to direct methods (cf. Direct method) only make sense when the matrix is sparse (cf. Sparse matrix). Moreover, their versatility depends on how large an error is tolerated; often other errors, e.g., truncation errors in discretized systems of partial differential equations, are more dominant.
When no information about the eigen structure of is available, or in the non-self-adjoint case, it is often preferable to use the method of conjugate gradients (cf. Conjugate gradients, method of). Numerical algorithms based on the latter method combined with incomplete factorization have proven to be one of the most efficient ways to solve linear problems up to now (1987).
L.F. Richardson, “The approximate arithmetical solution by finite differences of physical problems involving differential equations, with an application to the stresses in a masonry dam” Philos. Trans. Roy. Soc. London Ser. A , 210 (1910) pp. 307–357
L.F. Richardson, “The approximate arithmetical solution by finite differences of physical problems involving differential equations, with an application to the stresses in a masonry dam” Proc. Roy. Soc. London Ser. A , 83 (1910) pp. 335–336
E.L. Wachspress, “Iterative solution of elliptic systems, and applications to the neutron diffusion equations of nuclear physics” , Prentice-Hall (1966)
We seek the solution to a set of linear equations, expressed in matrix terms as
The Richardson iteration is
where ω is a scalar parameter that has to be chosen such that the sequence x(k) converges.
It is easy to see that the method is correct, because if it converges, then and x(k) has to approximate a solution of Ax = b.
Convergence
Subtracting the exact solution x, and introducing the notation for the error , we get the equality for the errors
e(k + 1) = e(k) − ωAe(k) = (I − ωA)e(k).
Thus,
for any vector norm and the corresponding induced matrix norm. Thus, if the method convergences.
Suppose that A is diagonalizable and that (λj,vj) are the eigenvalues and eigenvectors of A. The error converges to 0 if | 1 − ωλj | < 1 for all eigenvalues λj. If, e.g., all eigenvalues are positive, this can be guaranteed if ω is chosen such that 0 < ω < 2 / λmax(A). The optimal choice, minimizing all | 1 − ωλj | , is ω = 2 / (λmin(A) + λmax(A)), which gives the simplest Chebyshev iteration.
If there are both positive and negative eigenvalues, the method will diverge for any ω if the initial error e(0) has nonzero components in the corresponding eigenvectors.
References
Richardson, L.F. (1910). “The approximate arithmetical solution by finite differences of physical problems involving differential equations, with an application to the stresses in a masonry dam”.Philos. Trans. Roy. Soc. London Ser. A210: 307–357.
The Fréchet derivative has applications throughout mathematical analysis, and in particular to the calculus of variations and much of nonlinear analysis and nonlinear functional analysis. It has applications to nonlinear problems throughout the sciences.
a Metzler matrix is a matrix in which all the off-diagonal components are nonnegative (equal to or greater than zero)
Metzler matrices appear in stability analysis of time delayed differential equations and positive linear dynamical systems. Their properties can be derived by applying the properties of Nonnegative matrices to matrices of the form M + aI where M is a Metzler matrix.
a P-matrix is a complex square matrix with every principal minor > 0. A closely related class is that of P0-matrices, which are the closure of the class of P-matrices, with every principal minor 0.
Spectra of P-matrices
By a theorem of Kellogg, the eigenvalues of P– and P0– matrices are bounded away from a wedge about the negative real axis as follows:
If {u1,…,un} are the eigenvalues of an n-dimensional P-matrix, then
If {u1,…,un}, , i = 1,…,n are the eigenvalues of an n-dimensional P0-matrix, then
Notes
The class of nonsingular M-matrices is a subset of the class of P-matrices. More precisely, all matrices that are both P-matrices and Z-matrices are nonsingular M-matrices.
If the Jacobian of a function is a P-matrix, then the function is injective on any rectangular region of .
A related class of interest, particularly with reference to stability, is that of P( − )-matrices, sometimes also referred to as N − P-matrices. A matrix A is a P( − )-matrix if and only if ( − A) is a P-matrix (similarly for P0-matrices). Since σ(A) = − σ( − A), the eigenvalues of these matrices are bounded away from the positive real axis.
References
R. B. Kellogg, On complex eigenvalues of M and P matrices, Numer. Math. 19:170-175 (1972)
Li Fang, On the Spectra of P– and P0-Matrices, Linear Algebra and its Applications 119:1-25 (1989)
D. Gale and H. Nikaido, The Jacobian matrix and global univalence of mappings, Math. Ann. 159:81-93 (1965)