Iterative solver
Instead of a Direct Solver, as has been used so far, there is the possibility
of implementing an Iterative Solver. Due to lack of experience, I hesitated
to do it for a long time. But now I've finished a few programs, and the result
has been surprising to me! Iterative methods, as compared to a direct method,
seem to be FAR better for the bigger problems! Our method is called "Gauss-
Seidel with successive over-relaxation". It's nothing new, actually ...
Crucial part of the code is the Gauss pre-processor, which in fact is nothing
but a large ("Topology ...") Sorting \& Searching program. The "Gauss" program
is surprisingly fast, nevertheless, because I've made an explicit use of
the problem's structure. This has been a very dedicated programming exercise.
It also means that you can't easily change the existing topology of the
of the calculation domain, being a half-sphere.
The following serves as background info. The merging algorithm of "inbreng.f"
was developed because I was in the need of sorting out a huge collection of
Midi(music) files. The accompanying BASIC program was published on the Net.
Now I've known for a long time that using iterative methods for finite element
problems involves kind of a sorting procedure, which seemed prohibiting
at first sight.
But then suddenly it occurred to me that I could use ... the same algorithm as
with the Midi files! Which turned out to be very efficient, for quite another
purpose. The original BASIC program, called 'midiot.bas' is included herewith.
If you are not a MIDIot, then it's useless. If you are, then you'd better stop
collecting everything from the net, so that you don't have to use it anyway :-)
As has been stated previously,
the iterative method needs some pre-processing first. This is the "Gauss" code.
All of the Geometry is contained in this part. The PreProcessor writes (quite
a lot of) data, called "Gauss.dat" to disk. After "Gauss", guess what, "Seidel"
must be invoked several times. This program also contains the Boundary
Conditions, and a couple of iteration parameters, which can be adjusted.
"Seidel" leaves it's "Seidel.dat" data on disk for restart. Hence you must do
"Gauss Seidel Seidel Seidel ..." until convergence occurs.
MOST = number of iterations within 1 run
EPS = criterion for convergence
OO = over-relaxation factor
I have already made a choice for these numbers. Especially the "OO" factor,
usually denoted as $\omega$, is rather critical. A good choice will accelerate
the iterations a great deal. Quoted without permission: "In general, it is not
possible to compute in advance the value of $\omega$ that is optimal with
respect to the rate of convergence [ ... ] Frequently, some heuristic estimate
is used, such as $\omega = 2 - O(h)$ where $h$ is the mesh spacing of the
discretization of the underlying physical domain." Be aware of the possibility
that the value of $\omega$ probably must be changed as soon as the physics is
altered in subroutine COEFFS. The $\omega$ in the program was determined by
experiment: simply by looking at the convergence with different values of it.
We think that's the best strategy anyway. The $\omega$ number must be in
the interval [0,2], thus: $ 0 \leq \omega < 2 $, but people say it's better
to over-estimate than to under-estimate it. (Thus, instead of "1.82" you might
try "1.83" but not "1.81" ;-)
An advantage of iterative methods is also that you can see what the solution
is going to look like before obtaining full "accurate" results. For example,
first choose EPS = 1.E-4 and then refine the solution with EPS=1.E-5.
It's a slow mental process, but it gradually dawns to me that I have made the
wrong choice by implementing a direct solver in the first version of the CR
program. Following below is a couple of measurements. No use has been made of
Convex 3840 vector-processing; hence pure scalar performance, as will be the
case with a moderate workstation (such as a RS-6000 machine).
Grid = 18 x 18 x 12 :
Direct method: 404.5 real 359.1 user + 15.8 sys = 374.9
Iterative method:
Gauss: 14.6 real 13.8 user 0.2 sys
Seidel: 19.9 real 16.9 user 0.6 sys
---- ---
30.7 + 0.8 = 31.1 total
Comparison = 374.9 / 31.1 = 12 times faster !!!
Grid = 35 x 35 x 24 :
Direct method:
Pivot ............................*** CPU time limit exceeded
because the CPU time limit of our long queue is 5 hours.
By counting dots we see that the direct solver has done only
28/72 'th part of the work to be done. By extrapolation we
get an estimate of > 13 hours. This number is also obtained
by multiplying the result of the 18x18x12 grid with 2^7 :
128 x 374.9 /3600 = 13.33 hours.
Iterative method:
Gauss: 130.7 real 121.1 user 2.1 sys
Seidel: 559.2 real 515.5 user 8.4 sys
----- ----
636.6 + 10.5 = 647.1 total
Hence the improvement in performance comes close to a factor
of ... 75 !!!
So, iterative methods are really sensational! I get the unbelievable result
that computations will be completed in 10 minutes instead of 10 hours! Also
the amount of core memory needed should be hardly a problem for a moderate
workstation: approximately 5 MB for a 35x35x24 grid.