Iterative solver

Instead of a Direct Solver, as has been used so far, there is the possibility of implementing an Iterative Solver. Due to lack of experience, I hesitated to do it for a long time. But now I've finished a few programs, and the result has been surprising to me! Iterative methods, as compared to a direct method, seem to be FAR better for the bigger problems! Our method is called "Gauss- Seidel with successive over-relaxation". It's nothing new, actually ...
Crucial part of the code is the Gauss pre-processor, which in fact is nothing but a large ("Topology ...") Sorting \& Searching program. The "Gauss" program is surprisingly fast, nevertheless, because I've made an explicit use of the problem's structure. This has been a very dedicated programming exercise. It also means that you can't easily change the existing topology of the of the calculation domain, being a half-sphere.
The following serves as background info. The merging algorithm of "inbreng.f" was developed because I was in the need of sorting out a huge collection of Midi(music) files. The accompanying BASIC program was published on the Net. Now I've known for a long time that using iterative methods for finite element problems involves kind of a sorting procedure, which seemed prohibiting at first sight. But then suddenly it occurred to me that I could use ... the same algorithm as with the Midi files! Which turned out to be very efficient, for quite another purpose. The original BASIC program, called 'midiot.bas' is included herewith. If you are not a MIDIot, then it's useless. If you are, then you'd better stop collecting everything from the net, so that you don't have to use it anyway :-)
As has been stated previously, the iterative method needs some pre-processing first. This is the "Gauss" code. All of the Geometry is contained in this part. The PreProcessor writes (quite a lot of) data, called "Gauss.dat" to disk. After "Gauss", guess what, "Seidel" must be invoked several times. This program also contains the Boundary Conditions, and a couple of iteration parameters, which can be adjusted. "Seidel" leaves it's "Seidel.dat" data on disk for restart. Hence you must do "Gauss Seidel Seidel Seidel ..." until convergence occurs.
  MOST = number of iterations within 1 run
  EPS  = criterion for convergence
  OO   = over-relaxation factor
I have already made a choice for these numbers. Especially the "OO" factor, usually denoted as $\omega$, is rather critical. A good choice will accelerate the iterations a great deal. Quoted without permission: "In general, it is not possible to compute in advance the value of $\omega$ that is optimal with respect to the rate of convergence [ ... ] Frequently, some heuristic estimate is used, such as $\omega = 2 - O(h)$ where $h$ is the mesh spacing of the discretization of the underlying physical domain." Be aware of the possibility that the value of $\omega$ probably must be changed as soon as the physics is altered in subroutine COEFFS. The $\omega$ in the program was determined by experiment: simply by looking at the convergence with different values of it. We think that's the best strategy anyway. The $\omega$ number must be in the interval [0,2], thus: $ 0 \leq \omega < 2 $, but people say it's better to over-estimate than to under-estimate it. (Thus, instead of "1.82" you might try "1.83" but not "1.81" ;-)
An advantage of iterative methods is also that you can see what the solution is going to look like before obtaining full "accurate" results. For example, first choose EPS = 1.E-4 and then refine the solution with EPS=1.E-5.
It's a slow mental process, but it gradually dawns to me that I have made the wrong choice by implementing a direct solver in the first version of the CR program. Following below is a couple of measurements. No use has been made of Convex 3840 vector-processing; hence pure scalar performance, as will be the case with a moderate workstation (such as a RS-6000 machine).
Grid = 18 x 18 x 12 :

Direct method: 404.5 real   359.1 user + 15.8 sys =  374.9

Iterative method:

 Gauss:       14.6 real     13.8 user     0.2 sys 
Seidel:       19.9 real     16.9 user     0.6 sys 
                            ----             ---
                            30.7    +     0.8     =  31.1 total

Comparison = 374.9 / 31.1 =  12  times faster !!!


Grid = 35 x 35 x 24 :

Direct method:

 Pivot ............................*** CPU time limit exceeded
 because the CPU time limit of our long queue is 5 hours.
 By counting dots we see that the direct solver has done only
 28/72 'th part of the work to be done. By extrapolation we
 get an estimate of  > 13  hours. This number is also obtained
 by multiplying the result of the 18x18x12 grid with 2^7 :
 128 x 374.9 /3600 = 13.33 hours.

Iterative method:

 Gauss:      130.7 real   121.1 user  2.1 sys 
Seidel:      559.2 real   515.5 user  8.4 sys 
                          -----       ----
                          636.6   +  10.5  =  647.1  total

Hence the improvement in performance comes close to a factor
of ... 75 !!!
So, iterative methods are really sensational! I get the unbelievable result that computations will be completed in 10 minutes instead of 10 hours! Also the amount of core memory needed should be hardly a problem for a moderate workstation: approximately 5 MB for a 35x35x24 grid.