Direct outcore solver

After the asymmetrical linear system of equations has been assembled, it must be solved. A direct solution method has been implemented in the first place. It's still contained in a main program called "Direct.f". I have devised an out-core version of the LU decomposition technique for asymmetric matrices: subroutines VEGEN and TERUG. The bookkeeping, which is inevitably associated with storing coefficients in a band matrix, is taken care of by the Fortran statement function lok(i,j). It's used in all main programs, as well as in the two subroutines mentioned.
With such a direct solution method, Dirichlet boundary conditions are usually implemented by adding a BIG number to the diagonal of the global matrix and adding the same BIG number, multiplied by the actual (pressure) value, to the right hand side (load vector). A certain inaccuracy is introduced herewith, which can be diminished by making BIG still bigger. At the expence, however, of making worse the so-called condition number of the global matrix. A value of $10^6$ seems to be reasonable compromise, and is contained as such in the 'Direct' program.
With direct methods, it turns out that huge amounts of memory and computation time are required for solving even a moderate problem. Let NN be the number of unknowns and NB be the bandwidth. Then memory requirements as a whole will increase as NN*(2*NB+1). This is taken care of a great deal by putting most of the equation coefficients on temporary disk: only (2*NB+1)**2 coefficients need to be kept in core memory at a time. Computing times, however, increase even more with increasing dimensions. It can easily be inferred from the loop structure in 'vegen' that they are roughly proportional to NN*NB**2. Let's compare the following two cases. The first one was actually run, and its computing time measured: it turned out to be 6 minutes.
 Unknowns :        2535      : case 1
 Bandwidth:         192      : case 1
 Unknowns :       20335      : case 2
 Bandwidth:         620      : case 2
The computing time can be estimated for the second case, at hand of the first one: NN*NB**2 $= (20335/2535)*(620/192)^2 = 83.646 \times \mbox{slower}$. First run = 6 minutes, second one = 83.646 x 6 = 502 minutes = 8.4 hours for the solver (vegen.f) alone.
The overall structure of the asymmetrical band matrix can be visualized with a separate Fortran program called 'PlotIt.f'. In order to use it, you have to change the 'Direct' MAIN program in at least two places:
c    , access='direct',recl=4*(2*NB+1),disp='delete')  BECOMES:
     , access='direct',recl=4*(2*NB+1)) ! File will be on disk
INSERT:
      stop 'effe'
JUST BEFORE:
*
*     SOLVATION :-)
The MAIN program then will leave on disk the big matrix just after Assembly. Now run "PlotIt" and send "fort.7" to a PostScript printer. And ... don't forget to restore the original MAIN program!
It is easily conceived from the picture that the observed inefficiency must be associated with fill-in of the very many zeros within the banded storage. The PostScript-file (making up the figure) represents the matrix $S(i,j)$ of the equations to be solved in our program. The x-direction corresponds to the index $(i)$, the y-direction (with its origin at the top) corresponds to the index $(j)$. The non-zero entries in $S(i,j)$ are respresented by a black dot. As can be seen, most of the matrix is blank, that is: filled with zeros. (It's a so called SPARSE matrix.) During pivoting (in "vegen") the whole area between the two "boundary lines" will suffer from "fill-in", which means that it will be "painted black" with additional non-zero numbers. All these will take part in the computational process. Crunch, crunch ...
 Unknowns :        2535
 Bandwidth:         192
 Incore ..:      633460
 Outcore .:     3903900

   0.7083941    % of Full
   4.664361     % of Banded
   39 = maximum number coefficients / row
The frustrating thing is that less than 5 percent of the MegaBytes needed is filled with non-zero coefficients, just after Assembly. Then pivoting comes in (with "vegen") and fills this space for more than 85 percent!

The above is a 1996 story. It should be noticed that, with the advent of big memories, outcore solvers have become obsolete.