PERMAS-XPU - GPU Accelerator
PERMAS supports NVIDIA Tesla Cards. Since 1996 PERMAS uses a unique parallelization concept with a run-time parallelization of all matrix operations based on a dynamically generated task-graph of hierarchical block-operations. This concept gives excellent speedups especially on shared memory machines and ensures bit-identical results independent of the number of cores or the amount of memory used.
During the German MCSimVis and the European H4H project from 2009-2015 this concept was extended by a seamless integration of NVIDIA Cards. An NVIDIA Card may be used as an additional floating-point accelerator just like plugging in an extra socket of extra CPU cores.
The collaborative work of all CPUs plus the GPU acceleration is available for any PERMAS analysis and is not restricted by any hardware resource. I.e. PERMAS is known for solving huge FEM simulation problems even on limited hardware resources. E.g. efficiently working with TByte matrices on a system with only some GByte memory is not a problem for PERMAS. This is supported by asynchronous handling of I/O and computations. See Parallelization.
Thus the extra speedup from NVIDIA Tesla Cards can be seen even for out-of-core simulations involving PBytes of local I/O. Typically, on standard single or multi socket compute servers, an extra Tesla Card boosts the PERMAS performance by another factor 2 to 4, as shown for a large contact analysis that shows an overall speedup of 1.8 for the whole job.