THE CRUNCH GROUP

Division of Applied Mathematics

Brown University

Providence, RI 02912, USA

Two-level Domain Decomposition Method

The emerging generation of petaflop supercomputers permits deeper insight into physiological phenomena. From the computational standpoint, it makes feasible the solution of a problem with billions of unknowns in reasonable time. However, a straight-forward approach in simulating 3D flow in the Macrovascular Network is computationally prohibitive even on petaflop computers, due to extremely large size of tightly coupled problem and high cost of communication over thousands of processes. In order to exploit the available computational resources efficiently, and make progress in understanding the physiology of the arterial system we develop a new ultra-parallel paradigm.

A new two-level method for the Navier-Stokes equations we develop combines the best features of discontinuous and continuous Galerkin formulations. According to the method the large computational domain is first subdivided into overlapping patches (coarse level partitioning); within each patch a spectral element discretization (fine level) is employed. An example of large Macrovascular Network reconstructed from MR images of the brain is presented in the figure bellow.

Computational domain consisting of 65 arteries is decomposed into four sub-domains, as indicated by different colors. The domain is descritized by 470,000 tetrahedral spectral elements. Click here to see 3D geometrical model (AVI movie, 21MB).

The overall scalability of the method depends on the strong scaling within a patch and the weak scaling in terms of the number of patches. This dual path to scalability provides great flexibility in balancing accuracy and parallel efficiency.

Performance of NektarG on CRAY XT5 (Kraken) of NICS, University of Tennessee .

# of patches cores/patch cores(total) CPU-time for 1000 steps weak scaling

3 2,048 6,144 462.3s 100%

8 2,048 16,384 477.2s 96.9%

16 2,048 32,768 505.1s 91.5%

Performance of NektarG on BlueGene/P (Intrepid) of ALCF ANL.

# of patches cores/patch cores(total) CPU-time for 1000 steps weak scaling

3 2,048 6,144 650.27s 100%

8 2,048 16,384 685.23s 95%

16 2,048 32,768 703.4s 92%

The method has been implemented in unsteady flow simulation in major arteries of the brain. In the figure bellow the computational domain constructed from four overlapping patches is presented. In the XY plots we plot the velocity profile extracted across the overlapping regions (along the red-blue lines marked by 1,2 and 3).

Computational domain consisting of brain arteries with aneurysm is decomposed into four sub-domains, as indicated by different colors. The domain is discretized by Nel=425,113 tetrahedral spectral elements. Simulation has been performed with using high spatial resolution: inside each element the solution was approximated by polynomial expansion of sixth-order (P=6), The corresponding number of quadrature points inside each elements was Nq = (P + 3)(P + 2)2 = 576 and number of degrees of freedom DOF = (Nel)(Nq)4 = 979,460,352.

Publications

Leopold Grinberg and George Em Karniadakis, A Scalable Domain Decomposition Method for Ultra-Parallel Arterial Flow Simulations, Communications in Computational Physics, 4(5), 1151-1169 (2008). (publication)