I'm solving the Helmholtz equation using PETSc. I found with the PETSc configure option --download-f-blas-lapack my program runs twice as fast over running it with MKL. Is this a common trend or are there other factors at play? I'm using gcc and Open MPI with PETSc.
Asked
Active
Viewed 548 times
6
Aron Ahmadia
- 6,951
- 4
- 34
- 54
Hui Zhang
- 1,319
- 7
- 16
1 Answers
9
This is usually caused by trying to use a threaded MKL combined with MPI, resulting in over-subscription. Either explicitly configure PETSc to use non-threaded MKL or add MKL_NUM_THREADS=1 to your environment.
Jed Brown
- 25,650
- 3
- 72
- 130