tl;dr
- For multiprocessing (MPI, message passing) use
ntasks.
- For multithreading (OpenMP, pthreads) use
cpus-per-task.
- For hybrid codes you need both options and probably also want to tune
ntasks-per-node.
Link to the sbatch manual
This is somewhat complicated. It depends on whether your program needs tasks or cores. For example an MPI based program will be launched several times and communicates via message passing, while an OpenMP based program will only be launched once and will then launch several threads which communicate via shared memory.
In the case of message passing it doesn't matter on which node the tasks are launched as long as they can communicate (Infiniband, Ethernet, etc). In the case of shared memory it is important that tasks run on the same node (in fact, it is required).
The ntasks option of SLURM specifies how many tasks your program will launch, which could be threads of independent instances of the MPI program. However, SLURM assumes that when you say ntasks you mean tasks which communicate by message passing and in case your machine has 12 cores but you requested 13 tasks, it will happily launch 12 tasks on one node and 1 on another node. (I don't think this behaviour is guaranteed. SLURM could also throw all 13 tasks on one node with 12 CPUs and let the CPU schedule the tasks. You can get more fine-grained control using ntasks-per-core and ntasks-per-node.)
If you have a multithreaded program, then you want to use cpus-per-task instead and set ntasks to 1 (or leave it unspecified, as it defaults to 1). This way, if you request 13 CPUs but the maximum available is 12, your job will just be rejected.
ntasksyou do not get any guarantee. If resources are scarce it could squeeze all 13 tasks on a single CPU core. If you want a whole node use--nodes=1(caveat: your colleagues will hate you if you do not blast the whole node on full load and more jobs could have fit there) – Henri Menke Jul 14 '17 at 02:03