3

Is there a way to achieve "one cancels the other" semantics (http://www.investopedia.com/terms/o/oco.asp) for qsub jobs submitted to Sun Grid Engine? That is, I submit two (or more) jobs and when one of them runs it cancels the others?

Two examples...

Say I can run for 4 hours on N ranks or 2 hours on 2N ranks. If both jobs can start soon, I'd rather get the answer 2 hours earlier by running the bigger job. If I'm going to wait more than 2 additional hours to run the 2N-rank job, however, it would make sense to run the smaller job first and cancel the bigger one.

Say I can run part of a long running problem for 8 or 16 hours. There's some startup cost amortization for running longer batch jobs so 16 hours is a mildly better use of SUs. But if the 8 hour job can start sooner, I'd rather make progress now than wait for a 16 hour window.

Possibly another way to accomplish these things would be for me to submit a single job with a range of acceptable node counts (hypothetically, "-pe 12way N,2N") or wall times ("-l h_rt 16-8:00:00"). I haven't been able to tease this out of the SGE man pages, however.

Geoff Oxberry
  • 30,394
  • 9
  • 64
  • 127
Rhys Ulerich
  • 629
  • 3
  • 7
  • At first I thought "you're just trying to game the system". Then I realized, this is just a more efficient allocation of resources. Either way, upvote for you! – Victor Liu May 16 '14 at 04:40

0 Answers0