MantisBT - Rosetta
View Issue Details
0000164Rosetta[All Projects] Bad Codingpublic2012-12-12 16:452012-12-12 16:45
rmoretti 
 
normalminoralways
newopen 
All platformsAnyAny
Trunk 
 
any jd2
any jd2 with FileSystemJobDistributor and -jd2:ntrials > 10
Confirmed As Bug
0000164: -ntrials and -max_retry_job option confusion
The options -jd2:ntrials and -run:max_retry_job options appear to be doing almost the same thing, and appear to be doing it simultanteously.

In jd2::JobDistributor::go_main, if you get a FAIL_RETRY status from the mover, you'll terminate execution if you're above ntrials. If not, you'll call mark_current_job_id_for_repetition(), which for FileSystemJobDistributor then looks at max_retry_job, with the same cutoff logic. The upshot is that if you want to increase the number of trials when using the FileSystemJobDistributor, you'd have to set *both* -jd2:ntrials and -run:max_retry_job.

Run an always-fails-with-FAIL_RETRY mover under jd2 and FileSystemJobDistributor with something like -jd2:ntrials 100. Note how you only ever get 10 repeats because of -run:max_retry_job.
The -run:max_retry_job logic seems to have been added with a note about proper restart behavior on Boinc. https://svn.rosettacommons.org/trac/changeset/28711 [^]
It looks like this was before the -jd2:ntrials logic was added to JobDistributor.
No tags attached.
Issue History
2012-12-12 16:45rmorettiNew Issue

There are no notes attached to this issue.