| View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] |
| ID | Project | Category | View Status | Date Submitted | Last Update |
| 0000164 | Rosetta | [All Projects] Bad Coding | public | 2012-12-12 16:45 | 2012-12-12 16:45 |
|
| Reporter | rmoretti | |
| Assigned To | | |
| Priority | normal | Severity | minor | Reproducibility | always |
| Status | new | Resolution | open | |
| Platform | All platforms | OS | Any | OS Version | Any |
| Product Version | Trunk | |
| Fixed in Version | | |
|
| Summary | 0000164: -ntrials and -max_retry_job option confusion |
| Description | The options -jd2:ntrials and -run:max_retry_job options appear to be doing almost the same thing, and appear to be doing it simultanteously.
In jd2::JobDistributor::go_main, if you get a FAIL_RETRY status from the mover, you'll terminate execution if you're above ntrials. If not, you'll call mark_current_job_id_for_repetition(), which for FileSystemJobDistributor then looks at max_retry_job, with the same cutoff logic. The upshot is that if you want to increase the number of trials when using the FileSystemJobDistributor, you'd have to set *both* -jd2:ntrials and -run:max_retry_job.
|
| Steps To Reproduce | Run an always-fails-with-FAIL_RETRY mover under jd2 and FileSystemJobDistributor with something like -jd2:ntrials 100. Note how you only ever get 10 repeats because of -run:max_retry_job. |
| Additional Information | The -run:max_retry_job logic seems to have been added with a note about proper restart behavior on Boinc. https://svn.rosettacommons.org/trac/changeset/28711 [^]
It looks like this was before the -jd2:ntrials logic was added to JobDistributor. |
| Tags | No tags attached. |
|
| Application(s) Affected | any jd2 |
| Command Line Used | any jd2 with FileSystemJobDistributor and -jd2:ntrials > 10 |
| Developer Options | Confirmed As Bug |
| Fixed in SVN Version | |
|
| Attached Files | |
|