Saturday, December 10, 2016

Work done at U Texas for ensemble management

Work done at U Texas for management of large ensembles of large scientific calculations, with robustness to job failure or resource allocation expiry, is visible here:

https://bitbucket.org/mtobis/tex-mecs

This is structured as a framework. The computation code and the analysis code are plugged into the framework by the end user. This is designed to be minimally intrusive. If the varied parameters of interest are already read in from a file and the code is runnable on the target machine, the user need only understand a fairly simple method for declaring the structure of the computation in ensemble parameter files.

Because the models are presumed coarse-grained (running for hours, not milliseconds) operating system calls are not costly. Consequently it is possible to wrap the executables into python objects, which makes the underlying code quite clean.

No comments: