Automation and Management of Scientific Workflows in Distributed Network Environments
Large-scale computation-intensive applications in various science fields feature complex DAG-structured workflows comprised of distributed computing modules with intricate inter-module dependencies. Supporting such workflows in heterogeneous network environments and optimizing their end-to-end performance are crucial to the success of large-scale collaborative scientific applications. We design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a set of easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in distributed network environments. The current version of SWAMP integrates the graphical user interface of Kepler to compose abstract workflows and employs Condor DAGMan for workflow dispatch and execution. SWAMP provides a web-based user interface to automate and manage workflow executions and uses a special workflow mapper to optimize the end-to-end workflow performance. A case study of the workflow for Spallation Neutron Source datasets in real networks is presented to show the efficacy of the proposed platform.
MSU Digital Commons Citation
Wu, Qishi; Zhu, Michelle; Lu, Xukang; Brown, Patrick; Lin, Yunyue; Gu, Yi; Cao, Fei; and Reuter, Michael A., "Automation and Management of Scientific Workflows in Distributed Network Environments" (2010). Department of Computer Science Faculty Scholarship and Creative Works. 138.