Date of Award
Master of Science (MS)
College of Science and Mathematics
Thesis Sponsor/Dissertation Chair/Project Chair
The ability to reconstruct optimal phylogenies (evolutionary trees) based on objective criteria impacts directly on our understanding the relationships among organisms, including human evolution, as well as the spread of infectious disease. Numerous tree construction methods have been implemented for execution on single processors, however inferring large phylogenies using computationally intense algorithms can be beyond the practical capacity of a single processor. Distributed and parallel processing provides a means for overcoming this hurdle. FITCH is a freely available, single-processor implementation of a distance-based, tree-building algorithm commonly used by the biological community. Through an alternating least squares approach to branch length optimization and tree comparison, FITCH iteratively builds up evolutionary trees through species addition and branch rearrangement. To extend the utility of this program, I describe the design, implementation, and performance of mpiFITCH, a parallel processing version of FITCH developed using the Message Passing Interface for message exchange. Balanced load distribution required the conversion of tree generation from recursive linked list traversal to iterative, array-based traversal. Execution of mpiFITCH on a Beowulf cluster running 64 processors revealed maximum performance enhancement of up to ~28 fold with an efficiency of ~ 40%.
Dubin, Robert A., "Adapting the Phylogenetic Program FITCH for Distributed Processing" (2007). Theses, Dissertations and Culminating Projects. 1149.