ENHANCED THROUGHPUT FOR WORKFLOW SCHEDULING USING PARALELLISM COMPUTATION AND INFORMED SEARCH

No Thumbnail Available
Date
2017-07-16
Authors
Tang, Kaite Tang
Journal Title
Journal ISSN
Volume Title
Publisher
Middle Tennessee State University
Abstract
Next-generation e-science is producing a huge amount of data that needs to be processed by geographically isolated scientists and users through different steps, which can be modeled as a Directed Acyclic Graph (DAG) structured computing workflow. Many big data science applications, especially streaming applications with complex DAG-structured workflows, require a smooth dataflow for the Quality of Service (QoS) guarantee. Even with the ever-increasing computing power available in the High Performance Computing (HPC) environments, i.e., parallel processing on a PC cluster, the execution time of such high-demand streaming applications may still take a few hours or even days in some cases. Therefore, supporting and optimizing the performance of such scientific workflows in wide-area networks, especially in Grid and Cloud environments, are crucial to the success of collaborative scientific discovery.
In this thesis, we focus on optimizing and improving the performance of an existing workflow mapping algorithm, Layer-oriented Dynamic Programming (LDP), by (i) parallelizing the workflow executions on a PC cluster using MPI and OpenMP, and (ii) removing unnecessary search steps in order to reduce the algorithm runtime using informed search techniques inspired by depth-first search (DFS) and breadth-first search (BFS). The performance superiority of the modified algorithm is illustrated by an extensive set of simulations in comparison to the original LDP algorithm in terms of both throughput and runtime.
Description
Keywords
High Performance Computing, Maximum Frame Rate, Workflow Mapping
Citation
Collections