COLLOQUIUM Department of Computer Science and Engineering University of South Carolina Adaptive Allocation of Independent Tasks in Distributed Computing Systems Bo Hong Electrical Engineering Department University of Southern California Date: February 7, 2005 Time: 1:00-2:00PM Place: Swearingen 1A03 (Faculty Lounge) Abstract Distributed heterogeneous computing systems (e.g. the Grid and peer-to-peer systems) have emerged as attractive platforms for high performance computing. By utilizing their distributed resources in a coordinated manner, such systems can meet the computational demands of many complex applications. This talk will focus on the computation of a large set of independent tasks on such platforms. Such a computation paradigm models a variety of research and commercial activities such as SETI@home, Folding@home, and many parameter sweep applications. Previous studies have considered the scenario where the system infrastructure is a tree topology. However, other studies have shown that even the best tree extracted from a general graph structured system may have arbitrarily bad performance (compared with what can be achieved by the original graph). This talk will address the scheduling problems for systems with general graph topologies. By modeling the computation at the nodes as a special type of data flow, the throughput maximization problem is reduced to a flow optimization problem, which is then utilized to develop a distributed algorithm to coordinate the nodes. In this algorithm, every node determines its own activities (send, receive, and compute) based on the information about itself and its neighbors. Such a distributed cooperation among the nodes is proven to maximize the system throughput. Additionally, the algorithm is able to adapt to changes in the computation and communication capabilities of the nodes. The optimality and adaptivity of the algorithm are verified through extensive simulations on both uniformly distributed and power-law distributed systems. The results show that the proposed algorithm, when approximated as a distributed protocol, achieves close to optimal system throughput and outperforms the first-come-first-serve greedy heuristic. Start-up time and adaptation time of the distributed protocol is experimentally shown to be of the same order as the system diameter. Bo Hong received the B.Eng. and M.Eng. degree from Tsinghua University, Beijing China in 1997 and 2000, respectively. He is currently a PhD candidate in the Electrical Engineering Department of the University of Southern California. He is expected to complete his PhD degree in summer 2005. His current research interests are in the broad area of parallel and distributed computing, particularly in algorithm design and scheduling techniques for distributed computing systems, distributed computing in sensor networks, and modeling and optimization of high performance computing systems/applications.