COLLOQUIUM Department of Computer Science and Engineering University of South Carolina High-performance, Power-aware Distributed Computing Kirk W. Cameron Scalable Performance Laboratory Department of Computer Science and Engineering University of South Carolina Date: October 29, 2004 (Friday) Time: 3:30-4:30PM Place: Swearingen 1C01 (Amoco Hall) Abstract Large-scale parallel and distributed systems will be used to meet the computational demands of distributed simulations. These computer models enable scientific experiments prohibited by cost, capability, treaty, etc. (e.g. nuclear weapons). Application performance efficiencies (often 5-10% of peak) must improve to realize the computational power of these systems and decrease simulation time-to-solution significantly. As system size and complexity increase maintaining such efficiencies will prove challenging. Additionally, the fundamental drive to increase peak performance using thousands of power hungry components will lead to intolerable operating costs and failure rates. Without innovation, emergent petaflop systems will potentially require 100 Megawatts of power, the lighting requirements of a small city, and use a shrinking percentage of peak performance. We present our ongoing efforts to reduce the power-performance efficiency gap of distributed applications in high-end systems. To improve performance efficiency, we propose an analytical model for analysis and prediction. Our model is fast, accurate and capable of quantifying the performance impact of memory and middleware on distributed communication. We discuss the collaborative use of our model by Argonne National Laboratory researchers to improve MPI performance. To improve power efficiency, we use emergent power-aware technologies (e.g. DVS) to conserve energy up to 25% for scientific codes on our 16-node Centrino-based Beowulf without impacting performance significantly (<5%). Our innovative approaches compress the power-performance gap from below by increasing achieved middleware performance and from above by decreasing theoretical peak system speed during application inefficiencies to conserve energy. Kirk W. Cameron is an assistant professor in the Department of Computer Science and Engineering at the University of South Carolina, where he directs the SCAPE (SCAlable Performance) laboratory. He received his Ph.D. from Louisiana State University in August 2000. His practical experience includes work in memory simulation and validation at Intel Corporation and large-scale performance analysis at Los Alamos National Laboratory in New Mexico. He has published in many international conferences on microarchitecture and high-performance computing. Dr. Cameron is a recipient of an NSF CAREER Award (2004) and DOE Early Career Principal Investigator Award (2004). His research interests include high-performance and grid computing, parallel and distributed systems, computer architecture, power-aware systems, and performance evaluation and prediction.