COLLOQUIUM Department of Computer Science and Engineering University of South Carolina Exploiting Structure and Parallelism to Accelerate Microarchitectural Simulation David Penry Department of Computer Science Princeton University Date: February 24, 2006 Time: 1430-1530 Place: Swearingen 1C01 (Amoco Hall) Abstract Microprocessors have continually increased in size and complexity over time, placing an ever-increasing demand for performance upon the simulators we use to design them. Conveniently, the systems we run these simulators on have become faster and faster, allowing simulators to keep up in this "simulation arms race." However, this state of affairs is about to end. Microprocessor manufacturers are turning increasingly to multi-core processors instead of more complex cores as a way to continue to scale processor performance. However, an individual application can only increase its performance from generation to generation on these multi-core chips if the application is parallel. Thus, microarchitectural simulators must become parallel applications or simulation speed will no longer maintain pace with design complexity. It is clear that a simulator should have plenty of parallelism to exploit: hardware is naturally parallel. But how are we to exploit it? The traditional method of writing simulators as a sequential C/C++ program is already difficult, tedious, and time-consuming. But parallelizing these simulators is even worse: the structure of a sequential program obscures the natural parallelism and requires the simulator writer to use manual, ad-hoc methods to parallelize the simulator. A technique called structural simulation has been proposed recently, which greatly reduces the difficulty of writing simulators. This technique allows designers to develop a structural model of the processor which mirrors the natural parallelism of the hardware; a model compiler then generates a simulator from the model. In this talk I explore the question: "Can the structure of the model be exploited to automatically parallelize the generated simulator?" By implementing automatic parallelization within a structural simulation framework, I show that the structure can be exploited. I further show that significant simulation speedups can be obtained without requiring the user to change the model. David Penry is a Ph.D. candidate in Computer Science at Princeton University. His research interests include microarchitecture, parallel systems architecture, and tools for computer architecture research and development. Prior to arriving at Princeton, David worked for six years in industry: five of them at Sun Microsystems, where he worked on the Advanced PCI Bridge and the MAJC 5200 processor. Among his responsibilities on the latter project were logic design, system architecture, manufacturing test methodology, and leadership of the initial bringup effort. David holds B.S.E. and M.S. degrees in Computer Engineering from Case Western University, an M.B.A. from The Ohio State University, and an M.A. in Computer Science from Princeton University.