COLLOQUIUM Department of Computer Science and Engineering University of South Carolina Computational Methods in Structural Genomics: High Throughput Protein Structure Determination from NMR Data Homayoun Valafar Departments of Computer Science and Biochemistry University of Georgia Date: March 26, 2004 (Friday) Time: 3:30-4:30PM Place: Swearingen 1A03 (Faculty Lounge) Abstract One of the primary aims of the structural genomics initiative is the determination of representative structures from each protein fold family. However, analysis of the current rate of discovery of new fold families provides an estimate of 20-30 years of intense structure determination before the completion of a full library. This is mainly due to the inefficient nature of the target selection within the structural genomics community. Consequently, it is important to rapidly identify proteins that belong to a family that is already well populated (so they can be eliminated from further studies), or more importantly identify proteins that represent new families of fold. Currently most prominent method of target selection is based on sequence homology. While the utility of sequence homology in target selection is the most efficient method of coverage of the sequence space, it is not necessarily the most efficient method of covering protein fold space. During this presentation, the main objective of structural genomics will be introduced. Discussion of some of the current existing bottlenecks will follow the introduction. Finally, a new method for rapid classification to a fold family via the utilization of signal processing and optimization tools will be presented. Our statistical analysis takes advantage of new source of data, namely residual dipolar couplings that can be obtained by nuclear magnetic resonance spectroscopy. The required NMR data can be quickly acquired and analyzed. Using this method, structure determination efforts can be focused on more unique and interesting structures, and the overall efficiency in the construction of an information-rich structural library can be increased. Homayoun Valafar is the project coordinator for bioinformatics and data analysis at the Southeastern Collaboratory for Structural Genomics (SECGS) at the University of Georgia.