COLLOQUIUM Department of Computer Science and Engineering University of South Carolina Simple Math Is Enough: Two Examples of Inferring Functional Association from Genomic Data Shoudan Liang NASA Ames Research Center Date: April 5, 2004 (Monday) Time: 3:30-4:30PM Place: Swearingen 1A03 (Faculty Lounge) Abstract High-throughput experiments, such as genome-wide monitoring of mRNA expressions and protein-protein interactions, are expected to be fertile ground for deriving protein functions. These data usually contains a high rate of false positives. This talk discusses how to extract reliable biological associations from genome-wide data. Using two examples, we emphasize the importance of p-values, which are often derivable using simple mathematics. The first example is based on our recently published work (PNAS 100, 12579) on the reliable prediction of protein functions from certain non-random features in large-scale 2-hybrid data. Our method assumes that if two proteins share significantly larger number of common interaction partners than random, they have close functional associations. Based on an analysis on yeast protein-protein interaction data, we have derived more than 2800 functional associations. We derived tentative functions for 81 non-annotated proteins. Since the completion of the work, 23 of them have been annotated in Saccharomyces Genome Database, and all but one of our predictions proved to be correct. In the second example, we will discuss our improvement to the REDUCER algorithm of Bussemaker et al. (Nat. Genet. 27 167) that extracts protein-binding motifs from microarray experiments. In this method, statistical significance is derived from linear regression of the expression values to the copy number of motifs in gene's promoters. Shoudan Liang works at the NASA Advanced Supercomputing Division of the NASA Ames Research Center. Trained in theoretical physics with a degree from the University of Chicago, where he obtained a Ph.D. in 1986, Dr. Liang taught physics at Penn State University before moving to NASA Ames six years ago. His main interests include metabolic network modeling and cis-regulatory elements analysis in development.