Monday, February 26, 2018 - 01:30 pm
Meeting room 2265, Innovation Center
DISSERTATION DEFENSE Xiao Lin Advisor : Dr. Gabriel Terejanu Abstract Computational models play an important role in scientific discovery and engineering design. However, developing computational models is challenging, since the process always follows a path contaminated with errors and uncertainties. The uncertainties and errors inherent in computational models are the result of many factors, including experimental uncertainties, model structure inadequacies, uncertainties in model parameters and initial conditions, as well as errors due to numerical discretiza- tions. To realize the full potential in applications it is critical to systematically and economically reduce the uncertainties inherent in all computational models. The update and development of computational models is a recursive process between data assimilation and data selection. In data assimilation, measurements are incorporated into computational simulations to reduce the uncertainties of the model and in reverse, the simulations help determine where to acquire data such that most information can be provided. Currently, data assimilation techniques are overwhelmed by data volume and velocity and increased complexity of computational models. In this work, we develop a novel data assimilation approach EnLLVM which is based on linear latent variable model. There are several advantages of this approach. First, it works well with high dimensional dynamic systems, but only requires a small number of samples. Second, it can absorb model structure error and reflect the error in the uncertainty of data assimilation results. In addition, data assimilation is performed without calculating likelihood of observation, thus it can be applied to data assimilation problems in which likelihood is intractable. Obtaining informative data is also crucial, as data collection is an expensive endeavor for a number of science and engineering fields. Mutual information, which naturally measures information provided about one quantity by knowing the other quantity, has become a major design metric and has fueled a large body of work on experimental design. However, estimating mutual information is challenging and results are not reliable in high dimensions. In this work, we derive a lower bound of mutual information, which is computed in much lower dimensions. This lower bound can be applied to experimental design as well as other problems that require comparison of mutual information. At last, we develop a general framework for building computational models. In this framework, hypotheses about unknown model structure are generated by using EnLLVM for data assimilation and lower bound of mutual information for finding relations between state variables and unknown structure function. Then, different hypotheses can be ranked with model selection technique. This framework not only provides a way to infer model discrepancy, but also could further contribute to scientific discoveries.