Statistical modeling of oscillating biological networks for structure inference and experimental design
Trejo Baños, Daniel
MetadataShow full item record
Oscillations lie at the core of many biological processes, from the cell cycle, to circadian oscillations and developmental processes. They are essential to enable organisms to adapt to varying conditions in environmental cycles, from day/night to seasonal. Transcriptional regulatory networks are one of the mechanisms behind these biological oscillations. One of the main problems of computational systems biology is elucidating the interaction between biological components. A common mathematical abstraction is to represent these interactions as networks whose nodes are the reactive species and the interactions are edges. There is abundant literature dealing with the reconstruction of the network structure from steady-state gene expression measurements; still, there are lots of advancements to be made because of the complex nature of biological systems. Experimental design is another obstacle to overcome; we wish to perform experiments that help us best define the network structure according to our current knowledge of the system. In the first chapters of this thesis we will focus on reconstructing the network structure of biological oscillators by explicitly leveraging the cyclical nature of the transcriptional signals. We present a method for reconstructing network interactions tailored to this special but important class of genetic circuits. The method is based on projecting the signal onto a set of oscillatory basis functions. We build a Bayesian hierarchical model within a frequency domain linear model in order to enforce sparsity and incorporate prior knowledge about the network structure. Experiments on real and simulated data show that the method can lead to substantial improvements over competing approaches if the oscillatory assumption is met, and remains competitive also in cases it is not. Having defined a model for gene expression in oscillatory systems, we also consider the problem of designing informative experiments for elucidating the dynamics and better identify the model. We demonstrate our approach on a benchmark scenario in plant biology, the circadian clock network of Arabidopsis thaliana, and discuss the different value of three types of commonly used experiments in terms of aiding the reconstruction of the network. Finally we provide the architecture and design of a software implementation to plug in statistical methods of gene expression inference and network reconstruction into a biological data integration platform.