Edinburgh Research Archive >
Informatics, School of >
Informatics thesis and dissertation collection >
Please use this identifier to cite or link to this item:
|Title: ||Inference dynamics in transcriptional regulation|
|Authors: ||Asif, Hafiz Muhammad Shahzad|
|Supervisor(s): ||Sanguinetti, Guido|
|Issue Date: ||25-Jun-2012|
|Publisher: ||The University of Edinburgh|
|Abstract: ||Computational systems biology is an emerging area of research that focuses on understanding
the holistic view of complex biological systems with the help of statistical, mathematical and
computational techniques. The regulation of gene expression in gene regulatory network is
a fundamental task performed by all known forms of life. In this subsystem, modelling the
behaviour of the components and their interactions can provide useful biological insights. Statistical
approaches for understanding biological phenomena such as gene regulation are proving
to be useful for understanding the biological processes that are otherwise not comprehensible
due to multitude of information and experimental difficulties. A combination of both the experimental
and computational biology can potentially lead to system level understanding of
This thesis focuses on the problem of inferring the dynamics of gene regulation from the
observed output of gene expression. Understanding of the dynamics of regulatory proteins in
regulating the gene expression is a fundamental task in elucidating the hidden regulatory mechanisms.
For this task, an initial fixed structure of the network is obtained using experimental
biology techniques. Given this network structure, the proposed inference algorithms make use
of the expression data to predict the latent dynamics of transcription factor proteins.
The thesis starts with an introductory chapter that familiarises the reader with the physical
entities in biological systems; then we present the basic framework for inference in transcriptional
regulation and highlight the main features of our approach. Then we introduce the
methods and techniques that we use for inference in biological networks in chapter 2; it sets
the foundation for the remaining chapters of the thesis. Chapter 3 describes four well-known
methods for inference in transcriptional regulation with pros and cons of each method.
Main contributions of the thesis are presented in the following three chapters. Chapter 4 describes
a model for inference in transcriptional regulation using state space models. We extend
this method to cope with the expression data obtained from multiple independent experiments
where time dynamics are not present. We believe that the time has arrived to package methods
like these into customised software packages tailored for biologists for analysing the expression
data. So, we developed an open-sources, platform independent implementation of this method
(TFInfer) that can process expression measurements with biological replicates to predict the
activities of proteins and their influence on gene expression in gene regulatory network.
The proteins in the regulatory network are known to interact with one another in regulating
the expression of their downstream target genes. To take this into account, we propose a novel
method to infer combinatorial effect of the proteins on gene expression using a variant of factorial hidden Markov model. We describe the inference mechanism in combinatorial factorial
hidden model (cFHMM) using an efficient variational Bayesian expectation maximisation algorithm.
We study the performance of the proposed model using simulated data analysis and
identify its limitation in different noise conditions; then we use three real expression datasets
to find the extent of combinatorial transcriptional regulation present in these datasets. This
constitutes chapter 5 of the thesis.
In chapter 6, we focus on problem of inferring the groups of proteins that are under the
influence of same external signals and thus have similar effects on their downstream targets.
Main objectives for this work are two fold: firstly, identifying the clusters of proteins with
similar dynamics indicate their role is specific biological mechanisms and therefore potentially
useful for novel biological insights; secondly, clustering naturally leads to better estimation of
the transition rates of activity profiles of the regulatory proteins. The method we propose uses
Dirichlet process mixtures to cluster the latent activity profiles of regulatory proteins that are
modelled as latent Markov chain of a factorial hidden Markov model; we refer to this method
as DPM-FHMM. We extensively test our methods using simulated and real datasets and show
that our model shows better results for inference in transcriptional regulation compared to a
standard factorial hidden Markov model.
In the last chapter, we present conclusions about the work presented in this thesis and
propose future directions for extending this work.|
|Keywords: ||computational systems biology|
transcription factor proteins.
|Appears in Collections:||Informatics thesis and dissertation collection|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.