Informatics thesis and dissertation collectionhttp://hdl.handle.net/1842/33892019-09-19T06:23:26Z2019-09-19T06:23:26ZMethodology to sustain common information spaces for research collaborationsTrani, Lucahttp://hdl.handle.net/1842/361392019-09-16T11:00:56Z2019-11-23T00:00:00ZMethodology to sustain common information spaces for research collaborations
Trani, Luca
Information and knowledge sharing collaborations are essential for scientific research
and innovation. They provide opportunities to pool expertise and resources. They are
required to draw on today’s wealth of data to address pressing societal challenges.
Establishing effective collaborations depends on the alignment of intellectual and
technical capital.
In this thesis we investigate implications and influences of socio-technical aspects
of research collaborations to identify methods of facilitating their formation and
sustained success. We draw on our experience acquired in an international federated
seismological context, and in a large research infrastructure for solid-Earth sciences.
We recognise the centrality of the users and propose a strategy to sustain their
engagement as actors participating in the collaboration. Our approach promotes and
enables their active contribution in the construction and maintenance of Common
Information Spaces (CISs). These are shaped by conceptual agreements that are
captured and maintained to facilitate mutual understanding and to underpin their
collaborative work.
A user-driven approach shapes the evolution of a CIS based on the requirements of
the communities involved in the collaboration. Active users’ engagement is pursued by
partitioning concerns and by targeting their interests. For instance, application domain
experts focus on scientific and conceptual aspects; data and information experts address
knowledge representation issues; and architects and engineers build the infrastructure
that populates the common space.
We introduce a methodology to sustain CIS and a conceptual framework that has
its foundations on a set of agreed Core Concepts forming a Canonical Core (CC). A
representation of such a CC is also introduced that leverages and promotes reuse of
existing standards: EPOS-DCAT-AP.
The application of our methodology shows promising results with a good uptake
and adoption by the targeted communities. This encourages us to continue applying
and evaluating such a strategy in the future.
2019-11-23T00:00:00ZLearning to make decisions with unforeseen possibilitiesInnes, Craighttp://hdl.handle.net/1842/361292019-09-10T11:01:48Z2019-11-23T00:00:00ZLearning to make decisions with unforeseen possibilities
Innes, Craig
Methods for learning optimal policies often assume that the way the domain is conceptualised—
the possible states and relevant actions that are needed to solve one’s
decision problem—is known in advance and does not change during learning. This is
an unrealistic assumption in many scenarios. Often, new evidence can reveal important
information about what is possible, not just what is likely, or unlikely. A learner may
have been completely unaware such possibilities even existed prior to learning.
This thesis presents a model of an agent which discovers and exploits unforeseen
possibilities from two sources of evidence: domain exploration and communication
with an expert. The model combines probabilistic and symbolic reasoning to estimate
all components of the decision problem, including the set of belief variables, the
possible actions, and the probabilistic dependencies between variables. Unlike prior
work on solving decision problems by discovering and learning to exploit unforeseen
possibilities (e.g., Rong (2016); McCallum and Ballard (1996)), our model supports
discovering and learning to exploit unforeseen factors, as opposed to an additional
atomic state. Becoming aware of an unforeseen factor presents computational challenges
when compared with becoming aware of an additional atomic state, because even a
boolean factor doubles the size of the decision problem’s hypothesis space as opposed
to increasing it by just one more state. We show via experiments that one can meet
those challenges by adopting (defeasible) reasoning principles that are familiar from
the literature on belief revision: roughly, default to simple models over more complex
ones and default to conserving what you’ve learned from prior evidence.
For one-step decision problems, our agent learns the components of a Decision Network;
for sequential problems, it learns a Factored Markov Decision Process. We prove
convergence theorems for our models, given the learner’s and expert’s strategies for
gathering evidence. Furthermore, our experiments show that the agent converges on
optimal behaviour even when it starts out completely unaware of factors that are critical
to success.
2019-11-23T00:00:00ZDeep generative modelling for amortised variational inferenceSrivastava, Akashhttp://hdl.handle.net/1842/361142019-09-09T14:32:09Z2019-11-23T00:00:00ZDeep generative modelling for amortised variational inference
Srivastava, Akash
Probabilistic and statistical modelling are the fundamental frameworks that underlie a
large proportion of the modern machine learning (ML) techniques. These frameworks
allow for the practitioners to develop tailor-made models for their problems that may
include their expert knowledge and can learn from data. Learning from data in the
Bayesian framework is referred as inference. In general, model-specific inference
methods are hard to derive as they require high level of mathematical and statistical
dexterity on the practitioner’s part. As a result, there is a large industry of researchers
in ML and statistics that work towards developing automatic methods of inference
(Carpenter et al., 2017; Tran et al., 2016; Kucukelbir et al., 2016; Ge et al., 2018;
Salvatier et al., 2016; Uber, 2017; Lintusaari et al., 2018). These methods are generally
model agnostic and are therefore called black-box inference. Recent work has shown
that use of deep learning techniques (Rezende and Mohamed, 2015b; Kingma et al.,
2016; Srivastava and Sutton, 2017; Mescheder et al., 2017a) within the framework of
variational inference (Jordan et al., 1999) not only allows for automatic and accurate
inference but does so in a drastically efficient way. The added efficiency comes from
the amortisation of the learning cost by using deep neural networks to leverage the
smoothness between data points and their posterior parameters.
The field of deep learning based amortised variational inference is relatively new
and therefore has numerous challenges and issues to be tackled before it can be established
as a standard method of inference. To this end, this thesis presents four pieces of
original work in the domain of automatic amortised variational inference in statistical
models. We first introduce two sets of techniques for amortising variational inference in
Bayesian generative models such as the Latent Dirichlet Allocation (Blei et al., 2003)
and Pachinko Allocation Machine (Li and McCallum, 2006). These techniques use
deep neural networks and stochastic gradient based first order optimisers for inference
and can be generically applied for inference in a large number of Bayesian generative
models. Similarly, we also introduce a novel variational framework for implicit generative
models of data, called VEEGAN. This framework allows for doing inference in
statistical models where unlike the Bayesian generative models, a prescribed likelihood
function is not available. It makes use of a discriminator based density ratio estimator
(Sugiyama et al., 2012) to deal with the intractability of the likelihood function. Implicit
generative models such as the generative adversarial networks (Goodfellow et al., 2014)
suffer from learning issues like mode collapse (Srivastava et al., 2017) and training
instability (Arjovsky et al., 2017). We tackle the mode collapse in GANs using VEEGAN and propose a new training method for implicit generative models, RB-MMDnet
based on an alternative density ratio estimation which provide for stable training and
optimisation in implicit models.
Our results and analysis clearly show that the application of deep generative modelling
in variational inference is a promising direction for improving the state of the
black-box inference methods. Not only do these methods perform better than the traditional
inference methods for the models in question but they do so in a fraction of the
time compared to the traditional methods by utilising the latest in the GPU technology.
2019-11-23T00:00:00ZPhenomenological modelling: statistical abstraction methods for Markov chainsMichaelides, Michalishttp://hdl.handle.net/1842/361092019-09-09T12:25:12Z2019-11-23T00:00:00ZPhenomenological modelling: statistical abstraction methods for Markov chains
Michaelides, Michalis
Continuous-time Markov chains have long served as exemplary low-level models for an
array of systems, be they natural processes like chemical reactions and population fluctuations
in ecosystems, or artificial processes like server queuing systems or communication
networks. Our interest in such systems is often an emergent macro-scale behaviour, or
phenomenon, which can be well characterised by the satisfaction of a set of properties.
Although theoretically elegant, the fundamental low-level nature of Markov chain models
makes macro-scale analysis of the phenomenon of interest difficult. Particularly, it is not
easy to determine the driving mechanisms for the emergent phenomenon, or to predict
how changes at the Markov chain level will influence the macro-scale behaviour.
The difficulties arise primarily from two aspects of such models. Firstly, as the number
of components in the modelled system grows, so does the state-space of the Markov
chain, often making behaviour characterisation untenable under both simulation-based
and analytical methods. Secondly, the behaviour of interest in such systems is usually
dependent on the inherent stochasticity of the model, and may not be aligned to the
underlying state interpretation. In a model where states represent a low-level, primitive
aspect of system components, the phenomenon of interest often varies significantly with
respect to this low-level aspect that states represent.
This work focuses on providing methodological frameworks that circumvent these
issues by developing abstraction strategies, which preserve the phenomena of interest. In
the first part of this thesis, we express behavioural characteristics of the system in terms
of a temporal logic with Markov chain trajectories as semantic objects. This allows us
to group regions of the state-space by how well they satisfy the logical properties that
characterise macro-scale behaviour, in order to produce an abstracted Markov chain.
States of the abstracted chain correspond to certain satisfaction probabilities of the logical
properties, and inferred dynamics match the behaviour of the original chain in terms of
the properties. The resulting model has a smaller state-space which is interpretable in
terms of an emergent behaviour of the original system, and is therefore valuable to a
researcher despite the accuracy sacrifices. Coarsening based on logical properties is particularly useful in multi-scale modelling,
where a layer of the model is a (continuous-time) Markov chain. In such models, the layer
is relevant to other layers only in terms of its output: some logical property evaluated
on the trajectory drawn from the Markov chain. We develop here a framework for
constructing a surrogate (discrete-time) Markov chain, with states corresponding to layer
output. The expensive simulation of a large Markov chain is therefore replaced by an
interpretable abstracted model. We can further use this framework to test whether a
posited mechanism could be the driver for a specific macro-scale behaviour exhibited by
the model.
We use a powerful Bayesian non-parametric regression technique based on Gaussian
process theory to produce the necessary elements of the abstractions above. In particular,
we observe trajectories of the original system from which we infer the satisfaction of
logical properties for varying model parametrisation, and the dynamics for the abstracted
system that match the original in behaviour.
The final part of the thesis presents a novel continuous-state process approximation
to the macro-scale behaviour of discrete-state Markov chains with large state-spaces.
The method is based on spectral analysis of the transition matrix of the chain, where we
use the popular manifold learning method of diffusion maps to analyse the transition
matrix as the operator of a hidden continuous process. An embedding of states in
a continuous space is recovered, and the space is endowed with a drift vector field
inferred via Gaussian process regression. In this manner, we form an ODE whose
solution approximates the evolution of the CTMC mean, mapped onto the continuous
space (known as the fluid limit). Our method is general and differs significantly from
other continuous approximation methods; the latter rely on the Markov chain having
a particular population structure, suggestive of a natural continuous state-space and
associated dynamics.
Overall, this thesis contributes novel methodologies that emphasize the importance
of macro-scale behaviour in modelling complex systems. Part of the work focuses on
abstracting large systems into more concise systems that retain behavioural characteristics
and are interpretable to the modeller. The final part examines the relationship between
continuous and discrete state-spaces and seeks for a transition path between the two which
does not rely on exogenous semantics of the system states. Further than the computational
and theoretical benefits of these methodologies, they push at the boundaries of various
prevalent approaches to stochastic modelling.
2019-11-23T00:00:00Z