Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Clinical Sciences, School of >
School of Clinical Sciences thesis and dissertation collection >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/6520

This item has been viewed 37 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Golding2012.pdf2.29 MBAdobe PDFView/Open
Golding2012.doc4.38 MBMicrosoft Word
Title: Development of a statistical method for the identification of gene-environment interactions
Authors: Golding, Pauline Lindsay
Supervisor(s): Anderson, Niall
Campbell, Harry
Wild, Sarah
Issue Date: 30-Jun-2012
Publisher: The University of Edinburgh
Abstract: In order to understand common, complex disease it is necessary to consider not just genetic risks and environmental risks, but also the interplay between them. This thesis aims to develop methodology for the detection of gene-environment interactions specifically; both by looking at the strengths and weaknesses of traditional approaches and through the development and testing of a novel statistical method. Developments in genotyping technology enable researchers to collect large volumes of polymorphisms in human genes, yet very few statistical methods are able to handle the volume, variation and complexity of this data, especially in combination with environmental risk factors. Interactions between genes and the environment are often subject to the curse of dimensionality, with each new variable increasing the potential number of interactions exponentially, leading to low power and a high false positive rate. The Mixed Tree Method (MTM) exploits the differences between environmental and genetic variables, by selecting the most appropriate features from conventional methods (including recursive partitioning, random forests and logistic regression) and combining them with new comparison algorithms which rank the genetic variables by the likelihood that they interact with the environmental variable under study. Results show the MTM to be as effective as the most successful current method for identification of interactions, but maintaining a much lower false positive rate and computational burden. As the number of SNPs in the dataset increases, the success of MTM compared to other methods becomes greater while the comparator approaches exhibit computational problems and rapidly increasing processing times. The MTM is also applied to a colorectal cancer dataset to show its use in a practical setting. The results together suggest that MTM could be a useful strategy for identifying gene environment interactions in future studies into complex disease.
Sponsor(s): Chief Scientist Office
Keywords: statistics
genetics
URI: http://hdl.handle.net/1842/6520
Appears in Collections:School of Clinical Sciences thesis and dissertation collection

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback