Show simple item record

dc.contributor.advisorTyers, Mike
dc.contributor.authorMungall, Christopher
dc.date.accessioned2011-08-01T10:13:04Z
dc.date.available2011-08-01T10:13:04Z
dc.date.issued2011-06-27
dc.identifier.urihttp://hdl.handle.net/1842/5020
dc.descriptionNIH Grant no. HG00739
dc.description.abstractThe advent of next-generation sequencing technologies is transforming biology by enabling individual researchers to sequence the genomes of individual organisms or cells on a massive scale. In order to realize the translational potential of this technology we will need advanced information systems to integrate and interpret this deluge of data. These systems must be capable of extracting the location and function of genes and biological features from genomic data, requiring the coordinated parallel execution of multiple bioinformatics analyses and intelligent synthesis of the results. The resulting databases must be structured to allow complex biological knowledge to be recorded in a computable way, which requires the development of logic-based knowledge structures called ontologies. To visualise and manipulate the results, new graphical interfaces and knowledge acquisition tools are required. Finally, to help understand complex disease processes, these information systems must be equipped with the capability to integrate and make inferences over multiple data sets derived from numerous sources. RESULTS: Here I describe research, design and implementation of some of the components of such a next-generation information system. I first describe the automated pipeline system used for the annotation of the Drosophila genome, and the application of this system in genomic research. This was succeeded by the development of a flexible graphoriented database system called Chado, which relies on the use of ontologies for structuring data and knowledge. I also describe research to develop, restructure and enhance a number of biological ontologies, adding a layer of logical semantics that increases the computability of these key knowledge sources. The resulting database and ontology collection can be accessed through a suite of tools. Finally I describe how the combination of genome analysis, ontology-based database representation and powerful tools can be combined in order to make inferences about genotype-phenotype relationships within and across species. CONCLUSION: The large volumes of complex data generated by high-throughput genomic and systems biology technology threatens to overwhelm us, unless we can devise better computing tools to assist us with its analysis. Ontologies are key technologies, but many existing ontologies are not interoperable or lack features that make them computable. Here I have shown how concerted ontology, tool and database development can be applied to make inferences of value to translational research.en
dc.contributor.sponsorHoward Hughes Medical Instituteen
dc.contributor.sponsorNational Institutes of Healthen
dc.contributor.sponsorBiotechnology and Biological Sciences Research Council (BBSRC)en
dc.language.isoenen
dc.publisherThe University of Edinburghen
dc.subjectgenomic dataen
dc.subjectbioinformaticsen
dc.subjectgenome analysisen
dc.subjectontologyen
dc.titleNext-generation information systems for genomicsen
dc.typeThesis or Dissertationen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD(P) Doctor of Philosophy by Research Publicationsen


Files in this item

This item appears in the following Collection(s)

Show simple item record