Abstract
The objective of this thesis is to develop new methods to reconstruct haplotypes from phaseunknown
genotypes. The need for new methodologies is motivated by the increasing avail¬
ability of high-resolution marker data for many species. Such markers typically exhibit
correlations, a phenomenon known as Linkage Disequilibrium (LD). It is believed that re¬
constructed haplotypes for markers in high LD can be valuable for a variety of application
areas in population genetics, including reconstructing population history and identifying
genetic disease variants
Traditionally, haplotype reconstruction methods can be categorized according to whether
they operate on a single pedigree or a collection of unrelated individuals. The thesis begins
with a critical assessment of the limitations of existing methods, and then presents a uni¬
fied statistical framework that can accommodate pedigree data, unrelated individuals and
tightly linked markers. The framework makes use of graphical models, where inference
entails representing the relevant joint probability distribution as a graph and then using
associated algorithms to facilitate computation. The graphical model formalism provides
invaluable tools to facilitate model specification, visualization, and inference.
Once the unified framework is developed, a broad range of simulation studies are conducted
using previously published haplotype data. Important contributions include demonstrating
the different ways in which the haplotype frequency distribution can impact the accuracy of
both the phase assignments and haplotype frequency estimates; evaluating the effectiveness
of using family data to improve accuracy for different frequency profiles; and, assessing the
dangers of treating related individuals as unrelated in an association study.