Abstract This talk will deal with three related topics of ancestry inference from genetic data. Topic one concerns estimation of admixture proportions for people of mixed ethnicity. Population stratication has long been recognized as a potential confounding factor in genetic association studies. Estimated ancestries, derived from multi-locus genotype data, can be used as covariates to correct for population stratication. Topic two summarizes how one can locate the geographic origin of individuals based on their genetic backgrounds. SNPs (single nucleotide polymorphisms) vary widely in informativeness, allele frequencies change nonlinearly with geography, and reliable localization requires evidence to be integrated across a multitude of SNPs. These problems become more acute for individuals of mixed ancestry. Topic three summarizes a new convex clustering method that delivers an entire clustering path. When applied to genetic data, these paths recapitulate the divergence of human populations.
