Both groups were part of the Human Genome Diversity Project Centre dEtude du Polymorphisme Humain Panel, a collection of lymphoblastoid cell lines from 52 geographically diverse human license with Pfizer populations. In addition to the two populations residing in African tropical forests, we also examined, for compara tive purposes, three other human populations within the HGDP from Africa south of the Sahara. These popula tions, like the Pygmies, exhibit high levels of genetic di versity and low levels of linkage disequilibrium, relative to the non African populations that have been affected by ancestral founder effect during migration out of Africa The three other sub Saharan African popula tions examined were Bantu in Kenya, Mandenka in Senegal, and Yoruba in Nigeria.
Data from the HGDP CEPH panel were not examined for Bantu outside of Kenya or for the San from Namibia, since sample sizes for these groups were small. Individuals identified as relatives were removed from the data set, the final Brefeldin_A dataset contained 91 individuals, including Biaka, Mbuti, Bantu from Kenya, Mandenka and Yoruba. SNP genotypes We used the SNP data for the HGDP CEPH Panel, a dataset containing 938 individuals genotyped on the Illu mina 650 K platform. Using the standardized subset of the HGDP data, genotypes for 644,258 autosomal SNPs were available. Chromosomal positions for the SNPs were provided by the HGDP release for NCBI Human Genome build 36. 1 and map distances in centi morgans were calculated using those positions and recombination estimates provided by the HapMap pro ject phase I II.
Multi locus test of selection To examine the genomes for signatures of selection, we applied a previously validated method that exam ined regions displaying low heterozygosity within popu lations since and or high variance in FST between populations. By favoring one or few haplotypes at the expense of others, selection reduces the overall level of heterozygos ity around a beneficial allele. Thus low heterozygosity in the SNPs surrounding an allele may be a signature of se lection. Furthermore, within a population, as haplotype frequencies shift at a genomic region, some alleles will increase and others will decrease in frequency. In the population undergoing selection, some allele frequencies will become more similar, and other allele frequencies will become less similar, to allele frequencies present in a second population not undergoing selection. Thus be tween two populations relatively high variance of FST for alleles at a genomic region may represent a signature of selection. An algorithm that scanned the genome for regions of low heterozygosity within populations and high variance in FST between populations was run for each pos sible pair of African populations.