Aim: The Hardy-Weinberg Equilibrium (HWE) assumption is fundamental to population genetics. Current methods to evaluate deviation from HWE fall into two categories: exact tests for small populations and few alleles, and approximate goodness of fit tests which cannot handle ambiguous typing. Real-world HLA data is often characterized by high polymorphism, large sample size and typing ambiguity. To evaluate HWE using existing methods, HLA types must be converted to low resolution, grouped, or worse: the posterior distribution of imputation is collapsed to the most likely genotype, all of which are suboptimal. Furthermore, current HWE methods merely provide a measure of statistical significance but are not quantitative in terms of the degree of deviation.
Method: We’ve developed two new method that, in combination, overcome these limitations. The first, Unambiguous Multi Allelic Test (UMAT), is a method to test the deviation of the exact distribution from the one expected in HWE, and the second, Asymptotic Statistical Test with Ambiguity (ASTA) is a goodness of fit test that can handle ambiguous typing. UMAT is based on a perturbation, where the effect of randomly swapping pairs of alleles is tested in terms of its impact on the likelihood. ASTA is based on the correction of the Chi-Square test to reflect the variance with ambiguous typing.
Results: When these new methods are applied to the HLA in US population, we observe significant deviation in all populations and at all loci. Using these new methods 1. we can now quantify the degree of deviation at across populations and loci and 2. we can now quantify the deviation at the allele level. This has allowed the identification of a set of alleles that are likely to lead to spurious results in HLA association studies.
Conclusion: We can identify examples of these deviating alleles in the disease association literature leading to a general recommendation to consider using this method as a general strategy for interpreting disease association findings.