Aim: Leukocyte immunoglobulin-like receptors (LILRs), encoded on human chromosome 19q13.4, comprise a set of 11 immunoglobulin superfamily receptors known for their genetic heterogeneity. Notably, LILRB3 and LILRA6 within this cluster exhibit pronounced sequence homology and variable copy number (CN) states. However, understanding their precise role remains challenging.
Method: To address this difficulty, we developed an algorithm and tool named JoGo-LILR Caller, which jointly calls CNs of LILRB3 and LILRA6 from a population-scale whole-genome short-read sequencing dataset. This tool was applied to 2,504 International HapMap samples and yielded a global CN profile.
Results: The 100% concordance rate corroborated this profile with the CN data obtained from 24 samples subjected to long-read sequencing. The frequencies of different LILRB3-LILRA6 CN haplotypes were also estimated for five populations with a global CN profile. The established allele frequency profile allowed our tool to estimate LILRB3-LILRA6 CN haplotype combinations. JoGo-LILR-trio enhanced the prediction reliability for haplotype pairs within trio datasets, with validations demonstrating consistent estimations for offspring verified through long-read sequencing. JoGo-LILR Caller can accurately estimate LILRB3-LILRA6 CN types.
Conclusion: Its utility extends to facilitating software advancements for imputing LILRB3-LILRA6 CN types from SNP array genotyping data and enabling subsequent association analyses of these CN types with diverse diseases and phenotypes.