A hypervariable STR polymorphism in the CFI gene: Southern origin of East Asian-specific group H alleles
Introduction
Population-specific or regionally private alleles are powerful markers to understand human migration and expansion in anthropology and to identify the ancestry of an individual in forensic science. Short tandem repeat (STR) loci are highly polymorphic and useful genetic markers in these fields. However, no population-specific or regionally private STR alleles with a frequency above 0.13 have been found, except for allele 9RA at D9S1120 locus, which is ubiquitous at an average frequency of 0.329 in North and South American populations and 0.207 in Western Beringian populations. This geographically limited distribution of allele 9RA implies that Western Beringians contributed to all modern Native American populations [1]. Allele 9RA at D9S1120 locus is useful as an ancestry-informative marker for native Americans [2], [3].
Complement factor I (CFI) is a serine protease with an important role in complement alternative pathway regulation. The human CFI gene spans 63 kb on chromosome 4q25 and consists of 13 exons, encoding a 583-amino acid polypeptide [4]. A short tandem repeat (iSTR, refSNP ID rs56875356), observed at intron 7, is hypervariable with expected heterozygosity ranging from 0.889 in a German population to 0.927 in a Thai population. iSTR consists of 42 normal and intermediate alleles, of which 25 (alleles 16–34), belonging to group L, are common in four investigated populations. The remaining 17 (alleles 37.3–55.3), belonging to group H, were observed in mainland Japanese (J-Tottori) and Thai populations, but not in sub-Saharan African and German populations, suggesting that group H alleles are population-specific. The repeat structure is distinct between these two group alleles: group L alleles have a consensus structure of (TTTC)nC(TTTC)4, where n is 12–30, and group H alleles, of (TTTC)p(TTCC)qTTC(TTTC)rC(TTTC)3, where p + q + r is 34–52. An STR in the fibrinogen alpha chain gene (FGA) is located on chromosome 4q and is about 45 Mb distant from iSTR. No linkage disequilibrium (LD) was observed between them. Thus, iSTR was suggested to be a useful supplementary genetic marker in forensic science [5], [6]. We also elucidated the molecular basis of the CFI protein polymorphism and investigated five single-nucleotide polymorphisms on exons of the CFI gene (iSNPs) in 20 worldwide populations. There was a significant correlation between frequencies of c.1217A (p.406H) allele in exon 11 and the degree of latitude in 15 East Asian populations, showing a south–north downward geographic cline [7]. The association was observed between c.1217A and group H alleles in J-Tottori and Thai populations [5]. To evaluate the more detailed geographic patterns of iSNPs and iSTR at the CFI gene, in this study we investigated 1161 unrelated subjects from 11 Asian populations.
Section snippets
Materials and methods
A total of 1161 unrelated individuals from 11 Asian populations were investigated (Tables S1–S5). Of them, eight populations were selected from the 20 populations investigated in a previous study of the five iSNPs: Turk, Indian, Mongolian (M-Khalha), Korean (K-Seoul), Okinawan Japanese (J-Okinawa), and Han Chinese (H-Wuxi, H-Changsha and H-Huizhou) [7]. The remaining three populations were Tibetans in Kathmandu, Nepal, Evenkis in the Inner Mongolia Autonomous Region, China, and Oroqens in
Typing of five iSNPs
Table S1 shows the frequencies of the five iSNPs. The frequency of c.1217A allele in the three additional populations ranged from 0.095 in Tibetans to 0.012 in Oroqens. The two populations in the northern part of East Asia showed evidently low frequencies. These findings confirmed a south–north downward geographic cline of the c.1217A allele frequency in 18 East Asian populations (r = −0.8842, p < 0.001). H-Changsha showed the highest frequency of 0.150 among the 23 investigated populations. The
Discussion
The human iSTR has experienced two important events to generate groups L and H alleles in the process of human evolution. Chimpanzee, gorilla and Japanese monkey have the sequence of (TTTC)3(T/CTTC)T13–14 in the corresponding region [5]. RepeatMasking (http://www.repeatmasker.org/) showed AluYc element in the short sequence data from the three primates and humans [5] and AluSg7 element in the long human sequence data from a database (http://www.ncbi.nlm.nih.gov/gene/3426). An Alu element
Acknowledgements
This study was supported in part by Grants-in-Aid for Scientific Research (20590678 and 23590849 to IY) from the Japan Society for the Promotion of Science.
References (23)
- et al.
D9S1120, a simple STR with a common Native American-specific allele: forensic optimization, locus characterization and allele frequency studies
Forensic Sci Int Genet
(2008) - et al.
A hypervariable STR polymorphism in the CFI gene: mutation rate and no linkage disequilibrium with FGA.
Legal Med
(2013) - et al.
Distribution of OCA2∗481Thr and OCA2∗615Arg, associated with hypopigmentation, in several additional populations
Legal Med
(2011) - et al.
Alu repeats: a source for the genesis of primate microsatellites
Genomics
(1995) - et al.
Y-chromosome evidence of southern origin of the East Asian-specific haplogroup O3-M122
Am J Hum Genet
(2005) - et al.
Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat
Am J Hum Genet
(1998) - et al.
A private allele ubiquitous in the Americas
Biol Lett
(2007) - et al.
Evaluation of forensic and anthropological potential of D9S1120 in Mestizos and Amerindian populations from Mexico
Croat Med J
(2012) - et al.
Complement factor I deficiency: a not so rare immune defect. Characterization of new mutations and the first large gene deletion
Orphanet J Rare Dis
(2012) - et al.
A hypervariable STR polymorphism in the complement factor I (CFI) gene: Asian-specific alleles
Int J Legal Med
(2011)
Molecular basis of complement factor I (CFI) polymorphism: one of two polymorphic suballeles responsible for CFI A is Japanese-specific
J Hum Genet
Cited by (3)
Exploring the Risks of Genetic Similarity Between Donor and Recipient in Human Leukocyte Antigen–Matched Transplantation
2020, Transplantation ProceedingsGenotyping of the c.1423C>T (p.P475S) polymorphism in the ADAMTS13 gene by APLP and HRM assays: Northeastern Asian origin of the mutant
2016, Legal MedicineCitation Excerpt :The northern and southern regions of East Asia are hotspots of mutations and reservoirs of modern Asians in the process of evolution [21–23]. To date, we have also identified such alleles: OCA2∗481Thr, characteristic of northern populations and OCA2∗615Arg, CFI∗406His, and SCN5A∗1193Gln, characteristic of southern populations [24–27]. In this study, the average frequencies between two population groups divided by a northern latitude of 32° were compared: three northern and six southern Han Chinese populations had an average ADAMTS13∗T frequency of 0.0287 and 0.0136, respectively, and 14 northern and 10 southern East Asian populations, of 0.0450 and 0.0124, respectively.
The global distribution of the p.R1193Q polymorphism in the SCN5A gene
2016, Legal MedicineCitation Excerpt :The highest frequency of allele A was observed in Changsha in the Eurasian continent. According to distribution studies on SNPs and hypervariable short tandem repeats (iSTR) at the complement factor I (CFI) gene, the frequencies of the c.1217A allele and Group H alleles of iSTR were the highest in Han Chinese from Changsha and showed a south–north downward geographical cline [17–19]. In addition, the highest frequency of OCA2*615Arg was previously reported in Han Chinese in Changsha [20].