Tuesday , June 25 2019
Home / zimbabwe / Comparative genomics and the complex population history of Papio baboons

Comparative genomics and the complex population history of Papio baboons


We created the reference genome-file (Panu_3.0; GenBank Access GCA_000264685.2) for the baboon olive tree (Papio anubis), the type of baboons most commonly used in biomedical research (Figure S1 and Tables S1 and S2) (7). To investigate genetic differentiation in the genus we analyzed whole genome sequences from 16 additional individuals, 2-4 individuals representing each of the six species within Papio, and gelada (Theropithecus gelada), a member of a closely related family that serves as an outgroup (Figure S2 and Table S3). This diversity panel produced> 54.6 million mononucleotide variants (SNVs), of which> 42.4 million are variable Papio (Figure S3 and Table S4). In order to create a second independent perspective of genome differentiation, we found that they were new Alu advertisement, a type of genetic variation that results from a substantially different mutational mechanism. We have unexpectedly found a dramatically increased amount of recent Alu baboons (and rhesus macaques) in relation to genomes of human and other primates (Figure 2 and Table S5). There are 192,889 full lengths available AluY elements in P. anubis genome. Rate of line-specific accumulation AluIn the case of baboons and rhesus macaques, the insertion was more than four times higher (Figure 2) than in homoids (humans, chimpanzees or orangutans) and approximately three times higher than in African green monkeys Chlorocebus), another OWM (22).

Figure 2 Comparison Alu the rate of mobilization in selected primate genomes.

Only Alu elements specific to each line are included. Circle size matches the number of almost specific rows AluY elements of this kind. The bars on the right show the estimated number of ads per million years for each line. For baboons (Panu_3.0), rhesus monkey (Mmul_8.0.1), African green monkeys (chlSab2), chimpanzees (Pan_tro3) and humans (GRCh38 / hg38) AluThe Y sequences were computed using cross-comparison using the latest available assemblies. Orangutan estimates are from P_pygmaeus2.0.2 (60). Number of lines specific to the number of rows AluY elements are similar to rhesus macaque and baboon, and more than twice that in African green monkeys, despite a longer period of independent development for African green monkeys.

Our phylogenetic analyzes provide some new findings on the baboon population and genome history. High-probability analyzes (MLs) of SNV-linked sequences show that individual baboons cluster properly with their conspecifics while separating six existing species into different northern and southern leaves (Figure 3 and Figure S4A). By contrast, Bayesian analysis of the same SNV data suggests that P. kindae is the sister of the northern clan rather than P. cynocephalus and P. ursinus (Figure S4B). Existence of multiple hybrid zones and documented inconsistencies between mtDNA-based relationships and phenotypes (Figure 1) (12, 15) claims that man-made admixtures and / or incomplete line sorting (ILS) have influenced genetic relationships between these species. When we used the polymorphism-conscious phylogenetic approach, PoMo (23, 24), again we have obtained basal north-south divergence P. kindae located in the southern group. However, the relations between the three southern species differ from the ML results (Figure 3). PoMo also derives much longer lengths of end branches for P. ursinus and P. papio than on other lines. The simulations (Figure S5 and Table S6) show that the admixture between divergent lines can affect the derived lengths of the branches and that the lines that have passed through the admixtures will have artificially shorter lengths of branches to share the allele between the lines. This suggests that the other four lines may be more influenced by impurities than P. ursinus and P. papio, which is consistent with the fact that these two species are located in the extreme south and west areas of baboons (Figure 1).

Figure 3 Phylogenetic relationships between baboon species.

(AND) Phylogeny created by the phylogenetic method (PoMo) (23, 24). This topology for the three northern species is also supported by ML analysis of chained SNVs and 43.9% of informative gene trees filtered to exclude all genes encoding sequences [scaled concordance factor (CF) of 0.439, greater than the other two alternatives]. The topology shown for three species of South class is supported by PoMo analysis and has a CF score of 0.322. (B) One alternative topology for northern species supported by CF 0.241. (C) One alternative topology for southern species, supported by ML analysis of chained SNVs and a CF score of 0.513, ie a larger proportion of gene trees lacking coding genes than the other two alternative trees.

To explicitly test admixtures between the six existing baboons, we analyzed F-statistics, followed by modeling using coalescence hidden Markov methods (Figure 4A, Table S7 and Fig. S6). The best model (see materials and methods) indicates that history P. kindae includes an old suburban event involving a line related to existence P. ursinus (52% contribution to existence P. kindae) and an unmarked line (probably defunct) relevant to the northern wing (48% post). The F-statistics suggest that they exist P. papio is closely related P. anubis, but received ~ 10% genetic input from the original northern genus that had not yet been removed, perhaps extinct.

Fig. 4 Evolutionary and demographic history for Papio baboons.

(AND) Analyzes using F-statistics suggest it P. kindae was created by entering both the southern class of ancestry and the northern class of the line, with shares being estimated at 52 and 48%. P. papio is derived from the fact that it was made by 10% introgression from an unidentified old northern line into a population related P. anubis. Diversity and doping terms were derived using CoalHMM, and internal nodes representing these divergence or doping events are labeled A through K. Our analysis of asymmetric haplotype sharing also derived a mixture of P. cynocephalus to P. anubis approximately 21 generations ago. (B) Reconstruction of baboons' demographic history using PSMC methods. A long-term narrow site has been observed in ancestry P. papio the beginning of ~ 400 thousand years (ka) before, while the population of the predecessor to P. hamadryas and P. anubis grew between ~ 280 and ~ 160 a year ago. After scattering, P. anubis after an upward trend P. hamadryas he refused. Before about 400 ka, NE for P. ursinus deviates from estimates for ancestral populations P. cynocephalus and P. kindae, and underwent a long-lasting narrow species specific for the species. Approximately 300 KA NE reconstructed for both P. cynocephalus and P. kindae increased up to a maximum ~ 150 ka before experiencing a subsequent drop. PSMC methods are not always reliable for the last date range.

Our results also shed new light on the historical dynamics of hybridization between P. anubis (Northern Cretaceous species) and P. cynocephalus (southern species clade), previously reported in southern Kenya near the Amboseli National Park (17). Behavioral observations and microsatellite based analyzes support recent introgression P. anubis to P. cynocephalus from the eighties (25, 26). Our analysis of the sharing of haplotype blocks at the genome level suggests that a P. anubis a person from Aberdare Kenya, more than 200 km north of Amboseli, is also mixed with P. cynocephalus, carrying ~ 546 MB of nuclear DNA derived from P. cynocephalus (Figure S7). If we assume that this was the result of a single impurity, it is estimated to have occurred in about 21 generations (~ 220 years). However, more complex explanations are possible. Second person from P. anubis Aberdare population also bears P. cynocephalus haplotypes, but these shared genomic segments are less and shorter and are probably the result of an older introvert. In line with other studies (27), our findings suggest that there have been several episodes of gene flow involving these two species for a considerable time, and that the effects of hybridization from the past have gone far beyond the current hybrid zone. This complexity can well represent the complexity of other well-known baboon hybrid zones (10, 12, 15, 18, 19, 28).

Motivated results from F-statistics and haplotype sharing, we performed two additional tests Papio Diversity Panel to explore the hypothesis of ancient ingredients using independent methods. Alu the insertion polymorphisms are valuable phylogenetic features because the polarity of specific mutational changes can be clearly demonstrated for any given genomic segment (Fig.29). A haplotype that carries a novel Alu The advertisement is derived from the orthological haplotype it lacks Alu repetition and conversion are rare. Majority-rule Dollo parsimony analysis of baboons using a novel Alu reinforcements revealed the difference north and south again. However, the descending lines are poorly divided, showing apparent homoplasia (Figure S8). In phylogeny built using well-defined polarity characters, such homoplasia would not be expected if radiative species did not have significant ILS and / or gene flow between different lines (30).

We also investigated differences in the evolutionary history of various segments in the baboons genome. We divided the reference genome into 808 discrete regions without genes (putatively neutral). Using BUCK (31) and SNV genotypes from the Diversity Panel we performed a Bayesian Concordance Analysis (BCA). Individual animals, again, as expected, are grouped by species. Significant north and south divergences are supported, but the coordination factors (CF) for relationships within each of these two geographical areas are low (Figure 3). P. hamadryas is most often a sister anubis-papio clade, but two other possible topologies [(papio-(ham-anubis)) and (anubis-(ham-papio))] (S9), as would be expected in the ILS. Alike, P. kindae is most often sister and cynocephalus-ursinus clade, corresponding ML results, but not F-Statistics or PoMo results. Again, two small BCA topologies are not found in the same ratios (Figure S9). Together Alu The insertion and BCA results support the conclusion that networking rather than ILS without networking has influenced the genomic divergence of baboons (Table 1).

table 1 Summary of different types of data and analytical approaches used to investigate phylogeny of baboon species.

The timing of deviations in the number of rows and events was performed using the coalescence hidden Markov model (CoalHMM, Figures S10 to S15 and S8 and S9 tables) (32, 33). Use the estimated mutation rate of 0.9 × 10-8 based on a couple on a generation and a generation time of 11 years[seeMaterialsofmetamodel([seeMaterialsandMethodsand([vizMateriályametodya([seeMaterialsandMethodsand(11, 34)], we obtain the results shown in Figure 4A. To reconstruct demographic history, we created pairwise sequence Markov Coalescence Charts (PSMC) (35), assuming that the generation time and the mutation speed are listed above (Figure 4B). Except P. papio, which has a shortened fence, the remaining five species are very similar in the effective population size (NE) before 4 pm before ~ ​​1.4 Ma previously supported the conclusion that all kinds of baboons have the same demographic history (ie Everything NE land shows an upward trend of ~ 1.5 months a year ago, but species-specific increases occur at different rates, probably reflecting population growth and dispersion once ecological conditions have allowed for demographic expansion14). Due to the palaeontological evidence of southern origin of this genus (36), we believe that a more pronounced decline NE for northern clan species in relation to southern lines ~ 700,000 to 800,000 years may reflect dispersion barriers because the geographical area of ​​baboons has spread to the north. Likewise, CoalHMM suggests that the impurity of the North and the South that originated P. kindae occurred about 100,000 years ago, and PSMC results indicate an increase NE for P. kindae about this time.

To examine the possible functional implications of the baboon's dopamine, we examined 2201 suitable gene regions (local genomic segments containing each annotated gene encoding the protein and showing a sufficient phylogenetic signal to support one particular phylogenetic tree over all alternative trees). We identified individual loci exhibiting phylogenetic relationships (gene trees) identical or inconsistent with consensus phylogeny at species level, which separates three northern species from three southern species. Cluster 1 contains 1143 gene regions with phylogeny that are close to the result (Figure S16). Cluster 2 consists of 629 gene regions for which it is P. cynocephalus carries haplotypes that are not closely related to other Southern class haplotypes (Figure S17). Genes in these areas are enriched with the concepts of "learning and memory" about gene ontology (GO) (P = 0.012), "cognition" (P = 0.012), "head development" (P = 0.014) and "brain development" (P = 0.017) as well as several categories of GOs related to reproduction (see Table S10). Cluster 3 comprises 429 gene regions that exhibit phylogenetic relationships between the species of South Positions in accordance with the phylogenesis of Figure 3a. However, the cluster 3 haplotypes from the North Pros P. anubis are closely related to haplotypes from the Southern kladu, while the haplotypes in the northern circle P. papio generally form a sister of all other baboon haplotypes (Figure S18). The genes found in Group 3 regions are enriched with GO concepts related to the ontogenetic development of several organ systems (kidney, heart, circulatory and endocrine systems, all significantly enriched by P <0.03) (Table S10). Note that the two species showing the clearest gene variation compared to species level phylogenesis (i.e., species bearing haplotypes that apparently crossed species boundaries) are P. anubis and P. cynocephalus, northern hammers and southern hammers, which today actively hybridize in southern Kenya (17) and show evidence of nuclear DNA flooding (12).

Thanks: We recognize the contributions of sequential production workers of the Human Genome Sequencing Center: KA Abraham, HA Akbar, SA Ali, UA Anosike, PA Aqrawi, FA Arias, TA Attaway, RA Awwad, CB Babu, DB Bandaranaike, PB Battles, AB Bell, BB Beltran, DB Berhane-Mersha, CB Bess, CB Bickham, TB Bolden, K. Cardenas, KC Carter, M. Cavazos, A. Chandrabose, S. Chao, DC Chau, AC Chavez, R. Chu, KC Clerc -Blankenburg, A. Cockrell, MC Coyle, A. Cree, MD Dao, ML Davila, LD Davy-Carroll, SD Denson, S. Dugan, V. Ebong, S. Elkadiri, SF Fernandez, LF Forbes, G. Fowler, CF Francis, LF Francisco , QF Fu, R. Gabisi, RG Garcia, T. Garner, TG Garrett, SG Gross, SG Gubbala, K. Hawkins, BJ Hollins, LJ Jackson, MJ Javaid, JC Jayaseelan, AJ Johnson, BJ Johnson, JJ Jones, VJ Joshi, D. Kalra, JK Kalu, NK Khan, L. Kisamo, LL Lago, Y. Lai, FL Lara, T.-K. Le., F. L. Legall-Iii, S. L. Lemon, L. Lewis, J. L. Liu, Y.-S. Liu, DL Liyanage, P. London, JL Lopez, LL Lorensuhewa, E. Martinez, RM Mata, TM Mathew, T. Matskevitch, CM Mercado, IM Mercado, MM Morgan, MM Munidasa, DN Ngo, LN Nguyen, P. Nguyen , TN Nguyen, NN Nguyen, M. Nwaokelemeh, MO Obregon, GO Okwuonu, FO Ongeri, CO Onwere, IO Osifeso, AP Parra, SP Patil, AP Perez, E. Primus, L.-L. Pu, M. P. Puazo, J. Q. Quiroz, S. Richards, J. R. Rouhana, M. R. Ruiz, S.-J. Ruiz, N. S. Saada, J. S. Santibanez, M. S. Scheel, S. Scherer, B. S. Schneider, D. S. Simmons, I. S. Sisson, E. S. Skinner, N. Tabassum, L.-Y. Tang, A. Taylor, RT Thornton, JT Tisius, GT Toledanes, ZT Trejos, KU Usmani, RV Varghese, SV Vattathil, VV Vee, DW Walker, GW Weissenberger, CW White, K. Wilczek-Boney, AW Williams, K. Wilson, I. Woghiren, JW Woodworth, RW Wright, Y.-Q. Wu, Y. Xin, Y. Zhang, Y. Z. Zhu and X. Zou. Biomaterials for DNA sequencing P. anubis Baboon and several different baboon panels were provided by the National Research Center for Primates in San Antonio, TX, supported by a grant from the NIH Office of Research Infrastructure Programs (P51-OD011133). The research presented here was in line with government regulations and directives and the IACUC. J.R. is also affiliated with the Wisconsin National Primate Research Center, Madison, WI. C.K. is also associated with the Institute for Population Genetics, Vetmeduni Vienna, Austria and D.S. is newly affiliated with Eötvös Lorand University Budapest and Max Perutz Laboratories Vienna. Financing: Sequencing and Analytical Activity in Human Genome Sequencing, Baylor College of Medicine, was supported by NIH grants (NHGRI) U54-HG003273 and U54-HG006484 to R.A.G. and Grant GAC 1 S10 RR026605 by J. G. Reid. This research was also supported by the grant NIH R01-GM59290 from M.A.B .; grants from the Austrian Science Fund (FWF-P24551 and FWF-W1225) and the Vienna Science and Technology Fund (WWTF-MA16-061) from C.K .; Wellcome Trust grants (WT108749 / Z / 15 / Z) and EMBL B.A., F.J.M. and M.M .; VEGA 1/0719/14 and APVV-14-0253 are awarded to T. Vinar (member of the consortium); Grant MINECO / FEDER, NIH grant U01-MH106874, Howard Hughes International Early Career Award and Obra Social Award "La Caixa" T.M.-B .; NSF grants BNS83-03506 J.P.-C .; NSF1029302 by J. P. C., J. R. and C. J. J .; BNS96-15150 by J.P.C., C.J.J. and T.D .; and the National Geographic Society and the Leakey Foundation grant to J.P.-C. and C.J.J. E.E.E. is an investigator at the Howard Hughes Medical Institute. This work was partially supported by US NIH by awarding HG002385 E.E.E. Competitive interests: Authors declare that they have no competitive interests. Availability of data and materials: Raw data for reading, sample metadata, and other information about this genome build project are available from Bioproject PRJNA260523 at www.ncbi.nlm.nih.gov. Further information on RNA sequencing data is available in Nonhuman Primate Reference Transcriptome (http://nhprtr.org/). More information about SNV and Indel variants is available as a clue in the UCSC browser (https://hgsc.bcm.edu/non-human-primates/baboon-genome-project). Additional information regarding this contribution may be requested from the authors.Full membership of consortium analysis of baboons genomes:Bronwen Aken1, Nicoletta Archidiacono2, Georgios Athanasiadis3, Mark A. Batzer4, Thomas O. Beckstrom4, Christina Bergey5.6, Konstantinos Billis1, Andrew Burrell5, Oronzo Capozzi2, Claudia R. Catacchio2, Jade Cheng3, Laura A. Cox7.8, Huyen H. Dinh9, Todd Disotell5, HarshaVardhan Doddapaneni9, Evan E. Eichler10.11, James Else12, Richard A. Gibbs9.13, Matthew W. Hahn14, Yi Han9, R. Alan Harris9.13, John Huddleston10, Shalini N. Jhangiani9, Clifford J. Jolly5, Vallmer E. Jordan4, Anis Karimpour-Fard15, Miriam K. Konkel32, Gisela H. Koppová16,17, Viktoriya Korchina9, Carolin Kosiol18, Maximillian Kothe19, Christie L. Kovar9, Lukas Kuderna20, Sandra L. Lee9, Kalle Leppälä3, Xiaoming Liu21, Yue Liu9, Thomas Mailund3, Tomas Marques-Bonet20, 22, 23, 33, Alessia Marra-Campanale2, Fergal J. Martin1, Christopher E. Mason24, Marc de Manuel Montero20, Matthieu Muffato1, Kasper Munch3, Shwetha Murali9, Donna M. Muzny9.13, Angela Noll19, Kymberleigh A. Pagel25, Antonio Palazzo2, Jera Pecotte7, Vikas Pejaver25, Jane Phillips-Conroy26, Lenore Pipes24, Veronica Searles Fast15, Predrag Radivojac25, Archana Raja10, Brian J. Raney27, Muthuswamy Raveendran9, Karen Rice7, Mariano Rocchi2, Jeffrey Rogers9.13, Christian Roos19, Mikkel Heide Schierup3, Dominik Schrempf28, James M. Sikela15, Roscoe Stanyon29, Cody J. Steely4, Gregg W. C. Thomas14, Jenny Tung30, Mario Ventura2, Tauras P. Vilgalys30, Tomáš Vinar31, Jerilyn A. Walker4, Lutz Walter19, Kim C. Worley9.13, and Dietmar Zinner16.1European Laboratory of Molecular Biology, European Institute of Bioinformatics, Hinxton, UK. 2Department of Biology, University of Bari, Bari, Italy. 3Bioinformatics Research Center, Aarhus University, Aarhus, Denmark. 4Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA. 5Department of Anthropology, New York University, New York, NY, USA. 6Department of Biological Sciences, University of Notre Dame, South Bend, IN, USA. 7South Western National Research Center Texas Institute for Biomedical Research, San Antonio, TX, USA. 8Department of Genetics, Texas Biomedical Research Institute, San Antonio, TX, USA. 9Human Genome Gene Genesis Center, Baylor College of Medicine, Houston, TX, USA. 10Department of Genomic Sciences, Washington University, Seattle, WA, USA. 11Howard Hughes Medical Institute, Washington University, Seattle, WA, USA. 12Department of Pathology and Laboratory Medicine and Yerkes Primate Research Center, Emory University, Atlanta, GA, USA. 13Department of Molecular and Human Genetics, Baylor's Medical Faculty, Houston, TX, USA. 14Department of Biology, Indiana University, Bloomington, IN, USA. 15Institute of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Denver, CO, USA. 16Cognitive Ethological Laboratory, German Primary Center, Institute for Primary Research in Leibniz, Göttingen, Germany. 17Department of Biology, University of Konstanz, Konstanz, Germany. 18Center for Biological Diversity, Biological School, St. Andrews, UK. 19Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primary Research, Göttingen, Germany. 20Institute of Evolutionary Biology (UPF-CSIC), PRBB, Barcelona, ​​Spain. 21School of Public Health, Texas Health Science Center, Houston, TX, USA. 22Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, ​​Spain. 23CNAG-CRG, Center for Genome Regulation, Barcelona Institute of Science and Technology, Barcelona, ​​Spain. 24Department of Physiology and Biophysics, Weill Cornell Medical College, New York, NY, USA and President Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA. 25Department of Informatics and Computer Science, Indiana University, Bloomington, IN, USA. 26Department of Neuroscience, Faculty of Medicine, Washington, St. Petersburg. Louis. MO, USA and Department of Anthropology, Washington University, Seattle, WA, USA. 27Genomics Institute, University of California, Santa Cruz, CA, USA. 28Institute for Population Genetics, Veterinary Medical University of Vienna, Vienna, Austria. 29Department of Biology, Florence, Florence, Italy. 30Department of Evolutionary Anthropology, Duke University, Durham, NC, USA. 31Faculty of Mathematics, Physics and Informatics, Comenius University, Bratislava, Slovakia. 32Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA. 33Instituto Catala de paleontologia Miquel Crusafont, Autonoma de Barcelona, ​​Barcelona, ​​Spain.Members' contributions from consortium members:He proposed the study and supervised the analysis: J. Rogers *, K. C. Worley and R. A. Gibbs. Managed or tracked sequential production: D. M. Muzny *, C. L. Kovar, H. H. Dinh and Y. Han. Managed or supervised the preparation of sequential libraries: H. Doddapaneni *, S. Lee and D. M. Muzny. Produced by: K. C. Worley *, Y. Liu, S. Murali and R. A. Harris. Project and data report: D. M. Muzny *, M. Raveendran, R. A. Harris, K. C. Worley, S. N. Jhangiani, V. Korchina, C. Kovar. Genome Annotation: B. Aken *, F. J. Martin, M. Muffato, K. Billis and X. Liu. Alu Repeated Analysis: M. A. Batzer *, J. A. Walker, M. K. Konkel, V. E. Jordan, C. J. Steely and T. O. Beckström. SNV and Indel analysis: R. A. Harris and M. Raveendran. Direct and Phylogenetic Analysis: T. Mailund, M. H. Schierup, K. Leppälä, J. Cheng, K. Munch and G. Athanasiadis. Phylogenetic and Population Analysis C. Bergey A. Burrell A. Noll D. Schrempf C. Kosiol GH Kopp G. Athanasiadis K. Munch J. Phillips-Conroy M. Kothe J. Tung, J. Rogers, CJ Jolly, D. Zinner and C. Roos. Cytogenetics and assay validation: M. Rocchi *, R. Stanyon, E. Eichler, N. Archidiacono, A. Palazzo and O. Capozzi. Gene Family Analyzes: M. W. Hahn *, J. Sikela *, G. W. C. Thomas, V. Searles Quick, A. Karimpour-Fard and L. Walter. Methylation Analysis: J. Tung * and T. P. Vilgalys. Positive Selection Analysis: C. Kosiol *, T. Vinar * and B. J. Raney. Post-translational modifications: P. Radivojac *, K. A. Pagel and V. Pejaver. Segmental duplicate analysis: E. Eichler *, M. Ventura, A. Raja, C. Catacchio, A. Marra-Campanale and J. Huddleston. Variants of copy number: T. Marques-Bonet *, L. Kuderna and M. Montero Montero. Transcriptomic Analysis: C. E. Mason * and L. Pipes. Provided Basic Biomaterials: K. Rice, J. Pecotte, J. Phillips-Conroy, C. J. Jolly, J. Rogers, J. Else and L. A. Cox. Text provided and / or data: D. Zinner, C. Roos, T. Mailund, K. Leppälä, E. Eichler, G. Athanasiadis, J. Cheng, K. Munch, C. Kosiol, C. Bergey, JA Walker, MA Batzer and J. Tung. Written by: J. Rogers *, C. J. Jolly, J. Tung, M. Hahn, D. Zinner, C. Roos, T. Marques-Bonet and K. C. Worley. * Leader of the group.

Source link