273 research outputs found

    Breed Relationships Facilitate Fine-Mapping Studies: A 7.8-kb Deletion Cosegregates With Collie Eye Anomaly Across Multiple Dog Breeds

    Get PDF
    The features of modern dog breeds that increase the ease of mapping common diseases, such as reduced heterogeneity and extensive linkage disequilibrium, may also increase the difficulty associated with fine mapping and identifying causative mutations. One way to address this problem is by combining data from multiple breeds segregating the same trait after initial linkage has been determined. The multibreed approach increases the number of potentially informative recombination events and reduces the size of the critical haplotype by taking advantage of shortened linkage disequilibrium distances found across breeds. In order to identify breeds that likely share a trait inherited from the same ancestral source, we have used cluster analysis to divide 132 breeds of dog into five primary breed groups. We then use the multibreed approach to fine-map Collie eye anomaly (cea), a complex disorder of ocular development that was initially mapped to a 3.9-cM region on canine chromosome 37. Combined genotypes from affected individuals from four breeds of a single breed group significantly narrowed the candidate gene region to a 103-kb interval spanning only four genes. Sequence analysis revealed that all affected dogs share a homozygous deletion of 7.8 kb in the NHEJ1 gene. This intronic deletion spans a highly conserved binding domain to which several developmentally important proteins bind. This work both establishes that the primary cea mutation arose as a single disease allele in a common ancestor of herding breeds as well as highlights the value of comparative population analysis for refining regions of linkage

    Survey sequencing and radiation hybrid mapping to construct comparative maps.

    No full text
    In MURPHY WJ (ed.) Phylogenomics, Humana Press. (Methods in Molecular Biology, 422)International audienceRadiation hybrid (RH) mapping has become one of the most well-established techniques for economically and efficiently navigating genomes of interest. The success of the technique relies on random chromosome breakage of a target genome, which is then captured by recipient cells missing a preselected marker. Selection for hybrid cells that have DNA fragments bearing the marker of choice, plus a random set of DNA fragments from the initial irradiation, generates a set of cell lines that recapitulates the genome of the target organism several-fold. Markers or genes of interest are analyzed by PCR using DNA isolated from each cell line. Statistical tools are applied to determine both the linear order of markers on each chromosome, and the confidence of each placement. The resolution of the resulting map relies on many factors, most notably the degree of breakage from the initial radiation as well as the number of hybrid clones and mean retention value.A high-resolution RH map of a genome derived from low pass or survey sequencing (coverage from 1 to 2 times) can provide essentially the same comparative data on gene order that is derived from high-coverage (greater than x7) genome sequencing. When combined with fluorescence in situ hybridization, RH maps are complete and ordered blueprints for each chromosome. They give information about the relative order and spacing of genes and markers, and allow investigators to move between target and reference genomes, such as those of mouse or human, with ease although the approach is not limited to mammal genomes

    Transcriptomic evidence for modulation of host inflammatory responses during febrile Plasmodium falciparum malaria

    Get PDF
    Identifying molecular predictors and mechanisms of malaria disease is important for understanding how Plasmodium falciparum malaria is controlled. Transcriptomic studies in humans have so far been limited to retrospective analysis of blood samples from clinical cases. In this prospective, proof-of-principle study, we compared whole-blood RNA-seq profiles at pre-and post-infection time points from Malian adults who were either asymptomatic (n = 5) or febrile (n = 3) during their first seasonal PCR-positive P. falciparum infection with those from malaria-naïve Dutch adults after a single controlled human malaria infection (n = 5). Our data show a graded activation of pathways downstream of pro-inflammatory cytokines, with the highest activation in malaria-naïve Dutch individuals and significantly reduced activation in malaria-experienced Malians. Newly febrile and asymptomatic infections in Malians were statistically indistinguishable except for genes activated by pro-inflammatory cytokines. The combined data provide a molecular basis for the development of a pyrogenic threshold as individuals acquire immunity to clinical malaria

    Origins of domestic dog in Southern East Asia is supported by analysis of Y-chromosome DNA

    Get PDF
    Global mitochondrial DNA (mtDNA) data indicates that the dog originates from domestication of wolf in Asia South of Yangtze River (ASY), with minor genetic contributions from dog–wolf hybridisation elsewhere. Archaeological data and autosomal single nucleotide polymorphism data have instead suggested that dogs originate from Europe and/or South West Asia but, because these datasets lack data from ASY, evidence pointing to ASY may have been overlooked. Analyses of additional markers for global datasets, including ASY, are therefore necessary to test if mtDNA phylogeography reflects the actual dog history and not merely stochastic events or selection. Here, we analyse 14 437 bp of Y-chromosome DNA sequence in 151 dogs sampled worldwide. We found 28 haplotypes distributed in five haplogroups. Two haplogroups were universally shared and included three haplotypes carried by 46% of all dogs, but two other haplogroups were primarily restricted to East Asia. Highest genetic diversity and virtually complete phylogenetic coverage was found within ASY. The 151 dogs were estimated to originate from 13–24 wolf founders, but there was no indication of post-domestication dog–wolf hybridisations. Thus, Y-chromosome and mtDNA data give strikingly similar pictures of dog phylogeography, most importantly that roughly 50% of the gene pools are shared universally but only ASY has nearly the full range of genetic diversity, such that the gene pools in all other regions may derive from ASY. This corroborates that ASY was the principal, and possibly sole region of wolf domestication, that a large number of wolves were domesticated, and that subsequent dog–wolf hybridisation contributed modestly to the dog gene pool

    Ethological principles predict the neuropeptides co-opted to influence parenting

    Get PDF
    Ethologists predicted that parental care evolves by modifying behavioural precursors in the asocial ancestor. As a corollary, we predict that the evolved mechanistic changes reside in genetic pathways underlying these traits. Here we test our hypothesis in female burying beetles, Nicrophorus vespilloides, an insect where caring adults regurgitate food to begging, dependent offspring. We quantify neuropeptide abundance in brains collected from three behavioural states: solitary virgins, individuals actively parenting or post-parenting solitary adults and quantify 133 peptides belonging to 18 neuropeptides. Eight neuropeptides differ in abundance in one or more states, with increased abundance during parenting in seven. None of these eight neuropeptides have been associated with parental care previously, but all have roles in predicted behavioural precursors for parenting. Our study supports the hypothesis that predictable traits and pathways are targets of selection during the evolution of parenting and suggests additional candidate neuropeptides to study in the context of parenting

    Computational Biology Methods and Their Application to the Comparative Genomics of Endocellular Symbiotic Bacteria of Insects

    Get PDF
    Comparative genomics has become a real tantalizing challenge in the postgenomic era. This fact has been mostly magnified by the plethora of new genomes becoming available in a daily bases. The overwhelming list of new genomes to compare has pushed the field of bioinformatics and computational biology forward toward the design and development of methods capable of identifying patterns in a sea of swamping data noise. Despite many advances made in such endeavor, the ever-lasting annoying exceptions to the general patterns remain to pose difficulties in generalizing methods for comparative genomics. In this review, we discuss the different tools devised to undertake the challenge of comparative genomics and some of the exceptions that compromise the generality of such methods. We focus on endosymbiotic bacteria of insects because of their genomic dynamics peculiarities when compared to free-living organisms

    Metatranscriptomics Reveals the Diversity of Genes Expressed by Eukaryotes in Forest Soils

    Get PDF
    Eukaryotic organisms play essential roles in the biology and fertility of soils. For example the micro and mesofauna contribute to the fragmentation and homogenization of plant organic matter, while its hydrolysis is primarily performed by the fungi. To get a global picture of the activities carried out by soil eukaryotes we sequenced 2×10,000 cDNAs synthesized from polyadenylated mRNA directly extracted from soils sampled in beech (Fagus sylvatica) and spruce (Picea abies) forests. Taxonomic affiliation of both cDNAs and 18S rRNA sequences showed a dominance of sequences from fungi (up to 60%) and metazoans while protists represented less than 12% of the 18S rRNA sequences. Sixty percent of cDNA sequences from beech forest soil and 52% from spruce forest soil had no homologs in the GenBank/EMBL/DDJB protein database. A Gene Ontology term was attributed to 39% and 31.5% of the spruce and beech soil sequences respectively. Altogether 2076 sequences were putative homologs to different enzyme classes participating to 129 KEGG pathways among which several were implicated in the utilisation of soil nutrients such as nitrogen (ammonium, amino acids, oligopeptides), sugars, phosphates and sulfate. Specific annotation of plant cell wall degrading enzymes identified enzymes active on major polymers (cellulose, hemicelluloses, pectin, lignin) and glycoside hydrolases represented 0.5% (beech soil)–0.8% (spruce soil) of the cDNAs. Other sequences coding enzymes active on organic matter (extracellular proteases, lipases, a phytase, P450 monooxygenases) were identified, thus underlining the biotechnological potential of eukaryotic metatranscriptomes. The phylogenetic affiliation of 12 full-length carbohydrate active enzymes showed that most of them were distantly related to sequences from known fungi. For example, a putative GH45 endocellulase was closely associated to molluscan sequences, while a GH7 cellobiohydrolase was closest to crustacean sequences, thus suggesting a potentially significant contribution of non-fungal eukaryotes in the actual hydrolysis of soil organic matter

    The Diploid Genome Sequence of an Individual Human

    Get PDF
    Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information

    Repetitive Elements May Comprise Over Two-Thirds of the Human Genome

    Get PDF
    Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo “clouds”). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%–69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed “element-specific” P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed
    corecore