51 research outputs found

    Molecular Evolution of Drosophila Cuticular Protein Genes

    Get PDF
    Several multigene families have been described that together encode scores of structural cuticular proteins in Drosophila, although the functional significance of this diversity remains to be explored. Here I investigate the evolutionary histories of several multigene families (CPR, Tweedle, CPLCG, and CPF/CPFL) that vary in age, size, and sequence complexity, using sequenced Drosophila genomes and mosquito outgroups. My objective is to describe the rates and mechanisms of ‘cuticle-ome’ divergence, in order to identify conserved and rapidly evolving elements. I also investigate potential examples of interlocus gene conversion and concerted evolution within these families during Drosophila evolution. The absolute rate of change in gene number (per million years) is an order of magnitude lower for cuticular protein families within Drosophila than it is among Drosophila and the two mosquito taxa, implying that major transitions in the cuticle proteome have occurred at higher taxonomic levels. Several hotspots of intergenic conversion and/or gene turnover were identified, e.g. some gene pairs have independently undergone intergenic conversion within different lineages. Some gene conversion hotspots were characterized by conversion tracts initiating near nucleotide repeats within coding regions, and similar repeats were found within concertedly evolving cuticular protein genes in Anopheles gambiae. Rates of amino-acid substitution were generally severalfold higher along the branch connecting the Sophophora and Drosophila species groups, and 13 genes have Ka/Ks significantly greater than one along this branch, indicating adaptive divergence. Insect cuticular proteins appear to be a source of adaptive evolution within genera and, at higher taxonomic levels, subject to periods of gene-family expansion and contraction followed by quiescence. However, this relative stasis is belied by hotspots of molecular evolution, particularly concerted evolution, during the diversification of Drosophila. The prominent association between interlocus gene conversion and repeats within the coding sequence of interacting genes suggests that the latter promote strand exchange

    Annotation and analysis of a large cuticular protein family with the R&R Consensus in Anopheles gambiae

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The most abundant family of insect cuticular proteins, the CPR family, is recognized by the R&R Consensus, a domain of about 64 amino acids that binds to chitin and is present throughout arthropods. Several species have now been shown to have more than 100 CPR genes, inviting speculation as to the functional importance of this large number and diversity.</p> <p>Results</p> <p>We have identified 156 genes in <it>Anopheles gambiae </it>that code for putative cuticular proteins in this CPR family, over 1% of the total number of predicted genes in this species. Annotation was verified using several criteria including identification of TATA boxes, INRs, and DPEs plus support from proteomic and gene expression analyses. Two previously recognized CPR classes, RR-1 and RR-2, form separate, well-supported clades with the exception of a small set of genes with long branches whose relationships are poorly resolved. Several of these outliers have clear orthologs in other species. Although both clades are under purifying selection, the RR-1 variant of the R&R Consensus is evolving at twice the rate of the RR-2 variant and is structurally more labile. In contrast, the regions flanking the R&R Consensus have diversified in amino-acid composition to a much greater extent in RR-2 genes compared with RR-1 genes. Many genes are found in compact tandem arrays that may include similar or dissimilar genes but always include just one of the two classes. Tandem arrays of RR-2 genes frequently contain subsets of genes coding for highly similar proteins (sequence clusters). Properties of the proteins indicated that each cluster may serve a distinct function in the cuticle.</p> <p>Conclusion</p> <p>The complete annotation of this large gene family provides insight on the mechanisms of gene family evolution and clues about the need for so many CPR genes. These data also should assist annotation of other <it>Anopheles </it>genes.</p

    The Distribution of GYR- and YLP-Like Motifs in Drosophila Suggests a General Role in Cuticle Assembly and Other Protein-Protein Interactions

    Get PDF
    Background: Arthropod cuticle is composed predominantly of a self-assembling matrix of chitin and protein. Genes encoding structural cuticular proteins are remarkably abundant in arthropod genomes, yet there has been no systematic survey of conserved motifs across cuticular protein families. Methodology/Principal Findings: Two short sequence motifs with conserved tyrosines were identified in Drosophila cuticular proteins that were similar to the GYR and YLP Interpro domains. These motifs were found in members of the CPR, Tweedle, CPF/CPFL, and (in Anopheles gambiae) CPLCG cuticular protein families, and the Dusky/Miniature family of cuticleassociated proteins. Tweedle proteins have a characteristic motif architecture that is shared with the Drosophila protein GCR1 and its orthologs in other species, suggesting that GCR1 is also cuticular. A resilin repeat, which has been shown to confer elasticity, matched one of the motifs; a number of other Drosophila proteins of unknown function exhibit a motif architecture similar to that of resilin. The motifs were also present in some proteins of the peritrophic matrix and the eggshell, suggesting molecular convergence among distinct extracellular matrices. More surprisingly, gene regulation, development, and proteolysis were statistically over-represented ontology terms for all non-cuticular matches in Drosophila. Searches against other arthropod genomes indicate that the motifs are taxonomically widespread. Conclusions: This survey suggests a more general definition for GYR and YLP motifs and reveals their contribution to severa

    Taxonomic Characterization of Honey Bee (Apis mellifera) Pollen Foraging Based on Non-Overlapping Paired-End Sequencing of Nuclear Ribosomal Loci.

    No full text
    Identifying plant taxa that honey bees (Apis mellifera) forage upon is of great apicultural interest, but traditional methods are labor intensive and may lack resolution. Here we evaluate a high-throughput genetic barcoding approach to characterize trap-collected pollen from multiple North Dakota apiaries across multiple years. We used the Illumina MiSeq platform to generate sequence scaffolds from non-overlapping 300-bp paired-end sequencing reads of the ribosomal internal transcribed spacers (ITS). Full-length sequence scaffolds represented ~530 bp of ITS sequence after adapter trimming, drawn from the 5' of ITS1 and the 3' of ITS2, while skipping the uninformative 5.8S region. Operational taxonomic units (OTUs) were picked from scaffolds clustered at 97% identity, searched by BLAST against the nt database, and given taxonomic assignments using the paired-read lowest common ancestor approach. Taxonomic assignments and quantitative patterns were consistent with known plant distributions, phenology, and observational reports of pollen foraging, but revealed an unexpected contribution from non-crop graminoids and wetland plants. The mean number of plant species assignments per sample was 23.0 (+/- 5.5) and the mean species diversity (effective number of equally abundant species) was 3.3 (+/- 1.2). Bray-Curtis similarities showed good agreement among samples from the same apiary and sampling date. Rarefaction plots indicated that fewer than 50,000 reads are typically needed to characterize pollen samples of this complexity. Our results show that a pre-compiled, curated reference database is not essential for genus-level assignments, but species-level assignments are hindered by database gaps, reference length variation, and probable errors in the taxonomic assignment, requiring post-hoc evaluation. Although the effective per-sample yield achieved using custom MiSeq amplicon primers was less than the machine maximum, primarily due to lower "read2" quality, further protocol optimization and/or a modest reduction in multiplex scale should offset this difficulty. As small quantities of pollen are sufficient for amplification, our approach might be extendable to other questions or species for which large pollen samples are not available

    Contrasting genetic structure of adults and progeny in a Louisiana iris hybrid population

    No full text
    Studies of natural hybridization have suggested that it may be a creative stimulus for adaptive evolution and speciation. An important step in this process is the establishment of fit recombinant genotypes that are buffered front subsequent recombination with unlike genotypes. We used molecular markers and a two-generation sampling strategy to infer the extent of recombination in a Louisiana iris hybrid zone consisting predominantly of Iris fulva-type floral phenotypes. Genotypic diversity was fairly high, indicating that sexual reproduction is frequent relative to clonal reproduction. However, we observed strong spatial genetic structure even after controlling for clonality, which implies a low level of pollen and seed dispersal. We therefore used cluster analysis to explore the hypothesis that the fulva-type hybrids are an admixture of groups between which there has been limited recombination. Our results indicate that several Such groups are present in the population and are strongly localized spatially. This spatial pattern is not attributable strictly to a lack of mating opportunities between dissimilar genotypes for two reasons: (1) relatedness of flowering pairs was uncorrelated with the degree of overlap in flowering, and (2) paternity analysis shows that pollen movement among the outcross fraction occurred over large distances, with roughly half of all paternity attributed to pollen flow from outside the population. We also found evidence of strong inbreeding depression, indicated by contrasting estimates of the rate of self-fertilization and the average inbreeding coefficient of fulva-type hybrids. We conclude that groups of similar hybrid genotypes can be buffered from recombination at small spatial scales relative to pollen flow, and selection against certain recombinant genotypes may be as important as or more important than clonal reproduction and inbreeding

    Phylogeographic Genetic Diversity in the White Sucker Hepatitis B Virus across the Great Lakes Region and Alberta, Canada

    No full text
    Hepatitis B viruses belong to a family of circular, double-stranded DNA viruses that infect a range of organisms, with host responses that vary from mild infection to chronic infection and cancer. The white sucker hepatitis B virus (WSHBV) was first described in the white sucker (Catostomus commersonii), a freshwater teleost, and belongs to the genus Parahepadnavirus. At present, the host range of WSHBV and its impact on fish health are unknown, and neither genetic diversity nor association with fish health have been studied in any parahepadnavirus. Given the relevance of genomic diversity to disease outcome for the orthohepadnaviruses, we sought to characterize genomic variation in WSHBV and determine how it is structured among watersheds. We identified WSHBV-positive white sucker inhabiting tributaries of Lake Michigan, Lake Superior, Lake Erie (USA), and Lake Athabasca (Canada). Copy number in plasma and in liver tissue was estimated via qPCR. Templates from 27 virus-positive fish were amplified and sequenced using a primer-specific, circular long-range amplification method coupled with amplicon sequencing on the Illumina MiSeq. Phylogenetic analysis of the WSHBV genome identified phylogeographical clustering reminiscent of that observed with human hepatitis B virus genotypes. Notably, most non-synonymous substitutions were found to cluster in the pre-S/spacer overlap region, which is relevant for both viral entry and replication. The observed predominance of p1/s3 mutations in this region is indicative of adaptive change in the polymerase open reading frame (ORF), while, at the same time, the surface ORF is under purifying selection. Although the levels of variation we observed do not meet the criteria used to define sub/genotypes of human and avian hepadnaviruses, we identified geographically associated genome variation in the pre-S and spacer domain sufficient to define five WSHBV haplotypes. This study of WSHBV genetic diversity should facilitate the development of molecular markers for future identification of genotypes and provide evidence in future investigations of possible differential disease outcomes

    Risk of Institutionalization Among Community Long-Term Care Clients With Dementia

    No full text
    South Carolina Community Long-Term Care (CLTC) data were used to identify factors increasing the risk of institutionalization in people with dementia. Clients diagnosed withdementia and observed at least twice between June 1993 and December 1994 (N = 786)were studied. Logistic regression determined that clients with a decline in ADL function whowere white, had a nonrelative or child as a caregiver, and were diagnosed with Alzheimer'sdisease were at increased risk of institutionalization. Identifying CLTC clients at increased risk of institutionalization could be useful in designing additional interventions to prevent institutionalization or in planning the transition to institutional care

    Box plots of pairwise Bray-Curtis dissimilarities computed with Megan v. 5.10.2 [52].

    No full text
    <p>The range of possible values is 0–1. Only 2009 samples were compared as relatively few 2010 biological replicates were available and a year-effect on phenology is possible. Samples collected from different hives but from the same date and apiary were more similar than samples collected from different dates and apiaries. This indicates that variation among biological replicates is lower than spatial or temporal variation in foraging.</p

    Relative read counts summed across all OTUs for each plant taxonomic assignment.

    No full text
    <p>The counts are represented in phylogenetic context with the size of the circle at each node proportional to the (square-root transformed) counts per million (cpm) reads across all samples combined. Assignments were made at five taxonomic ranks as described in the methods. Counts for interior nodes reflect progressively greater taxonomic uncertainty in the underlying OTUs; they do not represent sums of counts attributed to the leaves of that node.</p
    • …
    corecore