637 research outputs found
SyntenyTracker: a tool for defining homologous synteny blocks using radiation hybrid maps and whole-genome sequence
<p>Abstract</p> <p>Background</p> <p>The recent availability of genomic sequences and BAC libraries for a large number of mammals provides an excellent opportunity for identifying comparatively-anchored markers that are useful for creating high-resolution radiation-hybrid (RH) and BAC-based comparative maps. To use these maps for multispecies genome comparison and evolutionary inference, robust bioinformatic tools are required for the identification of chromosomal regions shared between genomes and to localize the positions of evolutionary breakpoints that are the signatures of chromosomal rearrangements. Here we report an automated tool for the identification of homologous synteny blocks (HSBs) between genomes that tolerates errors common in RH comparative maps and can be used for automated whole-genome analysis of chromosome rearrangements that occur during evolution.</p> <p>Findings</p> <p>We developed an algorithm and software tool (SyntenyTracker) that can be used for automated definition of HSBs using pair-wise RH or gene-based comparative maps as input. To verify correct implementation of the underlying algorithm, SyntenyTracker was used to identify HSBs in the cattle and human genomes. Results demonstrated 96% agreement with HSBs defined manually using the same set of rules. A comparison of SyntenyTracker with the AutoGRAPH synteny tool was performed using identical datasets containing 14,380 genes with 1:1 orthology in human and mouse. Discrepancies between the results using the two tools and advantages of SyntenyTracker are reported.</p> <p>Conclusion</p> <p>SyntenyTracker was shown to be an efficient and accurate automated tool for defining HSBs using datasets that may contain minor errors resulting from limitations in map construction methodologies. The utility of SyntenyTracker will become more important for comparative genomics as the number of mapped and sequenced genomes increases.</p
Broad host range of SARS-CoV-2 predicted by comparative and structural analysis of ACE2 in vertebrates
The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of COVID-19. The main receptor of SARS-CoV-2, angiotensin I converting enzyme 2 (ACE2), is now undergoing extensive scrutiny to understand the routes of transmission and sensitivity in different species. Here, we utilized a unique dataset of ACE2 sequences from 410 vertebrate species, including 252 mammals, to study the conservation of ACE2 and its potential to be used as a receptor by SARS-CoV-2. We designed a five-category binding score based on the conservation properties of 25 amino acids important for the binding between ACE2 and the SARS-CoV-2 spike protein. Only mammals fell into the medium to very high categories and only catarrhine primates into the very high category, suggesting that they are at high risk for SARS-CoV-2 infection. We employed a protein structural analysis to qualitatively assess whether amino acid changes at variable residues would be likely to disrupt ACE2/SARS-CoV-2 spike protein binding and found the number of predicted unfavorable changes significantly correlated with the binding score. Extending this analysis to human population data, we found only rare (frequency \u3c 0.001) variants in 10/25 binding sites. In addition, we found significant signals of selection and accelerated evolution in the ACE2 coding sequence across all mammals, and specific to the bat lineage. Our results, if confirmed by additional experimental data, may lead to the identification of intermediate host species for SARS-CoV-2, guide the selection of animal models of COVID-19, and assist the conservation of animals both in native habitats and in human care
Discovery and characterization of 91 novel transcripts expressed in cattle placenta
<p>Abstract</p> <p>Background</p> <p>Among the eutherian mammals, placental architecture varies to a greater extent than any other tissue. The diversity of placental types, even within a single mammalian order suggests that genes expressed in placenta are under strong Darwinian selection. Thus, the ruminant placenta may be a rich source of genes to explore adaptive evolutionary responses in mammals. The aim of our study was to identify novel transcripts expressed in ruminant placenta, and to characterize them with respect to their expression patterns, organization of coding sequences in the genome, and potential functions.</p> <p>Results</p> <p>A combination of bioinformatics, comparative genomics and transcript profiling was used to identify and characterize 91 novel transcripts (NTs) represented in a cattle placenta cDNA library. These NTs have no significant similarity to any non-ferungulate DNA or RNA sequence. Proteins longer than 100 aa were predicted for 29 NTs, and 21 are candidate non-coding RNAs. Eighty-six NTs were found to be expressed in one or more of 18 different tissues, with 39 (42%) showing tissue-preference, including six that were expressed exclusively in placentome. The authenticity of the NTs was confirmed by their alignment to cattle genome sequence, 42 of which showed evidence of mRNA splicing. Analysis of the genomic context where NT genes reside revealed 61 to be in intergenic regions, whereas 30 are within introns of known genes. The genes encoding the NTs were found to be significantly associated with subtelomeric regions.</p> <p>Conclusion</p> <p>The 91 lineage-specific transcripts are a useful resource for studying adaptive evolutionary responses of the ruminant placenta. The presence of so many genes encoding NTs in cattle but not primates or rodents suggests that gene loss and gain are important mechanisms of genome evolution in mammals. Furthermore, the clustering of NT genes within subtelomeric regions suggests that such regions are highly dynamic and may foster the birth of novel genes. The sequencing of additional vertebrate genomes with defined phylogenetic relationships will permit the search for lineage-specific genes to take on a more evolutionary context that is required to understand their origins and functions.</p
Functional annotation of novel lineage-specific genes using co-expression and promoter analysis
<p>Abstract</p> <p>Background</p> <p>The diversity of placental architectures within and among mammalian orders is believed to be the result of adaptive evolution. Although, the genetic basis for these differences is unknown, some may arise from rapidly diverging and lineage-specific genes. Previously, we identified 91 novel lineage-specific transcripts (LSTs) from a cow term-placenta cDNA library, which are excellent candidates for adaptive placental functions acquired by the ruminant lineage. The aim of the present study was to infer functions of previously uncharacterized lineage-specific genes (LSGs) using co-expression, promoter, pathway and network analysis.</p> <p>Results</p> <p>Clusters of co-expressed genes preferentially expressed in liver, placenta and thymus were found using 49 previously uncharacterized LSTs as seeds. Over-represented composite transcription factor binding sites (TFBS) in promoters of clustered LSGs and known genes were then identified computationally. Functions were inferred for nine previously uncharacterized LSGs using co-expression analysis and pathway analysis tools. Our results predict that these LSGs may function in cell signaling, glycerophospholipid/fatty acid metabolism, protein trafficking, regulatory processes in the nucleus, and processes that initiate parturition and immune system development.</p> <p>Conclusions</p> <p>The placenta is a rich source of lineage-specific genes that function in the adaptive evolution of placental architecture and functions. We have shown that co-expression, promoter, and gene network analyses are useful methods to infer functions of LSGs with heretofore unknown functions. Our results indicate that many LSGs are involved in cellular recognition and developmental processes. Furthermore, they provide guidance for experimental approaches to validate the functions of LSGs and to study their evolution.</p
Genomic organization and evolution of the ULBP genes in cattle
BACKGROUND: The cattle UL16-binding protein 1 (ULBP1) and ULBP2 genes encode members of the MHC Class I superfamily that have homology to the human ULBP genes. Human ULBP1 and ULBP2 interact with the NKG2D receptor to activate effector cells in the immune system. The human cytomegalovirus UL16 protein is known to disrupt the ULBP-NKG2D interaction, thereby subverting natural killer cell-mediated responses. Previous Southern blotting experiments identified evidence of increased ULBP copy number within the genomes of ruminant artiodactyls. On the basis of these observations we hypothesized that the cattle ULBPs evolved by duplication and sequence divergence to produce a sufficient number and diversity of ULBP molecules to deliver an immune activation signal in the presence of immunogenic peptides. Given the importance of the ULBPs in antiviral immunity in other species, our goal was to determine the copy number and genomic organization of the ULBP genes in the cattle genome. RESULTS: Sequencing of cattle bacterial artificial chromosome genomic inserts resulted in the identification of 30 cattle ULBP loci existing in two gene clusters. Evidence of extensive segmental duplication and approximately 14 Kbp of novel repetitive sequences were identified within the major cluster. Ten ULBPs are predicted to be expressed at the cell surface. Substitution analysis revealed 11 outwardly directed residues in the predicted extracellular domains that show evidence of positive Darwinian selection. These positively selected residues have only one residue that overlaps with those proposed to interact with NKG2D, thus suggesting the interaction with molecules other than NKG2D. CONCLUSION: The ULBP loci in the cattle genome apparently arose by gene duplication and subsequent sequence divergence. Substitution analysis of the ULBP proteins provided convincing evidence for positive selection on extracellular residues that may interact with peptide ligands. These results support our hypothesis that the cattle ULBPs evolved under adaptive diversifying selection to avoid interaction with a UL16-like molecule whilst preserving the NKG2D binding site. The large number of ULBPs in cattle, their extensive diversification, and the high prevalence of bovine herpesvirus infections make this gene family a compelling target for studies of antiviral immunity
Rates, costs, return to work and reoperation following spinal surgery in a workers’ compensation cohort in New South Wales, 2010–2018: a cohort study using administrative data
Background: Internationally, elective spinal surgery rates in workers’ compensation populations are high, as are reoperation rates, while return-to-work rates following spinal surgery are low. Little information is available from Australia. The aim of this study was to describe the rates, costs, return to work and reoperation following elective spinal surgery in the workers’ compensation population in New South Wales (NSW), Australia. Methods: This retrospective cohort study used administrative data from the State Insurance Regulatory Authority, the government organisation responsible for regulating and administering workers’ compensation insurance in NSW. These data cover all workers’ compensation-insured workers in New South Wales (over 3 million workers/year). We identified a cohort of insured workers who underwent elective spinal surgery (fusion or decompression) between January 1, 2010 and December 31, 2018. People who underwent surgery for spinal fracture or dislocation, or who had sustained a traumatic brain injury were excluded. The main outcome measures were annual spinal surgery rates, cost of the surgical episode, cumulative costs (surgical, hospital, medical and physical therapy) to 2 years post-surgery, and reoperation and return-to-work rates 2 years post-surgery. Results: There were 9343 eligible claims (39.1 % fusion; 59.9 % decompression); claimants were predominantly male (75 %) with a mean age of 43 (range 18 to 75) years. Spinal surgery rates ranged from 15 to 29 surgeries per 100,000 workers per year, fell from 2011-12 to 2014-15 and rose thereafter. The average cost in Australian dollars for a surgical episode was 20,000 for a decompression. Two years post-fusion, only 19 % of people had returned to work at full capacity; 39 % after decompression. Nineteen percent of patients underwent additional spinal surgery within 2 years of the index surgery, to a maximum of 5 additional surgeries. Conclusion: Rates of workers’ compensation-funded spinal surgery did not rise significantly during the study period, but reoperation rates are high and return-to-work rates are low in this population at 2 years post- surgery. In the context of the poor evidence base supporting lumbar fusion surgery, the high cost, increasing rates, and the increased likelihood of poor outcomes in the workers’ compensation population, we question the value of this procedure in this setting
ESTIMA, a tool for EST management in a multi-project environment
BACKGROUND: Single-pass, partial sequencing of complementary DNA (cDNA) libraries generates thousands of chromatograms that are processed into high quality expressed sequence tags (ESTs), and then assembled into contigs representative of putative genes. Usually, to be of value, ESTs and contigs must be associated with meaningful annotations, and made available to end-users. RESULTS: A web application, Expressed Sequence Tag Information Management and Annotation (ESTIMA), has been created to meet the EST annotation and data management requirements of multiple high-throughput EST sequencing projects. It is anchored on individual ESTs and organized around different properties of ESTs including chromatograms, base-calling quality scores, structure of assembled transcripts, and multiple sources of comparison to infer functional annotation, Gene Ontology associations, and cDNA library information. ESTIMA consists of a relational database schema and a set of interactive query interfaces. These are integrated with a suite of web-based tools that allow a user to query and retrieve information. Further, query results are interconnected among the various EST properties. ESTIMA has several unique features. Users may run their own EST processing pipeline, search against preferred reference genomes, and use any clustering and assembly algorithm. The ESTIMA database schema is very flexible and accepts output from any EST processing and assembly pipeline. ESTIMA has been used for the management of EST projects of many species, including honeybee (Apis mellifera), cattle (Bos taurus), songbird (Taeniopygia guttata), corn rootworm (Diabrotica vergifera), catfish (Ictalurus punctatus, Ictalurus furcatus), and apple (Malus x domestica). The entire resource may be downloaded and used as is, or readily adapted to fit the unique needs of other cDNA sequencing projects. CONCLUSIONS: The scripts used to create the ESTIMA interface are freely available to academic users in an archived format from . The entity-relationship (E-R) diagrams and the programs used to generate the Oracle database tables are also available. We have also provided detailed installation instructions and a tutorial at the same website. Presently the chromatograms, EST databases and their annotations have been made available for cattle and honeybee brain EST projects. Non-academic users need to contact the W.M. Keck Center for Functional and Comparative Genomics, University of Illinois at Urbana-Champaign, Urbana, IL, for licensing information
Recommended from our members
Precision nomenclature for the new genomics
The confluence of two scientific disciplines may lead to nomenclature conflicts that require new terms while respecting historical definitions. This is the situation with the current state of cytology and genomics, which offer examples of distinct nomenclature and vocabularies that require reconciliation. In this article, we propose the new terms C-scaffold (for chromosome-scale assemblies of sequenced DNA fragments, commonly named scaffolds) and scaffotype (the resulting collection of C-scaffolds that represent an organism\u27s genome). This nomenclature avoids conflict with the historical definitions of the terms chromosome (a microscopic body made of DNA and protein) and karyotype (the collection of images of all chromosomes of an organism or species). As large-scale sequencing projects progress, adoption of this nomenclature will assist end users to properly classify genome assemblies, thus facilitating genomic analysis
- …