32 research outputs found

    MLTrends: Graphing MEDLINE term usage over time

    Get PDF
    The MEDLINE database of medical literature is routinely used by researchers and doctors to find articles pertaining to their area of interest. Insight into historical changes in research areas and use of scientific language may be gained by chronological analysis of the 18 million records currently in the database, however such analysis is generally complex and time consuming. The authors’ MLTrends web application graphs term usage in MEDLINE over time, allowing the determination of emergence dates for biomedical terms and historical variations in term usage intensity. Terms considered are individual words or quoted phrases which may be combined using Boolean operators. MLTrends can plot the number of records in MEDLINE per year whose titles or abstracts match each queried term for multiple terms simultaneously. The MEDLINE database is stored and indexed on the MLTrends server allowing queries to be completed and graphs generated in less than one second. Queries may be performed on all titles and/or abstracts in MEDLINE and can include stop words. The resulting graphs may be normalized by total publications or words per year to facilitate term usage comparison between years. This makes MLTrends a powerful tool for rapid evaluation of the evolution of biomedical research and language in a graphical way. MLTrends may be used at: http://www.ogic.ca/mltrend

    A General Model of Codon Bias Due to GC Mutational Bias

    Get PDF
    Background - In spite of extensive research on the effect of mutation and selection on codon usage, a general model of codon usage bias due to mutational bias has been lacking. Because most amino acids allow synonymous GC content changing substitutions in the third codon position, the overall GC bias of a genome or genomic region is highly correlated with GC3, a measure of third position GC content. For individual amino acids as well, G/C ending codons usage generally increases with increasing GC bias and decreases with increasing AT bias. Arginine and leucine, amino acids that allow GC-changing synonymous substitutions in the first and third codon positions, have codons which may be expected to show different usage patterns. // Principal Findings - In analyzing codon usage bias in hundreds of prokaryotic and plant genomes and in human genes, we find that two G-ending codons, AGG (arginine) and TTG (leucine), unlike all other G/C-ending codons, show overall usage that decreases with increasing GC bias, contrary to the usual expectation that G/C-ending codon usage should increase with increasing genomic GC bias. Moreover, the usage of some codons appears nonlinear, even nonmonotone, as a function of GC bias. To explain these observations, we propose a continuous-time Markov chain model of GC-biased synonymous substitution. This model correctly predicts the qualitative usage patterns of all codons, including nonlinear codon usage in isoleucine, arginine and leucine. The model accounts for 72%, 64% and 52% of the observed variability of codon usage in prokaryotes, plants and human respectively. When codons are grouped based on common GC content, 87%, 80% and 68% of the variation in usage is explained for prokaryotes, plants and human respectively. // Conclusions - The model clarifies the sometimes-counterintuitive effects that GC mutational bias can have on codon usage, quantifies the influence of GC mutational bias and provides a natural null model relative to which other influences on codon bias may be measured

    Towards a typology of mediated anger: routine coverage of protest and political emotion

    Get PDF
    This article establishes the importance of studying mediated anger. It first develops a typology of mediated anger, suggesting it is performative, discursively constructed, collective, and political. It applies this typology to routine coverage of anger in UK protest coverage during a two-month time period in 2015. The analysis demonstrates that anger serves as a cause of engagement and a barometer of public feeling. It sets out a spectrum of discursive constructions of mediated anger. At one end sits rational and legitimate anger, which forms the basis for social change. Along the spectrum sits aggressive and/or disruptive anger motivated by rational and legitimate concerns. At the other end of the spectrum lies illegitimate and irrational anger. The analysis shows that protesters can be simultaneously angry and rational, peaceful and legitimate. Discourses on protest construct a commonsense theory of political motivation, whereby anger explains the desire for political engagement, but only occasionally brings about other negative emotions or actions. As such, the article contributes a more nuanced understanding of anger as a political emotion

    Taxonomic colouring of phylogenetic trees of protein sequences

    Get PDF
    BACKGROUND: Phylogenetic analyses of protein families are used to define the evolutionary relationships between homologous proteins. The interpretation of protein-sequence phylogenetic trees requires the examination of the taxonomic properties of the species associated to those sequences. However, there is no online tool to facilitate this interpretation, for example, by automatically attaching taxonomic information to the nodes of a tree, or by interactively colouring the branches of a tree according to any combination of taxonomic divisions. This is especially problematic if the tree contains on the order of hundreds of sequences, which, given the accelerated increase in the size of the protein sequence databases, is a situation that is becoming common. RESULTS: We have developed PhyloView, a web based tool for colouring phylogenetic trees upon arbitrary taxonomic properties of the species represented in a protein sequence phylogenetic tree. Provided that the tree contains SwissProt, SpTrembl, or GenBank protein identifiers, the tool retrieves the taxonomic information from the corresponding database. A colour picker displays a summary of the findings and allows the user to associate colours to the leaves of the tree according to any number of taxonomic partitions. Then, the colours are propagated to the branches of the tree. CONCLUSION: PhyloView can be used at . A tutorial, the software with documentation, and GPL licensed source code, can be accessed at the same web address

    ChIP on SNP-chip for genome-wide analysis of human histone H4 hyperacetylation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>SNP microarrays are designed to genotype Single Nucleotide Polymorphisms (SNPs). These microarrays report hybridization of DNA fragments and therefore can be used for the purpose of detecting genomic fragments.</p> <p>Results</p> <p>Here, we demonstrate that a SNP microarray can be effectively used in this way to perform chromatin immunoprecipitation (ChIP) on chip as an alternative to tiling microarrays. We illustrate this novel application by mapping whole genome histone H4 hyperacetylation in human myoblasts and myotubes. We detect clusters of hyperacetylated histone H4, often spanning across up to 300 kilobases of genomic sequence. Using complementary genome-wide analyses of gene expression by DNA microarray we demonstrate that these clusters of hyperacetylated histone H4 tend to be associated with expressed genes.</p> <p>Conclusion</p> <p>The use of a SNP array for a ChIP-on-chip application (ChIP on SNP-chip) will be of great value to laboratories whose interest is the determination of general rules regarding the relationship of specific chromatin modifications to transcriptional status throughout the genome and to examine the asymmetric modification of chromatin at heterozygous loci.</p

    Gene function in early mouse embryonic stem cell differentiation

    Get PDF
    BACKGROUND: Little is known about the genes that drive embryonic stem cell differentiation. However, such knowledge is necessary if we are to exploit the therapeutic potential of stem cells. To uncover the genetic determinants of mouse embryonic stem cell (mESC) differentiation, we have generated and analyzed 11-point time-series of DNA microarray data for three biologically equivalent but genetically distinct mESC lines (R1, J1, and V6.5) undergoing undirected differentiation into embryoid bodies (EBs) over a period of two weeks. RESULTS: We identified the initial 12 hour period as reflecting the early stages of mESC differentiation and studied probe sets showing consistent changes of gene expression in that period. Gene function analysis indicated significant up-regulation of genes related to regulation of transcription and mRNA splicing, and down-regulation of genes related to intracellular signaling. Phylogenetic analysis indicated that the genes showing the largest expression changes were more likely to have originated in metazoans. The probe sets with the most consistent gene changes in the three cell lines represented 24 down-regulated and 12 up-regulated genes, all with closely related human homologues. Whereas some of these genes are known to be involved in embryonic developmental processes (e.g. Klf4, Otx2, Smn1, Socs3, Tagln, Tdgf1), our analysis points to others (such as transcription factor Phf21a, extracellular matrix related Lama1 and Cyr61, or endoplasmic reticulum related Sc4mol and Scd2) that have not been previously related to mESC function. The majority of identified functions were related to transcriptional regulation, intracellular signaling, and cytoskeleton. Genes involved in other cellular functions important in ESC differentiation such as chromatin remodeling and transmembrane receptors were not observed in this set. CONCLUSION: Our analysis profiles for the first time gene expression at a very early stage of mESC differentiation, and identifies a functional and phylogenetic signature for the genes involved. The data generated constitute a valuable resource for further studies. All DNA microarray data used in this study are available in the StemBase database of stem cell gene expression data [1] and in the NCBI's GEO database

    Recent developments in StemBase: a tool to study gene expression in human and murine stem cells

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Currently one of the largest online repositories for human and mouse stem cell gene expression data, StemBase was first designed as a simple web-interface to DNA microarray data generated by the Canadian Stem Cell Network to facilitate the discovery of gene functions relevant to stem cell control and differentiation.</p> <p>Findings</p> <p>Since its creation, StemBase has grown in both size and scope into a system with analysis tools that examine either the whole database at once, or slices of data, based on tissue type, cell type or gene of interest. As of September 1, 2008, StemBase contains gene expression data (microarray and Serial Analysis of Gene Expression) from 210 stem cell samples in 60 different experiments.</p> <p>Conclusion</p> <p>StemBase can be used to study gene expression in human and murine stem cells and is available at <url>http://www.stembase.ca</url>.</p

    Acknowledging contributions to online expert assistance

    Get PDF
    We present a poster which contains a sequence of a question, answers to this question and comments regarding acknowledging content on BioStar. Biostar.stackexchange.com is a website where questions about Bioinformatics can be asked and answered. Users can also comment on both the questions and the answers. The site is modelled after www.stackoverflow.com (see description from Joel Spolsky), a comparable site for programmers.&#xd;&#xa; &#xd;&#xa;Users find the site valuable both for answers to questions they have and as a reference. Since the content can also be viewed without registration the site likely reaches a larger audience. For instance, BioStar questions are often referenced on Twitter and FriendFeed. This leads to the question of how contributions to such a site can be measured and how they should be cited on other websites. The site itself has some mechanisms in place, which are mainly meant to encourage users; it uses reputation points and so called badges to recognize the quality of contributions. Reputation points are given by the community, who can up- or down- vote questions and answers. Badges are automatically awarded based on predefined criteria. Users with higher reputation levels can also manage the site itself, for instance by adding tags, editing questions and answers or even closing and deleting them. The reputation mechanism is interesting since it is not automatically given based on input provided but actually decided on by fellow users based on their judgement of the quality.&#xd;&#xa; &#xd;&#xa;We have used the BioStar website itself to ask &#x201c;How do you acknowledge Biostar and its contributors in your research output?&#x22; (http://biostar.stackexchange.com/questions/6062/) &#xd;&#xa;Currently (April 2011) this question is still active and in the top-10 of questions with most votes, indicating clear interest by the community for ways to acknowledge content from BioStar. The poster gives some interesting viewpoints on the matter. Some examples indicate how useful BioStar was in practical cases, for instance by showing how multiple consequences from gene variations can be mined, results of which could immediately be applied to real research questions. Of course people wanted to acknowledge BioStar in such cases, and indicated how they did that in practice. Although a paper about BioStar itself was suggested as a useful reference and way to advertise the site, people seem to agree that this is not the best way to acknowledge individual contributions. As an alternative, an example of a citation standard for blogs developed by the National library of medicine is mentioned, which also keeps track of the date (and thus version) of the cited document. The use of the Document Object Identifier was discussed, as a way to get easy links to fixed versions of a question with answers. Although the answers provided are given in the context of the BioStar community, the presented content is applicable to other online resources as well and could provide valid input to other communities

    Detection of Alpha-Rod Protein Repeats Using a Neural Network and Application to Huntingtin

    Get PDF
    A growing number of solved protein structures display an elongated structural domain, denoted here as alpha-rod, composed of stacked pairs of anti-parallel alpha-helices. Alpha-rods are flexible and expose a large surface, which makes them suitable for protein interaction. Although most likely originating by tandem duplication of a two-helix unit, their detection using sequence similarity between repeats is poor. Here, we show that alpha-rod repeats can be detected using a neural network. The network detects more repeats than are identified by domain databases using multiple profiles, with a low level of false positives (<10%). We identify alpha-rod repeats in approximately 0.4% of proteins in eukaryotic genomes. We then investigate the results for all human proteins, identifying alpha-rod repeats for the first time in six protein families, including proteins STAG1-3, SERAC1, and PSMD1-2 & 5. We also characterize a short version of these repeats in eight protein families of Archaeal, Bacterial, and Fungal species. Finally, we demonstrate the utility of these predictions in directing experimental work to demarcate three alpha-rods in huntingtin, a protein mutated in Huntington's disease. Using yeast two hybrid analysis and an immunoprecipitation technique, we show that the huntingtin fragments containing alpha-rods associate with each other. This is the first definition of domains in huntingtin and the first validation of predicted interactions between fragments of huntingtin, which sets up directions toward functional characterization of this protein. An implementation of the repeat detection algorithm is available as a Web server with a simple graphical output: http://www.ogic.ca/projects/ard. This can be further visualized using BiasViz, a graphic tool for representation of multiple sequence alignments
    corecore