139 research outputs found

    Vacuolating cytotoxin (vacA) alleles of Helicobacter pylori comprise two geographically widespread types, m1 and m2, and have evolved through limited recombination

    Get PDF
    Vacuolating cytotoxin (vacA) alleles of Helicobacter pylori vary, particularly in their mid region (which may be type m1 or m2) and their signal peptide coding region (type s1 or s2). We investigated nucleotide diversity among vacA alleles in strains from several locales in Asia, South America, and the USA. Phylogenetic analysis of vacA mid region sequences from 18 strains validated the division into two main groups (m1 and m2) and showed further significant divisions within these groups. Informative site analysis demonstrated one example of recombination between m1 and m2 alleles, and several examples of recombination among alleles within these groups. Recombination was not sufficiently extensive to destroy phylogenetic structure entirely. Synonymous nucleotide substitution rates were markedly different between regions of vacA, suggesting different evolutionary divergence times and implying horizontal transfer of genetic elements within vacA. Non-synonymous/synonymous rate ratios were greater between m1 and m2 sequences than among m1 sequences, consistent with m1 and m2 alleles encoding functions fitting strains for slightly different ecological niches

    A matter of words: NLP for quality evaluation of Wikipedia medical articles

    Get PDF
    Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles. In particular, we evaluate the articles adopting an "actionable" model, whose features are related to the content of the articles, so that the model can also directly suggest strategies for improving a given article quality. We rely on Natural Language Processing (NLP) and dictionaries-based techniques in order to extract the bio-medical concepts in a text. We prove the effectiveness of our approach by classifying the medical articles of the Wikipedia Medicine Portal, which have been previously manually labeled by the Wiki Project team. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified. Other than being interesting by their own, the results call for further research in the area of domain specific features suitable for Web data quality assessment

    Differential entropy and time

    Full text link
    We give a detailed analysis of the Gibbs-type entropy notion and its dynamical behavior in case of time-dependent continuous probability distributions of varied origins: related to classical and quantum systems. The purpose-dependent usage of conditional Kullback-Leibler and Gibbs (Shannon) entropies is explained in case of non-equilibrium Smoluchowski processes. A very different temporal behavior of Gibbs and Kullback entropies is confronted. A specific conceptual niche is addressed, where quantum von Neumann, classical Kullback-Leibler and Gibbs entropies can be consistently introduced as information measures for the same physical system. If the dynamics of probability densities is driven by the Schr\"{o}dinger picture wave-packet evolution, Gibbs-type and related Fisher information functionals appear to quantify nontrivial power transfer processes in the mean. This observation is found to extend to classical dissipative processes and supports the view that the Shannon entropy dynamics provides an insight into physically relevant non-equilibrium phenomena, which are inaccessible in terms of the Kullback-Leibler entropy and typically ignored in the literature.Comment: Final, unabridged version; http://www.mdpi.org/entropy/ Dedicated to Professor Rafael Sorkin on his 60th birthda

    Inferring hidden Markov models from noisy time sequences: a method to alleviate degeneracy in molecular dynamics

    Get PDF
    We present a new method for inferring hidden Markov models from noisy time sequences without the necessity of assuming a model architecture, thus allowing for the detection of degenerate states. This is based on the statistical prediction techniques developed by Crutchfield et al., and generates so called causal state models, equivalent to hidden Markov models. This method is applicable to any continuous data which clusters around discrete values and exhibits multiple transitions between these values such as tethered particle motion data or Fluorescence Resonance Energy Transfer (FRET) spectra. The algorithms developed have been shown to perform well on simulated data, demonstrating the ability to recover the model used to generate the data under high noise, sparse data conditions and the ability to infer the existence of degenerate states. They have also been applied to new experimental FRET data of Holliday Junction dynamics, extracting the expected two state model and providing values for the transition rates in good agreement with previous results and with results obtained using existing maximum likelihood based methods.Comment: 19 pages, 9 figure

    Positive words carry less information than negative words

    Get PDF
    We show that the frequency of word use is not only determined by the word length \cite{Zipf1935} and the average information content \cite{Piantadosi2011}, but also by its emotional content. We have analyzed three established lexica of affective word usage in English, German, and Spanish, to verify that these lexica have a neutral, unbiased, emotional content. Taking into account the frequency of word usage, we find that words with a positive emotional content are more frequently used. This lends support to Pollyanna hypothesis \cite{Boucher1969} that there should be a positive bias in human expression. We also find that negative words contain more information than positive words, as the informativeness of a word increases uniformly with its valence decrease. Our findings support earlier conjectures about (i) the relation between word frequency and information content, and (ii) the impact of positive emotions on communication and social links.Comment: 16 pages, 3 figures, 3 table

    Helicobacter pylori Counteracts the Apoptotic Action of Its VacA Toxin by Injecting the CagA Protein into Gastric Epithelial Cells

    Get PDF
    Infection with Helicobacter pylori is responsible for gastritis and gastroduodenal ulcers but is also a high risk factor for the development of gastric adenocarcinoma and lymphoma. The most pathogenic H. pylori strains (i.e., the so-called type I strains) associate the CagA virulence protein with an active VacA cytotoxin but the rationale for this association is unknown. CagA, directly injected by the bacterium into colonized epithelium via a type IV secretion system, leads to cellular morphological, anti-apoptotic and proinflammatory effects responsible in the long-term (years or decades) for ulcer and cancer. VacA, via pinocytosis and intracellular trafficking, induces epithelial cell apoptosis and vacuolation. Using human gastric epithelial cells in culture transfected with cDNA encoding for either the wild-type 38 kDa C-terminal signaling domain of CagA or its non-tyrosine-phosphorylatable mutant form, we found that, depending on tyrosine-phosphorylation by host kinases, CagA inhibited VacA-induced apoptosis by two complementary mechanisms. Tyrosine-phosphorylated CagA prevented pinocytosed VacA to reach its target intracellular compartments. Unphosphorylated CagA triggered an anti-apoptotic activity blocking VacA-induced apoptosis at the mitochondrial level without affecting the intracellular trafficking of the toxin. Assaying the level of apoptosis of gastric epithelial cells infected with wild-type CagA+/VacA+ H. pylori or isogenic mutants lacking of either CagA or VacA, we confirmed the results obtained in cells transfected with the CagA C-ter constructions showing that CagA antagonizes VacA-induced apoptosis. VacA toxin plays a role during H. pylori stomach colonization. However, once bacteria have colonized the gastric niche, the apoptotic action of VacA might be detrimental for the survival of H. pylori adherent to the mucosa. CagA association with VacA is thus a novel, highly ingenious microbial strategy to locally protect its ecological niche against a bacterial virulence factor, with however detrimental consequences for the human host

    A Multi-Cancer Mesenchymal Transition Gene Expression Signature Is Associated with Prolonged Time to Recurrence in Glioblastoma

    Get PDF
    A stage-associated gene expression signature of coordinately expressed genes, including the transcription factor Slug (SNAI2) and other epithelial-mesenchymal transition (EMT) markers has been found present in samples from publicly available gene expression datasets in multiple cancer types, including nonepithelial cancers. The expression levels of the co-expressed genes vary in a continuous and coordinate manner across the samples, ranging from absence of expression to strong co-expression of all genes. These data suggest that tumor cells may pass through an EMT-like process of mesenchymal transition to varying degrees. Here we show that, in glioblastoma multiforme (GBM), this signature is associated with time to recurrence following initial treatment. By analyzing data from The Cancer Genome Atlas (TCGA), we found that GBM patients who responded to therapy and had long time to recurrence had low levels of the signature in their tumor samples (P = 3×10−7). We also found that the signature is strongly correlated in gliomas with the putative stem cell marker CD44, and is highly enriched among the differentially expressed genes in glioblastomas vs. lower grade gliomas. Our results suggest that long delay before tumor recurrence is associated with absence of the mesenchymal transition signature, raising the possibility that inhibiting this transition might improve the durability of therapy in glioma patients

    On the Accessibility of Adaptive Phenotypes of a Bacterial Metabolic Network

    Get PDF
    The mechanisms by which adaptive phenotypes spread within an evolving population after their emergence are understood fairly well. Much less is known about the factors that influence the evolutionary accessibility of such phenotypes, a pre-requisite for their emergence in a population. Here, we investigate the influence of environmental quality on the accessibility of adaptive phenotypes of Escherichia coli's central metabolic network. We used an established flux-balance model of metabolism as the basis for a genotype-phenotype map (GPM). We quantified the effects of seven qualitatively different environments (corresponding to both carbohydrate and gluconeogenic metabolic substrates) on the structure of this GPM. We found that the GPM has a more rugged structure in qualitatively poorer environments, suggesting that adaptive phenotypes could be intrinsically less accessible in such environments. Nevertheless, on average ∼74% of the genotype can be altered by neutral drift, in the environment where the GPM is most rugged; this could allow evolving populations to circumvent such ruggedness. Furthermore, we found that the normalized mutual information (NMI) of genotype differences relative to phenotype differences, which measures the GPM's capacity to transmit information about phenotype differences, is positively correlated with (simulation-based) estimates of the accessibility of adaptive phenotypes in different environments. These results are consistent with the predictions of a simple analytic theory that makes explicit the relationship between the NMI and the speed of adaptation. The results suggest an intuitive information-theoretic principle for evolutionary adaptation; adaptation could be faster in environments where the GPM has a greater capacity to transmit information about phenotype differences. More generally, our results provide insight into fundamental environment-specific differences in the accessibility of adaptive phenotypes, and they suggest opportunities for research at the interface between information theory and evolutionary biology

    Network adaptation improves temporal representation of naturalistic stimuli in drosophila eye: II Mechanisms

    Get PDF
    Retinal networks must adapt constantly to best present the ever changing visual world to the brain. Here we test the hypothesis that adaptation is a result of different mechanisms at several synaptic connections within the network. In a companion paper (Part I), we showed that adaptation in the photoreceptors (R1-R6) and large monopolar cells (LMC) of the Drosophila eye improves sensitivity to under-represented signals in seconds by enhancing both the amplitude and frequency distribution of LMCs' voltage responses to repeated naturalistic contrast series. In this paper, we show that such adaptation needs both the light-mediated conductance and feedback-mediated synaptic conductance. A faulty feedforward pathway in histamine receptor mutant flies speeds up the LMC output, mimicking extreme light adaptation. A faulty feedback pathway from L2 LMCs to photoreceptors slows down the LMC output, mimicking dark adaptation. These results underline the importance of network adaptation for efficient coding, and as a mechanism for selectively regulating the size and speed of signals in neurons. We suggest that concert action of many different mechanisms and neural connections are responsible for adaptation to visual stimuli. Further, our results demonstrate the need for detailed circuit reconstructions like that of the Drosophila lamina, to understand how networks process information

    Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks

    Get PDF
    Background: A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by a saturation in the pair-wise mutual information (MI) metric; hence there is a theoretical limit on the inference accuracy of information theory based schemes that depends on the number of time points of micro-array data used to infer GRNs. This illustrates the fact that MI might not be the best metric to use for GRN inference algorithms. To circumvent the limitations of the MI metric, we introduce a new method of computing time lags between any pair of genes and present the pair-wise time lagged Mutual Information (TLMI) and time lagged Conditional Mutual Information (TLCMI) metrics. Next we use these new metrics to propose novel GRN inference schemes which provides higher inference accuracy based on the precision and recall parameters. Results: It was observed that beyond a certain number of time-points (i.e., a specific size) of micro-array data, the performance of the algorithms measured in terms of the recall-to-precision ratio saturated due to the saturation in the calculated pair-wise MI metric with increasing data size. The proposed algorithms were compared to existing approaches on four different biological networks. The resulting networks were evaluated based on the benchmark precision and recall metrics and the results favour our approach. Conclusions: To alleviate the effects of data size on information theory based GRN inference algorithms, novel time lag based information theoretic approaches to infer gene regulatory networks have been proposed. The results show that the time lags of regulatory effects between any pair of genes play an important role in GRN inference schemes
    • …
    corecore