86 research outputs found

    A Winnow-Based Approach to Context-Sensitive Spelling Correction

    Full text link
    A large class of machine-learning problems in natural language require the characterization of linguistic context. Two characteristic properties of such problems are that their feature space is of very high dimensionality, and their target concepts refer to only a small subset of the features in the space. Under such conditions, multiplicative weight-update algorithms such as Winnow have been shown to have exceptionally good theoretical properties. We present an algorithm combining variants of Winnow and weighted-majority voting, and apply it to a problem in the aforementioned class: context-sensitive spelling correction. This is the task of fixing spelling errors that happen to result in valid words, such as substituting "to" for "too", "casual" for "causal", etc. We evaluate our algorithm, WinSpell, by comparing it against BaySpell, a statistics-based method representing the state of the art for this task. We find: (1) When run with a full (unpruned) set of features, WinSpell achieves accuracies significantly higher than BaySpell was able to achieve in either the pruned or unpruned condition; (2) When compared with other systems in the literature, WinSpell exhibits the highest performance; (3) The primary reason that WinSpell outperforms BaySpell is that WinSpell learns a better linear separator; (4) When run on a test set drawn from a different corpus than the training set was drawn from, WinSpell is better able than BaySpell to adapt, using a strategy we will present that combines supervised learning on the training set with unsupervised learning on the (noisy) test set.Comment: To appear in Machine Learning, Special Issue on Natural Language Learning, 1999. 25 page

    Applying Winnow to Context-Sensitive Spelling Correction

    Full text link
    Multiplicative weight-updating algorithms such as Winnow have been studied extensively in the COLT literature, but only recently have people started to use them in applications. In this paper, we apply a Winnow-based algorithm to a task in natural language: context-sensitive spelling correction. This is the task of fixing spelling errors that happen to result in valid words, such as substituting {\it to\/} for {\it too}, {\it casual\/} for {\it causal}, and so on. Previous approaches to this problem have been statistics-based; we compare Winnow to one of the more successful such approaches, which uses Bayesian classifiers. We find that: (1)~When the standard (heavily-pruned) set of features is used to describe problem instances, Winnow performs comparably to the Bayesian method; (2)~When the full (unpruned) set of features is used, Winnow is able to exploit the new features and convincingly outperform Bayes; and (3)~When a test set is encountered that is dissimilar to the training set, Winnow is better than Bayes at adapting to the unfamiliar test set, using a strategy we will present for combining learning on the training set with unsupervised learning on the (noisy) test set.Comment: 9 page

    A comparison of methods for the registration of tractographic fibre images

    Get PDF
    Diffusion tensor imaging (DTI) and tractography have opened up new avenues in neuroscience and are allowing previously unexplored areas of neuroanatomy and function to be researched

    Minimally invasive determination of mRNA concentration in single living bacteria

    Get PDF
    Fluorescence correlation spectroscopy (FCS) has permitted the characterization of high concentrations of noncoding RNAs in a single living bacterium. Here, we extend the use of FCS to low concentrations of coding RNAs in single living cells. We genetically fuse a red fluorescent protein (RFP) gene and two binding sites for an RNA-binding protein, whose translated product is the RFP protein alone. Using this construct, we determine in single cells both the absolute [mRNA] concentration and the associated [RFP] expressed from an inducible plasmid. We find that the FCS method allows us to reliably monitor in real-time [mRNA] down to ∼40 nM (i.e. approximately two transcripts per volume of detection). To validate these measurements, we show that [mRNA] is proportional to the associated expression of the RFP protein. This FCS-based technique establishes a framework for minimally invasive measurements of mRNA concentration in individual living bacteria

    A Tale of Two Stories from "Below the Line": Comment Fields at the Guardian

    Get PDF
    This article analyzes the nature of debate on “below the line” comment fields at the United Kingdom’s Guardian, and how, if at all, such debates are impacting journalism practice. The article combines a content analysis of 3,792 comments across eighty-five articles that focused on the UN Climate Change Summit, with ten interviews with journalists, two with affiliated commentators, plus the community manager. The results suggest a more positive picture than has been found by many existing studies: Debates were often deliberative in nature, and journalists reported that it was positively impacting their practice in several ways, including providing new story leads and enhanced critical reflection. However, citizen–journalist debate was limited. The results are attributed to the normalization of comment fields into everyday journalism practice, extensive support and encouragement from senior management, and a realization that comment fields can actually make the journalists’ life a little easier

    Targeting DNA-PKcs and ATM with miR-101 Sensitizes Tumors to Radiation

    Get PDF
    Radiotherapy kills tumor-cells by inducing DNA double strand breaks (DSBs). However, the efficient repair of tumors frequently prevents successful treatment. Therefore, identifying new practical sensitizers is an essential step towards successful radiotherapy. In this study, we tested the new hypothesis: identifying the miRNAs to target DNA DSB repair genes could be a new way for sensitizing tumors to ionizing radiation.HERE, WE CHOSE TWO GENES: DNA-PKcs (an essential factor for non-homologous end-joining repair) and ATM (an important checkpoint regulator for promoting homologous recombination repair) as the targets to search their regulating miRNAs. By combining the database search and the bench work, we picked out miR-101. We identified that miR-101 could efficiently target DNA-PKcs and ATM via binding to the 3'- UTR of DNA-PKcs or ATM mRNA. Up-regulating miR-101 efficiently reduced the protein levels of DNA-PKcs and ATM in these tumor cells and most importantly, sensitized the tumor cells to radiation in vitro and in vivo.These data demonstrate for the first time that miRNAs could be used to target DNA repair genes and thus sensitize tumors to radiation. These results provide a new way for improving tumor radiotherapy

    Genome-Wide Association Meta-Analysis of Cortical Bone Mineral Density Unravels Allelic Heterogeneity at the RANKL Locus and Potential Pleiotropic Effects on Bone

    Get PDF
    Previous genome-wide association (GWA) studies have identified SNPs associated with areal bone mineral density (aBMD). However, this measure is influenced by several different skeletal parameters, such as periosteal expansion, cortical bone mineral density (BMDC) cortical thickness, trabecular number, and trabecular thickness, which may be under distinct biological and genetic control. We have carried out a GWA and replication study of BMDC, as measured by peripheral quantitative computed tomography (pQCT), a more homogenous and valid measure of actual volumetric bone density. After initial GWA meta-analysis of two cohorts (ALSPAC n = 999, aged ∼15 years and GOOD n = 935, aged ∼19 years), we attempted to replicate the BMDC associations that had p<1×10−5 in an independent sample of ALSPAC children (n = 2803) and in a cohort of elderly men (MrOS Sweden, n = 1052). The rs1021188 SNP (near RANKL) was associated with BMDC in all cohorts (overall p = 2×10−14, n = 5739). Each minor allele was associated with a decrease in BMDC of ∼0.14SD. There was also evidence for an interaction between this variant and sex (p = 0.01), with a stronger effect in males than females (at age 15, males −6.77mg/cm3 per C allele, p = 2×10−6; females −2.79 mg/cm3 per C allele, p = 0.004). Furthermore, in a preliminary analysis, the rs1021188 minor C allele was associated with higher circulating levels of sRANKL (p<0.005). We show this variant to be independent from the previously aBMD associated SNP (rs9594738) and possibly from a third variant in the same RANKL region, which demonstrates important allelic heterogeneity at this locus. Associations with skeletal parameters reflecting bone dimensions were either not found or were much less pronounced. This finding implicates RANKL as a locus containing variation associated with volumetric bone density and provides further insight into the mechanism by which the RANK/RANKL/OPG pathway may be involved in skeletal development

    Glutamate is required for depression but not potentiation of long-term presynaptic function

    Get PDF
    Hebbian plasticity is thought to require glutamate signalling. We show this is not the case for hippocampal presynaptic long-term potentiation (LTPpre), which is expressed as an increase in transmitter release probability (Pr). We find that LTPpreandnbsp;can be induced by pairing pre- and postsynaptic spiking in the absence of glutamate signalling. LTPpreinduction involves a non-canonical mechanism of retrograde nitric oxide signalling, which is triggered by Ca2+andnbsp;influx from L-type voltage-gated Ca2+andnbsp;channels, not postsynaptic NMDA receptors (NMDARs), and does not require glutamate release. When glutamate release occurs, it decreases Prandnbsp;by activating presynaptic NMDARs, and promotes presynaptic long-term depression. Net changes in Pr, therefore, depend on two opposing factors: (1) Hebbian activity, which increases Pr, and (2) glutamate release, which decreases Pr. Accordingly, release failures during Hebbian activity promote LTPpreinduction. Our findings reveal a novel framework of presynaptic plasticity that radically differs from traditional models of postsynaptic plasticity.</p

    Model-Based Deconvolution of Cell Cycle Time-Series Data Reveals Gene Expression Details at High Resolution

    Get PDF
    In both prokaryotic and eukaryotic cells, gene expression is regulated across the cell cycle to ensure “just-in-time” assembly of select cellular structures and molecular machines. However, present in all time-series gene expression measurements is variability that arises from both systematic error in the cell synchrony process and variance in the timing of cell division at the level of the single cell. Thus, gene or protein expression data collected from a population of synchronized cells is an inaccurate measure of what occurs in the average single-cell across a cell cycle. Here, we present a general computational method to extract “single-cell”-like information from population-level time-series expression data. This method removes the effects of 1) variance in growth rate and 2) variance in the physiological and developmental state of the cell. Moreover, this method represents an advance in the deconvolution of molecular expression data in its flexibility, minimal assumptions, and the use of a cross-validation analysis to determine the appropriate level of regularization. Applying our deconvolution algorithm to cell cycle gene expression data from the dimorphic bacterium Caulobacter crescentus, we recovered critical features of cell cycle regulation in essential genes, including ctrA and ftsZ, that were obscured in population-based measurements. In doing so, we highlight the problem with using population data alone to decipher cellular regulatory mechanisms and demonstrate how our deconvolution algorithm can be applied to produce a more realistic picture of temporal regulation in a cell
    corecore