5 research outputs found

    Similarity-based word sense disambiguation

    No full text
    We describe a method for automatic word sense disambiguation using a text corpus and a machinereadable dictionary (MRD). The method is based on word similarity and context similarity measures. Words are considered similar if they appear in similar contexts; contexts are similar if they contain similar words. The circularity of this definition is resolved by an iterative, converging process, in which the system learns from the corpus a set of typical usages for each of the senses of the polysemous word listed in the MRD. A new instance of a polysemous word is assigned the sense associated with the typical usage most similar to its context. Experiments show that this method can learn even from very sparse training data, achieving over 92 % correct disambiguation performance

    MicroRNA expression detected by oligonucleotide microarrays: System establishment and expression profiling in human tissues

    No full text
    MicroRNAs (MIRs) are a novel group of conserved short ∼22 nucleotide-long RNAs with important roles in regulating gene expression. We have established a MIR-specific oligonucleotide microarray system that enables efficient analysis of the expression of the human MIRs identified so far. We show that the 60-mer oligonucleotide probes on the microarrays hybridize with labeled cRNA of MIRs, but not with their precursor hairpin RNAs, derived from amplified, size-fractionated, total RNA of human origin. Signal intensity is related to the location of the MIR sequences within the 60-mer probes, with location at the 5′ region giving the highest signals, and at the 3′ end, giving the lowest signals. Accordingly, 60-mer probes harboring one MIR copy at the 5′ end gave signals of similar intensity to probes containing two or three MIR copies. Mismatch analysis shows that mutations within the MIR sequence significantly reduce or eliminate the signal, suggesting that the observed signals faithfully reflect the abundance of matching MIRs in the labeled cRNA. Expression profiling of 150 MIRs in five human tissues and in HeLa cells revealed a good overall concordance with previously published results, but also with some differences. We present novel data on MIR expression in thymus, testes, and placenta, and have identified MIRs highly enriched in these tissues. Taken together, these results highlight the increased sensitivity of the DNA microarray over other methods for the detection and study of MIRs, and the immense potential in applying such microarrays for the study of MIRs in health and disease
    corecore