181,774 research outputs found

    An Efficient Distribution of Labor in a Two Stage Robust Interpretation Process

    Full text link
    Although Minimum Distance Parsing (MDP) offers a theoretically attractive solution to the problem of extragrammaticality, it is often computationally infeasible in large scale practical applications. In this paper we present an alternative approach where the labor is distributed between a more restrictive partial parser and a repair module. Though two stage approaches have grown in popularity in recent years because of their efficiency, they have done so at the cost of requiring hand coded repair heuristics. In contrast, our two stage approach does not require any hand coded knowledge sources dedicated to repair, thus making it possible to achieve a similar run time advantage over MDP without losing the quality of domain independence.Comment: 9 pages, 1 Postscript figure, uses aclap.sty and psfig.tex, In Proceedings of EMNLP 199

    WordRank: Learning Word Embeddings via Robust Ranking

    Full text link
    Embedding words in a vector space has gained a lot of attention in recent years. While state-of-the-art methods provide efficient computation of word similarities via a low-dimensional matrix embedding, their motivation is often left unclear. In this paper, we argue that word embedding can be naturally viewed as a ranking problem due to the ranking nature of the evaluation metrics. Then, based on this insight, we propose a novel framework WordRank that efficiently estimates word representations via robust ranking, in which the attention mechanism and robustness to noise are readily achieved via the DCG-like ranking losses. The performance of WordRank is measured in word similarity and word analogy benchmarks, and the results are compared to the state-of-the-art word embedding techniques. Our algorithm is very competitive to the state-of-the- arts on large corpora, while outperforms them by a significant margin when the training set is limited (i.e., sparse and noisy). With 17 million tokens, WordRank performs almost as well as existing methods using 7.2 billion tokens on a popular word similarity benchmark. Our multi-node distributed implementation of WordRank is publicly available for general usage.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP), November 1-5, 2016, Austin, Texas, US

    A robust high-sensitivity algorithm for automated detection of proteins in two-dimensional electrophoresis gels

    Get PDF
    The automated interpretation of two-dimensional gel electrophoresis images used in protein separation and analysis presents a formidable problem in the detection and characterization of ill-defined spatial objects. We describe in this paper a hierarchical algorithm that provides a robust, high-sensitivity solution to this problem, which can be easily adapted to a variety of experimental situations. The software implementation of this algorithm functions as part of a complete package designed for general protein gel analysis applications

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
    • …
    corecore