45,317 research outputs found

    Can Automatic Abstracting Improve on Current Extracting Techniques in Aiding Users to Judge the Relevance of Pages in Search Engine Results?

    No full text
    Current search engines use sentence extraction techniques to produce snippet result summaries, which users may find less than ideal for determining the relevance of pages. Unlike extracting, abstracting programs analyse the context of documents and rewrite them into informative summaries. Our project aims to produce abstracting summaries which are coherent and easy to read thereby lessening users’ time in judging the relevance of pages. However, automatic abstracting technique has its domain restriction. For solving this problem we propose to employ text classification techniques. We propose a new approach to initially classify whole web documents into sixteen top level ODP categories by using machine learning and a Bayesian classifier. We then manually create sixteen templates for each category. The summarisation techniques we use include a natural language processing techniques to weight words and analyse lexical chains to identify salient phrases and place them into relevant template slots to produce summaries

    Powellsnakes II: a fast Bayesian approach to discrete object detection in multi-frequency astronomical data sets

    Get PDF
    Powellsnakes is a Bayesian algorithm for detecting compact objects embedded in a diffuse background, and was selected and successfully employed by the Planck consortium in the production of its first public deliverable: the Early Release Compact Source Catalogue (ERCSC). We present the critical foundations and main directions of further development of PwS, which extend it in terms of formal correctness and the optimal use of all the available information in a consistent unified framework, where no distinction is made between point sources (unresolved objects), SZ clusters, single or multi-channel detection. An emphasis is placed on the necessity of a multi-frequency, multi-model detection algorithm in order to achieve optimality

    People on Drugs: Credibility of User Statements in Health Communities

    Full text link
    Online health communities are a valuable source of information for patients and physicians. However, such user-generated resources are often plagued by inaccuracies and misinformation. In this work we propose a method for automatically establishing the credibility of user-generated medical statements and the trustworthiness of their authors by exploiting linguistic cues and distant supervision from expert sources. To this end we introduce a probabilistic graphical model that jointly learns user trustworthiness, statement credibility, and language objectivity. We apply this methodology to the task of extracting rare or unknown side-effects of medical drugs --- this being one of the problems where large scale non-expert data has the potential to complement expert medical knowledge. We show that our method can reliably extract side-effects and filter out false statements, while identifying trustworthy users that are likely to contribute valuable medical information

    Improving the translation environment for professional translators

    Get PDF
    When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
    • …
    corecore