Search CORE

45,317 research outputs found

Can Automatic Abstracting Improve on Current Extracting Techniques in Aiding Users to Judge the Relevance of Pages in Search Engine Results?

Author: Liang SF
Publication venue
Publication date: 01/01/2004
Field of study

Current search engines use sentence extraction techniques to produce snippet result summaries, which users may find less than ideal for determining the relevance of pages. Unlike extracting, abstracting programs analyse the context of documents and rewrite them into informative summaries. Our project aims to produce abstracting summaries which are coherent and easy to read thereby lessening users’ time in judging the relevance of pages. However, automatic abstracting technique has its domain restriction. For solving this problem we propose to employ text classification techniques. We propose a new approach to initially classify whole web documents into sixteen top level ODP categories by using machine learning and a Bayesian classifier. We then manually create sixteen templates for each category. The summarisation techniques we use include a natural language processing techniques to weight words and analyse lexical chains to identify salient phrases and place them into relevant template slots to produce summaries

Southampton (e-Prints Soton)

Powellsnakes II: a fast Bayesian approach to discrete object detection in multi-frequency astronomical data sets

Author: A. Lasenby
Aatrokoski
Ade
Ade
Ade
Aghanim
Argüeso
Bennett
Bertin
Birkinshaw
Bouchaud
Box
Bracewell
Carvalho
de Zotti
de Zotti
Feroz
Feroz
González-Nuevo
Graça Rocha
Górski
Herranz
Herranz
Herranz
Hinshaw
Hobson
Jaynes
Jeffreys
Jenkins
Jenkins
Keeton
Lanz
Lopez-Caniego
López-Caniego
López-Caniego
M. P. Hobson
Melin
Mukherjee
North
Pedro Carvalho
Press
Riley
Sanz
Schäfer
Serjeant
Sivia
Sunyaev
Van Trees
Waldram
Publication venue: 'Wiley'
Publication date: 20/12/2011
Field of study

Powellsnakes is a Bayesian algorithm for detecting compact objects embedded in a diffuse background, and was selected and successfully employed by the Planck consortium in the production of its first public deliverable: the Early Release Compact Source Catalogue (ERCSC). We present the critical foundations and main directions of further development of PwS, which extend it in terms of formal correctness and the optimal use of all the available information in a consistent unified framework, where no distinction is made between point sources (unresolved objects), SZ clusters, single or multi-channel detection. An emphasis is placed on the necessity of a multi-frequency, multi-model detection algorithm in order to achieve optimality

arXiv.org e-Print Archive

Crossref

People on Drugs: Credibility of User Statements in Health Communities

Author: Danescu-Niculescu-Mizil Cristian
Mukherjee Subhabrata
Weikum Gerhard
Publication venue
Publication date: 06/05/2017
Field of study

Online health communities are a valuable source of information for patients and physicians. However, such user-generated resources are often plagued by inaccuracies and misinformation. In this work we propose a method for automatically establishing the credibility of user-generated medical statements and the trustworthiness of their authors by exploiting linguistic cues and distant supervision from expert sources. To this end we introduce a probabilistic graphical model that jointly learns user trustworthiness, statement credibility, and language objectivity. We apply this methodology to the task of extracting rare or unknown side-effects of medical drugs --- this being one of the problems where large scale non-expert data has the potential to complement expert medical knowledge. We show that our method can reliably extract side-effects and filter out false statements, while identifying trustworthy users that are likely to contribute valuable medical information

arXiv.org e-Print Archive

MPG.PuRe

Improving the translation environment for professional translators

Author: Augustinus Liesbeth
Bulté Bram
Buysschaert Joost
Coppers Sven
Daems Joke
Heyman Geert
Hoste Veronique
Lefever Els
Luyten Kris
Macken Lieve
Moens Marie-Francine
Pelemans Joris
Rigouts Terryn Ayla
Steurs Frieda
Tezcan Arda
Van den Bergh Jan
van der Lek-Ciudin Iulianna
Van Eynde Frank
Vanallemeersch Tom
Vandeghinste Vincent
Verwimp Lyan
Wambacq Patrick
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography