Search CORE

726 research outputs found

CLEF 2005: Ad Hoc track overview

Author: Di Nunzio Giorgio Maria
Ferro Nicola
Jones Gareth J.F.
Peters Carol
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

We describe the objectives and organization of the CLEF 2005 ad hoc track and discuss the main characteristics of the tasks offered to test monolingual, bilingual and multilingual textual document retrieval. The performance achieved for each task is presented and a preliminary analysis of results is given. The paper focuses in particular on the multilingual tasks which reused the test collection created in CLEF 2003 in an attempt to see if an improvement in system performance over time could be measured, and also to examine the multilingual results merging problem

CiteSeerX

DCU Online Research Access Service

Archivio istituzionale della ricerca - Università di Padova

Principles of content analysis for information retrieval systems: an overview

Author: Krause Jürgen
Publication venue: Mannheim
Publication date: 01/01/1995
Field of study

"Unquestionably, the content analysis which has emerged as part of Information Retrieval Systems (IRS, e.g. literature databases) over the past 20 years has much in common with the content analysis used by linguists or in the social sciences. However, its intrinsic value stems from the special context in which it is used: a) Close interdependencies link the selected content analysis with the retrieval situation. The user’s retrieval strategies, which are intended to obtain information relevant to the current problem situation, and the available aids (e.g. expansion lists or user-friendly browsing tools) affect the efficacy of some analysis techniques (e.g. noun phrase analysis from computer linguistics) to a considerable extent. b) Normally, a commercial IRS handles mass data, thus necessitating the use of a reduced content analysis even today. Full morphological, syntactic, semantic and pragmatic text analyses are unthinkable simply for efficiency reasons but also for knowledge reasons. Content analysis in IRS is therefore a component part of a special type of restricted system which obeys its own laws. Against the backdrop of these considerations, forms of content analysis in present-day commercial retrieval systems are studied and promising expansions and alternatives are proposed." (author's abstract

SSOAR - Social Science Open Access Repository

DeepKAF:A Heterogeneous CBR & Deep Learning Approach for NLP Prototyping

Author: Althoff Klaus-Dieter
Amin Kareem
Dengel Andreas
Kapetanakis Stelios
Polatidis Nikolaos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/09/2020
Field of study

University of Brighton Research Portal

Extracting knowledge from web communities and linked data for case-based reasoning systems

Author: Aamodt
Bergmann
Bergmann
Bizer
Boyd
Bridge
Church
Fayyad
Gennari
Recio-García
Richter
Roth-Berghofer
Sauer
Sauer
Publication venue: 'Wiley'
Publication date: 10/11/2013
Field of study

Web communities and the Web 2.0 provide a huge amount of experiences and there has been a growing availability of Linked Open Data. Making experiences and data available as knowledge to be used in case-based reasoning CBR systems is a current research effort. The process of extracting such knowledge from the diverse data types used in web communities, to transform data obtained from Linked Data sources, and then formalising it for CBR, is not an easy task. In this paper, we present a prototype, the Knowledge Extraction Workbench KEWo, which supports the knowledge engineer in this task. We integrated the KEWo into the open-source case-based reasoning tool myCBR Workbench. We provide details on the abilities of the KEWo to extract vocabularies from Linked Data sources and generate taxonomies from Linked Data as well as from web community data in the form of semi-structured texts

Crossref

UWL Repository

Relevance distributions across Bradford Zones: Can Bradfordizing improve search?

Author: Mayr Philipp
Publication venue
Publication date: 01/01/2013
Field of study

The purpose of this paper is to describe the evaluation of the effectiveness of the bibliometric technique Bradfordizing in an information retrieval (IR) scenario. Bradfordizing is used to re-rank topical document sets from conventional abstracting & indexing (A&I) databases into core and more peripheral document zones. Bradfordized lists of journal articles and monographs will be tested in a controlled scenario consisting of different A&I databases from social and political sciences, economics, psychology and medical science, 164 standardized IR topics and intellectual assessments of the listed documents. Does Bradfordizing improve the ratio of relevant documents in the first third (core) compared to the second and last third (zone 2 and zone 3, respectively)? The IR tests show that relevance distributions after re-ranking improve at a significant level if documents in the core are compared with documents in the succeeding zones. After Bradfordizing of document pools, the core has a significant better average precision than zone 2, zone 3 and baseline. This paper should be seen as an argument in favour of alternative non-textual (bibliometric) re-ranking methods which can be simply applied in text-based retrieval systems and in particular in A&I databases.Comment: 11 pages, 2 figures, Preprint of a full paper @ 14th International Society of Scientometrics and Informetrics Conference (ISSI 2013

arXiv.org e-Print Archive

SSOAR - Social Science Open Access Repository

Formation of microtubule-based traps controls the sorting and concentration of vesicles to restricted sites of regenerating neurons after axotomy

Author: De Zeeuw Chris I.
Erez Hadas
Hoogenraad Casper C.
Malkinson Guy
Prager-Khoutorsky Masha
Spira Micha E.
Publication venue: The Rockefeller University Press
Publication date: 01/01/2007
Field of study

Transformation of a transected axonal tip into a growth cone (GC) is a critical step in the cascade leading to neuronal regeneration. Critical to the regrowth is the supply and concentration of vesicles at restricted sites along the cut axon. The mechanisms underlying these processes are largely unknown. Using online confocal imaging of transected, cultured Aplysia californica neurons, we report that axotomy leads to reorientation of the microtubule (MT) polarities and formation of two distinct MT-based vesicle traps at the cut axonal end. Approximately 100 μm proximal to the cut end, a selective trap for anterogradely transported vesicles is formed, which is the plus end trap. Distally, a minus end trap is formed that exclusively captures retrogradely transported vesicles. The concentration of anterogradely transported vesicles in the former trap optimizes the formation of a GC after axotomy

Crossref

PubMed Central

EUR Research Repository

Deriving case base vocabulary from web community data

Author: Althoff Klaus-Dieter
Bach Kerstin
Sauer Christian
Publication venue
Publication date: 01/07/2010
Field of study

This paper presents and approach for knowledge extraction for Case-Based Reasoning systems. The recent development of the WWW, especially the Web 2.0, shows that many successful applications are web based. Moreover, the Web 2.0 offers many experiences and our approach uses those experiences to fill the knowledge containers. We are especially focusing on vocabulary knowledge and are using forum posts to create domain-dependent taxonomies that can be directly used in Case-Based Reasoning systems. This paper introduces the applied knowledge extraction process based on the KDD process and explains its application on a web forum for travelers

UWL Repository

Time series classification with ensembles of elastic distance measures

Author: A Stefan
Anthony Bagnall
H Deng
J Demšar
J Lin
J Lin
J Rodriguez
J Tanner
Jason Lines
L Breiman
M Baydogan
M Hall
PF Marteau
T Górecki
Y Jeong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2015
Field of study

Several alternative distance measures for comparing time series have recently been proposed and evaluated on time series classification (TSC) problems. These include variants of dynamic time warping (DTW), such as weighted and derivative DTW, and edit distance-based measures, including longest common subsequence, edit distance with real penalty, time warp with edit, and move–split–merge. These measures have the common characteristic that they operate in the time domain and compensate for potential localised misalignment through some elastic adjustment. Our aim is to experimentally test two hypotheses related to these distance measures. Firstly, we test whether there is any significant difference in accuracy for TSC problems between nearest neighbour classifiers using these distance measures. Secondly, we test whether combining these elastic distance measures through simple ensemble schemes gives significantly better accuracy. We test these hypotheses by carrying out one of the largest experimental studies ever conducted into time series classification. Our first key finding is that there is no significant difference between the elastic distance measures in terms of classification accuracy on our data sets. Our second finding, and the major contribution of this work, is to define an ensemble classifier that significantly outperforms the individual classifiers. We also demonstrate that the ensemble is more accurate than approaches not based in the time domain. Nearly all TSC papers in the data mining literature cite DTW (with warping window set through cross validation) as the benchmark for comparison. We believe that our ensemble is the first ever classifier to significantly outperform DTW and as such raises the bar for future work in this area

Crossref

University of East Anglia digital repository

Opinion Holder and Target Extraction on Opinion Compounds – A Linguistic Approach

Author: Bocionek Christine
Ruppenhofer Josef
Wiegand Michael
Publication venue: San Diego (California) : Association for Computational Linguistics
Publication date: 01/01/2016
Field of study

We present an approach to the new task of opinion holder and target extraction on opinion compounds. Opinion compounds (e.g. user rating or victim support) are noun compounds whose head is an opinion noun. We do not only examine features known to be effective for noun compound analysis, such as paraphrases and semantic classes of heads and modifiers, but also propose novel features tailored to this new task. Among them, we examine paraphrases that jointly consider holders and targets, a verb detour in which noun heads are replaced by related verbs, a global head constraint allowing inferencing between different compounds, and the categorization of the sentiment view that the head conveys

Crossref

Publikationsserver des Instituts für Deutsche Sprache