Search CORE

10,933 research outputs found

Automatic extraction of knowledge from web documents

Author: Alani Harith
Hall Wendy
Kim Sanghee
Lewis Paul H.
Millard David E.
Shadbolt Nigel R.
Weal Mark J.
Publication venue
Publication date: 01/01/2003
Field of study

A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from multiple documents based on a predefined ontology. The ontology represents the type and form of knowledge to extract. This knowledge is then used to generate tailored biographies. The information extraction process of Artequakt is detailed and evaluated in this paper

CiteSeerX

Southampton (e-Prints Soton)

Open Research Online (The Open University)

Exploiting Image-trained CNN Architectures for Unconstrained Video Classification

Author: Andrews Walter
Luisier Florian
Salakhutdinov Ruslan
Srivastava Nitish
Zha Shengxin
Publication venue
Publication date: 01/01/2015
Field of study

We conduct an in-depth exploration of different strategies for doing event detection in videos using convolutional neural networks (CNNs) trained for image classification. We study different ways of performing spatial and temporal pooling, feature normalization, choice of CNN layers as well as choice of classifiers. Making judicious choices along these dimensions led to a very significant increase in performance over more naive approaches that have been used till now. We evaluate our approach on the challenging TRECVID MED'14 dataset with two popular CNN architectures pretrained on ImageNet. On this MED'14 dataset, our methods, based entirely on image-trained CNN features, can outperform several state-of-the-art non-CNN models. Our proposed late fusion of CNN- and motion-based features can further increase the mean average precision (mAP) on MED'14 from 34.95% to 38.74%. The fusion approach achieves the state-of-the-art classification performance on the challenging UCF-101 dataset

arXiv.org e-Print Archive

Crossref

A reproducible approach with R markdown to automatic classification of medical certificates in French

Author: Beghini Federica
Di Nunzio Giorgio Maria
Henrot Genevi\ue8ve
Vezzani Federica
Publication venue: CEUR-WS
Publication date: 01/01/2017
Field of study

In this paper, we report the ongoing developments of our first participation to the Cross-Language Evaluation Forum (CLEF) eHealth Task 1: “Multilingual Information Extraction - ICD10 coding” (Névéol et al., 2017). The task consists in labelling death certificates, in French with international standard codes. In particular, we wanted to accomplish the goal of the ‘Replication track’ of this Task which promotes the sharing of tools and the dissemination of solid, reproducible results.In questo articolo presentiamo gli sviluppi del lavoro iniziato con la partecipazione al Laboratorio CrossLanguage Evaluation Forum (CLEF) eHealth denominato: “Multilingual Information Extraction - ICD10 coding” (Névéol et al., 2017) che ha come obiettivo quello di classificare certificati di morte in lingua francese con dei codici standard internazionali. In particolare, abbiamo come obiettivo quello proposto dalla ‘Replication track’ di questo Task, che promuove la condivisione di strumenti e la diffusione di risultati riproducibili

Archivio istituzionale della ricerca - Università di Padova

Web based knowledge extraction and consolidation for automatic ontology instantiation

Author: Alani Harith
Hall Wendy
Kim Sanghee
Lewis Paul H.
Millard David E.
Shadbolt Nigel
Weal Mark J.
Publication venue
Publication date: 01/01/2003
Field of study

The Web is probably the largest and richest information repository available today. Search engines are the common access routes to this valuable source. However, the role of these search engines is often limited to the retrieval of lists of potentially relevant documents. The burden of analysing the returned documents and identifying the knowledge of interest is therefore left to the user. The Artequakt system aims to deploy natural language tools to automatically ex-tract and consolidate knowledge from web documents and instantiate a given ontology, which dictates the type and form of knowledge to extract. Artequakt focuses on the domain of artists, and uses the harvested knowledge to gen-erate tailored biographies. This paper describes the latest developments of the system and discusses the problem of knowledge consolidation

CiteSeerX

Southampton (e-Prints Soton)

Open Research Online (The Open University)

Vocal Access to a Newspaper Archive: Design Issues and Preliminary Investigation

Author: Crestani Fabio
Publication venue
Publication date: 10/12/1998
Field of study

This paper presents the design and the current prototype implementation of an interactive vocal Information Retrieval system that can be used to access articles of a large newspaper archive using a telephone. The results of preliminary investigation into the feasibility of such a system are also presented

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

The effects of topic familiarity on user search behavior in question answering systems

Author: Al-Maskari A.
Sanderson M.
Publication venue
Publication date: 01/01/2006
Field of study

This paper reports on experiments that attempt to characterize the relationship between users and their knowledge of the search topic in a Question Answering (QA) system. It also investigates user search behavior with respect to the length of answers presented by a QA system. Two lengths of answers were compared; snippets (one to two sentences of text) and exact answers. A user test was conducted, 92 factoid questions were judged by 44 participants, to explore the participants’ preferences, feelings and opinions about QA system tasks. The conclusions drawn from the results were that participants preferred and obtained higher accuracy in finding answers from the snippets set. However, accuracy varied according to users’ topic familiarity; users were only substantially helped by the wider context of a snippet if they were already familiar with the topic of the question, without such familiarity, users were about as accurate at locating answers from the snippets as they were in exact set

RMIT Research Repository

White Rose Research Online