Search CORE

55 research outputs found

An efficient method of indexing for image retrieval from pdf files

Author: Crespo Mariano
Mata Vázquez Jacinto
Maña López Manuel Jesús
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2010
Field of study

Una de las áreas que más interés está despertando actualmente entre los investigadores y usuarios de sistemas de Recuperación de Información es la recuperación de documentos que contengan imágenes relevantes a una necesidad de información. En este caso, el principal objetivo no es la recuperación de los documentos relevantes a la necesidad de información del usuario sino la obtención de las imágenes relevantes a dicha necesidad. En la actualidad, las colecciones de documentos se pueden encontrar en diversos formatos (html, xml, pdf, etc.). En este artículo presentamos un método eficaz para indexar una colección de documentos en formato pdf para mejorar la recuperación de imágenes contenidas en los documentos. Los experimentos realizados prueban que el método presentado obtiene mejores resultados que si se realizara una indexación del texto completo.One of the areas which is presently awakening more interest among researchers and users of Information Retrieval systems is the retrieval of documents containing images which are relevant to a need for information. In this case, the main objective is not the retrieval of the documents relevant to the user’s need for information, but the achievement of the images relevant to that need for information. At present, document collections can be found in a variety of formats (html, xml, pdf, etc). In this paper we present an efficient method to index a collection of documents in pdf format to improve the retrieval of images contained in documents. The experiments we carried out prove that the method presented here achieves better results than indexing the full text.Este trabajo ha sido parcialmente financiado por el Ministerio de Ciencia e Innovación, el Plan E del Gobierno Español y la Unión Europea con cargo al FEDER (TIN2009-14057-C03-03)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Arias Montano: Institutional Repository of the University of Huelva

AORESCU: Opinion Analysis in Social Networks and User-Generated Contents

Author: Cruz Mata Fermín
Enríquez de Salamanca Ros Fernando
Maña López Manuel Jesús
Troyano Jiménez José Antonio
Ureña López Luis Alfonso
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2015
Field of study

El proyecto AORESCU tiene como objetivos la recopilación y el procesamiento de la información generada por los usuarios sobre una entidad con idea de obtener a partir de ella una serie de indicadores que permitan evaluar la imagen que los usuarios tienen de la misma. La información recuperada puede ser estructurada (p.e. valoraciones numéricas) y no estructurada (fundamentalmente en forma de textos en lenguaje natural). Las técnicas y herramientas utilizadas en el proyecto son adaptables a cualquier dominio. No obstante, se ha elegido el ámbito turístico como dominio de aplicación al tratarse de un sector con una importante actividad económica y para el que es fácil encontrar contenidos para analizar. El proyecto tiene cuatro partes fundamentales: la recuperación de información de distintas fuentes sobre las entidades que pertenecen al dominio de aplicación (hoteles, restaurantes, espacios naturales, monumentos,…), la definición de un modelo de datos para representar esta información, el desarrollo de herramientas de análisis de textos para procesar los comentarios de los usuarios y el desarrollo de una aplicación web que permita analizar los datos procesados.AORESCU project main goals are focused on the retrieval and processing of information generated by users about an entity. The idea is to get insights from this information that help us to understand the perception of users about an entity. We can retrieve two types of information from web 2.0 sources: structured information (e.g. numerical rating) and unstructured (mainly in the form of texts in natural language). The techniques and tools used in the project are adaptable to any domain. We chose the tourism sector as application domain since it is a sector with an important economic activity and because it is easy to find user generated content about touristic resources. The project has four main phases: the retrieval of information from different sources about the entities (for the tourism sector, these entities are hotels, restaurants, natural spaces, monuments,...), the definition of a data model to represent this information, the development of text analysis tools to process user comments and the development of a web application to query and analyze the processed data.El proyecto AORESCU (P11-TIC-7684 MO) está financiado por la Consejería de Innovación, Ciencia y Empresas de la Junta de Andalucía

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

idUS. Depósito de Investigación Universidad de Sevilla

¿Qué otros factores nos interesa conocer de la función visual del paciente con baja visión?

Author: Antón López Alfonso
Cardona Torradeflot Genís
Pérez Maña Luis
Publication venue
Publication date: 01/03/2018
Field of study

Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Acceso a la información bilingüe utilizando ontologías específicas del dominio biomédico

Author: Buenaga Rodríguez Manuel de
Carrero García Francisco
Gómez Hidalgo José María
Mata Vázquez Jacinto
Maña López Manuel Jesús
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2007
Field of study

Unos de los enfoques más prometedores en la Recuperación de Información Croslingüe es la utilización de recursos léxico-semánticos para realizar una indexación conceptual de los documentos y consultas. Hemos seguido esta aproximación para proponer un sistema de acceso a la información para profesionales sanitarios, que facilita la preparación de casos clínicos, y la realización de estudios e investigaciones. En nuestra propuesta se conecta la documentación de los pacientes (la historia clínica), en castellano, con la información científica relacionada (artículos científicos), en inglés y castellano, usando para ellos recursos de gran cobertura y calidad como la ontología SNOMED. Se describe asimismo como se gestiona la confidencialidad de la información.One of the most promising approaches to Cross-Language Information Retrieval is the utilization of lexical-semantic resources for concept-indexing documents and queries. We have followed this approach in a proposal of an Information Access system designed for medicine professionals, aiming at easing the preparation of clinical cases, and the development of studies and research. In our proposal, the clinical record information, in Spanish, is connected to related scientific information (research papers), in English and Spanish, by using high quality and coverage resources like the SNOMED ontology. We also describe how we have addressed information privacy

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Arias Montano: Institutional Repository of the University of Huelva

Un Sistema de Recuperación de Información Biomédica en Dispositivos Móviles basado en Agrupamiento

Author: Maña López Manuel Jesús
Millán Manuel
Muñoz Alejandro
Villa Cordero Manuel de la
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2010
Field of study

La sobrecarga de información producida por la creciente disponibilidad en internet de textos y publicaciones de interés es un problema que se acrecienta cuando esa información es necesaria para la toma de decisiones, como ocurre en el ámbito biomédico. Es en este dominio donde se ubica este sistema de recuperación de información dirigido a dispositivos de consulta móviles, que a los tradicionales procesos de indexado y búsqueda, añade la característica de la devolución de los resultados de manera agrupada en función de su contenido.Information overload caused by the increasing availability of online texts and publications of interest is a problem that increases when such information is necessary for decision making, as in the biomedical field. It is in this domain where we present an information retrieval system for mobile devices. Traditional indexing and search processes are enriched with the feature of returning the results in clusters according to their content.This work has been partially funded by the Spanish Ministry of Science and Innovation and the European Union from the ERDF (TIN2009-14057-C03-03

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Rule extraction from medical data without discretization of numerical attributes

Author: Domínguez Olmedo Juan Luis
Mata Vázquez Jacinto
Maña López Manuel Jesús
Pachón Álvarez Victoria
Publication venue: 'Scitepress'
Publication date: 01/01/2012
Field of study

Association rule mining is a popular technique used to find associations between attributes in a dataset. When using deterministic algorithms, if the attributes have numerical values the usual approach is to discretize them defining proper intervals. But the discretization can notably affect the quality of the rules generated. This work presents a method based on a deterministic exploration of the interval search space without a previous discretization of the numerical attributes. It has been applied to medical data from an atherosclerosis study. The quality of the obtained rules seems to support this method as a valid alternative for this kind of rule extraction

Arias Montano: Institutional Repository of the University of Huelva

Interactive information retrieval

Author: Allan
Barry
Bates
Beaulieu
Beaulieu
Belkin
Belkin
Bhavnani
Blair
Borgman
Borgman
Brajnik
Broder
Buyukkokten
Byström
Campbell
Case
Chen
Cove
Crestani
Crouch
Downie
Dumais
Eastman
Efthimiadis
Ellis
Ellis
Fidel
Ford
Ford
Foster
Fox
Hansen
Harper
Hearst
Hearst
Hearst
Heinström
Hill
Ingwersen
Ingwersen
Jansen
Jansen
Jones
Jones
Kang
Kelly
Kelly
Kim
Konstan
Kruschwitz
Kuhlthau
Legg
Lin
Lin
Lorigo
Lynch
López-Ostenero
Maña-López
Niemi
Norman
Over
Pirkola
Pu
Radev
Reid
Reid
Riedl
Rieh
Robertson
Rosenfeld
Roussinov
Ruthven
Ruthven
Savolainen
Shipman
Shneiderman
Sihvonen
Slone
Smeaton
Spink
Spink
Spink
Spink
Spink
Spink
Spärck Jones
Spärck Jones
Sweeney
Tombros
Tombros
Toms
Topi
Topi
Vakkari
Vakkari
Vakkari
Vakkari
van der Eijk
Vechtomova
Voorhees
White
White
White
White
Wiesman
Wu
Xie
Publication venue: 'Wiley'
Publication date: 01/11/2008
Field of study

Crossref

University of Strathclyde Institutional Repository

Overview of BioCreative II gene mention recognition

Author: Adriaans P.
Baumgartner (jr.) W.A.
Blaschke C.
Carpenter B.
Chen Y.
Chung I-F.
Dai H.-J.
Divoli A.
Friedrich C.M.
Ganchev K.
Haddow B.
Hsu C.-N.
Hunter L.
Johnson R.
Katrenko S.
Klinger R.
Kuo C.-J.
Lin Y.-S.
Liu F.
Liu H.
Mata J.
Maña-López M.
Nakov P.
Neves M.
Povinelli R.J.
Smith L.
Struble C.A.
Sun C.
Tanabe L.K.
Torii M.
Torres R.
Tsai R.T.-H.
Vlachos A.
Wilbur W.J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

International Migration, Integration and Social Cohesion online publications

Overview of BioCreative II gene mention recognition.

Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions

epublications@Marquette

Fraunhofer-ePrints

PubMed Central

Edinburgh Research Explorer

Publications at Bielefeld University

Apollo (Cambridge)

White Rose Research Online

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Setting a baseline for an automatic extractive concepts-based summarization on the biomedical domain

Author: Maña López Manuel Jesús
Villa Cordero Manuel de la
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2009
Field of study

Los métodos de generación de resúmenes basados en técnicas extractivas han demostrado ser muy útiles por su adaptabilidad y eficiencia en tiempo de respuesta en cualquier tipo de dominios. En el ámbito biomédico son numerosos los estudios que hablan de la sobrecarga de información y recogen la necesidad de aplicación de técnicas eficientes de recuperación y generación de resúmenes para una correcta aplicación de la medicina basada en la evidencia. En este contexto vamos a presentar una propuesta de metodología de generación automática de resúmenes basada en conocimiento estructurado y grafos. A partir de una representación del documento original en un grafo, aplicando técnicas de similitud entre frases y sus conceptos biomédicos, se obtienen las frases más relevantes para formar el resumen final.The methods for automatic summarization generation based in extractive techniques have widely shown its utility for his adaptability and efficiency in the manner of response time at any kind of application domain. In Biomedical domain are numerous the research results about the overload information and the need of application of efficient recovery and summarization methods for the proper use of evidence based medicine. In this context we are going to present a proposal of methodology for automatic summarization based on structured knowledge and graph's use. From a representation of the source document in form of a graph, applying similarity methods between phrases and their containing biomedical concepts, we obtain the most salient phrases to fill in the final summary.Este trabajo ha sido financiado por el Ministerio de Ciencia e Innovación a través de los proyectos CICYT TIN2007-67843-C06-03 y TIN2005-08998-C02-02

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Arias Montano: Institutional Repository of the University of Huelva