12 research outputs found

    Sistema de búsqueda de respuestas sobre DBpedia

    Full text link
    La búsqueda de respuestas a preguntas concretas en los contenidos de la Web es en muchas ocasiones una tarea difícil y costosa. Por una parte, esto es debido a la ingente y constantemente creciente cantidad de información disponible. Por otra parte, porque los motores de búsqueda actuales están basados en la coincidencia de palabras clave y recuperan aquellos documentos en los que se dan ocurrencias de las palabras usadas en las consultas, sin analizar, comprender y explotar la semántica (significados y relaciones) subyacente, tanto en consultas como en documentos. Los Sistemas de Búsqueda de Respuestas –del inglés Question Answering (QA) systems– pretenden dar solución al problema anterior, permitiendo al usuario realizar consultas en lenguaje natural –en vez de por medio de palabras clave–, y dando como resultados respuestas concretas ya procesadas y verificadas, presentadas también en lenguaje natural en vez de en un listado de documentos. Question Answering es un problema de investigación complejo abierto, no resuelto satisfactoriamente, especialmente cuando se intentan tratar preguntas no restringidas sintácticamente y sobre múltiples dominios. Abordando esas limitaciones, en este Trabajo de Fin de Grado se plantea desarrollar un prototipo de QA que haga uso de herramientas de Procesado de Lenguaje Natural avanzadas para el procesamiento y comprensión de preguntas abiertas y que acceda de manera flexible a bases de conocimientos de la Web Semántica –en particular a DBpedia, la versión estructurada de Wikipedia–, para la obtención de las respuestas correspondientes a tales preguntas.Searching for answers to specific questions on the Web is often a difficult and time-consuming task. This is due, on the one hand, to the huge and ever growing amount of available information and the fact that current search engines merely return long lists of documents potentially relevant to the queries made. On the other hand, it is because such systems are based on keyword matching and do not analyze and understand the underlying semantics (concepts and relations) in both queries and documents. Question Answering Systems intend to address that problem, allowing the user to ask questions in natural language –instead of keyword-based queries–, and retrieving specific –and already processed and verified– answers in natural language, rather than lists of documents. Question Answering is an open research problem, which has not been solved satisfactory in many cases, especially when dealing with open, non restricted questions on multiple domains. Addressing these last limitations, this Bachelor Thesis aims to develop a prototype Question Answering System, using advanced Natural Language Processing tools to process and understand open domain questions, and exploiting in a general and flexible way knowledge bases from the Semantic Web –DBpedia, the structured version of Wikipedia, in particular– to automatically obtain the answers to the input questions

    AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture

    Get PDF
    The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices

    PerCon: A Personal Digital Library for Heterogeneous Data Management and Analysis

    Get PDF
    Systems are needed to support access to and analysis of larger and more heterogeneous scientific datasets. Users need support in the location, organization, analysis, and interpretation of data to support their current activities with appropriate services and tools. We developed PerCon, a data management and analysis environment, to support such use. PerCon processes and integrates data gathered via queries to existing data providers to create a personal or a small group digital library of data. Users may then search, browse, visualize, annotate, and organize the data as they proceed with analysis and interpretation. Analysis and interpretation in PerCon takes place in a visual workspace in which multiple data visualizations and annotations are placed into spatial arrangements based on the current task. The system watches for patterns in the user’s data selection, exploration, and organization, then through mixed-initiative interaction assists users by suggesting potentially relevant data from unexplored data sources. In order to identify relevant data, PerCon builds up various precomputed feature tables of data objects including their metadata (e.g. similarities, distances) and a user interest model to infer the user interest or specific information need. In particular, probabilistic networks in PerCon model user interactions (i.e. event features) and predict the data type of greatest interest through network training. In turn, the most relevant data objects of interest in the inferred data type are identified through a weighted feature computation then recommended to the user. PerCon’s data location and analysis capabilities were evaluated in a controlled study with 24 users. The study participants were asked to locate and analyze heterogeneous weather and river data with and without the visual workspace and mixed-initiative interaction, respectively. Results indicate that the visual workspace facilitated information representation and aided in the identification of relationships between datasets. The system’s suggestions encouraged data exploration, leading participants to identify more evidences of correlation among data streams and more potential interactions among weather and river data

    The Scholarly Electronic Publishing Bibliography: 2008 Annual Edition

    Get PDF
    The Scholarly Electronic Publishing Bibliography: 2008 Annual Edition presents over 3,350 English-language articles, books, and other printed and electronic sources that are useful in understanding scholarly electronic publishing efforts on the Internet. Most sources have been published from 1990 through 2008; however, a limited number of key sources published prior to 1990 are also included. Where possible, links are provided to works that are freely available on the Internet, including e-prints in disciplinary archives and institutional repositories. It is available under a Creative Commons Attribution-Noncommercial 3.0 United States License

    Scholarly Electronic Publishing Bibliography 2010

    Get PDF
    The Scholarly Electronic Publishing Bibliography 2010 presents over 3,800 selected English-language articles, books, and other textual sources that are useful in understanding scholarly electronic publishing efforts on the Internet. It covers digital copyright, digital libraries, digital preservation, digital rights management, digital repositories, economic issues, electronic books and texts, electronic serials, license agreements, metadata, publisher issues, open access, and other related topics. Most sources have been published from 1990 through 2010. Many references have links to freely available copies of included works. It is under a Creative Commons Attribution-Noncommercial 3.0 United States License. Cite as: Bailey, Charles W., Jr. Scholarly Electronic Publishing Bibliography 2010. Houston: Digital Scholarship, 2011
    corecore