80 research outputs found

    Citation Recommendation: Approaches and Datasets

    Get PDF
    Citation recommendation describes the task of recommending citations for a given text. Due to the overload of published scientific works in recent years on the one hand, and the need to cite the most appropriate publications when writing scientific texts on the other hand, citation recommendation has emerged as an important research topic. In recent years, several approaches and evaluation data sets have been presented. However, to the best of our knowledge, no literature survey has been conducted explicitly on citation recommendation. In this article, we give a thorough introduction into automatic citation recommendation research. We then present an overview of the approaches and data sets for citation recommendation and identify differences and commonalities using various dimensions. Last but not least, we shed light on the evaluation methods, and outline general challenges in the evaluation and how to meet them. We restrict ourselves to citation recommendation for scientific publications, as this document type has been studied the most in this area. However, many of the observations and discussions included in this survey are also applicable to other types of text, such as news articles and encyclopedic articles.Comment: to be published in the International Journal on Digital Librarie

    Music information retrieval: conceptuel framework, annotation and user behaviour

    Get PDF
    Understanding music is a process both based on and influenced by the knowledge and experience of the listener. Although content-based music retrieval has been given increasing attention in recent years, much of the research still focuses on bottom-up retrieval techniques. In order to make a music information retrieval system appealing and useful to the user, more effort should be spent on constructing systems that both operate directly on the encoding of the physical energy of music and are flexible with respect to users’ experiences. This thesis is based on a user-centred approach, taking into account the mutual relationship between music as an acoustic phenomenon and as an expressive phenomenon. The issues it addresses are: the lack of a conceptual framework, the shortage of annotated musical audio databases, the lack of understanding of the behaviour of system users and shortage of user-dependent knowledge with respect to high-level features of music. In the theoretical part of this thesis, a conceptual framework for content-based music information retrieval is defined. The proposed conceptual framework - the first of its kind - is conceived as a coordinating structure between the automatic description of low-level music content, and the description of high-level content by the system users. A general framework for the manual annotation of musical audio is outlined as well. A new methodology for the manual annotation of musical audio is introduced and tested in case studies. The results from these studies show that manually annotated music files can be of great help in the development of accurate analysis tools for music information retrieval. Empirical investigation is the foundation on which the aforementioned theoretical framework is built. Two elaborate studies involving different experimental issues are presented. In the first study, elements of signification related to spontaneous user behaviour are clarified. In the second study, a global profile of music information retrieval system users is given and their description of high-level content is discussed. This study has uncovered relationships between the users’ demographical background and their perception of expressive and structural features of music. Such a multi-level approach is exceptional as it included a large sample of the population of real users of interactive music systems. Tests have shown that the findings of this study are representative of the targeted population. Finally, the multi-purpose material provided by the theoretical background and the results from empirical investigations are put into practice in three music information retrieval applications: a prototype of a user interface based on a taxonomy, an annotated database of experimental findings and a prototype semantic user recommender system. Results are presented and discussed for all methods used. They show that, if reliably generated, the use of knowledge on users can significantly improve the quality of music content analysis. This thesis demonstrates that an informed knowledge of human approaches to music information retrieval provides valuable insights, which may be of particular assistance in the development of user-friendly, content-based access to digital music collections

    Recent Application in Biometrics

    Get PDF
    In the recent years, a number of recognition and authentication systems based on biometric measurements have been proposed. Algorithms and sensors have been developed to acquire and process many different biometric traits. Moreover, the biometric technology is being used in novel ways, with potential commercial and practical implications to our daily activities. The key objective of the book is to provide a collection of comprehensive references on some recent theoretical development as well as novel applications in biometrics. The topics covered in this book reflect well both aspects of development. They include biometric sample quality, privacy preserving and cancellable biometrics, contactless biometrics, novel and unconventional biometrics, and the technical challenges in implementing the technology in portable devices. The book consists of 15 chapters. It is divided into four sections, namely, biometric applications on mobile platforms, cancelable biometrics, biometric encryption, and other applications. The book was reviewed by editors Dr. Jucheng Yang and Dr. Norman Poh. We deeply appreciate the efforts of our guest editors: Dr. Girija Chetty, Dr. Loris Nanni, Dr. Jianjiang Feng, Dr. Dongsun Park and Dr. Sook Yoon, as well as a number of anonymous reviewers

    Modeling and querying spatio-temporal clinical databases with multiple granularities

    Get PDF
    In molti campi di ricerca, i ricercatori hanno la necessit\ue0 di memorizzare, gestire e interrogare dati spazio-temporali. Tali dati sono classici dati alfanumerici arricchiti per\uf2 con una o pi\uf9 componenti temporali, spaziali e spazio-temporali che, con diversi possibili significati, li localizzano nel tempo e/o nello spazio. Ambiti in cui tali dati spazio-temporali devono essere raccolti e gestiti sono, per esempio, la gestione del territorio o delle risorse naturali, l'epidemiologia, l'archeologia e la geografia. Pi\uf9 in dettaglio, per esempio nelle ricerche epidemiologiche, i dati spazio-temporali possono servire a rappresentare diversi aspetti delle malattie e delle loro caratteristiche, quali per esempio la loro origine, espansione ed evoluzione e i fattori di rischio potenzialmente connessi alle malattie e al loro sviluppo. Le componenti spazio-temporali dei dati possono essere considerate come dei "meta-dati" che possono essere sfruttati per introdurre nuovi tipi di analisi sui dati stessi. La gestione di questi "meta-dati" pu\uf2 avvenire all'interno di diversi framework proposti in letteratura. Uno dei concetti proposti a tal fine \ue8 quello delle granularit\ue0. In letteratura c'\ue8 ampio consenso sul concetto di granularit\ue0 temporale, di cui esistono framework basati su diversi approcci. D'altro canto, non esiste invece un consenso generale sulla definizione di un framework completo, come quello delle granularit\ue0 temporali, per le granularit\ue0 spaziali e spazio-temporali. Questa tesi ha lo scopo di riempire questo vuoto proponendo un framework per le granularit\ue0 spaziali e, basandosi su questo e su quello gi\ue0 presente in letteratura per le granularit\ue0 temporali, un framework per le granularit\ue0 spazio-temporali. I framework proposti vogliono essere completi, per questo, oltre alle definizioni dei concetti di granularit\ue0 spaziale e spazio-temporale, includono anche la definizione di diversi concetti legati alle granularit\ue0, quali per esempio le relazioni e le operazioni tra granularit\ue0. Le relazioni permettono di conoscere come granularit\ue0 diverse sono legate tra loro, costruendone anche una gerarchia. Tali informazioni sono poi utili al fine di conoscere se e come \ue8 possibile confrontare dati associati e rappresentati con granularit\ue0 diverse. Le operazioni permettono invece di creare nuove granularit\ue0 a partire da altre granularit\ue0 gi\ue0 definite nel sistema, manipolando o selezionando alcune loro componenti. Basandosi su questi framework, l'obiettivo della tesi si sposta poi sul mostrare come le granularit\ue0 possano essere utilizzate per arricchire basi di dati spazio-temporali gi\ue0 esistenti al fine di una loro migliore e pi\uf9 ricca gestione e interrogazione. A tal fine, proponiamo qui una base di dati per la gestione dei dati riguardanti le granularit\ue0 temporali, spaziali e spazio-temporali. Nella base di dati proposta possono essere rappresentate tutte le componenti di una granularit\ue0 come definito nei framework proposti. La base di dati pu\uf2 poi essere utilizzata per estendere una base di dati spazio-temporale esistente aggiungendo alle tuple di quest'ultima delle referenze alle granularit\ue0 dove quei dati possono essere localizzati nel tempo e/o nel spazio. Per dimostrare come ci\uf2 possa essere fatto, nella tesi introduciamo la base di dati sviluppata ed utilizzata dal Servizio Psichiatrico Territoriale (SPT) di Verona. Tale base di dati memorizza le informazioni su tutti i pazienti venuti in contatto con l'SPT negli ultimi 30 anni e tutte le informazioni sui loro contatti con il servizio stesso (per esempio: chiamate telefoniche, visite a domicilio, ricoveri). Parte di tali informazioni hanno una componente spazio-temporale e possono essere quindi analizzate studiandone trend e pattern nel tempo e nello spazio. Nella tesi quindi estendiamo questa base di dati psichiatrica collegandola a quella proposta per la gestione delle granularit\ue0. A questo punto i dati psichiatrici possono essere interrogati anche sulla base di vincoli spazio-temporali basati su granularit\ue0. L'interrogazione di dati spazio-temporali associati a granularit\ue0 richiede l'utilizzo di un linguaggio d'interrogazione che includa, oltre a strutture, operatori e funzioni spazio-temporali per la gestione delle componenti spazio-temporali dei dati, anche costrutti per l'utilizzo delle granularit\ue0 nelle interrogazioni. Quindi, partendo da un linguaggio d'interrogazione spazio-temporale gi\ue0 presente in letteratura, in questa tesi proponiamo anche un linguaggio d'interrogazione che permetta ad un utente di recuperare dati da una base di dati spazio-temporale anche sulla base di vincoli basati su granularit\ue0. Il linguaggio viene introdotto fornendone la sintassi e la semantica. Inoltre per mostrare l'effettivo ruolo delle granularit\ue0 nell'interrogazione di una base di dati clinica, mostreremo diversi esempi di interrogazioni, scritte con il linguaggio d'interrogazione proposto, sulla base di dati psichiatrica dell'SPT di Verona. Tali interrogazioni spazio-temporali basate su granularit\ue0 possono essere utili ai ricercatori ai fini di analisi epidemiologiche dei dati psichiatrici.In several research fields, temporal, spatial, and spatio-temporal data have to be managed and queried with several purposes. These data are usually composed by classical data enriched with a temporal and/or a spatial qualification. For instance, in epidemiology spatio-temporal data may represent surveillance data, origins of disease and outbreaks, and risk factors. In order to better exploit the time and spatial dimensions, spatio-temporal data could be managed considering their spatio-temporal dimensions as meta-data useful to retrieve information. One way to manage spatio-temporal dimensions is by using spatio-temporal granularities. This dissertation aims to show how this is possible, in particular for epidemiological spatio-temporal data. For this purpose, in this thesis we propose a framework for the definition of spatio-temporal granularities (i.e., partitions of a spatio-temporal dimension) with the aim to improve the management and querying of spatio-temporal data. The framework includes the theoretical definitions of spatial and spatio-temporal granularities (while for temporal granularities we refer to the framework proposed by Bettini et al.) and all related notions useful for their management, e.g., relationships and operations over granularities. Relationships are useful for relating granularities and then knowing how data associated with different granularities can be compared. Operations allow one to create new granularities from already defined ones, manipulating or selecting their components. We show how granularities can be represented in a database and can be used to enrich an existing spatio-temporal database. For this purpose, we conceptually and logically design a relational database for temporal, spatial, and spatio-temporal granularities. The database stores all data about granularities and their related information we defined in the theoretical framework. This database can be used for enriching other spatio-temporal databases with spatio-temporal granularities. We introduce the spatio-temporal psychiatric case register, developed by the Verona Community-based Psychiatric Service (CPS), for storing and managing information about psychiatric patient, their personal information, and their contacts with the CPS occurred in last 30 years. The case register includes both clinical and statistical information about contacts, that are also temporally and spatially qualified. We show how the case register database can be enriched with spatio-temporal granularities both extending its structure and introducing a spatio-temporal query language dealing with spatio-temporal data and spatio-temporal granularities. Thus, we propose a new spatio-temporal query language, by defining its syntax and semantics, that includes ad-hoc features and constructs for dealing with spatio-temporal granularities. Finally, using the proposed query language, we report several examples of spatio-temporal queries on the psychiatric case register showing the ``usage'' of granularities and their role in spatio-temporal queries useful for epidemiological studies

    Unsupervised learning of relation detection patterns

    Get PDF
    L'extracció d'informació és l'àrea del processament de llenguatge natural l'objectiu de la qual és l'obtenir dades estructurades a partir de la informació rellevant continguda en fragments textuals. L'extracció d'informació requereix una quantitat considerable de coneixement lingüístic. La especificitat d'aquest coneixement suposa un inconvenient de cara a la portabilitat dels sistemes, ja que un canvi d'idioma, domini o estil té un cost en termes d'esforç humà. Durant dècades, s'han aplicat tècniques d'aprenentatge automàtic per tal de superar aquest coll d'ampolla de portabilitat, reduint progressivament la supervisió humana involucrada. Tanmateix, a mida que augmenta la disponibilitat de grans col·leccions de documents, esdevenen necessàries aproximacions completament nosupervisades per tal d'explotar el coneixement que hi ha en elles. La proposta d'aquesta tesi és la d'incorporar tècniques de clustering a l'adquisició de patrons per a extracció d'informació, per tal de reduir encara més els elements de supervisió involucrats en el procés En particular, el treball se centra en el problema de la detecció de relacions. L'assoliment d'aquest objectiu final ha requerit, en primer lloc, el considerar les diferents estratègies en què aquesta combinació es podia dur a terme; en segon lloc, el desenvolupar o adaptar algorismes de clustering adequats a les nostres necessitats; i en tercer lloc, el disseny de procediments d'adquisició de patrons que incorporessin la informació de clustering. Al final d'aquesta tesi, havíem estat capaços de desenvolupar i implementar una aproximació per a l'aprenentatge de patrons per a detecció de relacions que, utilitzant tècniques de clustering i un mínim de supervisió humana, és competitiu i fins i tot supera altres aproximacions comparables en l'estat de l'art.Information extraction is the natural language processing area whose goal is to obtain structured data from the relevant information contained in textual fragments. Information extraction requires a significant amount of linguistic knowledge. The specificity of such knowledge supposes a drawback on the portability of the systems, as a change of language, domain or style demands a costly human effort. Machine learning techniques have been applied for decades so as to overcome this portability bottleneck¿progressively reducing the amount of involved human supervision. However, as the availability of large document collections increases, completely unsupervised approaches become necessary in order to mine the knowledge contained in them. The proposal of this thesis is to incorporate clustering techniques into pattern learning for information extraction, in order to further reduce the elements of supervision involved in the process. In particular, the work focuses on the problem of relation detection. The achievement of this ultimate goal has required, first, considering the different strategies in which this combination could be carried out; second, developing or adapting clustering algorithms suitable to our needs; and third, devising pattern learning procedures which incorporated clustering information. By the end of this thesis, we had been able to develop and implement an approach for learning of relation detection patterns which, using clustering techniques and minimal human supervision, is competitive and even outperforms other comparable approaches in the state of the art.Postprint (published version

    The Sixth Annual Workshop on Space Operations Applications and Research (SOAR 1992)

    Get PDF
    This document contains papers presented at the Space Operations, Applications, and Research Symposium (SOAR) hosted by the U.S. Air Force (USAF) on 4-6 Aug. 1992 and held at the JSC Gilruth Recreation Center. The symposium was cosponsored by the Air Force Material Command and by NASA/JSC. Key technical areas covered during the symposium were robotic and telepresence, automation and intelligent systems, human factors, life sciences, and space maintenance and servicing. The SOAR differed from most other conferences in that it was concerned with Government-sponsored research and development relevant to aerospace operations. The symposium's proceedings include papers covering various disciplines presented by experts from NASA, the USAF, universities, and industry

    Organization based multiagent architecture for distributed environments

    Get PDF
    [EN]Distributed environments represent a complex field in which applied solutions should be flexible and include significant adaptation capabilities. These environments are related to problems where multiple users and devices may interact, and where simple and local solutions could possibly generate good results, but may not be effective with regards to use and interaction. There are many techniques that can be employed to face this kind of problems, from CORBA to multi-agent systems, passing by web-services and SOA, among others. All those methodologies have their advantages and disadvantages that are properly analyzed in this documents, to finally explain the new architecture presented as a solution for distributed environment problems. The new architecture for solving complex solutions in distributed environments presented here is called OBaMADE: Organization Based Multiagent Architecture for Distributed Environments. It is a multiagent architecture based on the organizations of agents paradigm, where the agents in the architecture are structured into organizations to improve their organizational capabilities. The reasoning power of the architecture is based on the Case-Based Reasoning methology, being implemented in a internal organization that uses agents to create services to solve the external request made by the users. The OBaMADE architecture has been successfully applied to two different case studies where its prediction capabilities have been properly checked. Those case studies have showed optimistic results and, being complex systems, have demonstrated the abstraction and generalizations capabilities of the architecture. Nevertheless OBaMADE is intended to be able to solve much other kind of problems in distributed environments scenarios. It should be applied to other varieties of situations and to other knowledge fields to fully develop its potencial.[ES]Los entornos distribuidos representan un campo de conocimiento complejo en el que las soluciones a aplicar deben ser flexibles y deben contar con gran capacidad de adaptación. Este tipo de entornos está normalmente relacionado con problemas donde varios usuarios y dispositivos entran en juego. Para solucionar dichos problemas, pueden utilizarse sistemas locales que, aunque ofrezcan buenos resultados en términos de calidad de los mismos, no son tan efectivos en cuanto a la interacción y posibilidades de uso. Existen múltiples técnicas que pueden ser empleadas para resolver este tipo de problemas, desde CORBA a sistemas multiagente, pasando por servicios web y SOA, entre otros. Todas estas mitologías tienen sus ventajas e inconvenientes, que se analizan en este documento, para explicar, finalmente, la nueva arquitectura presentada como una solución para los problemas generados en entornos distribuidos. La nueva arquitectura aquí se llama OBaMADE, que es el acrónimo del inglés Organization Based Multiagent Architecture for Distributed Environments (Arquitectura Multiagente Basada en Organizaciones para Entornos Distribuidos). Se trata de una arquitectura multiagente basasa en el paradigma de las organizaciones de agente, donde los agentes que forman parte de la arquitectura se estructuran en organizaciones para mejorar sus capacidades organizativas. La capacidad de razonamiento de la arquitectura está basada en la metodología de razonamiento basado en casos, que se ha implementado en una de las organizaciones internas de la arquitectura por medio de agentes que crean servicios que responden a las solicitudes externas de los usuarios. La arquitectura OBaMADE se ha aplicado de forma exitosa a dos casos de estudio diferentes, en los que se han demostrado sus capacidades predictivas. Aplicando OBaMADE a estos casos de estudio se han obtenido resultados esperanzadores y, al ser sistemas complejos, se han demostrado las capacidades tanto de abstracción como de generalización de la arquitectura presentada. Sin embargo, esta arquitectura está diseñada para poder ser aplicada a más tipo de problemas de entornos distribuidos. Debe ser aplicada a más variadas situaciones y a otros campos de conocimiento para desarrollar completamente el potencial de esta arquitectura
    corecore