106 research outputs found

    Contextual Cross-Referencing of Species Names for Fiddler Crabs (Genus Uca): An Experiment in Cyber-Taxonomy

    Get PDF
    abstract: Cyber-taxonomy of name usage has focused primarily on producing authoritative lists of names or cross-linking names and data across disparate databases. A feature missing from much of this work is the recording and analysis of the context in which a name was used—context which can be critical for understanding not only what name an author used, but to which currently recognized species they actually refer. An experiment on recording contextual information associated with name usage was conducted for the fiddler crabs (genus Uca). Data from approximately one quarter of all publications that mention fiddler crabs, including 95% of those published prior to 1924 and 67% of those published prior to 1976, have currently been recorded in a database. Approaches and difficulties in recording and analyzing the context of name use are discussed. These results are not meant to be a full solution, rather to highlight problems which have not been previously investigated and may act as a springboard for broader approaches and discussion. Some data on the accessibility of the literature, including in particular electronic forms of publication, are also presented. The resulting data has been integrated for general browsing into the website http://www.fiddlercrab.info; the raw data and code used to construct the website is available at https://github.com/msrosenberg/fiddlercrab.info

    20 GB in 10 minutes: a case for linking major biodiversity databases using an open socio-technical infrastructure and a pragmatic, cross-institutional collaboration

    Get PDF
    Biodiversity information is made available through numerous databases that each have their own data models, web services, and data types. Combining data across databases leads to new insights, but is not easy because each database uses its own system of identifiers. In the absence of stable and interoperable identifiers, databases are often linked using taxonomic names. This labor intensive, error prone, and lengthy process relies on accessible versions of nomenclatural authorities and fuzzy-matching algorithms. To approach the challenge of linking diverse data, more than technology is needed. New social collaborations like the Global Unified Open Data Architecture (GUODA) that combines skills from diverse groups of computer engineers from iDigBio, server resources from the Advanced Computing and Information Systems (ACIS) Lab, global-scale data presentation from EOL, and independent developers and researchers are what is needed to make concrete progress on finding relationships between biodiversity datasets. This paper will discuss a technical solution developed by the GUODA collaboration for faster linking across databases with a use case linking Wikidata and the Global Biotic Interactions database (GloBI). The GUODA infrastructure is a 12-node, high performance computing cluster made up of about 192 threads with 12 TB of storage and 288 GB memory. Using GUODA, 20 GB of compressed JSON from Wikidata was processed and linked to GloBI in about 10–11 min. Instead of comparing name strings or relying on a single identifier, Wikidata and GloBI were linked by comparing graphs of biodiversity identifiers external to each system. This method resulted in adding 119,957 Wikidata links in GloBI, an increase of 13.7% of all outgoing name links in GloBI. Wikidata and GloBI were compared to Open Tree of Life Reference Taxonomy to examine consistency and coverage. The process of parsing Wikidata, Open Tree of Life Reference Taxonomy and GloBI archives and calculating consistency metrics was done in minutes on the GUODA platform. As a model collaboration, GUODA has the potential to revolutionize biodiversity science by bringing diverse technically minded people together with high performance computing resources that are accessible from a laptop or desktop. However, participating in such a collaboration still requires basic programming skills

    LL(O)D and NLP perspectives on semantic change for humanities research

    Get PDF
    CC BY 4.0This paper presents an overview of the LL(O)D and NLP methods, tools and data for detecting and representing semantic change, with its main application in humanities research. The paper’s aim is to provide the starting point for the construction of a workflow and set of multilingual diachronic ontologies within the humanities use case of the COST Action Nexus Linguarum, European network for Web-centred linguistic data science, CA18209. The survey focuses on the essential aspects needed to understand the current trends and to build applications in this area of study

    Agent-based management of clinical guidelines

    Get PDF
    Les guies de pràctica clínica (GPC) contenen un conjunt d'accions i dades que ajuden a un metge a prendre decisions sobre el diagnòstic, tractament o qualsevol altre procediment a un pacient i sobre una determinada malaltia. És conegut que l'adopció d'aquestes guies en la vida diària pot millorar l'assistència mèdica als pacients, pel fet que s'estandarditzen les pràctiques. Sistemes computeritzats que utilitzen GPC poden constituir part de sistemes d'ajut a la presa de decisions més complexos amb la finalitat de proporcionar el coneixement adequat a la persona adequada, en un format correcte i en el moment precís. L'automatització de l'execució de les GPC és el primer pas per la seva implantació en els centres mèdics.Per aconseguir aquesta implantació final, hi ha diferents passos que cal solucionar com per exemple, l'adquisició i representació de les GPC, la seva verificació formal, i finalment la seva execució. Aquesta Tesi està dirigida en l'execució de GPC i proposa la implementació d'un sistema multi-agent. En aquest sistema els diferents actors dels centres mèdics coordinen les seves activitats seguint un pla global determinat per una GPC. Un dels principals problemes de qualsevol sistema que treballa en l'àmbit mèdic és el tractament del coneixement. En aquest cas s'han hagut de tractar termes mèdics i organitzatius, que s'ha resolt amb la implementació de diferents ontologies. La separació de la representació del coneixement del seu ús és intencionada i permet que el sistema d'execució de GPC sigui fàcilment adaptable a les circumstàncies concretes dels centres, on varien el personal i els recursos disponibles.En paral·lel a l'execució de GPC, el sistema proposat manega preferències del pacient per tal d'implementar serveis adaptats al pacient. En aquesta àrea concretament, a) s'han definit un conjunt de criteris, b) aquesta informació forma part del perfil de l'usuari i serveix per ordenar les propostes que el sistema li proposa, i c) un algoritme no supervisat d'aprenentatge permet adaptar les preferències del pacient segons triï.Finalment, algunes idees d'aquesta Tesi actualment s'estan aplicant en dos projectes de recerca. Per una banda, l'execució distribuïda de GPC, i per altra banda, la representació del coneixement mèdic i organitzatiu utilitzant ontologies.Clinical guidelines (CGs) contain a set of directions or principles to assist the health care practitioner with patient care decisions about appropriate diagnostic, therapeutic, or other clinical procedures for specific clinical circumstances. It is widely accepted that the adoption of guideline-execution engines in daily practice would improve the patient care, by standardising the care procedures. Guideline-based systems can constitute part of a knowledge-based decision support system in order to deliver the right knowledge to the right people in the right form at the right time. The automation of the guideline execution process is a basic step towards its widespread use in medical centres.To achieve this general goal, different topics should be tackled, such as the acquisition of clinical guidelines, its formal verification, and finally its execution. This dissertation focuses on the execution of CGs and proposes the implementation of an agent-based platform in which the actors involved in health care coordinate their activities to perform the complex task of guideline enactment. The management of medical and organizational knowledge, and the formal representation of the CGs, are two knowledge-related topics addressed in this dissertation and tackled through the design of several application ontologies. The separation of the knowledge from its use is fully intentioned, and allows the CG execution engine to be easily customisable to different medical centres with varying personnel and resources.In parallel with the execution of CGs, the system handles citizen's preferences and uses them to implement patient-centred services. With respect this issue, the following tasks have been developed: a) definition of the user's criteria, b) use of the patient's profile to rank the alternatives presented to him, c) implementation of an unsupervised learning method to adapt dynamically and automatically the user's profile.Finally, several ideas of this dissertation are being directly applied in two ongoing funded research projects, including the agent-based execution of CGs and the ontological management of medical and organizational knowledge
    corecore