11,156 research outputs found

    Application of Semantics to Solve Problems in Life Sciences

    Get PDF
    Fecha de lectura de Tesis: 10 de diciembre de 2018La cantidad de información que se genera en la Web se ha incrementado en los últimos años. La mayor parte de esta información se encuentra accesible en texto, siendo el ser humano el principal usuario de la Web. Sin embargo, a pesar de todos los avances producidos en el área del procesamiento del lenguaje natural, los ordenadores tienen problemas para procesar esta información textual. En este cotexto, existen dominios de aplicación en los que se están publicando grandes cantidades de información disponible como datos estructurados como en el área de las Ciencias de la Vida. El análisis de estos datos es de vital importancia no sólo para el avance de la ciencia, sino para producir avances en el ámbito de la salud. Sin embargo, estos datos están localizados en diferentes repositorios y almacenados en diferentes formatos que hacen difícil su integración. En este contexto, el paradigma de los Datos Vinculados como una tecnología que incluye la aplicación de algunos estándares propuestos por la comunidad W3C tales como HTTP URIs, los estándares RDF y OWL. Haciendo uso de esta tecnología, se ha desarrollado esta tesis doctoral basada en cubrir los siguientes objetivos principales: 1) promover el uso de los datos vinculados por parte de la comunidad de usuarios del ámbito de las Ciencias de la Vida 2) facilitar el diseño de consultas SPARQL mediante el descubrimiento del modelo subyacente en los repositorios RDF 3) crear un entorno colaborativo que facilite el consumo de Datos Vinculados por usuarios finales, 4) desarrollar un algoritmo que, de forma automática, permita descubrir el modelo semántico en OWL de un repositorio RDF, 5) desarrollar una representación en OWL de ICD-10-CM llamada Dione que ofrezca una metodología automática para la clasificación de enfermedades de pacientes y su posterior validación haciendo uso de un razonador OWL

    Designing a resource-efficient data structure for mobile data systems

    Get PDF
    Designing data structures for use in mobile devices requires attention on optimising data volumes with associated benefits for data transmission, storage space and battery use. For semi-structured data, tree summarisation techniques can be used to reduce the volume of structured elements while dictionary compression can efficiently deal with value-based predicates. This project seeks to investigate and evaluate an integration of the two approaches. The key strength of this technique is that both structural and value predicates could be resolved within one graph while further allowing for compression of the resulting data structure. As the current trend is towards the requirement for working with larger semi-structured data sets this work would allow for the utilisation of much larger data sets whilst reducing requirements on bandwidth and minimising the memory necessary both for the storage and querying of the data

    Gower as Data: Exploring the Application of Machine Learning to Gower’s Middle English Corpus

    Get PDF
    Distant reading, a digital humanities method in wide use, involves processing and analyzing a large amount of text through computer programs. In treating texts as data, these methods can highlight trends in diction, themes, and linguistic patterns that individual readers may miss or critical traditions may obscure. Though several scholars have undertaken projects using topic models and text mining on Middle English texts, the nonstandard orthography of Middle English makes this process more challenging than for our counterparts in later literature. This collaborative project uses Gower’s Confessio Amantis as a small, fixed corpus for analysis. We employ natural language processing to reexamine the Confessio’s themes, adding data analysis to the more traditional close reading strategies of Gower scholarship. We use Gower’s work as a case study both to help reduce the potential variants across textual versions and to more deeply investigate the corpus than distant reading normally allows. Here, we share our initial findings as well as our methodologies. We hope to share resources that will allow other scholars to engage in similar types of projects

    Mejorando la Ciencia Abierta Usando Datos Abiertos Enlazados: Caso de Uso CONICET Digital

    Get PDF
    Los servicios de publicación científica están cambiando drásticamente, los investigadores demandan servicios de búsqueda inteligentes para descubrir y relacionar publicaciones científicas. Los editores deben incorporar información semántica para organizar mejor sus activos digitales y hacer que las publicaciones sean más visibles. En este documento, presentamos el trabajo en curso para publicar un subconjunto de publicaciones científicas de CONICET Digital como datos abiertos enlazados. El objetivo de este trabajo es mejorar la recuperación y la reutilización de datos a través de tecnologías de Web Semántica y Datos Enlazados en el dominio de las publicaciones científicas. Para lograr estos objetivos, se han tenido en cuenta los estándares de la Web Semántica y los esquemas RDF (Dublín Core, FOAF, VoID, etc.). El proceso de conversión y publicación se basa en las pautas metodológicas para publicar datos vinculados de gobierno. También describimos como estos datos se pueden vincular a otros conjuntos de datos como DBLP, Wikidata y DBPedia. Finalmente, mostramos algunos ejemplos de consultas que responden a preguntas que inicialmente no permite CONICET Digital.Scientific publication services are changing drastically, researchers demand intelligent search services to discover and relate scientific publications. Publishersneed to incorporate semantic information to better organize their digital assets and make publications more discoverable. In this paper, we present the on-going work to publish a subset of scientific publications of CONICET Digital as Linked Open Data. The objective of this work is to improve the recovery andreuse of data through Semantic Web technologies and Linked Data in the domain of scientific publications.To achieve these goals, Semantic Web standards and reference RDF schema?s have been taken into account (Dublin Core, FOAF, VoID, etc.). The conversion and publication process is guided by the methodological guidelines for publishing government linked data. We also outline how these data can be linked to other datasets DBLP, WIKIDATA and DBPEDIA on the web of data. Finally, we show some examples of queries that answer questions that initially CONICET Digital does not allowFil: Zárate, Marcos Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Centro para el Estudio de Sistemas Marinos; ArgentinaFil: Carlos Buckle. Universidad Nacional de la Patagonia "San Juan Bosco"; ArgentinaFil: Mazzanti, Renato. Universidad Nacional de la Patagonia "San Juan Bosco"; ArgentinaFil: Samec, Gustavo Daniel. Universidad Nacional de la Patagonia "San Juan Bosco"; Argentin

    Enriched property ontology for knowledge systems : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Systems in Information Systems, Massey University, Palmerston North, New Zealand

    Get PDF
    "It is obvious that every individual thing or event has an indefinite number of properties or attributes observable in it and might therefore be considered as belonging to an indefinite number of different classes of things" [Venn 1876]. The world in which we try to mimic in Knowledge Based (KB) Systems is essentially extremely complex especially when we attempt to develop systems that cover a domain of discourse with an almost infinite number of possible properties. Thus if we are to develop such systems how do we know what properties we wish to extract to make a decision and how do we ensure the value of our findings are the most relevant in our decision making. Equally how do we have tractable computations, considering the potential computation complexity of systems required for decision making within a very large domain. In this thesis we consider this problem in terms of medical decision making. Medical KB systems have the potential to be very useful aids for diagnosis, medical guidance and patient data monitoring. For example in a diagnostic process in certain scenarios patients may provide various potential symptoms of a disease and have defining characteristics. Although considerable information could be obtained, there may be difficulty in correlating a patient's data to known diseases in an economic and efficient manner. This would occur where a practitioner lacks a specific specialised knowledge. Considering the vastness of knowledge in the domain of medicine this could occur frequently. For example a Physician with considerable experience in a specialised domain such as breast cancer may easily be able to diagnose patients and decide on the value of appropriate symptoms given an abstraction process however an inexperienced Physician or Generalist may not have this facility.[FROM INTRODUCTION

    Visual exploration and retrieval of XML document collections with the generic system X2

    Get PDF
    This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed

    Knowledge Organization Systems (KOS) in the Semantic Web: A Multi-Dimensional Review

    Full text link
    Since the Simple Knowledge Organization System (SKOS) specification and its SKOS eXtension for Labels (SKOS-XL) became formal W3C recommendations in 2009 a significant number of conventional knowledge organization systems (KOS) (including thesauri, classification schemes, name authorities, and lists of codes and terms, produced before the arrival of the ontology-wave) have made their journeys to join the Semantic Web mainstream. This paper uses "LOD KOS" as an umbrella term to refer to all of the value vocabularies and lightweight ontologies within the Semantic Web framework. The paper provides an overview of what the LOD KOS movement has brought to various communities and users. These are not limited to the colonies of the value vocabulary constructors and providers, nor the catalogers and indexers who have a long history of applying the vocabularies to their products. The LOD dataset producers and LOD service providers, the information architects and interface designers, and researchers in sciences and humanities, are also direct beneficiaries of LOD KOS. The paper examines a set of the collected cases (experimental or in real applications) and aims to find the usages of LOD KOS in order to share the practices and ideas among communities and users. Through the viewpoints of a number of different user groups, the functions of LOD KOS are examined from multiple dimensions. This paper focuses on the LOD dataset producers, vocabulary producers, and researchers (as end-users of KOS).Comment: 31 pages, 12 figures, accepted paper in International Journal on Digital Librarie

    Referencing Sources of Molecular Spectroscopic Data in the Era of Data Science: Application to the HITRAN and AMBDAS Databases

    Full text link
    The application described has been designed to create bibliographic entries in large databases with diverse sources automatically, which reduces both the frequency of mistakes and the workload for the administrators. This new system uniquely identifies each reference from its digital object identifier (DOI) and retrieves the corresponding bibliographic information from any of several online services, including the SAO/NASA Astrophysics Data Systems (ADS) and CrossRef APIs. Once parsed into a relational database, the software is able to produce bibliographies in any of several formats, including HTML and BibTeX, for use on websites or printed articles. The application is provided free-of-charge for general use by any scientific database. The power of this application is demonstrated when used to populate reference data for the HITRAN and AMBDAS databases as test cases. HITRAN contains data that is provided by researchers and collaborators throughout the spectroscopic community. These contributors are accredited for their contributions through the bibliography produced alongside the data returned by an online search in HITRAN. Prior to the work presented here, HITRAN and AMBDAS created these bibliographies manually, which is a tedious, time-consuming and error-prone process. The complete code for the new referencing system can be found at \url{https://github.com/hitranonline/refs}.Comment: 11 pages, 5 figures, already published online at https://doi.org/10.3390/atoms802001
    • …
    corecore