3,589 research outputs found

    Exploiting synergy between ontologies and recommender systems

    Get PDF
    Recommender systems learn about user preferences over time, automatically finding things of similar interest. This reduces the burden of creating explicit queries. Recommender systems do, however, suffer from cold-start problems where no initial information is available early on upon which to base recommendations.Semantic knowledge structures, such as ontologies, can provide valuable domain knowledge and user information. However, acquiring such knowledge and keeping it up to date is not a trivial task and user interests are particularly difficult to acquire and maintain. This paper investigates the synergy between a web-based research paper recommender system and an ontology containing information automatically extracted from departmental databases available on the web. The ontology is used to address the recommender systems cold-start problem. The recommender system addresses the ontology's interest-acquisition problem. An empirical evaluation of this approach is conducted and the performance of the integrated systems measured

    Enterprise information integration: on discovering links using genetic programming

    Get PDF
    Both established and emergent business rely heavily on data, chiefly those that wish to become game changers. The current biggest source of data is the Web, where there is a large amount of sparse data. The Web of Data aims at providing a unified view of these islands of data. To realise this vision, it is required that the resources in different data sources that refer to the same real-world entities must be linked, which is they key factor for such a unified view. Link discovery is a trending task that aims at finding link rules that specify whether these links must be established or not. Currently there are many proposals in the literature to produce these links, especially based on meta-heuristics. Unfortunately, creating proposals based on meta-heuristics is not a trivial task, which has led to a lack of comparison between some well-established proposals. On the other hand, it has been proved that these link rules fall short in cases in which resources that refer to different real-world entities are very similar or vice versa. In this dissertation, we introduce several proposals to address the previous lacks in the literature. On the one hand we, introduce Eva4LD, which is a generic framework to build genetic programming proposals for link discovery; which are a kind of meta-heuristics proposals. Furthermore, our framework allows to implement many proposals in the literature and compare their results fairly. On the other hand, we introduce Teide, which applies effectively the link rules increasing significantly their precision without dropping their recall significantly. Unfortunately, Teide does not learn link rules, and applying all the provided link rules is computationally expensive. Due to this reason we introduce Sorbas, which learns what we call contextual link rules.Las empresas que desean establecer un precedente en el panorama actual tienden a recurrir al uso de datos para mejorar sus modelos de negocio. La mayor fuente de datos disponible es la Web, donde una gran cantidad es accesible aunque se encuentre fragmentada en islas de datos. La Web de los Datos tiene como objetivo dar una visión unificada de dichas islas, aunque el almacenamiento de los mismos siga siendo distribuido. Para ofrecer esta visión es necesario enlazar los recursos presentes en las islas de datos que hacen referencia a las mismas entidades del mundo real. Link discovery es el nombre atribuido a esta tarea, la cual se basa en generar reglas de enlazado que permiten establecer bajo qué circunstancias dos recursos deben ser enlazados. Se pueden encontrar diferentes propuestas en la literatura de link discovery, especialmente basadas en meta-heurísticas. Por desgracia comparar propuestas basadas en meta-heurísticas no es trivial. Por otro lado, se ha probado que estas reglas de enlazado no funcionan bien cuando los recursos que hacen referencia a dos entidades distintas del mundo real son muy parecidos, o por el contrario, cuando dos recursos muy distintos hacen referencia a la misma entidad. En esta tesis presentamos varias propuestas. Por un lado, Eva4LD es un framework genérico para desarrollar propuestas de link discovery basadas en programación genética, que es un tipo de meta-heurística. Gracias a nuestro framework, hemos podido implementar distintas propuestas de la literatura y comprar justamente sus resultados. Por otro lado, en la tesis presentamos Teide, una propuesta que recibiendo varias reglas de enlazado las aplica de tal modo que mejora significativamente la precisión de las mismas sin reducir significativamente su cobertura. Por desgracia, Teide es computacionalmente costoso debido a que no aprende reglas. Debido a este motivo, presentamos Sorbas que aprende un tipo de reglas de enlazado que denominamos reglas de enlazado con contexto

    A simulation-based algorithm for solving the resource-assignment problem in satellite telecommunication networks

    Get PDF
    This paper proposes an heuristic for the scheduling of capacity requests and the periodic assignment of radio resources in geostationary (GEO) satellite networks with star topology, using the Demand Assigned Multiple Access (DAMA) protocol in the link layer, and Multi-Frequency Time Division Multiple Access (MF-TDMA) and Adaptive Coding and Modulation (ACM) in the physical layer.En este trabajo se propone una heurística para la programación de las solicitudes de capacidad y la asignación periódica de los recursos de radio en las redes de satélites geoestacionarios (GEO) con topología en estrella, con la demanda de acceso múltiple de asignación (DAMA) de protocolo en la capa de enlace, y el Multi-Frequency Time Division (Acceso múltiple por MF-TDMA) y codificación y modulación Adaptable (ACM) en la capa física.En aquest treball es proposa una heurística per a la programació de les sol·licituds de capacitat i l'assignació periòdica dels recursos de ràdio en les xarxes de satèl·lits geoestacionaris (GEO) amb topologia en estrella, amb la demanda d'accés múltiple d'assignació (DAMA) de protocol en la capa d'enllaç, i el Multi-Frequency Time Division (Accés múltiple per MF-TDMA) i codificació i modulació Adaptable (ACM) a la capa física

    The scale of population structure in Arabidopsis thaliana

    Get PDF
    The population structure of an organism reflects its evolutionary history and influences its evolutionary trajectory. It constrains the combination of genetic diversity and reveals patterns of past gene flow. Understanding it is a prerequisite for detecting genomic regions under selection, predicting the effect of population disturbances, or modeling gene flow. This paper examines the detailed global population structure of Arabidopsis thaliana. Using a set of 5,707 plants collected from around the globe and genotyped at 149 SNPs, we show that while A. thaliana as a species self-fertilizes 97% of the time, there is considerable variation among local groups. This level of outcrossing greatly limits observed heterozygosity but is sufficient to generate considerable local haplotypic diversity. We also find that in its native Eurasian range A. thaliana exhibits continuous isolation by distance at every geographic scale without natural breaks corresponding to classical notions of populations. By contrast, in North America, where it exists as an exotic species, A. thaliana exhibits little or no population structure at a continental scale but local isolation by distance that extends hundreds of km. This suggests a pattern for the development of isolation by distance that can establish itself shortly after an organism fills a new habitat range. It also raises questions about the general applicability of many standard population genetics models. Any model based on discrete clusters of interchangeable individuals will be an uneasy fit to organisms like A. thaliana which exhibit continuous isolation by distance on many scales

    On learning context-aware rules to link RDF datasets

    Get PDF
    Integrating RDF datasets has become a relevant problem for both researchers and practitioners. In the literature, there are many genetic proposals that learn rules that allow to link the resources that refer to the same real-world entities, which is paramount to integrating the datasets. Unfortunately, they are context-unaware because they focus on the resources and their attributes but forget about their neighbours. This implies that they fall short in cases in which different resources have similar attributes but refer to different real-world entities or cases in which they have dissimilar attributes but refer to the same real-world entities. In this article, we present a proposal that learns context-aware rules that take into account both the attributes of the resources and their neighbours. We have conducted an extensive experimentation that proves that it outperforms the most advanced genetic proposal. Our conclusions were checked using statistically sound methods.Ministerio de Economía y Competitividad TIN2013-40848-RMinisterio de Economía y Competitividad TIN2016-75394-RJunta de Andalucía P18- RT-106

    GI Systems for public health with an ontology based approach

    Get PDF
    Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.Health is an indispensable attribute of human life. In modern age, utilizing technologies for health is one of the emergent concepts in several applied fields. Computer science, (geographic) information systems are some of the interdisciplinary fields which motivates this thesis. Inspiring idea of the study is originated from a rhetorical disease DbHd: Database Hugging Disorder, defined by Hans Rosling at World Bank Open Data speech in May 2010. The cure of this disease can be offered as linked open data, which contains ontologies for health science, diseases, genes, drugs, GEO species etc. LOD-Linked Open Data provides the systematic application of information by publishing and connecting structured data on the Web. In the context of this study we aimed to reduce boundaries between semantic web and geo web. For this reason a use case data is studied from Valencia CSISP- Research Center of Public Health in which the mortality rates for particular diseases are represented spatio-temporally. Use case data is divided into three conceptual domains (health, spatial, statistical), enhanced with semantic relations and descriptions by following Linked Data Principles. Finally in order to convey complex health-related information, we offer an infrastructure integrating geo web and semantic web. Based on the established outcome, user access methods are introduced and future researches/studies are outlined
    corecore