1,680 research outputs found

    A spatial mediator model for integrating heterogeneous spatial data

    Get PDF
    The complexity and richness of geospatial data create specific problems in heterogeneous data integration. To deal with this type of data integration, we propose a spatial mediator embedded in a large distributed mobile environment (GeoGrid). The spatial mediator takes a user request from a field application and uses the request to select the appropriate data sources, constructs subqueries for the selected data sources, defines the process of combining the results from the subqueries, and develop an integration script that controls the integration process in order to respond to the request. The spatial mediator uses ontologies to support search for both geographic location based on symbolic terms as well as providing a term-based index to spatial data sources based on the relational model. In our approach, application designers only need to be aware of a minimum amount about the queries needed to supply users with the required data. The key part of this research has been the development of the spatial mediator that can dynamically respond to requests within the GeoGrid environment for geographic maps and related relational spatial data

    Types With Extents: On Transforming and Querying Self-Referential Data-Structures (Dissertation Proposal)

    Get PDF
    The central theme of this paper is to study the properties and expressive power of data-models which use type systems with extents in order to represent recursive or self-referential data-structures. A standard type system is extended with classes which represent the finite extents of values stored in a database. Such an extended type system expresses constraints about a database instance which go beyond those normally associated with the typing of data-values, and takes on an important part of the functionality of a database schema. Recursion in data-structures is then constrained to be defined via these finite extents, so that all values in a database have a finite representation. The idea of extending a type system with such classes is not new. In particular [2] introduced a type system and data models equivalent to those used here. However such existing work focuses on the expressive power of systems which allow the dynamic creation of recursive values, while we are concerned more with the properties of querying and manipulating databases containing known static extensions of data-values

    Semantic Similarity of Spatial Scenes

    Get PDF
    The formalization of similarity in spatial information systems can unleash their functionality and contribute technology not only useful, but also desirable by broad groups of users. As a paradigm for information retrieval, similarity supersedes tedious querying techniques and unveils novel ways for user-system interaction by naturally supporting modalities such as speech and sketching. As a tool within the scope of a broader objective, it can facilitate such diverse tasks as data integration, landmark determination, and prediction making. This potential motivated the development of several similarity models within the geospatial and computer science communities. Despite the merit of these studies, their cognitive plausibility can be limited due to neglect of well-established psychological principles about properties and behaviors of similarity. Moreover, such approaches are typically guided by experience, intuition, and observation, thereby often relying on more narrow perspectives or restrictive assumptions that produce inflexible and incompatible measures. This thesis consolidates such fragmentary efforts and integrates them along with novel formalisms into a scalable, comprehensive, and cognitively-sensitive framework for similarity queries in spatial information systems. Three conceptually different similarity queries at the levels of attributes, objects, and scenes are distinguished. An analysis of the relationship between similarity and change provides a unifying basis for the approach and a theoretical foundation for measures satisfying important similarity properties such as asymmetry and context dependence. The classification of attributes into categories with common structural and cognitive characteristics drives the implementation of a small core of generic functions, able to perform any type of attribute value assessment. Appropriate techniques combine such atomic assessments to compute similarities at the object level and to handle more complex inquiries with multiple constraints. These techniques, along with a solid graph-theoretical methodology adapted to the particularities of the geospatial domain, provide the foundation for reasoning about scene similarity queries. Provisions are made so that all methods comply with major psychological findings about people’s perceptions of similarity. An experimental evaluation supplies the main result of this thesis, which separates psychological findings with a major impact on the results from those that can be safely incorporated into the framework through computationally simpler alternatives

    Learning Collective Behavior in Multi-relational Networks

    Get PDF
    With the rapid expansion of the Internet and WWW, the problem of analyzing social media data has received an increasing amount of attention in the past decade. The boom in social media platforms offers many possibilities to study human collective behavior and interactions on an unprecedented scale. In the past, much work has been done on the problem of learning from networked data with homogeneous topologies, where instances are explicitly or implicitly inter-connected by a single type of relationship. In contrast to traditional content-only classification methods, relational learning succeeds in improving classification performance by leveraging the correlation of the labels between linked instances. However, networked data extracted from social media, web pages, and bibliographic databases can contain entities of multiple classes and linked by various causal reasons, hence treating all links in a homogeneous way can limit the performance of relational classifiers. Learning the collective behavior and interactions in heterogeneous networks becomes much more complex. The contribution of this dissertation include 1) two classification frameworks for identifying human collective behavior in multi-relational social networks; 2) unsupervised and supervised learning models for relationship prediction in multi-relational collaborative networks. Our methods improve the performance of homogeneous predictive models by differentiating heterogeneous relations and capturing the prominent interaction patterns underlying the network structure. The work has been evaluated in various real-world social networks. We believe that this study will be useful for analyzing human collective behavior and interactions specifically in the scenario when the heterogeneous relationships in the network arise from various causal reasons

    Formal methods applied to the analysis of phylogenies: Phylogenetic model checking

    Get PDF
    Los árboles filogenéticos son abstracciones útiles para modelar y caracterizar la evolución de un conjunto de especies o poblaciones respecto del tiempo. La proposición, verificación y generalización de hipótesis sobre un árbol filogenético inferido juegan un papel importante en el estudio y comprensión de las relaciones evolutivas. Actualmente, uno de los principales objetivos científicos es extraer o descubrir los mensajes biológicos implícitos y las propiedades estructurales subyacentes en la filogenia. Por ejemplo, la integración de información genética en una filogenia ayuda al descubrimiento de genes conservados en todo o parte del árbol, la identificación de posiciones covariantes en el ADN o la estimación de las fechas de divergencia entre especies. Consecuentemente, los árboles ayudan a comprender el mecanismo que gobierna la deriva evolutiva. Hoy en día, el amplio espectro de métodos y herramientas heterogéneas para el análisis de filogenias enturbia y dificulta su utilización, además del fuerte acoplamiento entre la especificación de propiedades y los algoritmos utilizados para su evaluación (principalmente scripts ad hoc). Este problema es el punto de arranque de esta tesis, donde se analiza como solución la posibilidad de introducir un entorno formal de verificación de hipótesis que, de manera automática y modular, estudie la veracidad de dichas propiedades definidas en un lenguaje genérico e independiente (en una lógica formal asociada) sobre uno de los múltiples softwares preparados para ello. La contribución principal de la tesis es la propuesta de un marco formal para la descripción, verificación y manipulación de relaciones causales entre especies de forma independiente del código utilizado para su valoración. Para ello, exploramos las características de las técnicas de model checking, un paradigma en el que una especificación expresada en lógica temporal se verifica con respecto a un modelo del sistema que representa una implementación a un cierto nivel de detalle. Se ha aplicado satisfactoriamente en la industria para el modelado de sistemas y su verificación, emergiendo del ámbito de las ciencias de la computación. Las contribuciones concretas de la tesis han sido: A) La identificación e interpretación de los árboles filogeneticos como modelos de la evolución, adaptados al entorno de las técnicas de model checking. B) La definición de una lógica temporal que captura las propiedades filogenéticas habituales junto con un método de construcción de propiedades. C) La clasificación de propiedades filogenéticas, identificando categorías de propiedades según estén centradas en la estructura del árbol, en las secuencias o sean híbridas. D) La extensión de las lógicas y modelos para contemplar propiedades cuantitativas de tiempo, probabilidad y de distancias. E) El desarrollo de un entorno para la verificación de propiedades booleanas, cuantitativas y paramétricas. F) El establecimiento de los principios para la manipulación simbolica de objetos filogenéticos, p. ej., clados. G) La explotación de las herramientas de model checking existentes, detectando sus problemas y carencias en el campo de filogenia y proponiendo mejoras. H) El desarrollo de técnicas "ad hoc" para obtener ganancia de complejidad alrededor de dos frentes: distribución de los cálculos y datos, y el uso de sistemas de información. Los puntos A-F se centran en las aportaciones conceptuales de nuestra aproximación, mientras que los puntos G-H enfatizan la parte de herramientas e implementación. Los contenidos de la tesis están contrastados por la comunidad científica mediante las siguientes publicaciones en conferencias y revistas internacionales. La introducción de model checking como entorno formal para analizar propiedades biológicas (puntos A-C) ha llevado a la publicación de nuestro primer artículo de congreso [1]. En [2], desarrollamos la verificación de hipótesis filogenéticas sobre un árbol de ejemplo construido a partir de las relaciones impuestas por un conjunto de proteínas codificadas por el ADN mitocondrial humano (ADNmt). En ese ejemplo, usamos una herramienta automática y genérica de model checking (punto G). El artículo de revista [7] resume lo básico de los artículos de congreso previos y extiende la aplicación de lógicas temporales a propiedades filogenéticas no consideradas hasta ahora. Los artículos citados aquí engloban los contenidos presentados en las Parte I--II de la tesis. El enorme tamaño de los árboles y la considerable cantidad de información asociada a los estados (p.ej., la cadena de ADN) obligan a la introducción de adaptaciones especiales en las herramientas de model checking para mantener un rendimiento razonable en la verificación de propiedades y aliviar también el problema de la explosión de estados (puntos G-H). El artículo de congreso [3] presenta las ventajas de rebanar el ADN asociado a los estados, la partición de la filogenia en pequeños subárboles y su distribución entre varias máquinas. Además, la idea original del model checking rebanado se complementa con la inclusión de una base de datos externa para el almacenamiento de secuencias. El artículo de revista [4] reúne las nociones introducidas en [3] junto con la implementación y resultados preliminares presentados [5]. Este tema se corresponde con lo presentado en la Parte III de la tesis. Para terminar, la tesis reaprovecha las extensiones de las lógicas temporales con tiempo explícito y probabilidades a fin de manipular e interrogar al árbol sobre información cuantitativa. El artículo de congreso [6] ejemplifica la necesidad de introducir probabilidades y tiempo discreto para el análisis filogenético de un fenotipo real, en este caso, el ratio de distribución de la intolerancia a la lactosa entre diversas poblaciones arraigadas en las hojas de la filogenia. Esto se corresponde con el Capítulo 13, que queda englobado dentro de las Partes IV--V. Las Partes IV--V completan los conceptos presentados en ese artículo de conferencia hacia otros dominios de aplicación, como la puntuación de árboles, y tiempo continuo (puntos E-F). La introducción de parámetros en las hipótesis filogenéticas se plantea como trabajo futuro. Referencias [1] Roberto Blanco, Gregorio de Miguel Casado, José Ignacio Requeno, and José Manuel Colom. Temporal logics for phylogenetic analysis via model checking. In Proceedings IEEE International Workshop on Mining and Management of Biological and Health Data, pages 152-157. IEEE, 2010. [2] José Ignacio Requeno, Roberto Blanco, Gregorio de Miguel Casado, and José Manuel Colom. Phylogenetic analysis using an SMV tool. In Miguel P. Rocha, Juan M. Corchado Rodríguez, Florentino Fdez-Riverola, and Alfonso Valencia, editors, Proceedings 5th International Conference on Practical Applications of Computational Biology and Bioinformatics, volume 93 of Advances in Intelligent and Soft Computing, pages 167-174. Springer, Berlin, 2011. [3] José Ignacio Requeno, Roberto Blanco, Gregorio de Miguel Casado, and José Manuel Colom. Sliced model checking for phylogenetic analysis. In Miguel P. Rocha, Nicholas Luscombe, Florentino Fdez-Riverola, and Juan M. Corchado Rodríguez, editors, Proocedings 6th International Conference on Practical Applications of Computational Biology and Bioinformatics, volume 154 of Advances in Intelligent and Soft Computing, pages 95-103. Springer, Berlin, 2012. [4] José Ignacio Requeno and José Manuel Colom. Model checking software for phylogenetic trees using distribution and database methods. Journal of Integrative Bioinformatics, 10(3):229-233, 2013. [5] José Ignacio Requeno and José Manuel Colom. Speeding up phylogenetic model checking. In Mohd Saberi Mohamad, Loris Nanni, Miguel P. Rocha, and Florentino Fdez-Riverola, editors, Proceedings 7th International Conference on Practical Applications of Computational Biology and Bioinformatics, volume 222 of Advances in Intelligent Systems and Computing, pages 119-126. Springer, Berlin, 2013. [6] José Ignacio Requeno and José Manuel Colom. Timed and probabilistic model checking over phylogenetic trees. In Miguel P. Rocha et al., editors, Proceedings 8th International Conference on Practical Applications of Computational Biology and Bioinformatics, Advances in Intelligent and Soft Computing. Springer, Berlin, 2014. [7] José Ignacio Requeno, Gregorio de Miguel Casado, Roberto Blanco, and José Manuel Colom. Temporal logics for phylogenetic analysis via model checking. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10(4):1058-1070, 2013

    Topological Equivalence and Similarity in Multi-Representation Geographic Databases

    Get PDF
    Geographic databases contain collections of spatial data representing the variety of views for the real world at a specific time. Depending on the resolution or scale of the spatial data, spatial objects may have different spatial dimensions, and they may be represented by point, linear, or polygonal features, or combination of them. The diversity of data that are collected over the same area, often from different sources, imposes a question of how to integrate and to keep them consistent in order to provide correct answers for spatial queries. This thesis is concerned with the development of a tool to check topological equivalence and similarity for spatial objects in multi-representation databases. The main question is what are the components of a model to identify topological consistency, based on a set of possible transitions for the different types of spatial representations. This work develops a new formalism to model consistently spatial objects and spatial relations between several objects, each represented at multiple levels of detail. It focuses on the topological consistency constraints that must hold among the different representation of objects, but it is not concerned about generalization operations of how to derive one representation level from another. The result of this thesis is a?computational tool to evaluate topological equivalence and similarity across multiple representations. This thesis proposes to organize a spatial scene -a set of spatial objects and their embeddings in space- directly as a relation-based model that uses a hierarchical graph representation. The focus of the relation-based model is on relevant object representations. Only the highest-dimensional object representations are explicitly stored, while their parts are not represented in the graph

    CRIS-IR 2006

    Get PDF
    The recognition of entities and their relationships in document collections is an important step towards the discovery of latent knowledge as well as to support knowledge management applications. The challenge lies on how to extract and correlate entities, aiming to answer key knowledge management questions, such as; who works with whom, on which projects, with which customers and on what research areas. The present work proposes a knowledge mining approach supported by information retrieval and text mining tasks in which its core is based on the correlation of textual elements through the LRD (Latent Relation Discovery) method. Our experiments show that LRD outperform better than other correlation methods. Also, we present an application in order to demonstrate the approach over knowledge management scenarios.Fundação para a Ciência e a Tecnologia (FCT) Denmark's Electronic Research Librar
    corecore