559 research outputs found

    <strong> </strong>Multitemporal conditional schema evolution

    Get PDF

    A Rule-Based Approach to Analyzing Database Schema Objects with Datalog

    Full text link
    Database schema elements such as tables, views, triggers and functions are typically defined with many interrelationships. In order to support database users in understanding a given schema, a rule-based approach for analyzing the respective dependencies is proposed using Datalog expressions. We show that many interesting properties of schema elements can be systematically determined this way. The expressiveness of the proposed analysis is exemplarily shown with the problem of computing induced functional dependencies for derived relations. The propagation of functional dependencies plays an important role in data integration and query optimization but represents an undecidable problem in general. And yet, our rule-based analysis covers all relational operators as well as linear recursive expressions in a systematic way showing the depth of analysis possible by our proposal. The analysis of functional dependencies is well-integrated in a uniform approach to analyzing dependencies between schema elements in general.Comment: Pre-proceedings paper presented at the 27th International Symposium on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur, Belgium, 10-12 October 2017 (arXiv:1708.07854

    Modelos de compressão e ferramentas para dados ómicos

    Get PDF
    The ever-increasing growth of the development of high-throughput sequencing technologies and as a consequence, generation of a huge volume of data, has revolutionized biological research and discovery. Motivated by that, we investigate in this thesis the methods which are capable of providing an efficient representation of omics data in compressed or encrypted manner, and then, we employ them to analyze omics data. First and foremost, we describe a number of measures for the purpose of quantifying information in and between omics sequences. Then, we present finite-context models (FCMs), substitution-tolerant Markov models (STMMs) and a combination of the two, which are specialized in modeling biological data, in order for data compression and analysis. To ease the storage of the aforementioned data deluge, we design two lossless data compressors for genomic and one for proteomic data. The methods work on the basis of (a) a combination of FCMs and STMMs or (b) the mentioned combination along with repeat models and a competitive prediction model. Tested on various synthetic and real data showed their outperformance over the previously proposed methods in terms of compression ratio. Privacy of genomic data is a topic that has been recently focused by developments in the field of personalized medicine. We propose a tool that is able to represent genomic data in a securely encrypted fashion, and at the same time, is able to compact FASTA and FASTQ sequences by a factor of three. It employs AES encryption accompanied by a shuffling mechanism for improving the data security. The results show it is faster than general-purpose and special-purpose algorithms. Compression techniques can be employed for analysis of omics data. Having this in mind, we investigate the identification of unique regions in a species with respect to close species, that can give us an insight into evolutionary traits. For this purpose, we design two alignment-free tools that can accurately find and visualize distinct regions among two collections of DNA or protein sequences. Tested on modern humans with respect to Neanderthals, we found a number of absent regions in Neanderthals that may express new functionalities associated with evolution of modern humans. Finally, we investigate the identification of genomic rearrangements, that have important roles in genetic disorders and cancer, by employing a compression technique. For this purpose, we design a tool that is able to accurately localize and visualize small- and large-scale rearrangements between two genomic sequences. The results of applying the proposed tool on several synthetic and real data conformed to the results partially reported by wet laboratory approaches, e.g., FISH analysis.O crescente crescimento do desenvolvimento de tecnologias de sequenciamento de alto rendimento e, como consequência, a geração de um enorme volume de dados, revolucionou a pesquisa e descoberta biológica. Motivados por isso, nesta tese investigamos os métodos que fornecem uma representação eficiente de dados ómicros de maneira compactada ou criptografada e, posteriormente, os usamos para análise. Em primeiro lugar, descrevemos uma série de medidas com o objetivo de quantificar informação em e entre sequencias ómicas. Em seguida, apresentamos modelos de contexto finito (FCMs), modelos de Markov tolerantes a substituição (STMMs) e uma combinação dos dois, especializados na modelagem de dados biológicos, para compactação e análise de dados. Para facilitar o armazenamento do dilúvio de dados acima mencionado, desenvolvemos dois compressores de dados sem perda para dados genómicos e um para dados proteómicos. Os métodos funcionam com base em (a) uma combinação de FCMs e STMMs ou (b) na combinação mencionada, juntamente com modelos de repetição e um modelo de previsão competitiva. Testados em vários dados sintéticos e reais mostraram a sua eficiência sobre os métodos do estado-de-arte em termos de taxa de compressão. A privacidade dos dados genómicos é um tópico recentemente focado nos desenvolvimentos do campo da medicina personalizada. Propomos uma ferramenta capaz de representar dados genómicos de maneira criptografada com segurança e, ao mesmo tempo, compactando as sequencias FASTA e FASTQ para um fator de três. Emprega criptografia AES acompanhada de um mecanismo de embaralhamento para melhorar a segurança dos dados. Os resultados mostram que ´e mais rápido que os algoritmos de uso geral e específico. As técnicas de compressão podem ser exploradas para análise de dados ómicos. Tendo isso em mente, investigamos a identificação de regiões únicas em uma espécie em relação a espécies próximas, que nos podem dar uma visão das características evolutivas. Para esse fim, desenvolvemos duas ferramentas livres de alinhamento que podem encontrar e visualizar com precisão regiões distintas entre duas coleções de sequências de DNA ou proteínas. Testados em humanos modernos em relação a neandertais, encontrámos várias regiões ausentes nos neandertais que podem expressar novas funcionalidades associadas à evolução dos humanos modernos. Por último, investigamos a identificação de rearranjos genómicos, que têm papéis importantes em desordens genéticas e cancro, empregando uma técnica de compressão. Para esse fim, desenvolvemos uma ferramenta capaz de localizar e visualizar com precisão os rearranjos em pequena e grande escala entre duas sequências genómicas. Os resultados da aplicação da ferramenta proposta, em vários dados sintéticos e reais, estão em conformidade com os resultados parcialmente relatados por abordagens laboratoriais, por exemplo, análise FISH.Programa Doutoral em Engenharia Informátic

    Diseño, implementación y optimización del sistema de compresión de imágenes sobre el ordenador de a bordo del proyecto de nanosátelite Eye-Sat

    Get PDF
    Eye-Sat es un Proyecto de nano satélites, dirigido por el CNES (Centre National d’Etudes Spatiales) y desarrollado principalmente por estudiantes de varias escuelas de ingeniería del territorio francés. El objetivo de este pequeño telescopio no solo radica en la oportunidad de realizar la demostración de distintos dispositivos tecnológicos, sino que también tiene como misión la adquisición de fotografías en la bandas de color e infrarrojo de la vía Láctea, así como el estudio de la intensidad y polarización de la luz Zodiacal. Los requerimientos de la misión exigen el desarrollo de un algoritmo de compresión de imágenes sin pérdidas para las imágenes “Color Filter Array” CFA (Bayer) e infrarrojas adquiridas por el satélite. Como miembro de la comisión consultativa para los sistemas espaciales, CNES ha seleccionado el estándar CCSDS-123.0-B como algoritmo base para cumplir los requerimientos de la misión. A este algoritmo se le añadirán modificaciones o mejoras, adaptadas a las imágenes tipo, con el fin de mejorar las prestaciones de compresión y de complejidad. La implementación y la optimización del algoritmo será desarrollada sobre la plataforma Xilinx Zynq® All Programmable SoC, el cual incluye una FPGA y un Dual-core ARM® Cortex™-A9 processor with NEONTM DSP/FPU Engine

    Web Engineering for Workflow-based Applications: Models, Systems and Methodologies

    Get PDF
    This dissertation presents novel solutions for the construction of Workflow-based Web applications: The Web Engineering DSL Framework, a stakeholder-oriented Web Engineering methodology based on Domain-Specific Languages; the Workflow DSL for the efficient engineering of Web-based Workflows with strong stakeholder involvement; the Dialog DSL for the usability-oriented development of advanced Web-based dialogs; the Web Engineering Reuse Sphere enabling holistic, stakeholder-oriented reuse

    Model-based programming environments for spreadsheets

    Get PDF
    Spreadsheets can be seen as a flexible programming environment. However, they lack some of the concepts of regular programming languages, such as structured data types. This can lead the user to edit the spreadsheet in a wrong way and perhaps cause corrupt or redundant data. We devised a method for extraction of a relational model from a spreadsheet and the subsequent embedding of the model back into the spreadsheet to create a model-based spreadsheet programming environment. The extraction algorithm is specific for spreadsheets since it considers particularities such as layout and column arrangement. The extracted model is used to generate formulas and visual elements that are then embedded in the spreadsheet helping the user to edit data in a correct way. We present preliminary experimental results from applying our approach to a sample of spreadsheets from the EUSES Spreadsheet Corpus. Finally, we conduct the first systematic empirical study to assess the effectiveness and efficiency of this approach. A set of spreadsheet end users worked with two different model-based spreadsheets, and we present and analyze here the results achieved.This work is funded by ERDF European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT Fundacao para a Ciencia e a Tecnologia (Portuguese Foundation for Science and Technology) within project FCOMP-01-0124-FEDER-010048. The first author is supported by the FCT grant SFRH/BPD/73358/2010

    The SP theory of intelligence: benefits and applications

    Full text link
    This article describes existing and expected benefits of the "SP theory of intelligence", and some potential applications. The theory aims to simplify and integrate ideas across artificial intelligence, mainstream computing, and human perception and cognition, with information compression as a unifying theme. It combines conceptual simplicity with descriptive and explanatory power across several areas of computing and cognition. In the "SP machine" -- an expression of the SP theory which is currently realized in the form of a computer model -- there is potential for an overall simplification of computing systems, including software. The SP theory promises deeper insights and better solutions in several areas of application including, most notably, unsupervised learning, natural language processing, autonomous robots, computer vision, intelligent databases, software engineering, information compression, medical diagnosis and big data. There is also potential in areas such as the semantic web, bioinformatics, structuring of documents, the detection of computer viruses, data fusion, new kinds of computer, and the development of scientific theories. The theory promises seamless integration of structures and functions within and between different areas of application. The potential value, worldwide, of these benefits and applications is at least $190 billion each year. Further development would be facilitated by the creation of a high-parallel, open-source version of the SP machine, available to researchers everywhere.Comment: arXiv admin note: substantial text overlap with arXiv:1212.022
    corecore