42,961 research outputs found

    Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management

    Full text link
    Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with databases as a back-end datastore, providing scalability to spreadsheets, and interactivity to databases, an integration we term presentational data management (PDM). In this paper, we make a first step towards this vision: developing a storage engine for PDM, studying how to flexibly represent spreadsheet data within a database and how to support and maintain access by position. We first conduct an extensive survey of spreadsheet use to motivate our functional requirements for a storage engine for PDM. We develop a natural set of mechanisms for flexibly representing spreadsheet data and demonstrate that identifying the optimal representation is NP-Hard; however, we develop an efficient approach to identify the optimal representation from an important and intuitive subclass of representations. We extend our mechanisms with positional access mechanisms that don't suffer from cascading update issues, leading to constant time access and modification performance. We evaluate these representations on a workload of typical spreadsheets and spreadsheet operations, providing up to 20% reduction in storage, and up to 50% reduction in formula evaluation time

    A qualitative approach to the identification, visualisation and interpretation of repetitive motion patterns in groups of moving point objects

    Get PDF
    Discovering repetitive patterns is important in a wide range of research areas, such as bioinformatics and human movement analysis. This study puts forward a new methodology to identify, visualise and interpret repetitive motion patterns in groups of Moving Point Objects (MPOs). The methodology consists of three steps. First, motion patterns are qualitatively described using the Qualitative Trajectory Calculus (QTC). Second, a similarity analysis is conducted to compare motion patterns and identify repetitive patterns. Third, repetitive motion patterns are represented and interpreted in a continuous triangular model. As an illustration of the usefulness of combining these hitherto separated methods, a specific movement case is examined: Samba dance, a rhythmical dance will? many repetitive movements. The results show that the presented methodology is able to successfully identify, visualize and interpret the contained repetitive motions

    Stochastic dynamics of macromolecular-assembly networks

    Get PDF
    The formation and regulation of macromolecular complexes provides the backbone of most cellular processes, including gene regulation and signal transduction. The inherent complexity of assembling macromolecular structures makes current computational methods strongly limited for understanding how the physical interactions between cellular components give rise to systemic properties of cells. Here we present a stochastic approach to study the dynamics of networks formed by macromolecular complexes in terms of the molecular interactions of their components. Exploiting key thermodynamic concepts, this approach makes it possible to both estimate reaction rates and incorporate the resulting assembly dynamics into the stochastic kinetics of cellular networks. As prototype systems, we consider the lac operon and phage lambda induction switches, which rely on the formation of DNA loops by proteins and on the integration of these protein-DNA complexes into intracellular networks. This cross-scale approach offers an effective starting point to move forward from network diagrams, such as those of protein-protein and DNA-protein interaction networks, to the actual dynamics of cellular processes.Comment: Open Access article available at http://www.nature.com/msb/journal/v2/n1/full/msb4100061.htm

    Protein sectors: statistical coupling analysis versus conservation

    Full text link
    Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed "sectors". The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation.Comment: 36 pages, 17 figure
    corecore