42,961 research outputs found
Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management
Spreadsheet software is the tool of choice for interactive ad-hoc data
management, with adoption by billions of users. However, spreadsheets are not
scalable, unlike database systems. On the other hand, database systems, while
highly scalable, do not support interactivity as a first-class primitive. We
are developing DataSpread, to holistically integrate spreadsheets as a
front-end interface with databases as a back-end datastore, providing
scalability to spreadsheets, and interactivity to databases, an integration we
term presentational data management (PDM). In this paper, we make a first step
towards this vision: developing a storage engine for PDM, studying how to
flexibly represent spreadsheet data within a database and how to support and
maintain access by position. We first conduct an extensive survey of
spreadsheet use to motivate our functional requirements for a storage engine
for PDM. We develop a natural set of mechanisms for flexibly representing
spreadsheet data and demonstrate that identifying the optimal representation is
NP-Hard; however, we develop an efficient approach to identify the optimal
representation from an important and intuitive subclass of representations. We
extend our mechanisms with positional access mechanisms that don't suffer from
cascading update issues, leading to constant time access and modification
performance. We evaluate these representations on a workload of typical
spreadsheets and spreadsheet operations, providing up to 20% reduction in
storage, and up to 50% reduction in formula evaluation time
A qualitative approach to the identification, visualisation and interpretation of repetitive motion patterns in groups of moving point objects
Discovering repetitive patterns is important in a wide range of research areas, such as bioinformatics and human movement analysis. This study puts forward a new methodology to identify, visualise and interpret repetitive motion patterns in groups of Moving Point Objects (MPOs). The methodology consists of three steps. First, motion patterns are qualitatively described using the Qualitative Trajectory Calculus (QTC). Second, a similarity analysis is conducted to compare motion patterns and identify repetitive patterns. Third, repetitive motion patterns are represented and interpreted in a continuous triangular model. As an illustration of the usefulness of combining these hitherto separated methods, a specific movement case is examined: Samba dance, a rhythmical dance will? many repetitive movements. The results show that the presented methodology is able to successfully identify, visualize and interpret the contained repetitive motions
Stochastic dynamics of macromolecular-assembly networks
The formation and regulation of macromolecular complexes provides the
backbone of most cellular processes, including gene regulation and signal
transduction. The inherent complexity of assembling macromolecular structures
makes current computational methods strongly limited for understanding how the
physical interactions between cellular components give rise to systemic
properties of cells. Here we present a stochastic approach to study the
dynamics of networks formed by macromolecular complexes in terms of the
molecular interactions of their components. Exploiting key thermodynamic
concepts, this approach makes it possible to both estimate reaction rates and
incorporate the resulting assembly dynamics into the stochastic kinetics of
cellular networks. As prototype systems, we consider the lac operon and phage
lambda induction switches, which rely on the formation of DNA loops by proteins
and on the integration of these protein-DNA complexes into intracellular
networks. This cross-scale approach offers an effective starting point to move
forward from network diagrams, such as those of protein-protein and DNA-protein
interaction networks, to the actual dynamics of cellular processes.Comment: Open Access article available at
http://www.nature.com/msb/journal/v2/n1/full/msb4100061.htm
Protein sectors: statistical coupling analysis versus conservation
Statistical coupling analysis (SCA) is a method for analyzing multiple
sequence alignments that was used to identify groups of coevolving residues
termed "sectors". The method applies spectral analysis to a matrix obtained by
combining correlation information with sequence conservation. It has been
asserted that the protein sectors identified by SCA are functionally
significant, with different sectors controlling different biochemical
properties of the protein. Here we reconsider the available experimental data
and note that it involves almost exclusively proteins with a single sector. We
show that in this case sequence conservation is the dominating factor in SCA,
and can alone be used to make statistically equivalent functional predictions.
Therefore, we suggest shifting the experimental focus to proteins for which SCA
identifies several sectors. Correlations in protein alignments, which have been
shown to be informative in a number of independent studies, would then be less
dominated by sequence conservation.Comment: 36 pages, 17 figure
- …