14 research outputs found

    Data Centric Peer-to-Peer Communication in Power Grids

    Get PDF
    We study the use of peer-to-peer based declarative data management to enable efficient monitoring and control of power transmission and distribution networks. We propose methods and an architecture for data centric communication in power networks; a proof-of-concept decentralized communication infrastructure is presented that uses and advances state of the art peer-to-peer and distributed data management protocols to provide real time access to network state information. We propose methods for adaptive network reconfiguration and self-repair mechanisms to handle fault situations. To efficiently handle complex queries, we present a centralized metadata index, and propose a query language and execution method that allows us to handle high volume data streams in-network

    What if an SQL Statement Returned a Database?

    Full text link
    Every SQL statement is limited to return a single, possibly denormalized, table. This design decision has far reaching consequences. (1.) for databases users in terms of slow query performance, long query result transfer times, usability-issues of SQL in web applications and object-relational mappers. In addition, (2.) for database architects it has consequences when designing query optimizers leading to logical (algebraic) join enumeration effort, memory consumption for intermediate result materialization, and physical operator selection effort. So basically, the entire query optimization stack is shaped by that design decision. In this paper, we argue that the single-table limitation should be dropped. We extend the SELECT-clause of SQL by a keyword 'RESULTDB' to support returning a result database. Our approach has clear semantics, i.e. our extended SQL returns subsets of all tables with only those tuples that would be part of the traditional (single-table) query result set, however without performing any denormalization through joins. Our SQL-extension is downward compatible. Moreover, we discuss the surprisingly long list of benefits of our approach. First, for database users: far simpler and more readable application code, better query performance, smaller query results, better query result transfer times. Second, for database architects, we present how to leverage existing closed source systems as well as change open source database systems to support our feature. We propose a couple of algorithms to integrate our feature into both closed-source as well as open source database systems. We present an initial experimental study with promising results

    Object Graph Programming

    Full text link
    We introduce Object Graph Programming (OGO), which enables reading and modifying an object graph (i.e., the entire state of the object heap) via declarative queries. OGO models the objects and their relations in the heap as an object graph thereby treating the heap as a graph database: each node in the graph is an object (e.g., an instance of a class or an instance of a metadata class) and each edge is a relation between objects (e.g., a field of one object references another object). We leverage Cypher, the most popular query language for graph databases, as OGO's query language. Unlike LINQ, which uses collections (e.g., List) as a source of data, OGO views the entire object graph as a single "collection". OGO is ideal for querying collections (just like LINQ), introspecting the runtime system state (e.g., finding all instances of a given class or accessing fields via reflection), and writing assertions that have access to the entire program state. We prototyped OGO for Java in two ways: (a) by translating an object graph into a Neo4j database on which we run Cypher queries, and (b) by implementing our own in-memory graph query engine that directly queries the object heap. We used OGO to rewrite hundreds of statements in large open-source projects into OGO queries. We report our experience and performance of our prototypes.Comment: 13 pages, ICSE 202

    Towards a Formal Theory of Interoperability

    Get PDF
    This dissertation proposes a formal theory of interoperability that explains 1) what interoperability is as opposed to how it works, 2) how to tell whether two or more systems can interoperate and 3) how to identify whether systems are interoperating or merely exchanging bits and bytes. The research provides a formal model of data in M&S that captures all possible representations of a real or imagined thing and distinguishes between existential dependencies and transformational dependencies. Existential dependencies capture the relationships within a model while transformational dependencies capture the relationships between interactions with a model. These definitions are used to formally specify interoperation, the ability to exchange information, as a necessary condition for interoperability. Theorems of interoperation that capture the nature and boundaries of the interoperation space and how to measure it are formulated. Interoperability is formally captured as a subset of the interoperation space for which transformational dependencies can be fulfilled. Theorems of interoperability that capture the interoperability space and how to measure it are presented. Using graph theory and complexity theory, the model of data is reformulated as a graph, and the complexity of interoperation and interoperability is shown to be at least NP-Complete. Model Based Data Engineering (MBDE) is formally defined using the model of data introduced earlier and transformed into a heuristic that supports interoperability. This heuristic is shown to be more powerful than current approaches in that it is consistent and can easily be verified

    Digital Energetics

    Get PDF
    Media and energy require joint theorization as they are bound together across contemporary informational and fossil regimes. Digital Energetics traces the contours of a media analytic of energy and an energy analytic of media across the cultural, environmental, and labor relations they subtend. Focusing specifically on digital operations, its authors analyze how data and energy have jointly modulated the character of data work and politics in a warming world

    Contributions à la définition d'un nouveau langage d'exploitation des bases de données relationnelles

    Get PDF
    Le but du projet DOMINUS est de définir un modèle de SGBD adapté au développement de services de dépôt de données autonomes capables de maintenir un haut standard d'intégrité et de fiabilité dans un contexte applicatif contemporain.Le présent mémoire, réalisé dans le cadre du projet DOMINUS, vise à contribuer à la définition d'un premier langage conforme à ce modèle, Discipulus, et à mettre en oeuvre un premier traducteur expérimental de ce langage. Le modèle DOMINUS demeure basé sur le modèle relationnel de E. F. Codd tout d'abord parce qu'il est simple, facile à appréhender, et repose sur de solides bases théoriques qui permettent notamment de définir de façon formelle les langages de manipulation associés et, en second lieu, parce qu'il est éprouvé, comme le démontrent plus de trente années de prédominance ininterrompue. L'évolution de la gestion d'information a vu apparaître de nouvelles applications (systèmes de gestion intégrée, traitement d'images, vidéo...) nécessitant l'utilisation de bases de données complexes de plus en plus importantes. Ces nouvelles applications ont mis en évidence les insuffisances majeures des systèmes relationnels existants fondés sur le langage SQL: (1) L'inadéquation du modèle relationnel à représenter directement des données complexes, comme des dossiers médicaux structurés, des images radiographiques ou des textes annotés. (2) Les performances insuffisantes dans la manipulation de ces mêmes données. Ces lacunes ont conduit certains à vouloir remplacer le modèle relationnel par le modèle orienté objet. En effet, la notion d'objet (plus exactement de classe) permet de modéliser des éléments complexes et composites du monde réel. En 1990 sont apparus les premiers systèmes de gestion de bases de données à objets, mais, vu les performances et la maturité des systèmes de bases de données relationnelles, les systèmes à objets n'ont pas pris une place significative au sein des organisations. La voie explorée ici est plutôt celle de l'intégration du modèle objet au modèle relationnel, ce dernier demeurant prééminent. L'adoption des deux structures (la relation et la classe) semble donc nécessaire afin de répondre aux besoins et aux exigences des applications complexes tout en gardant la simplicité et la cohésion conceptuelle nécessaire à la vérification et à la validation. Le modèle DOMINUS est donc inspiré des travaux fondamentaux de E. F. Codd et de ses continuateurs, dont C. J. Date et H. Darwen [S1] ainsi que des modèles algorithmiques et de typage de B. Meyer[L13] . Au final, le langage Discipulus retient plusieurs acquis du langage SQL, s'inspire également de langage Tutorial D et emprunte la structure générale et plusieurs mécanismes syntaxiques du langage Eiffel[L13] . Notre proposition comporte également de nombreuses différences sensibles tant sur le fond que sur la forme[L1,L7] . Ces apports sont présentés au fil du mémoire. Le langage Discipulus a été conçu dans le but de permettre l'expression rigoureuse de modèles complexes (intégration complète des classes, des tuples et des relations dans un seul système de typage homogène et cohérent) tout en favorisant la réutilisation (l'utilisation d'un système de paquetage destiné à développer des modules cohérents tout en permettant leur réutilisation simple pour le développement d'autres systèmes), l'évolutivité (l'adoption de l'héritage multiple permet d'éviter la redondance de code et facilite l'extensibilité du logiciel et, par conséquent, l'évolutivité sans compromettre son intégrité et sa fiabilité) et la fiabilité (incorporation des principes de programmation par contrat et leur extension aux opérateurs relationnels, traitement cohérent de l'annulabilité)

    Effective data versioning for collaborative data analytics

    Get PDF
    With the massive proliferation of datasets in a variety of sectors, data science teams in these sectors spend vast amounts of time collaboratively constructing, curating, and analyzing these datasets. Versions of datasets are routinely generated during this data science process, via various data processing operations like data transformation and cleaning, feature engineering and normalization, among others. However, no existing systems enable us to effectively store, track, and query these versioned datasets, leading to massive redundancy in versioned data storage and making true collaboration and sharing impossible. In this thesis, we develop solutions for versioned data management for collaborative data analytics. In the first part of this thesis, we extend a relational database to support versioning of structured data. Specifically, we build a system, OrpheusDB, on top of a relational database with a carefully designed data representation and an intelligent partitioning algorithm for fast version control operations. OrpheusDB inherits much of the same benefits of relational databases, while also compactly storing, keeping track of, and recreating versions on demand. However, OrpheusDB implicitly makes a few assumptions, namely that: (a) the SQL assumption: a SQL-like language is the best fit for querying data and versioning information; (b) the structural assumption: the data is in a relational format with a regular structure; (c) the from-scratch assumption: users adopt OrpheusDB from the very beginning of their project and register each data version along with full metadata in the system. In the second part of this thesis, we remove each of these assumptions, one at a time. First, we remove the SQL assumption and propose a generalized query language for querying data along with versioning and provenance information. Second, we remove the structural assumption and develop solutions for compact storage and fast retrieval of arbitrary data representations. Finally, we remove the “from-scratch” assumption, by developing techniques to infer lineage relationships among versions residing in an existing data repository

    Flexibility in Data Management

    Get PDF
    With the ongoing expansion of information technology, new fields of application requiring data management emerge virtually every day. In our knowledge culture increasing amounts of data and work force organized in more creativity-oriented ways also radically change traditional fields of application and question established assumptions about data management. For instance, investigative analytics and agile software development move towards a very agile and flexible handling of data. As the primary facilitators of data management, database systems have to reflect and support these developments. However, traditional database management technology, in particular relational database systems, is built on assumptions of relatively stable application domains. The need to model all data up front in a prescriptive database schema earned relational database management systems the reputation among developers of being inflexible, dated, and cumbersome to work with. Nevertheless, relational systems still dominate the database market. They are a proven, standardized, and interoperable technology, well-known in IT departments with a work force of experienced and trained developers and administrators. This thesis aims at resolving the growing contradiction between the popularity and omnipresence of relational systems in companies and their increasingly bad reputation among developers. It adapts relational database technology towards more agility and flexibility. We envision a descriptive schema-comes-second relational database system, which is entity-oriented instead of schema-oriented; descriptive rather than prescriptive. The thesis provides four main contributions: (1)~a flexible relational data model, which frees relational data management from having a prescriptive schema; (2)~autonomous physical entity domains, which partition self-descriptive data according to their schema properties for better query performance; (3)~a freely adjustable storage engine, which allows adapting the physical data layout used to properties of the data and of the workload; and (4)~a self-managed indexing infrastructure, which autonomously collects and adapts index information under the presence of dynamic workloads and evolving schemas. The flexible relational data model is the thesis\' central contribution. It describes the functional appearance of the descriptive schema-comes-second relational database system. The other three contributions improve components in the architecture of database management systems to increase the query performance and the manageability of descriptive schema-comes-second relational database systems. We are confident that these four contributions can help paving the way to a more flexible future for relational database management technology
    corecore