95,445 research outputs found

    Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources

    Get PDF
    Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.Comment: SIGMOD'1

    Extending the data dictionary for data/knowledge management

    Get PDF
    Current relational database technology provides the means for efficiently storing and retrieving large amounts of data. By combining techniques learned from the field of artificial intelligence with this technology, it is possible to expand the capabilities of such systems. This paper suggests using the expanded domain concept, an object-oriented organization, and the storing of knowledge rules within the relational database as a solution to the unique problems associated with CAD/CAM and engineering data

    Information Integration - the process of integration, evolution and versioning

    Get PDF
    At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

    Knowledge Discovery in the SCADA Databases Used for the Municipal Power Supply System

    Full text link
    This scientific paper delves into the problems related to the develop-ment of intellectual data analysis system that could support decision making to manage municipal power supply services. The management problems of mu-nicipal power supply system have been specified taking into consideration modern tendencies shown by new technologies that allow for an increase in the energy efficiency. The analysis findings of the system problems related to the integrated computer-aided control of the power supply for the city have been given. The consideration was given to the hierarchy-level management decom-position model. The objective task targeted at an increase in the energy effi-ciency to minimize expenditures and energy losses during the generation and transportation of energy carriers to the Consumer, the optimization of power consumption at the prescribed level of the reliability of pipelines and networks and the satisfaction of Consumers has been defined. To optimize the support of the decision making a new approach to the monitoring of engineering systems and technological processes related to the energy consumption and transporta-tion using the technologies of geospatial analysis and Knowledge Discovery in databases (KDD) has been proposed. The data acquisition for analytical prob-lems is realized in the wireless heterogeneous medium, which includes soft-touch VPN segments of ZigBee technology realizing the 6LoWPAN standard over the IEEE 802.15.4 standard and also the segments of the networks of cellu-lar communications. JBoss Application Server is used as a server-based plat-form for the operation of the tools used for the retrieval of data collected from sensor nodes, PLC and energy consumption record devices. The KDD tools are developed using Java Enterprise Edition platform and Spring and ORM Hiber-nate technologies

    A Review of integrity constraint maintenance and view updating techniques

    Get PDF
    Two interrelated problems may arise when updating a database. On one hand, when an update is applied to the database, integrity constraints may become violated. In such case, the integrity constraint maintenance approach tries to obtain additional updates to keep integrity constraints satisfied. On the other hand, when updates of derived or view facts are requested, a view updating mechanism must be applied to translate the update request into correct updates of the underlying base facts. This survey reviews the research performed on integrity constraint maintenance and view updating. It is proposed a general framework to classify and to compare methods that tackle integrity constraint maintenance and/or view updating. Then, we analyze some of these methods in more detail to identify their actual contribution and the main limitations they may present.Postprint (published version
    • …
    corecore