95,445 research outputs found
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources
Apache Calcite is a foundational software framework that provides query
processing, optimization, and query language support to many popular
open-source data processing systems such as Apache Hive, Apache Storm, Apache
Flink, Druid, and MapD. Calcite's architecture consists of a modular and
extensible query optimizer with hundreds of built-in optimization rules, a
query processor capable of processing a variety of query languages, an adapter
architecture designed for extensibility, and support for heterogeneous data
models and stores (relational, semi-structured, streaming, and geospatial).
This flexible, embeddable, and extensible architecture is what makes Calcite an
attractive choice for adoption in big-data frameworks. It is an active project
that continues to introduce support for the new types of data sources, query
languages, and approaches to query processing and optimization.Comment: SIGMOD'1
Extending the data dictionary for data/knowledge management
Current relational database technology provides the means for efficiently storing and retrieving large amounts of data. By combining techniques learned from the field of artificial intelligence with this technology, it is possible to expand the capabilities of such systems. This paper suggests using the expanded domain concept, an object-oriented organization, and the storing of knowledge rules within the relational database as a solution to the unique problems associated with CAD/CAM and engineering data
Information Integration - the process of integration, evolution and versioning
At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud
In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources
Knowledge Discovery in the SCADA Databases Used for the Municipal Power Supply System
This scientific paper delves into the problems related to the develop-ment of
intellectual data analysis system that could support decision making to manage
municipal power supply services. The management problems of mu-nicipal power
supply system have been specified taking into consideration modern tendencies
shown by new technologies that allow for an increase in the energy efficiency.
The analysis findings of the system problems related to the integrated
computer-aided control of the power supply for the city have been given. The
consideration was given to the hierarchy-level management decom-position model.
The objective task targeted at an increase in the energy effi-ciency to
minimize expenditures and energy losses during the generation and
transportation of energy carriers to the Consumer, the optimization of power
consumption at the prescribed level of the reliability of pipelines and
networks and the satisfaction of Consumers has been defined. To optimize the
support of the decision making a new approach to the monitoring of engineering
systems and technological processes related to the energy consumption and
transporta-tion using the technologies of geospatial analysis and Knowledge
Discovery in databases (KDD) has been proposed. The data acquisition for
analytical prob-lems is realized in the wireless heterogeneous medium, which
includes soft-touch VPN segments of ZigBee technology realizing the 6LoWPAN
standard over the IEEE 802.15.4 standard and also the segments of the networks
of cellu-lar communications. JBoss Application Server is used as a server-based
plat-form for the operation of the tools used for the retrieval of data
collected from sensor nodes, PLC and energy consumption record devices. The KDD
tools are developed using Java Enterprise Edition platform and Spring and ORM
Hiber-nate technologies
A Review of integrity constraint maintenance and view updating techniques
Two interrelated problems may arise when updating a database. On one
hand, when an update is applied to the database, integrity constraints
may become violated. In such case, the integrity constraint maintenance
approach tries to obtain additional updates to keep integrity
constraints satisfied. On the other hand, when updates of derived or
view facts are requested, a view updating mechanism must be applied to
translate the update request into correct updates of the underlying base
facts.
This survey reviews the research performed on integrity constraint
maintenance and view updating. It is proposed a general framework to
classify and to compare methods that tackle integrity constraint
maintenance and/or view updating. Then, we analyze some of these methods
in more detail to identify their actual contribution and the main
limitations they may present.Postprint (published version
- …