860 research outputs found
Managing XML Data to optimize Performance into Object-Relational Databases
This paper propose some possibilities for manage XML data in order to optimize performance into object-relational databases. It is detailed the possibility of storing XML data into such databases, using for exemplification an Oracle database and there are tested some optimizing techniques of the queries over XMLType tables, like indexing and partitioning tables
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources
Apache Calcite is a foundational software framework that provides query
processing, optimization, and query language support to many popular
open-source data processing systems such as Apache Hive, Apache Storm, Apache
Flink, Druid, and MapD. Calcite's architecture consists of a modular and
extensible query optimizer with hundreds of built-in optimization rules, a
query processor capable of processing a variety of query languages, an adapter
architecture designed for extensibility, and support for heterogeneous data
models and stores (relational, semi-structured, streaming, and geospatial).
This flexible, embeddable, and extensible architecture is what makes Calcite an
attractive choice for adoption in big-data frameworks. It is an active project
that continues to introduce support for the new types of data sources, query
languages, and approaches to query processing and optimization.Comment: SIGMOD'1
An Extensible "SCHEMA-LESS" Database Framework for Managing High-Throughput Semi-Structured Documents
Object-Relational database management system is an integrated hybrid cooperative approach to combine the best practices of both the relational model utilizing SQL queries and the object-oriented, semantic paradigm for supporting complex data creation. In this paper, a highly scalable, information on demand database framework, called NETMARK, is introduced. NETMARK takes advantages of the Oracle 8i object-relational database using physical addresses data types for very efficient keyword search of records spanning across both context and content. NETMARK was originally developed in early 2000 as a research and development prototype to solve the vast amounts of unstructured and semistructured documents existing within NASA enterprises. Today, NETMARK is a flexible, high-throughput open database framework for managing, storing, and searching unstructured or semi-structured arbitrary hierarchal models, such as XML and HTML
An Expressive Language and Efficient Execution System for Software Agents
Software agents can be used to automate many of the tedious, time-consuming
information processing tasks that humans currently have to complete manually.
However, to do so, agent plans must be capable of representing the myriad of
actions and control flows required to perform those tasks. In addition, since
these tasks can require integrating multiple sources of remote information ?
typically, a slow, I/O-bound process ? it is desirable to make execution as
efficient as possible. To address both of these needs, we present a flexible
software agent plan language and a highly parallel execution system that enable
the efficient execution of expressive agent plans. The plan language allows
complex tasks to be more easily expressed by providing a variety of operators
for flexibly processing the data as well as supporting subplans (for
modularity) and recursion (for indeterminate looping). The executor is based on
a streaming dataflow model of execution to maximize the amount of operator and
data parallelism possible at runtime. We have implemented both the language and
executor in a system called THESEUS. Our results from testing THESEUS show that
streaming dataflow execution can yield significant speedups over both
traditional serial (von Neumann) as well as non-streaming dataflow-style
execution that existing software and robot agent execution systems currently
support. In addition, we show how plans written in the language we present can
represent certain types of subtasks that cannot be accomplished using the
languages supported by network query engines. Finally, we demonstrate that the
increased expressivity of our plan language does not hamper performance;
specifically, we show how data can be integrated from multiple remote sources
just as efficiently using our architecture as is possible with a
state-of-the-art streaming-dataflow network query engine
- …