12 research outputs found

    SciQL, Bridging the Gap between Science and Relational DBMS

    Get PDF
    Scientific discoveries increasingly rely on the ability to efficiently grind massive amounts of experimental data using database technologies. To bridge the gap between the needs of the Data-Intensive Research fields and the current DBMS technologies, we propose SciQL (pronounced as ‘cycle’), the first SQL-based query language for scientific applications with both tables and arrays as first class citizens. It provides a seamless symbiosis of array-, set- and sequence- interpretations. A key innovation is the extension of value-based grouping of SQL:2003 with structural grouping, i.e., fixed-sized and unbounded groups based on explicit relationships between elements positions. This leads to a generalisation of window-based query processing with wide applicability in science domains. This paper describes the main language features of SciQL and illustrates it using time-series concepts

    SciQL, A query language for science applications

    Get PDF
    Scientific applications are still poorly served by contemporary relational database systems. At best, the system provides a bridge towards an external library using user-defined functions, explicit import/export facilities or linked-in Java/C# interpreters. Time has come to rectify this with SciQL, a SQL-query language for science applications with arrays as first class citizens. It provides a seamless symbiosis of array-, set-, and sequence- interpretation using a clear separation of the mathematical object from its underlying storage representation. The language extends value-based grouping in SQL with structural grouping, i.e., fixed-sized and unbounded groups based on explicit relationships between its index attributes. It leads to a generalization of window-based query processing. The SciQL architecture benefits from a column store system with an adaptive storage scheme, including keeping multiple representations around for reduced impedance mismatch. This paper is focused on the language features, its architectural consequences and extensive examples of its intended use

    Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data

    Get PDF
    SciQL (pronounced as ‘cycle’) is a novel SQL-based array query language for scientific applications with both tables and arrays as first class citizens. SciQL lowers the entrance fee of adopting relational DBMS (RDBMS) in scienti

    Real-Time Wildfire Monitoring Using Scientific Database and Linked Data Technologies

    Get PDF
    We present a real-time wildfire monitoring service that exploits satellite images and linked geospatial data to detect hotspots and monitor the evolution of fire fronts. The service makes heavy use of scientific database technologies (array databases, SciQL, data vaults) and linked data technologies (ontologies, linked geospatial data, stSPARQL) and is implemented on top of MonetDB and Strabon. The service is now operational at the National Observatory of Athens and has been used during the previous summer by emergency managers monitoring wildfires in Greece

    UpStream: storage-centric load management for streaming applications with update semantics

    Get PDF
    This paper addresses the problem of minimizing the staleness of query results for streaming applications with update semantics under overload conditions. Staleness is a measure of how out-of-date the results are compared with the latest data arriving on the input. Real-time streaming applications are subject to overload due to unpredictably increasing data rates, while in many of them, we observe that data streams and queries in fact exhibit "update semantics” (i.e., the latest input data are all that really matters when producing a query result). Under such semantics, overload will cause staleness to build up. The key to avoid this is to exploit the update semantics of applications as early as possible in the processing pipeline. In this paper, we propose UpStream, a storage-centric framework for load management over streaming applications with update semantics. We first describe how we model streams and queries that possess the update semantics, providing definitions for correctness and staleness for the query results. Then, we show how staleness can be minimized based on intelligent update key scheduling techniques applied at the queue level, while preserving the correctness of the results, even for complex queries that involve sliding windows. UpStream is based on the simple idea of applying the updates in place, yet with great returns in terms of lowering staleness and memory consumption, as we also experimentally verify on the Borealis syste

    Context-aware pending interest table management scheme for NDN-based VANETs

    Get PDF
    In terms of delivery effectiveness, Vehicular Adhoc NETworks (VANETs) applications have multiple, possibly conflicting, and disparate needs (e.g., latency, reliability, and delivery priorities). Named Data Networking (NDN) has attracted the attention of the research community for effective content retrieval and dissemination in mobile environments such as VANETs. A vehicle in a VANET application is heavily reliant on information about the content, network, and application, which can be obtained from a variety of sources. The information gathered can be used as context to make better decisions. While it is difficult to obtain the necessary context information at the IP network layer, the emergence of NDN is changing the tide. The Pending Information Table (PIT) is an important player in NDN data retrieval. PIT size is the bottleneck due to the limited opportunities provided by current memory technologies. PIT overflow results in service disruptions as new Interest messages cannot be added to PIT. Adaptive, context-aware PIT entry management solutions must be introduced to NDN-based VANETs for effective content dissemination. In this context, our main contribution is a decentralised, context-aware PIT entry management (CPITEM) protocol. The simulation results show that the proposed CPITEM protocol achieves lower Interest Satisfaction Delay and effective PIT utilization based on context when compared to existing PIT entry replacement protocols

    Window-based Query Processing

    No full text
    Data Streams are infinite in nature. As a result, a query that executes over data streams specifies a ‘‘window’’ of focus or the part of the data stream that is of interest to the query. When new data items arrive into the data stream, the window may either expand or slide to allow the query to process these new data items. Hence, queries over data streams are continuous in nature, i.e., the query is continuously re-evaluated each time the query window slides.Window-based query processing on data streams refers to the various ways and techniques for processing and evaluating continuous queries over windows of data stream items

    Scientific Data Processing Using SciQL

    No full text
    Scientific discoveries increasingly rely on the ability to efficiently grind massive amounts of experimental data using database technologies. To bridge the gap between the needs of the Data-Intensive Research fields and the current DBMS technologies, we are developing SciQL (pronounced as ‘cycle’), an SQL-based query language for scientific applications with both tables and arrays as first class citizens. It provides a seamless symbiosis of array-, set- and sequence- interpretations. A key innovation is the extension of value-based grouping of SQL:2003 with structural grouping, i.e., fixed-sized and unbounded groups based on explicit relationships between elements positions. This leads to a generalisation of window-based query processing with wide applicability in science domains. In this demo, we show the main features of SciQL using use cases of remote sensing image processing

    Scientific Data Processing Using SciQL

    No full text
    Scientific discoveries increasingly rely on the ability to efficiently grind massive amounts of experimental data using database technologies. To bridge the gap between the needs of the Data-Intensive Research fields and the current DBMS technologies, we are developing SciQL (pronounced as ‘cycle’), an SQL-based query language for scientific applications with both tables and arrays as first class citizens. It provides a seamless symbiosis of array-, set- and sequence- interpretations. A key innovation is the extension of value-based grouping of SQL:2003 with structural grouping, i.e., fixed-sized and unbounded groups based on explicit relationships between elements positions. This leads to a generalisation of window-based query processing with wide applicability in science domains. In this demo, we show the main features of SciQL using use cases of remote sensing image processing

    SciQL: bridging the gap between science and relational DBMS

    Get PDF
    Scientific discoveries increasingly rely on the ability to efficiently grind massive amounts of experimental data using database technologies. To bridge the gap between the needs of the Data-Intensive Research fields and the current DBMS technologies, we propose SciQL (pronounced as 'cycle'), the first SQL-based query language for scientific applications with both tables and arrays as first class citizens. It provides a seamless symbiosis of array-, set- and sequence-interpretations. A key innovation is the extension of value-based grouping of SQL:2003 with structural grouping, i.e., fixed-sized and unbounded groups based on explicit relationships between elements positions. This leads to a generalisation of window-based query processing with wide applicability in science domains. This paper describes the main language features of SciQL and illustrates it using time-series concepts
    corecore