2,689 research outputs found

    AsterixDB: A Scalable, Open Source BDMS

    Full text link
    AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make it well-suited to applications like web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries; a scalable runtime; partitioned, LSM-based data storage and indexing (including B+-tree, R-tree, and text indexes); support for external as well as natively stored data; a rich set of built-in types; support for fuzzy, spatial, and temporal types and queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store. Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This paper is the first complete description of the resulting open source AsterixDB system. Covered herein are the system's data model, its query language, and its software architecture. Also included are a summary of the current status of the project and a first glimpse into how AsterixDB performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform, for things that both technologies can do. Also included is a brief description of some initial trials that the system has undergone and the lessons learned (and plans laid) based on those early "customer" engagements

    A rapid prototyping/artificial intelligence approach to space station-era information management and access

    Get PDF
    Applications of rapid prototyping and Artificial Intelligence techniques to problems associated with Space Station-era information management systems are described. In particular, the work is centered on issues related to: (1) intelligent man-machine interfaces applied to scientific data user support, and (2) the requirement that intelligent information management systems (IIMS) be able to efficiently process metadata updates concerning types of data handled. The advanced IIMS represents functional capabilities driven almost entirely by the needs of potential users. Space Station-era scientific data projected to be generated is likely to be significantly greater than data currently processed and analyzed. Information about scientific data must be presented clearly, concisely, and with support features to allow users at all levels of expertise efficient and cost-effective data access. Additionally, mechanisms for allowing more efficient IIMS metadata update processes must be addressed. The work reported covers the following IIMS design aspects: IIMS data and metadata modeling, including the automatic updating of IIMS-contained metadata, IIMS user-system interface considerations, including significant problems associated with remote access, user profiles, and on-line tutorial capabilities, and development of an IIMS query and browse facility, including the capability to deal with spatial information. A working prototype has been developed and is being enhanced

    Handling imperfect information in criterion evaluation, aggregation and indexing

    Get PDF

    The design and implementation of fuzzy query processing on sensor networks

    Get PDF
    Sensor nodes and Wireless Sensor Networks (WSN) enable observation of the physical world in unprecedented levels of granularity. A growing number of environmental monitoring applications are being designed to leverage data collection features of WSN, increasing the need for efficient data management techniques and for comparative analysis of various data management techniques. My research leverages aspects of fuzzy database, specifically fuzzy data representation and fuzzy or flexible queries to improve upon the efficiency of existing data management techniques by exploiting the inherent uncertainty of the data collected by WSN. Herein I present my research contributions. I provide classification of WSN middleware to illustrate varying approaches to data management for WSN and identify a need to better handle the uncertainty inherent in data collected from physical environments and to take advantage of the imprecision of the data to increase the efficiency of WSN by requiring less information be transmitted to adequately answer queries posed by WSN monitoring applications. In this dissertation, I present a novel approach to querying WSN, in which semantic knowledge about sensor attributes is represented as fuzzy terms. I present an enhanced simulation environment that supports more flexible and realistic analysis by using cellular automata models to separately model the deployed WSN and the underlying physical environment. Simulation experiments are used to evaluate my fuzzy query approach for environmental monitoring applications. My analysis shows that using fuzzy queries improves upon other data management techniques by reducing the amount of data that needs to be collected to accurately satisfy application requests. This reduction in data transmission results in increased battery life within sensors, an important measure of cost and performance for WSN applications
    corecore