115 research outputs found

    Content based dissemination of XML data

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Music-Related Media-Contents Synchronization over theWeb: the IEEE 1599 Initiative

    Get PDF
    IEEE 1599 is an international standard originally conceived for music, which aims at providing a comprehensive description of the media contents related to a music piece within a multi-layer and synchronized environment. A number of o_- line and stand-alone software prototypes has been realized after its standardization, occurred in 2008. Recently, thanks to some technological advances (e.g. the release of HTML5), the engine of the IEEE 1599 parser has been ported on the Web. Some non-trivial problems have been solved, e.g. the management of multiple simultaneous media streams in a client-server architecture. After providing an overview of the IEEE 1599 standard, this article presents a survey of the recent initiatives regarding audio-driven synchronization over the Web

    Continuously Providing Approximate Results under Limited Resources: Load Shedding and Spilling in XML Streams

    Get PDF
    Because of the high volume and unpredictable arrival rates, stream processing systems may not always be able to keep up with the input data streams, resulting in buffer overflow and uncontrolled loss of data. To continuously supply online results, two alternate solutions to tackle this problem of unpredictable failures of such overloaded systems can be identified. One technique, called load shedding, drops some fractions of data from the input stream to reduce the memory and CPU requirements of the workload. However, dropping some portions of the input data means that the accuracy of the output is reduced since some data is lost. To produce eventually complete results, the second technique, called data spilling, pushes some fractions of data to persistent storage temporarily when the processing speed cannot keep up with the arrival rate. The processing of the disk resident data is then postponed until a later time when system resources become available. This dissertation explores these load reduction technologies in the context of XML stream systems. Load shedding in the specific context of XML streams poses several unique opportunities and challenges. Since XML data is hierarchical, subelements, extracted from different positions of the XML tree structure, may vary in their importance. Further, dropping different subelements may vary in their savings of storage and computation. Hence, unlike prior work in the literature that drops data completely or not at all, in this dissertation we introduce the notion of structure-oriented load shedding, meaning selectively some XML subelements are shed from the possibly complex XML objects in the XML stream. First we develop a preference model that enables users to specify the relative importance of preserving different subelements within the XML result structure. This transforms shedding into the problem of rewriting the user query into shed queries that return approximate answers with their utility as measured by the user preference model. Our optimizer finds the appropriate shed queries to maximize the output utility driven by our structure-based preference model under the limitation of available computation resources. The experimental results demonstrate that our proposed XML-specific shedding solution consistently achieves higher utility results compared to the existing relational shedding techniques. Second, we introduces structure-based spilling, a spilling technique customized for XML streams by considering the spilling of partial substructures of possibly complex XML elements. Several new challenges caused by structure-based spilling are addressed. When a path is spilled, multiple other paths may be affected. We categorize varying types of spilling side effects on the query caused by spilling. How to execute the reduced query to produce the correct runtime output is also studied. Three optimization strategies are developed to select the reduced query that maximizes the output quality. We also examine the clean-up stage to guarantee that an entire result set is eventually generated by producing supplementary results to complement the partial results output earlier. The experimental study demonstrates that our proposed solutions consistently achieve higher quality results compared to the state-of-the-art techniques. Third, we design an integrated framework that combines both shedding and spilling policies into one comprehensive methodology. Decisions on the choice of whether to shed or spill data may be affected by the application needs and data arrival patterns. For some input data, it may be worth to flush it to disk if a delayed output of its result will be important, while other data would best directly dropped from the system given that a delayed delivery of these results would no longer be meaningful to the application. Therefore we need sophisticated technologies capable of deploying both shedding and spilling techniques within one integrated strategy with the ability to deliver the most appropriate decision customers need for each specific circumstance. We propose a novel flexible framework for structure-based shed and spill approaches, applicable in any XML stream system. We propose a solution space that represents all the shed and spill candidates. An age-based quality model is proposed for evaluating the output quality for different reduced query and supplementary query pairs. We also propose a family of four optimization strategies, OptF, OptSmart, HiX and Fex. OptF and OptSmart are both guaranteed to identify an optimal solution of reduced and supplementary query pair, with OptSmart exhibiting significantly less overhead than OptF. HiX and Fex use heuristic-based approaches that are much more efficient than OptF and OptSmart

    Adaptive Caching of Distributed Components

    Get PDF
    Die Zugriffslokalität referenzierter Daten ist eine wichtige Eigenschaft verteilter Anwendungen. Lokales Zwischenspeichern abgefragter entfernter Daten (Caching) wird vielfach bei der Entwicklung solcher Anwendungen eingesetzt, um diese Eigenschaft auszunutzen. Anschliessende Zugriffe auf diese Daten können so beschleunigt werden, indem sie aus dem lokalen Zwischenspeicher bedient werden. Gegenwärtige Middleware-Architekturen bieten dem Anwendungsprogrammierer jedoch kaum Unterstützung für diesen nicht-funktionalen Aspekt. Die vorliegende Arbeit versucht deshalb, Caching als separaten, konfigurierbaren Middleware-Dienst auszulagern. Durch die Einbindung in den Softwareentwicklungsprozess wird die frühzeitige Modellierung und spätere Wiederverwendung caching-spezifischer Metadaten gewährleistet. Zur Laufzeit kann sich das entwickelte System außerdem bezüglich der Cachebarkeit von Daten adaptiv an geändertes Nutzungsverhalten anpassen.Locality of reference is an important property of distributed applications. Caching is typically employed during the development of such applications to exploit this property by locally storing queried data: Subsequent accesses can be accelerated by serving their results immediately form the local store. Current middleware architectures however hardly support this non-functional aspect. The thesis at hand thus tries outsource caching as a separate, configurable middleware service. Integration into the software development lifecycle provides for early capturing, modeling, and later reuse of cachingrelated metadata. At runtime, the implemented system can adapt to caching access characteristics with respect to data cacheability properties, thus healing misconfigurations and optimizing itself to an appropriate configuration. Speculative prefetching of data probably queried in the immediate future complements the presented approach

    Multifaceted Optimization of Energy Efficiency for Stationary WSN Applications

    Get PDF
    Stationary Wireless Sensor Networks (S-WSNs) consist of battery-powered and resource-constrained sensor nodes distributed at fixed locations to cooperatively monitor the environment or an object and provide persistent data acquisition. These systems are being practiced in many applications, ranging from disaster warning systems for instant event detection to structural health monitoring for effective maintenance. Despite the diversity of S-WSN applications, one common requirement is to achieve a long lifespan for a higher value-to-cost ratio. However, the variety of WSN deployment environments and use cases imply that there is no silver bullet to solve the energy issue completely. This thesis is a summary of six publications. Our  contributions include four energy optimization techniques on three layers for S-WSN applications. From the bottom up, we designed an ultra-low power smart trigger to integrate environment perceptibility into the hardware. On the network layer, we propose a reliable clustering protocol and a cluster-based data aggregation scheme. This scheme offers topology optimization together with in-network data processing. On the application layer, we extend an industrial standard protocol XMPP to incorporate WSN characteristics for unified information dissemination. Our protocol extensions facilitate WSN application development by adopting IMPS on the Internet. In addition, we conducted a performance analysis of one lightweight security protocol for WSNs called HIP Diet Exchange, which is being standardized by IETF. We suggested a few improvements and potential applications for HIP DEX. In the process of improving energy efficiency, we explore modular and generic design for better system integration and scalability. Our hardware invention can extend features by adding new transducers onboard. The clustering protocol and data aggregation scheme provides a general self-adaptive method to increase information throughput per energy cost while tolerating network dynamics. The unified XMPP extensions aim to support seamless information flow for the Web of Things. The results presented in this thesis demonstrate the importance of multifaceted optimization strategy in WSN development. An optimal WSN system should comprehend multiple factors to boost energy efficiency in a holistic approach

    The LOFAR Transients Pipeline

    Get PDF
    Current and future astronomical survey facilities provide a remarkably rich opportunity for transient astronomy, combining unprecedented fields of view with high sensitivity and the ability to access previously unexplored wavelength regimes. This is particularly true of LOFAR, a recently-commissioned, low-frequency radio interferometer, based in the Netherlands and with stations across Europe. The identification of and response to transients is one of LOFAR's key science goals. However, the large data volumes which LOFAR produces, combined with the scientific requirement for rapid response, make automation essential. To support this, we have developed the LOFAR Transients Pipeline, or TraP. The TraP ingests multi-frequency image data from LOFAR or other instruments and searches it for transients and variables, providing automatic alerts of significant detections and populating a lightcurve database for further analysis by astronomers. Here, we discuss the scientific goals of the TraP and how it has been designed to meet them. We describe its implementation, including both the algorithms adopted to maximize performance as well as the development methodology used to ensure it is robust and reliable, particularly in the presence of artefacts typical of radio astronomy imaging. Finally, we report on a series of tests of the pipeline carried out using simulated LOFAR observations with a known population of transients.Comment: 30 pages, 11 figures; Accepted for publication in Astronomy & Computing; Code at https://github.com/transientskp/tk

    Mapping Scholarly Communication Infrastructure: A Bibliographic Scan of Digital Scholarly Communication Infrastructure

    Get PDF
    This bibliography scan covers a lot of ground. In it, I have attempted to capture relevant recent literature across the whole of the digital scholarly communications infrastructure. I have used that literature to identify significant projects and then document them with descriptions and basic information. Structurally, this review has three parts. In the first, I begin with a diagram showing the way the projects reviewed fit into the research workflow; then I cover a number of topics and functional areas related to digital scholarly communication. I make no attempt to be comprehensive, especially regarding the technical literature; rather, I have tried to identify major articles and reports, particularly those addressing the library community. The second part of this review is a list of projects or programs arranged by broad functional categories. The third part lists individual projects and the organizations—both commercial and nonprofit—that support them. I have identified 206 projects. Of these, 139 are nonprofit and 67 are commercial. There are 17 organizations that support multiple projects, and six of these—Artefactual Systems, Atypon/Wiley, Clarivate Analytics, Digital Science, Elsevier, and MDPI—are commercial. The remaining 11—Center for Open Science, Collaborative Knowledge Foundation (Coko), LYRASIS/DuraSpace, Educopia Institute, Internet Archive, JISC, OCLC, OpenAIRE, Open Access Button, Our Research (formerly Impactstory), and the Public Knowledge Project—are nonprofit.Andrew W. Mellon Foundatio
    corecore