272 research outputs found

    Performance assessment of real-time data management on wireless sensor networks

    Get PDF
    Technological advances in recent years have allowed the maturity of Wireless Sensor Networks (WSNs), which aim at performing environmental monitoring and data collection. This sort of network is composed of hundreds, thousands or probably even millions of tiny smart computers known as wireless sensor nodes, which may be battery powered, equipped with sensors, a radio transceiver, a Central Processing Unit (CPU) and some memory. However due to the small size and the requirements of low-cost nodes, these sensor node resources such as processing power, storage and especially energy are very limited. Once the sensors perform their measurements from the environment, the problem of data storing and querying arises. In fact, the sensors have restricted storage capacity and the on-going interaction between sensors and environment results huge amounts of data. Techniques for data storage and query in WSN can be based on either external storage or local storage. The external storage, called warehousing approach, is a centralized system on which the data gathered by the sensors are periodically sent to a central database server where user queries are processed. The local storage, in the other hand called distributed approach, exploits the capabilities of sensors calculation and the sensors act as local databases. The data is stored in a central database server and in the devices themselves, enabling one to query both. The WSNs are used in a wide variety of applications, which may perform certain operations on collected sensor data. However, for certain applications, such as real-time applications, the sensor data must closely reflect the current state of the targeted environment. However, the environment changes constantly and the data is collected in discreet moments of time. As such, the collected data has a temporal validity, and as time advances, it becomes less accurate, until it does not reflect the state of the environment any longer. Thus, these applications must query and analyze the data in a bounded time in order to make decisions and to react efficiently, such as industrial automation, aviation, sensors network, and so on. In this context, the design of efficient real-time data management solutions is necessary to deal with both time constraints and energy consumption. This thesis studies the real-time data management techniques for WSNs. It particularly it focuses on the study of the challenges in handling real-time data storage and query for WSNs and on the efficient real-time data management solutions for WSNs. First, the main specifications of real-time data management are identified and the available real-time data management solutions for WSNs in the literature are presented. Secondly, in order to provide an energy-efficient real-time data management solution, the techniques used to manage data and queries in WSNs based on the distributed paradigm are deeply studied. In fact, many research works argue that the distributed approach is the most energy-efficient way of managing data and queries in WSNs, instead of performing the warehousing. In addition, this approach can provide quasi real-time query processing because the most current data will be retrieved from the network. Thirdly, based on these two studies and considering the complexity of developing, testing, and debugging this kind of complex system, a model for a simulation framework of the real-time databases management on WSN that uses a distributed approach and its implementation are proposed. This will help to explore various solutions of real-time database techniques on WSNs before deployment for economizing money and time. Moreover, one may improve the proposed model by adding the simulation of protocols or place part of this simulator on another available simulator. For validating the model, a case study considering real-time constraints as well as energy constraints is discussed. Fourth, a new architecture that combines statistical modeling techniques with the distributed approach and a query processing algorithm to optimize the real-time user query processing are proposed. This combination allows performing a query processing algorithm based on admission control that uses the error tolerance and the probabilistic confidence interval as admission parameters. The experiments based on real world data sets as well as synthetic data sets demonstrate that the proposed solution optimizes the real-time query processing to save more energy while meeting low latency.Fundação para a Ciência e Tecnologi

    Cost- and workload-driven data management in the cloud

    Get PDF
    This thesis deals with the challenge of finding the right balance between consistency, availability, latency and costs, captured by the CAP/PACELC trade-offs, in the context of distributed data management in the Cloud. At the core of this work, cost and workload-driven data management protocols, called CCQ protocols, are developed. First, this includes the development of C3, which is an adaptive consistency protocol that is able to adjust consistency at runtime by considering consistency and inconsistency costs. Second, the development of Cumulus, an adaptive data partitioning protocol, that can adapt partitions by considering the application workload so that expensive distributed transactions are minimized or avoided. And third, the development of QuAD, a quorum-based replication protocol, that constructs the quorums in such a way so that, given a set of constraints, the best possible performance is achieved. The behavior of each CCQ protocol is steered by a cost model, which aims at reducing the costs and overhead for providing the desired data management guarantees. The CCQ protocols are able to continuously assess their behavior, and if necessary to adapt the behavior at runtime based on application workload and the cost model. This property is crucial for applications deployed in the Cloud, as they are characterized by a highly dynamic workload, and high scalability and availability demands. The dynamic adaptation of the behavior at runtime does not come for free, and may generate considerable overhead that might outweigh the gain of adaptation. The CCQ cost models incorporate a control mechanism, which aims at avoiding expensive and unnecessary adaptations, which do not provide any benefits to applications. The adaptation is a distributed activity that requires coordination between the sites in a distributed database system. The CCQ protocols implement safe online adaptation approaches, which exploit the properties of 2PC and 2PL to ensure that all sites behave in accordance with the cost model, even in the presence of arbitrary failures. It is crucial to guarantee a globally consistent view of the behavior, as in contrary the effects of the cost models are nullified. The presented protocols are implemented as part of a prototypical database system. Their modular architecture allows for a seamless extension of the optimization capabilities at any level of their implementation. Finally, the protocols are quantitatively evaluated in a series of experiments executed in a real Cloud environment. The results show their feasibility and ability to reduce application costs, and to dynamically adjust the behavior at runtime without violating their correctness

    Effective web crawlers

    Get PDF
    Web crawlers are the component of a search engine that must traverse the Web, gathering documents in a local repository for indexing by a search engine so that they can be ranked by their relevance to user queries. Whenever data is replicated in an autonomously updated environment, there are issues with maintaining up-to-date copies of documents. When documents are retrieved by a crawler and have subsequently been altered on the Web, the effect is an inconsistency in user search results. While the impact depends on the type and volume of change, many existing algorithms do not take the degree of change into consideration, instead using simple measures that consider any change as significant. Furthermore, many crawler evaluation metrics do not consider index freshness or the amount of impact that crawling algorithms have on user results. Most of the existing work makes assumptions about the change rate of documents on the Web, or relies on the availability of a long history of change. Our work investigates approaches to improving index consistency: detecting meaningful change, measuring the impact of a crawl on collection freshness from a user perspective, developing a framework for evaluating crawler performance, determining the effectiveness of stateless crawl ordering schemes, and proposing and evaluating the effectiveness of a dynamic crawl approach. Our work is concerned specifically with cases where there is little or no past change statistics with which predictions can be made. Our work analyses different measures of change and introduces a novel approach to measuring the impact of recrawl schemes on search engine users. Our schemes detect important changes that affect user results. Other well-known and widely used schemes have to retrieve around twice the data to achieve the same effectiveness as our schemes. Furthermore, while many studies have assumed that the Web changes according to a model, our experimental results are based on real web documents. We analyse various stateless crawl ordering schemes that have no past change statistics with which to predict which documents will change, none of which, to our knowledge, has been tested to determine effectiveness in crawling changed documents. We empirically show that the effectiveness of these schemes depends on the topology and dynamics of the domain crawled and that no one static crawl ordering scheme can effectively maintain freshness, motivating our work on dynamic approaches. We present our novel approach to maintaining freshness, which uses the anchor text linking documents to determine the likelihood of a document changing, based on statistics gathered during the current crawl. We show that this scheme is highly effective when combined with existing stateless schemes. When we combine our scheme with PageRank, our approach allows the crawler to improve both freshness and quality of a collection. Our scheme improves freshness regardless of which stateless scheme it is used in conjunction with, since it uses both positive and negative reinforcement to determine which document to retrieve. Finally, we present the design and implementation of Lara, our own distributed crawler, which we used to develop our testbed

    A Survey on Consensus Mechanisms and Mining Strategy Management in Blockchain Networks

    Full text link
    © 2013 IEEE. The past decade has witnessed the rapid evolution in blockchain technologies, which has attracted tremendous interests from both the research communities and industries. The blockchain network was originated from the Internet financial sector as a decentralized, immutable ledger system for transactional data ordering. Nowadays, it is envisioned as a powerful backbone/framework for decentralized data processing and data-driven self-organization in flat, open-access networks. In particular, the plausible characteristics of decentralization, immutability, and self-organization are primarily owing to the unique decentralized consensus mechanisms introduced by blockchain networks. This survey is motivated by the lack of a comprehensive literature review on the development of decentralized consensus mechanisms in blockchain networks. In this paper, we provide a systematic vision of the organization of blockchain networks. By emphasizing the unique characteristics of decentralized consensus in blockchain networks, our in-depth review of the state-of-the-art consensus protocols is focused on both the perspective of distributed consensus system design and the perspective of incentive mechanism design. From a game-theoretic point of view, we also provide a thorough review of the strategy adopted for self-organization by the individual nodes in the blockchain backbone networks. Consequently, we provide a comprehensive survey of the emerging applications of blockchain networks in a broad area of telecommunication. We highlight our special interest in how the consensus mechanisms impact these applications. Finally, we discuss several open issues in the protocol design for blockchain consensus and the related potential research directions

    Weiterentwicklung analytischer Datenbanksysteme

    Get PDF
    This thesis contributes to the state of the art in analytical database systems. First, we identify and explore extensions to better support analytics on event streams. Second, we propose a novel polygon index to enable efficient geospatial data processing in main memory. Third, we contribute a new deep learning approach to cardinality estimation, which is the core problem in cost-based query optimization.Diese Arbeit trägt zum aktuellen Forschungsstand von analytischen Datenbanksystemen bei. Wir identifizieren und explorieren Erweiterungen um Analysen auf Eventströmen besser zu unterstützen. Wir stellen eine neue Indexstruktur für Polygone vor, die eine effiziente Verarbeitung von Geodaten im Hauptspeicher ermöglicht. Zudem präsentieren wir einen neuen Ansatz für Kardinalitätsschätzungen mittels maschinellen Lernens

    Smart Wireless Sensor Networks

    Get PDF
    The recent development of communication and sensor technology results in the growth of a new attractive and challenging area - wireless sensor networks (WSNs). A wireless sensor network which consists of a large number of sensor nodes is deployed in environmental fields to serve various applications. Facilitated with the ability of wireless communication and intelligent computation, these nodes become smart sensors which do not only perceive ambient physical parameters but also be able to process information, cooperate with each other and self-organize into the network. These new features assist the sensor nodes as well as the network to operate more efficiently in terms of both data acquisition and energy consumption. Special purposes of the applications require design and operation of WSNs different from conventional networks such as the internet. The network design must take into account of the objectives of specific applications. The nature of deployed environment must be considered. The limited of sensor nodes� resources such as memory, computational ability, communication bandwidth and energy source are the challenges in network design. A smart wireless sensor network must be able to deal with these constraints as well as to guarantee the connectivity, coverage, reliability and security of network's operation for a maximized lifetime. This book discusses various aspects of designing such smart wireless sensor networks. Main topics includes: design methodologies, network protocols and algorithms, quality of service management, coverage optimization, time synchronization and security techniques for sensor networks

    Cooperative mechanisms for information dissemination and retrieval in networks with autonomous nodes

    Get PDF
    Αυτή η διατριβή συνεισφέρει στη βιβλιογραφία με το να προτείνει και να μοντελοποιήσει καινοτόμους αλγορίθμους και σχήματα που επιτρέπουν στις διεργασίες διάδοσης και ανάκτησης πληροφοριών – και γενικότερα της διαχείρισης περιεχομένου – να εκτελεστούν πιο αποτελεσματικά σε ένα σύγχρονο περιβάλλον δικτύωσης. Εκτός από τη διάδοση και ανάκτηση των πληροφοριών, άλλες πτυχές της διαχείρισης περιεχομένου που εξετάζουμε είναι η αποθήκευση και η κατηγοριοποίηση. Η πιο σημαντική πρόκληση που αφορά πολλά από τα σχήματα που προτείνονται στην παρούσα εργασία είναι η ανάγκη να διαχειριστούν την αυτονομία των κόμβων, διατηρώντας παράλληλα τον κατανεμημένο, καθώς και τον ανοικτό χαρακτήρα του συστήματος. Κατά το σχεδιασμό κατανεμημένων μηχανισμών σε δίκτυα με αυτόνομους κόμβους, ένα σημαντικό επίσης Ζητούμενο είναι να δημιουργηθούν κίνητρα ώστε οι κόμβοι να συνεργάζονται κατά την εκτέλεση των καθηκόντων επικοινωνίας. Ένα καινούργιο χαρακτηριστικό των περισσοτέρων από τα προτεινόμενα σχήματα είναι η αξιοποίηση των κοινωνικών χαρακτηριστικών των κόμβων, εστιάζοντας στο πώς τα κοινά ενδιαφέροντα των κόμβων μπορούν να αξιοποιηθούν για τη βελτίωση της αποδοτικότητας στην επικοινωνία. Για την αξιολόγηση της απόδοσης των προτεινόμενων αλγορίθμων και σχημάτων, κυρίως αναπτύσσουμε μαθηματικά στοχαστικά μοντέλα και λαμβάνουμε αριθμητικά αποτελέσματα. Όπου είναι απαραίτητο, παρέχουμε αποτελέσματα προσομοίωσης που επαληθεύουν την ακρίβεια αυτών των μοντέλων. Πραγματικά ίχνη δικτύου χρησιμοποιούνται όπου θέλουμε να υποστηρίξουμε περαιτέρω τη λογική για την πρόταση ενός συγκεκριμένου σχήματος. Ένα βασικό εργαλείο για τη μοντελοποίηση και την ανάλυση των προβλημάτων συνεργασίας σε δίκτυα με αυτόνομους κόμβους είναι η θεωρία παιγνίων, η οποία χρησιμοποιείται σε μερικά τμήματα αυτής της διατριβής για να βοηθήσει στην εξακρίβωση της δυνατότητας διατήρησης της συνεργασίας μεταξύ των κόμβων στο δίκτυο. Με την αξιοποίηση των κοινωνικών χαρακτηριστικών των κόμβων, μπαίνουμε επίσης στον τομέα της ανάλυσης των κοινωνικών δικτύων, και χρησιμοποιούμε σχετικές μετρικές και τεχνικές ανάλυσης.This thesis contributes to the literature by proposing and modeling novel algorithms and schemes that allow the tasks of information dissemination and retrieval – and more generally of content management – to be performed more efficiently in a modern networking environment. Apart from information dissemination and retrieval, other aspects of content management we examine are content storage and classification. The most important challenge that will preoccupy many of the proposed schemes is the need to manage the autonomy of nodes while preserving the distributed, as well as the open nature of the system. In designing distributed mechanisms in networks with autonomous nodes, an important challenge is also to develop incentives for nodes to cooperate while performing communication tasks. A novel characteristic of most of the proposed schemes is the exploitation of social characteristics of nodes, focusing on how common interests of nodes can be used to improve communication efficiency. In order to evaluate the performance of the proposed algorithms and schemes, we mainly develop mathematical stochastic models and obtain numerical results. Where it is deemed necessary, we provide simulation results that verify the accuracy of these models. Real network traces are used where we want to further support the rationale for proposing a certain scheme. A key tool for modeling and analyzing cooperation problems in networks with autonomous nodes is game theory, and it is used in parts of this thesis to help identify the feasibility of sustaining cooperation between nodes in the network. By exploiting social characteristics of nodes, we also enter the field of social network analysis, and use related metrics and techniques

    Adaptive Caching of Distributed Components

    Get PDF
    Die Zugriffslokalität referenzierter Daten ist eine wichtige Eigenschaft verteilter Anwendungen. Lokales Zwischenspeichern abgefragter entfernter Daten (Caching) wird vielfach bei der Entwicklung solcher Anwendungen eingesetzt, um diese Eigenschaft auszunutzen. Anschliessende Zugriffe auf diese Daten können so beschleunigt werden, indem sie aus dem lokalen Zwischenspeicher bedient werden. Gegenwärtige Middleware-Architekturen bieten dem Anwendungsprogrammierer jedoch kaum Unterstützung für diesen nicht-funktionalen Aspekt. Die vorliegende Arbeit versucht deshalb, Caching als separaten, konfigurierbaren Middleware-Dienst auszulagern. Durch die Einbindung in den Softwareentwicklungsprozess wird die frühzeitige Modellierung und spätere Wiederverwendung caching-spezifischer Metadaten gewährleistet. Zur Laufzeit kann sich das entwickelte System außerdem bezüglich der Cachebarkeit von Daten adaptiv an geändertes Nutzungsverhalten anpassen.Locality of reference is an important property of distributed applications. Caching is typically employed during the development of such applications to exploit this property by locally storing queried data: Subsequent accesses can be accelerated by serving their results immediately form the local store. Current middleware architectures however hardly support this non-functional aspect. The thesis at hand thus tries outsource caching as a separate, configurable middleware service. Integration into the software development lifecycle provides for early capturing, modeling, and later reuse of cachingrelated metadata. At runtime, the implemented system can adapt to caching access characteristics with respect to data cacheability properties, thus healing misconfigurations and optimizing itself to an appropriate configuration. Speculative prefetching of data probably queried in the immediate future complements the presented approach
    corecore