1,873 research outputs found

    Contribution to the publish/subscribe communication model for the development of ubiquitous services in wireless sensor networks

    Get PDF
    Advances in wireless technologies and the rapid development of integrated electronics have made wireless sensor networks (WSN) one of the key technologies of the Internet of Things (IoT). Thanks to the ability of these networks to measure the physical phenomena of their environment, process and communicate this information using wireless technologies, they have excelled in the development of applications that respond or adapt to the context of a user, as is the case of ubiquitous environments such as smart cities, industrial automation, e-health among others. In addition, the IoT has opened the possibility that intelligent objects or devices are also capable of exchanging status information, conditions and capacity in order to interact with each other, in the same way that human beings have done through systems based on presence. These systems require information of an event in real time to react in a timely manner to the conditions or context of the user or device. These applications open new challenges in the management of WSN resources, since these networks operate in environments that are generally prone to packet loss and consist of generally small nodes with limited resources in memory, processing, bandwidth and power. The main objective of this thesis is the development of several mechanisms that allow the adaptation of the Publish/Subscribe protocols to the characteristics and limitations of the WSN for the provision of ubiquitous services in the context of the IoT. In addition, QoS support is supplied through mechanisms that provide reliability and timeliness in the delivery of packets and data aggregation techniques are applied to be efficient in the power consumption and the WSN bandwidth. Our research proposes an architecture to provide a presence service for WSN based on a Publish/ Subscribe model distributed and focused on mechanisms such as the aggregation of data and the publication of messages on demand to achieve energy efficiency and bandwidth. All these mechanisms have been applied in the design of a system called PASH aimed at home control based on the concept of Ambient Assisted Living (AAL). The reliability provided by Publish/Subscribe protocols in WSN is of great importance in the design of applications that require receiving a message to react on time or in real time to an event. Initially, we focused our study on increasing the proportion of packet delivery (PDR) in the destination node through the improvement of reliability mechanisms. We evaluated the reliability mechanism of the MQTT-SN protocol and several proposed mechanisms of the CoAP protocol. From this evaluation, we propose a new and simple adaptive retransmission mechanism to respond to packet loss in the most appropriate way. Finally, we consider that applications such as: e-health, critical infrastructure control and monitoring, among others, must meet different QoS requirements, such as reliability and timeliness for each type of message received. In addition, data aggregation techniques play an important role in WSN to reduce power consumption and bandwidth. In this thesis, we propose a mechanism that provides the application with three different levels of QoS: we provide an improvement of our previous relay mechanism for reliability, we include the data aggregation in our reliability mechanism and we provide a timeliness mechanism in the delivery of packets.Els avenços en les tecnologies sense fils i al ràpid desenvolupament de l'electrònica integrada, ha convertit les xarxes de sensors sense fils (WSN) en una de les tecnologies claus de l'Internet de les Coses (Iot). Gràcies a la capacitat que tenen aquestes xarxes de mesurar els fenòmens físics del seu entorn, processar i comunicar aquesta informació utilitzant tecnologies sense fils, s'han destacat en el desenvolupament d'aplicacions que responguin o s'adaptin al context d'un usuari, com és el cas dels entorns ubics com a ciutats intel·ligents, automatització industrial, salut electrònica entre d'altres. A més, el IOT ha obert la possibilitat que els objectes o dispositius intel·ligents també siguin capaços d'intercanviar informació d'estat, condicions i capacitat per tal d'interactuar entre si, de la mateixa manera que els éssers humans ho han fet a través de sistemes basats en presència. Aquests sistemes requereixen informació d'un esdeveniment en temps real per reaccionar de manera oportuna a les condicions o al context de l'usuari o dispositiu. Aquestes aplicacions obren nous desafiaments en l'administració dels recursos de WSN, ja que aquestes xarxes operen en entorns que generalment són propensos a la pèrdua de paquets i consten de nodes generalment petits amb recursos limitats en memòria, processament, ample de banda i alimentació. El principal objectiu d'aquesta tesi és el desenvolupament de diversos mecanismes que permetin l'adequació dels protocols d'Publish / Subscribe a les característiques i limitacions de la WSN per a la provisió de serveis ubics en el context de la IOT. A més, es brinda suport de QoS a través de mecanismes que proporcionen fiabilitat i puntualitat en el lliurament de paquets i s'apliquen tècniques d'agregació de dades per a ser eficients en el consum d'energia i l'ample de banda de la WSN. La nostra investigació proposa una arquitectura per a proporcionar un servei de presència per WSN basat en un model de Publish / Subscribe distribuït i centrat en mecanismes com ara l'agregació de dades i la publicació de missatges en demanda per aconseguir eficiència en l'energia i l'ample de banda . Tots aquests mecanismes han estat aplicats en el disseny d'un sistema anomenat Pash dirigit al control de la llar basat en el concepte de Vida Assistida (AAL). La fiabilitat proporcionada pels protocols de Publish / Subscribe WSN és de gran importància en el disseny d'aplicacions que requereixen rebre un missatge per reaccionar a temps o en temps real davant un esdeveniment. Inicialment enfoquem el nostre estudi en augmentar la proporció de lliurament de paquets (PDR) en el node de destinació a través de la millora dels mecanismes de fiabilitat. Avaluem el mecanisme de fiabilitat del protocol MQTT-SN i diversos mecanismes proposats del protocol COAP. A partir d'aquesta avaluació, proposem un nou i senzill mecanisme de retransmissió adaptable per respondre a la pèrdua de paquets de la manera més adequada. Finalment, considerem que les aplicacions com: salut electrònica, control d'infraestructura crítica i monitoratge, entre d'altres, han de complir diferents requisits de QoS, com la fiabilitat i la puntualitat per a cada tipus de missatge rebut. A més, les tècniques d'agregació de dades tenen un paper important en WSN per reduir el consum d'energia i l'ample de banda. En aquesta tesi, proposem un mecanisme que proporciona a la aplicació tres nivells de QoS diferents: proporcionem una millora del nostre mecanisme de retransmissió anterior per a la fiabilitat, incloem l'agregació de dades en el nostre mecanisme de fiabilitat i proporcionem un mecanisme de puntualitat en el lliurament de paquets.Postprint (published version

    Contribution to the publish/subscribe communication model for the development of ubiquitous services in wireless sensor networks

    Get PDF
    Advances in wireless technologies and the rapid development of integrated electronics have made wireless sensor networks (WSN) one of the key technologies of the Internet of Things (IoT). Thanks to the ability of these networks to measure the physical phenomena of their environment, process and communicate this information using wireless technologies, they have excelled in the development of applications that respond or adapt to the context of a user, as is the case of ubiquitous environments such as smart cities, industrial automation, e-health among others. In addition, the IoT has opened the possibility that intelligent objects or devices are also capable of exchanging status information, conditions and capacity in order to interact with each other, in the same way that human beings have done through systems based on presence. These systems require information of an event in real time to react in a timely manner to the conditions or context of the user or device. These applications open new challenges in the management of WSN resources, since these networks operate in environments that are generally prone to packet loss and consist of generally small nodes with limited resources in memory, processing, bandwidth and power. The main objective of this thesis is the development of several mechanisms that allow the adaptation of the Publish/Subscribe protocols to the characteristics and limitations of the WSN for the provision of ubiquitous services in the context of the IoT. In addition, QoS support is supplied through mechanisms that provide reliability and timeliness in the delivery of packets and data aggregation techniques are applied to be efficient in the power consumption and the WSN bandwidth. Our research proposes an architecture to provide a presence service for WSN based on a Publish/ Subscribe model distributed and focused on mechanisms such as the aggregation of data and the publication of messages on demand to achieve energy efficiency and bandwidth. All these mechanisms have been applied in the design of a system called PASH aimed at home control based on the concept of Ambient Assisted Living (AAL). The reliability provided by Publish/Subscribe protocols in WSN is of great importance in the design of applications that require receiving a message to react on time or in real time to an event. Initially, we focused our study on increasing the proportion of packet delivery (PDR) in the destination node through the improvement of reliability mechanisms. We evaluated the reliability mechanism of the MQTT-SN protocol and several proposed mechanisms of the CoAP protocol. From this evaluation, we propose a new and simple adaptive retransmission mechanism to respond to packet loss in the most appropriate way. Finally, we consider that applications such as: e-health, critical infrastructure control and monitoring, among others, must meet different QoS requirements, such as reliability and timeliness for each type of message received. In addition, data aggregation techniques play an important role in WSN to reduce power consumption and bandwidth. In this thesis, we propose a mechanism that provides the application with three different levels of QoS: we provide an improvement of our previous relay mechanism for reliability, we include the data aggregation in our reliability mechanism and we provide a timeliness mechanism in the delivery of packets.Els avenços en les tecnologies sense fils i al ràpid desenvolupament de l'electrònica integrada, ha convertit les xarxes de sensors sense fils (WSN) en una de les tecnologies claus de l'Internet de les Coses (Iot). Gràcies a la capacitat que tenen aquestes xarxes de mesurar els fenòmens físics del seu entorn, processar i comunicar aquesta informació utilitzant tecnologies sense fils, s'han destacat en el desenvolupament d'aplicacions que responguin o s'adaptin al context d'un usuari, com és el cas dels entorns ubics com a ciutats intel·ligents, automatització industrial, salut electrònica entre d'altres. A més, el IOT ha obert la possibilitat que els objectes o dispositius intel·ligents també siguin capaços d'intercanviar informació d'estat, condicions i capacitat per tal d'interactuar entre si, de la mateixa manera que els éssers humans ho han fet a través de sistemes basats en presència. Aquests sistemes requereixen informació d'un esdeveniment en temps real per reaccionar de manera oportuna a les condicions o al context de l'usuari o dispositiu. Aquestes aplicacions obren nous desafiaments en l'administració dels recursos de WSN, ja que aquestes xarxes operen en entorns que generalment són propensos a la pèrdua de paquets i consten de nodes generalment petits amb recursos limitats en memòria, processament, ample de banda i alimentació. El principal objectiu d'aquesta tesi és el desenvolupament de diversos mecanismes que permetin l'adequació dels protocols d'Publish / Subscribe a les característiques i limitacions de la WSN per a la provisió de serveis ubics en el context de la IOT. A més, es brinda suport de QoS a través de mecanismes que proporcionen fiabilitat i puntualitat en el lliurament de paquets i s'apliquen tècniques d'agregació de dades per a ser eficients en el consum d'energia i l'ample de banda de la WSN. La nostra investigació proposa una arquitectura per a proporcionar un servei de presència per WSN basat en un model de Publish / Subscribe distribuït i centrat en mecanismes com ara l'agregació de dades i la publicació de missatges en demanda per aconseguir eficiència en l'energia i l'ample de banda . Tots aquests mecanismes han estat aplicats en el disseny d'un sistema anomenat Pash dirigit al control de la llar basat en el concepte de Vida Assistida (AAL). La fiabilitat proporcionada pels protocols de Publish / Subscribe WSN és de gran importància en el disseny d'aplicacions que requereixen rebre un missatge per reaccionar a temps o en temps real davant un esdeveniment. Inicialment enfoquem el nostre estudi en augmentar la proporció de lliurament de paquets (PDR) en el node de destinació a través de la millora dels mecanismes de fiabilitat. Avaluem el mecanisme de fiabilitat del protocol MQTT-SN i diversos mecanismes proposats del protocol COAP. A partir d'aquesta avaluació, proposem un nou i senzill mecanisme de retransmissió adaptable per respondre a la pèrdua de paquets de la manera més adequada. Finalment, considerem que les aplicacions com: salut electrònica, control d'infraestructura crítica i monitoratge, entre d'altres, han de complir diferents requisits de QoS, com la fiabilitat i la puntualitat per a cada tipus de missatge rebut. A més, les tècniques d'agregació de dades tenen un paper important en WSN per reduir el consum d'energia i l'ample de banda. En aquesta tesi, proposem un mecanisme que proporciona a la aplicació tres nivells de QoS diferents: proporcionem una millora del nostre mecanisme de retransmissió anterior per a la fiabilitat, incloem l'agregació de dades en el nostre mecanisme de fiabilitat i proporcionem un mecanisme de puntualitat en el lliurament de paquets.Postprint (published version

    Dynamic adaptation of interaction models for stateful web services

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaWireless Sensor Networks (WSNs) are accepted as one of the fundamental technologies for current and future science in all domains, where WSNs formed from either static or mobile sensor devices allow a low cost high-resolution sensing of the environment. Such opens the possibility of developing new kinds of crucial applications or providing more accurate data to more traditional ones. For instance, examples may range from large-scale WSNs deployed on oceans contributing to weather prediction simulations; to high number of diverse Sensor devices deployed over a geographical area at different heights from the ground for collecting more accurate data for cyclic wildfire spread simulations; or to networks of mobile phone devices contributing to urban traffic management via Participatory Sensing applications. In order to simplify data access, network parameterisation, and WSNs aggregation, WSNs have been integrated in Web environments, namely through high level standard interfaces like Web services. However, the typical interface access usually supports a restricted number of interaction models and the available mechanisms for their run-time adaptation are still scarce. Nevertheless, applications demand a richer and more flexible control on interface accesses – e.g. such accesses may depend on contextual information and, consequently, may evolve in time. Additionally, Web services have become increasingly popular in the latest years, and their usage led to the need of aggregating and coordinating them and also to represent state in between Web services invocations. Current standard composition languages for Web services (wsbpel,wsci,bpml) deal with the traditional forms of service aggregation and coordination, while WS-Resource framework (wsrf) deals with accessing services pertaining state concerns (relating both executing applications and the runtime environment). Subjacent to the notion of service coordination is the need to capture dependencies among them (through the workflow concept, for instance), reuse common interaction models, e.g. embodied in common behavioural Patterns like Client/Server, Publish/- Subscriber, Stream, and respond to dynamic events in the system (novel user requests, service failures, etc.). Dynamic adaptation, in particular, is a pressing requirement for current service-based systems due to the increasing trend on XaaS ("everything as a service") which promises to reduce costs on application development and infrastructure support, as is already apparent in the Cloud computing domain. Therefore, the self-adaptive (or dynamic/adaptive) systems present themselves as a solution to the above concerns. However, since they comprise a vast area, this thesis only focus on self-adaptive software. Concretely, we propose a novel model for dynamic interactions, in particular with Stateful Web Services, i.e. services interfacing continued activities. The solution consists on a middleware prototype based on pattern abstractions which may be able to provide (novel) richer interaction models and a few structured dynamic adaptation mechanisms, which are captured in the context of a "Session" abstraction. The middleware was implemented and uses a pre-existent framework supporting Web enabled access to WSNs, and some evaluation scenarios were tested in this setting. Namely, this area was chosen as the application domain that contextualizes this work as it contributes to the development of increasingly important applications needing highresolution and low cost sensing of environment. The result is a novel way to specify richer and dynamic modes of accessing and acquiring data generated by WSNs.Este trabalho foi parcialmente financiado pelo Centro de Informática e Tecnologias da Informação (CITI), e pela Fundação para a Ciência e a Tecnologia (FCT / MCTES) em projectos de investigaçã

    Digital Preservation Services : State of the Art Analysis

    Get PDF
    Research report funded by the DC-NET project.An overview of the state of the art in service provision for digital preservation and curation. Its focus is on the areas where bridging the gaps is needed between e-Infrastructures and efficient and forward-looking digital preservation services. Based on a desktop study and a rapid analysis of some 190 currently available tools and services for digital preservation, the deliverable provides a high-level view on the range of instruments currently on offer to support various functions within a preservation system.European Commission, FP7peer-reviewe

    Analyzing audit trails in a distributed and hybrid intrusion detection platform

    Get PDF
    Efforts have been made over the last decades in order to design and perfect Intrusion Detection Systems (IDS). In addition to the widespread use of Intrusion Prevention Systems (IPS) as perimeter defense devices in systems and networks, various IDS solutions are used together as elements of holistic approaches to cyber security incident detection and prevention, including Network-Intrusion Detection Systems (NIDS) and Host-Intrusion Detection Systems (HIDS). Nevertheless, specific IDS and IPS technology face several effectiveness challenges to respond to the increasing scale and complexity of information systems and sophistication of attacks. The use of isolated IDS components, focused on one-dimensional approaches, strongly limits a common analysis based on evidence correlation. Today, most organizations’ cyber-security operations centers still rely on conventional SIEM (Security Information and Event Management) technology. However, SIEM platforms also have significant drawbacks in dealing with heterogeneous and specialized security event-sources, lacking the support for flexible and uniform multi-level analysis of security audit-trails involving distributed and heterogeneous systems. In this thesis, we propose an auditing solution that leverages on different intrusion detection components and synergistically combines them in a Distributed and Hybrid IDS (DHIDS) platform, taking advantage of their benefits while overcoming the effectiveness drawbacks of each one. In this approach, security events are detected by multiple probes forming a pervasive, heterogeneous and distributed monitoring environment spread over the network, integrating NIDS, HIDS and specialized Honeypot probing systems. Events from those heterogeneous sources are converted to a canonical representation format, and then conveyed through a Publish-Subscribe middleware to a dedicated logging and auditing system, built on top of an elastic and scalable document-oriented storage system. The aggregated events can then be queried and matched against suspicious attack signature patterns, by means of a proposed declarative query-language that provides event-correlation semantics

    Contribution to the publish/subscribe communication model for the development of ubiquitous services in wireless sensor networks

    Get PDF
    Advances in wireless technologies and the rapid development of integrated electronics have made wireless sensor networks (WSN) one of the key technologies of the Internet of Things (IoT). Thanks to the ability of these networks to measure the physical phenomena of their environment, process and communicate this information using wireless technologies, they have excelled in the development of applications that respond or adapt to the context of a user, as is the case of ubiquitous environments such as smart cities, industrial automation, e-health among others. In addition, the IoT has opened the possibility that intelligent objects or devices are also capable of exchanging status information, conditions and capacity in order to interact with each other, in the same way that human beings have done through systems based on presence. These systems require information of an event in real time to react in a timely manner to the conditions or context of the user or device. These applications open new challenges in the management of WSN resources, since these networks operate in environments that are generally prone to packet loss and consist of generally small nodes with limited resources in memory, processing, bandwidth and power. The main objective of this thesis is the development of several mechanisms that allow the adaptation of the Publish/Subscribe protocols to the characteristics and limitations of the WSN for the provision of ubiquitous services in the context of the IoT. In addition, QoS support is supplied through mechanisms that provide reliability and timeliness in the delivery of packets and data aggregation techniques are applied to be efficient in the power consumption and the WSN bandwidth. Our research proposes an architecture to provide a presence service for WSN based on a Publish/ Subscribe model distributed and focused on mechanisms such as the aggregation of data and the publication of messages on demand to achieve energy efficiency and bandwidth. All these mechanisms have been applied in the design of a system called PASH aimed at home control based on the concept of Ambient Assisted Living (AAL). The reliability provided by Publish/Subscribe protocols in WSN is of great importance in the design of applications that require receiving a message to react on time or in real time to an event. Initially, we focused our study on increasing the proportion of packet delivery (PDR) in the destination node through the improvement of reliability mechanisms. We evaluated the reliability mechanism of the MQTT-SN protocol and several proposed mechanisms of the CoAP protocol. From this evaluation, we propose a new and simple adaptive retransmission mechanism to respond to packet loss in the most appropriate way. Finally, we consider that applications such as: e-health, critical infrastructure control and monitoring, among others, must meet different QoS requirements, such as reliability and timeliness for each type of message received. In addition, data aggregation techniques play an important role in WSN to reduce power consumption and bandwidth. In this thesis, we propose a mechanism that provides the application with three different levels of QoS: we provide an improvement of our previous relay mechanism for reliability, we include the data aggregation in our reliability mechanism and we provide a timeliness mechanism in the delivery of packets.Els avenços en les tecnologies sense fils i al ràpid desenvolupament de l'electrònica integrada, ha convertit les xarxes de sensors sense fils (WSN) en una de les tecnologies claus de l'Internet de les Coses (Iot). Gràcies a la capacitat que tenen aquestes xarxes de mesurar els fenòmens físics del seu entorn, processar i comunicar aquesta informació utilitzant tecnologies sense fils, s'han destacat en el desenvolupament d'aplicacions que responguin o s'adaptin al context d'un usuari, com és el cas dels entorns ubics com a ciutats intel·ligents, automatització industrial, salut electrònica entre d'altres. A més, el IOT ha obert la possibilitat que els objectes o dispositius intel·ligents també siguin capaços d'intercanviar informació d'estat, condicions i capacitat per tal d'interactuar entre si, de la mateixa manera que els éssers humans ho han fet a través de sistemes basats en presència. Aquests sistemes requereixen informació d'un esdeveniment en temps real per reaccionar de manera oportuna a les condicions o al context de l'usuari o dispositiu. Aquestes aplicacions obren nous desafiaments en l'administració dels recursos de WSN, ja que aquestes xarxes operen en entorns que generalment són propensos a la pèrdua de paquets i consten de nodes generalment petits amb recursos limitats en memòria, processament, ample de banda i alimentació. El principal objectiu d'aquesta tesi és el desenvolupament de diversos mecanismes que permetin l'adequació dels protocols d'Publish / Subscribe a les característiques i limitacions de la WSN per a la provisió de serveis ubics en el context de la IOT. A més, es brinda suport de QoS a través de mecanismes que proporcionen fiabilitat i puntualitat en el lliurament de paquets i s'apliquen tècniques d'agregació de dades per a ser eficients en el consum d'energia i l'ample de banda de la WSN. La nostra investigació proposa una arquitectura per a proporcionar un servei de presència per WSN basat en un model de Publish / Subscribe distribuït i centrat en mecanismes com ara l'agregació de dades i la publicació de missatges en demanda per aconseguir eficiència en l'energia i l'ample de banda . Tots aquests mecanismes han estat aplicats en el disseny d'un sistema anomenat Pash dirigit al control de la llar basat en el concepte de Vida Assistida (AAL). La fiabilitat proporcionada pels protocols de Publish / Subscribe WSN és de gran importància en el disseny d'aplicacions que requereixen rebre un missatge per reaccionar a temps o en temps real davant un esdeveniment. Inicialment enfoquem el nostre estudi en augmentar la proporció de lliurament de paquets (PDR) en el node de destinació a través de la millora dels mecanismes de fiabilitat. Avaluem el mecanisme de fiabilitat del protocol MQTT-SN i diversos mecanismes proposats del protocol COAP. A partir d'aquesta avaluació, proposem un nou i senzill mecanisme de retransmissió adaptable per respondre a la pèrdua de paquets de la manera més adequada. Finalment, considerem que les aplicacions com: salut electrònica, control d'infraestructura crítica i monitoratge, entre d'altres, han de complir diferents requisits de QoS, com la fiabilitat i la puntualitat per a cada tipus de missatge rebut. A més, les tècniques d'agregació de dades tenen un paper important en WSN per reduir el consum d'energia i l'ample de banda. En aquesta tesi, proposem un mecanisme que proporciona a la aplicació tres nivells de QoS diferents: proporcionem una millora del nostre mecanisme de retransmissió anterior per a la fiabilitat, incloem l'agregació de dades en el nostre mecanisme de fiabilitat i proporcionem un mecanisme de puntualitat en el lliurament de paquets

    MementoMap: A Web Archive Profiling Framework for Efficient Memento Routing

    Get PDF
    With the proliferation of public web archives, it is becoming more important to better profile their contents, both to understand their immense holdings as well as to support routing of requests in Memento aggregators. A memento is a past version of a web page and a Memento aggregator is a tool or service that aggregates mementos from many different web archives. To save resources, the Memento aggregator should only poll the archives that are likely to have a copy of the requested Uniform Resource Identifier (URI). Using the Crawler Index (CDX), we generate profiles of the archives that summarize their holdings and use them to inform routing of the Memento aggregator’s URI requests. Additionally, we use full text search (when available) or sample URI lookups to build an understanding of an archive’s holdings. Previous work in profiling ranged from using full URIs (no false positives, but with large profiles) to using only top-level domains (TLDs) (smaller profiles, but with many false positives). This work explores strategies in between these two extremes. For evaluation we used CDX files from Archive-It, UK Web Archive, Stanford Web Archive Portal, and Arquivo.pt. Moreover, we used web server access log files from the Internet Archive’s Wayback Machine, UK Web Archive, Arquivo.pt, LANL’s Memento Proxy, and ODU’s MemGator Server. In addition, we utilized historical dataset of URIs from DMOZ. In early experiments with various URI-based static profiling policies we successfully identified about 78% of the URIs that were not present in the archive with less than 1% relative cost as compared to the complete knowledge profile and 94% URIs with less than 10% relative cost without any false negatives. In another experiment we found that we can correctly route 80% of the requests while maintaining about 0.9 recall by discovering only 10% of the archive holdings and generating a profile that costs less than 1% of the complete knowledge profile. We created MementoMap, a framework that allows web archives and third parties to express holdings and/or voids of an archive of any size with varying levels of details to fulfil various application needs. Our archive profiling framework enables tools and services to predict and rank archives where mementos of a requested URI are likely to be present. In static profiling policies we predefined the maximum depth of host and path segments of URIs for each policy that are used as URI keys. This gave us a good baseline for evaluation, but was not suitable for merging profiles with different policies. Later, we introduced a more flexible means to represent URI keys that uses wildcard characters to indicate whether a URI key was truncated. Moreover, we developed an algorithm to rollup URI keys dynamically at arbitrary depths when sufficient archiving activity is detected under certain URI prefixes. In an experiment with dynamic profiling of archival holdings we found that a MementoMap of less than 1.5% relative cost can correctly identify the presence or absence of 60% of the lookup URIs in the corresponding archive without any false negatives (i.e., 100% recall). In addition, we separately evaluated archival voids based on the most frequently accessed resources in the access log and found that we could have avoided more than 8% of the false positives without introducing any false negatives. We defined a routing score that can be used for Memento routing. Using a cut-off threshold technique on our routing score we achieved over 96% accuracy if we accept about 89% recall and for a recall of 99% we managed to get about 68% accuracy, which translates to about 72% saving in wasted lookup requests in our Memento aggregator. Moreover, when using top-k archives based on our routing score for routing and choosing only the topmost archive, we missed only about 8% of the sample URIs that are present in at least one archive, but when we selected top-2 archives, we missed less than 2% of these URIs. We also evaluated a machine learning-based routing approach, which resulted in an overall better accuracy, but poorer recall due to low prevalence of the sample lookup URI dataset in different web archives. We contributed various algorithms, such as a space and time efficient approach to ingest large lists of URIs to generate MementoMaps and a Random Searcher Model to discover samples of holdings of web archives. We contributed numerous tools to support various aspects of web archiving and replay, such as MemGator (a Memento aggregator), Inter- Planetary Wayback (a novel archival replay system), Reconstructive (a client-side request rerouting ServiceWorker), and AccessLog Parser. Moreover, this work yielded a file format specification draft called Unified Key Value Store (UKVS) that we use for serialization and dissemination of MementoMaps. It is a flexible and extensible file format that allows easy interactions with Unix text processing tools. UKVS can be used in many applications beyond MementoMaps

    Automated tools and techniques for distributed Grid Software: Development of the testbed infrastructure

    Get PDF
    Grid technology is becoming more and more important as the new paradigm for sharing computational resources across different organizations in a secure way. The great powerfulness of this solution, requires the definition of a generic stack of services and protocols and this is the scope of the different Grid initiatives. As a result of international collaborations for its development, the Open Grid Forum created the Open Grid Services Architecture (OGSA) which aims to define the common set of services that will enable interoperability across the different implementations. This master thesis has been developed in this framework, as part of the two European-funded projects ETICS and OMII-Europe. The main objective is to contribute to the design and maintenance of large distributed development projects with the automated tool that enables to implement Software Engineering techniques oriented to achieve an acceptable level of quality at the release process. Specifically, this thesis develops the testbed concept as the virtual production-like scenario where to perform compliance tests. As proof of concept, the OGSA Basic Execution Service has been chosen in order to implement and execute conformance tests within the ETICS automated testbed framework
    corecore