513 research outputs found

    A Grammar for Reproducible and Painless Extract-Transform-Load Operations on Medium Data

    Get PDF
    Many interesting data sets available on the Internet are of a medium size---too big to fit into a personal computer's memory, but not so large that they won't fit comfortably on its hard disk. In the coming years, data sets of this magnitude will inform vital research in a wide array of application domains. However, due to a variety of constraints they are cumbersome to ingest, wrangle, analyze, and share in a reproducible fashion. These obstructions hamper thorough peer-review and thus disrupt the forward progress of science. We propose a predictable and pipeable framework for R (the state-of-the-art statistical computing environment) that leverages SQL (the venerable database architecture and query language) to make reproducible research on medium data a painless reality.Comment: 30 pages, plus supplementary material

    Assurance of Distributed Algorithms and Systems: Runtime Checking of Safety and Liveness

    Full text link
    This paper presents a general framework and methods for complete programming and checking of distributed algorithms at a high-level, as in pseudocode languages, but precisely specified and directly executable, as in formal specification languages and practical programming languages, respectively. The checking framework, as well as the writing of distributed algorithms and specification of their safety and liveness properties, use DistAlgo, a high-level language for distributed algorithms. We give a complete executable specification of the checking framework, with a complete example algorithm and example safety and liveness properties.Comment: Small fixes to improve property specifications, including improvements not in the RV 2020 final versio

    In-Production Continuous Testing for Future Telco Cloud

    Get PDF
    Software Defined Networking (SDN) is an emerging paradigm to design, build and operate networks. The driving motivation of SDN was the need for a major change in network technologies to support a configuration, management, operation, reconfiguration and evolution than in current computer networks. In the SDN world, performance it is not only related to the behaviour of the data plane. As the separation of control plane and data plane makes the latter significantly more agile, it lays off all the complex processing workload to the control plane. This is further exacerbated in distributed network controller, where the control plane is additionally loaded with the state synchronization overhead. Furthermore, the introduction of SDNs technologies has raised advanced challenges in achieving failure resilience, meant as the persistence of service delivery that can justifiably be trusted, when facing changes, and fault tolerance, meant as the ability to avoid service failures in the presence of faults. Therefore, along with the “softwarization” of network services, it is an important goal in the engineering of such services, e.g. SDNs and NFVs, to be able to test and assess the proper functioning not only in emulated conditions before release and deployment, but also “in-production”, when the system is under real operating conditions.   The goal of this thesis is to devise an approach to evaluate not only the performance, but also the effectiveness of the failure detection, and mitigation mechanisms provided by SDN controllers, as well as the capability of the SDNs to ultimately satisfy nonfunctional requirements, especially resiliency, availability, and reliability. The approach consists of exploiting benchmarking techniques, such as the failure injection, to get continuously feedback on the performance as well as capabilities of the SDN services to survive failures, which is of paramount importance to improve the effective- ness of the system internal mechanisms in reacting to anomalous situations potentially occurring in operation, while its services are regularly updated or improved. Within this vision, this dissertation first presents SCP-CLUB (SDN Control Plane CLoUd-based Benchmarking), a benchmarking frame- work designed to automate the characterization of SDN control plane performance, resilience and fault tolerance in telco cloud deployments. The idea is to provide the same level of automation available in deploying NFV function, for the testing of different configuration, using idle cycles of the telco cloud infrastructure. Then, the dissertation proposes an extension of the framework with mechanisms to evaluate the runtime behaviour of a Telco Cloud SDN under (possibly unforeseen) failure conditions, by exploiting the software failure injection

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Macro-micro approach for mining public sociopolitical opinion from social media

    Get PDF
    During the past decade, we have witnessed the emergence of social media, which has prominence as a means for the general public to exchange opinions towards a broad range of topics. Furthermore, its social and temporal dimensions make it a rich resource for policy makers and organisations to understand public opinion. In this thesis, we present our research in understanding public opinion on Twitter along three dimensions: sentiment, topics and summary. In the first line of our work, we study how to classify public sentiment on Twitter. We focus on the task of multi-target-specific sentiment recognition on Twitter, and propose an approach which utilises the syntactic information from parse-tree in conjunction with the left-right context of the target. We show the state-of-the-art performance on two datasets including a multi-target Twitter corpus on UK elections which we make public available for the research community. Additionally we also conduct two preliminary studies including cross-domain emotion classification on discourse around arts and cultural experiences, and social spam detection to improve the signal-to-noise ratio of our sentiment corpus. Our second line of work focuses on automatic topical clustering of tweets. Our aim is to group tweets into a number of clusters, with each cluster representing a meaningful topic, story, event or a reason behind a particular choice of sentiment. We explore various ways of tackling this challenge and propose a two-stage hierarchical topic modelling system that is efficient and effective in achieving our goal. Lastly, for our third line of work, we study the task of summarising tweets on common topics, with the goal to provide informative summaries for real-world events/stories or explanation underlying the sentiment expressed towards an issue/entity. As most existing tweet summarisation approaches rely on extractive methods, we propose to apply state-of-the-art neural abstractive summarisation model for tweets. We also tackle the challenge of cross-medium supervised summarisation with no target-medium training resources. To the best of our knowledge, there is no existing work on studying neural abstractive summarisation on tweets. In addition, we present a system for providing interactive visualisation of topic-entity sentiments and the corresponding summaries in chronological order. Throughout our work presented in this thesis, we conduct experiments to evaluate and verify the effectiveness of our proposed models, comparing to relevant baseline methods. Most of our evaluations are quantitative, however, we do perform qualitative analyses where it is appropriate. This thesis provides insights and findings that can be used for better understanding public opinion in social media

    Network Traffic Measurements, Applications to Internet Services and Security

    Get PDF
    The Internet has become along the years a pervasive network interconnecting billions of users and is now playing the role of collector for a multitude of tasks, ranging from professional activities to personal interactions. From a technical standpoint, novel architectures, e.g., cloud-based services and content delivery networks, innovative devices, e.g., smartphones and connected wearables, and security threats, e.g., DDoS attacks, are posing new challenges in understanding network dynamics. In such complex scenario, network measurements play a central role to guide traffic management, improve network design, and evaluate application requirements. In addition, increasing importance is devoted to the quality of experience provided to final users, which requires thorough investigations on both the transport network and the design of Internet services. In this thesis, we stress the importance of users’ centrality by focusing on the traffic they exchange with the network. To do so, we design methodologies complementing passive and active measurements, as well as post-processing techniques belonging to the machine learning and statistics domains. Traffic exchanged by Internet users can be classified in three macro-groups: (i) Outbound, produced by users’ devices and pushed to the network; (ii) unsolicited, part of malicious attacks threatening users’ security; and (iii) inbound, directed to users’ devices and retrieved from remote servers. For each of the above categories, we address specific research topics consisting in the benchmarking of personal cloud storage services, the automatic identification of Internet threats, and the assessment of quality of experience in the Web domain, respectively. Results comprise several contributions in the scope of each research topic. In short, they shed light on (i) the interplay among design choices of cloud storage services, which severely impact the performance provided to end users; (ii) the feasibility of designing a general purpose classifier to detect malicious attacks, without chasing threat specificities; and (iii) the relevance of appropriate means to evaluate the perceived quality of Web pages delivery, strengthening the need of users’ feedbacks for a factual assessment

    Resilience-Building Technologies: State of Knowledge -- ReSIST NoE Deliverable D12

    Get PDF
    This document is the first product of work package WP2, "Resilience-building and -scaling technologies", in the programme of jointly executed research (JER) of the ReSIST Network of Excellenc

    CORPORATE SOCIAL RESPONSIBILITY IN ROMANIA

    Get PDF
    The purpose of this paper is to identify the main opportunities and limitations of corporate social responsibility (CSR). The survey was defined with the aim to involve the highest possible number of relevant CSR topics and give the issue a more wholesome perspective. It provides a basis for further comprehension and deeper analyses of specific CSR areas. The conditions determining the success of CSR in Romania have been defined in the paper on the basis of the previously cumulative knowledge as well as the results of various researches. This paper provides knowledge which may be useful in the programs promoting CSR.Corporate social responsibility, Supportive policies, Romania
    • 

    corecore