Search CORE

14,104 research outputs found

git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories

Author: Gote Christoph
Scholtes Ingo
Schweitzer Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/03/2019
Field of study

Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts defined at the level of files, modules, or packages. While this approach has led to insights into the social aspects of software development, it neglects detailed information on code changes and code ownership, e.g. which exact lines of code have been authored by which developers, that is contained in the commit log of software projects. Addressing this issue, we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. This information allows us to construct directed, weighted, and time-stamped networks, where a link signifies that one developer has edited a block of source code originally written by another developer. Our tool is applied in case studies of an Open Source and a commercial software project. We argue that it opens up a massive new source of high-resolution data on human collaboration patterns.Comment: MSR 2019, 12 pages, 10 figure

arXiv.org e-Print Archive

ZORA

Datacenter Traffic Control: Understanding Techniques and Trade-offs

Author: Noormohammadpour Mohammad
Raghavendra Cauligi S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/12/2017
Field of study

Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

arXiv.org e-Print Archive

ZENODO

FigShare

Recursive SDN for Carrier Networks

Author: Koponen Teemu
Liu Zhi
McCauley James
Panda Aurojit
Raghavan Barath
Rexford Jennifer
Shenker Scott
Publication venue
Publication date: 25/05/2016
Field of study

Control planes for global carrier networks should be programmable (so that new functionality can be easily introduced) and scalable (so they can handle the numerical scale and geographic scope of these networks). Neither traditional control planes nor new SDN-based control planes meet both of these goals. In this paper, we propose a framework for recursive routing computations that combines the best of SDN (programmability) and traditional networks (scalability through hierarchy) to achieve these two desired properties. Through simulation on graphs of up to 10,000 nodes, we evaluate our design's ability to support a variety of routing and traffic engineering solutions, while incorporating a fast failure recovery mechanism

arXiv.org e-Print Archive

Princeton University Open Access Repository

Fine Grained Component Engineering of Adaptive Overlays: Experiences and Perspectives

Author: Blair Gordon S.
Grace P.
Mauthe Andreas
Tyson Gareth
Publication venue: Lancaster University
Publication date: 06/07/2009
Field of study

Recent years have seen significant research being carried out into peer-to-peer (P2P) systems. This work has focused on the styles and applications of P2P computing, from grid computation to content distribution; however, little investigation has been performed into how these systems are built. Component based engineering is an approach that has seen successful deployment in the field of middleware development; functionality is encapsulated in ‘building blocks’ that can be dynamically plugged together to form complete systems. This allows efficient, flexible and adaptable systems to be built with lower overhead and development complexity. This paper presents an investigation into the potential of using component based engineering in the design and construction of peer-to-peer overlays. It is highlighted that the quality of these properties is dictated by the component architecture used to implement the system. Three reusable decomposition architectures are designed and evaluated using Chord and Pastry case studies. These demonstrate that significant improvements can be made over traditional design approaches resulting in much more reusable, (re)configurable and extensible systems

Lancaster E-Prints

ScaRR: Scalable Runtime Remote Attestation for Complex Systems

Author: Biondo Andrea
Conti Mauro
Losiouk Eleonora
Toffalini Flavio
Zhou Jianying
Publication venue
Publication date: 01/01/2019
Field of study

The introduction of remote attestation (RA) schemes has allowed academia and industry to enhance the security of their systems. The commercial products currently available enable only the validation of static properties, such as applications fingerprint, and do not handle runtime properties, such as control-flow correctness. This limitation pushed researchers towards the identification of new approaches, called runtime RA. However, those mainly work on embedded devices, which share very few common features with complex systems, such as virtual machines in a cloud. A naive deployment of runtime RA schemes for embedded devices on complex systems faces scalability problems, such as the representation of complex control-flows or slow verification phase. In this work, we present ScaRR: the first Scalable Runtime Remote attestation schema for complex systems. Thanks to its novel control-flow model, ScaRR enables the deployment of runtime RA on any application regardless of its complexity, by also achieving good performance. We implemented ScaRR and tested it on the benchmark suite SPEC CPU 2017. We show that ScaRR can validate on average 2M control-flow events per second, definitely outperforming existing solutions.Comment: 14 page

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Recommended from our members

DESIGN AND IMPLEMENTATION OF PATH FINDING AND VERIFICATION IN THE INTERNET

Author: Cai Hao
Publication venue: ScholarWorks@UMass Amherst
Publication date: 16/07/2020
Field of study

In the Internet, network traffic between endpoints typically follows one path that is determined by the control plane. Endpoints have little control over the choice of which path their network traffic takes and little ability to verify if the traffic indeed follows a specific path. With the emergence of software-defined networking (SDN), more control over connections can be exercised, and thus the opportunity for novel solutions exists. However, there remain concerns about the attack surface exposed by fine-grained control, which may allow attackers to inject and redirect traffic. To address these opportunities and concerns, we consider two specific challenges: (1) How can the network determine the choices of paths available to connect endpoints, especially when multiple criteria can be considered? And (2) how can endpoints verify the integrity of the path over which network traffic is sent. The latter consists of two subproblems, determining that the source of traffic is authentic and determining that a specified path is traversed without deviation. In this dissertation, we investigate and present solutions for both the network path finding problem and the verification problem. We first address path finding, or routing, which is a core functionality in the Internet. Existing approaches are either based on a single criterion (such as path length, delay, or an artificially defined ``weight’’) or use a combinatorial optimization function when there are multiple criteria. We present a multi-criteria routing algorithm that can search the whole space of all possible paths. To achieve the scalability of our solution, we limit the search to only Pareto-optimal paths, which allows us to prune sub-optimal paths quickly and reduce computational complexity. We show that our approach is tractable on a variety of realistic topologies and the results Pareto-optimal paths can be clustered to present a few alternative options. We then address path verification in the Internet, which consists of source authentication and path validation. Once a path has been selected, we show that an endpoint can validate that traffic indeed traverses along the chosen path. Prior work has relied on cryptographic approaches for such validation, which need significant computational resources. In contrast, we propose a lightweight and scalable technique to address this problem, which uses a set of orthogonal sequences as credentials in the packets. The verification of these orthogonal credentials is based on inner product computations, which can be easily implemented by basic bitwise operations in a processor. We show that the proposed approach can achieve the necessary security properties for both source authentication and path validation. Results from a prototype implementation show that the proposed technique can be implemented efficiently and only add a small computational overhead. The results of our work enable novel uses of networks with fine-grained traffic control, such as enabling more path choices in networks where multiple performance criteria matter. In addition, our work contributes to efforts to make the Internet more secure by presenting techniques that allow endpoints to validate the source and path of network traffic. We believe that these contributions help with improving both the current Internet and also future networks

ScholarWorks@UMass Amherst