106,877 research outputs found
Parallel Discrete Event Simulation with Erlang
Discrete Event Simulation (DES) is a widely used technique in which the state
of the simulator is updated by events happening at discrete points in time
(hence the name). DES is used to model and analyze many kinds of systems,
including computer architectures, communication networks, street traffic, and
others. Parallel and Distributed Simulation (PADS) aims at improving the
efficiency of DES by partitioning the simulation model across multiple
processing elements, in order to enabling larger and/or more detailed studies
to be carried out. The interest on PADS is increasing since the widespread
availability of multicore processors and affordable high performance computing
clusters. However, designing parallel simulation models requires considerable
expertise, the result being that PADS techniques are not as widespread as they
could be. In this paper we describe ErlangTW, a parallel simulation middleware
based on the Time Warp synchronization protocol. ErlangTW is entirely written
in Erlang, a concurrent, functional programming language specifically targeted
at building distributed systems. We argue that writing parallel simulation
models in Erlang is considerably easier than using conventional programming
languages. Moreover, ErlangTW allows simulation models to be executed either on
single-core, multicore and distributed computing architectures. We describe the
design and prototype implementation of ErlangTW, and report some preliminary
performance results on multicore and distributed architectures using the well
known PHOLD benchmark.Comment: Proceedings of ACM SIGPLAN Workshop on Functional High-Performance
Computing (FHPC 2012) in conjunction with ICFP 2012. ISBN: 978-1-4503-1577-
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
This paper presents a state-of-the-art model for visual question answering
(VQA), which won the first place in the 2017 VQA Challenge. VQA is a task of
significant importance for research in artificial intelligence, given its
multimodal nature, clear evaluation protocol, and potential real-world
applications. The performance of deep neural networks for VQA is very dependent
on choices of architectures and hyperparameters. To help further research in
the area, we describe in detail our high-performing, though relatively simple
model. Through a massive exploration of architectures and hyperparameters
representing more than 3,000 GPU-hours, we identified tips and tricks that lead
to its success, namely: sigmoid outputs, soft training targets, image features
from bottom-up attention, gated tanh activations, output embeddings initialized
using GloVe and Google Images, large mini-batches, and smart shuffling of
training data. We provide a detailed analysis of their impact on performance to
assist others in making an appropriate selection.Comment: Winner of the 2017 Visual Question Answering (VQA) Challenge at CVP
LUNES: Agent-based Simulation of P2P Systems (Extended Version)
We present LUNES, an agent-based Large Unstructured NEtwork Simulator, which
allows to simulate complex networks composed of a high number of nodes. LUNES
is modular, since it splits the three phases of network topology creation,
protocol simulation and performance evaluation. This permits to easily
integrate external software tools into the main software architecture. The
simulation of the interaction protocols among network nodes is performed via a
simulation middleware that supports both the sequential and the
parallel/distributed simulation approaches. In the latter case, a specific
mechanism for the communication overhead-reduction is used; this guarantees
high levels of performance and scalability. To demonstrate the efficiency of
LUNES, we test the simulator with gossip protocols executed on top of networks
(representing peer-to-peer overlays), generated with different topologies.
Results demonstrate the effectiveness of the proposed approach.Comment: Proceedings of the International Workshop on Modeling and Simulation
of Peer-to-Peer Architectures and Systems (MOSPAS 2011). As part of the 2011
International Conference on High Performance Computing and Simulation (HPCS
2011
Performance analysis of the doubly-linked list protocol family for distributed shared memory systems
The 2nd International Conference on Algorithms and Architectures for Parallel Processing, Singapore, 11-13 June 1996The doubly-linked list (DLL) protocol provides a memory efficient, scalable, high-performance and yet easy to implement method to maintain memory coherence in distributed shared memory (DSM) systems. In this paper, the performance analysis of the DLL family of protocols is presented. Theoretically, the DLL protocol with stable owners has the shortest remote memory access latency among the DLL protocol family. According to the simulated performance evaluation, the DLL-S protocol is 65.7% faster than the DDM algorithm for the linear equation solver; and is 16.5% faster for the matrix multiplier. From the trend of the performance figures, it is predicted that the improvement in performance due to the DLL-S protocol will be considerably greater when a larger number of processors are used, indicating that the DLL-S protocol is also the most scalable of the protocols tested.published_or_final_versio
Creation of Flexible Data Structure for an Emerging Network Control Protocol
Due to increasing number of versions of OpenFlow protocol, it is getting harder day by day to use isolated data structure support of OpenFlow protocol. There is high degree of variability between each versions of OpenFlow protocol. Each version of OpenFlow specifies an interface and the collection of abstractions present in a switch that can be manipulated. So our focus of this thesis is to use the data structure (Avro) which supports OpenFlow protocol through software infrastructure proposed by Warp development group. Using this we have developed the OpenFlow version 1.2 support to Warp controller. Warp architecture uses Avro data structure which has advantages like easy integration of new version, update existing version and apply run time changes, version control, data exchange and easy schema processing which heavily impact on performance and flexibility of OpenFlow controller. These mentioned factors are compared against other OpenFlow controller architectures such as Floodlight, Ryu etc. Comparing obtained observations with different architectures conclude that Warp is more flexible architecture as compared to Floodlight and Ryu.Engineering Technology, Department o
Recommended from our members
Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures
The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu
Optimization driven multi-hop network design and experimentation: the approach of the FP7 project OPNEX
International audienceThe OPNEX project exemplifies system and optimization theory as the foundations for algorithms that provably maximize capacity of wireless networks. The algorithms termed in abstract network models have been converted to protocols and architectures practically applicable to wireless systems. A validation methodology through experimental protocol evaluation in real network testbeds has been proposed and used. OPNEX uses recent advances in system theoretic network control, including the Back-Pressure principle, max-weight scheduling, utility optimization, congestion control, and the primal-dual method for extracting network algorithms. These approaches exhibited vast potential for achieving high capacity and full exploitation of resources in abstract network models and found their way to reality in high performance architectures developed as a result of the research conducted within OPNEX
Multistage Switching Architectures for Software Routers
Software routers based on personal computer (PC) architectures are becoming an important alternative to proprietary and expensive network devices. However, software routers suffer from many limitations of the PC architecture, including, among others, limited bus and central processing unit (CPU) bandwidth, high memory access latency, limited scalability in terms of number of network interface cards, and lack of resilience mechanisms. Multistage PC-based architectures can be an interesting alternative since they permit us to i) increase the performance of single software routers, ii) scale router size, iii) distribute packet manipulation and control functionality, iv) recover from single-component failures, and v) incrementally upgrade router performance. We propose a specific multistage architecture, exploiting PC-based routers as switching elements, to build a high-speed, largesize,scalable, and reliable software router. A small-scale prototype of the multistage router is currently up and running in our labs, and performance evaluation is under wa
- âŠ