106,877 research outputs found

    Parallel Discrete Event Simulation with Erlang

    Full text link
    Discrete Event Simulation (DES) is a widely used technique in which the state of the simulator is updated by events happening at discrete points in time (hence the name). DES is used to model and analyze many kinds of systems, including computer architectures, communication networks, street traffic, and others. Parallel and Distributed Simulation (PADS) aims at improving the efficiency of DES by partitioning the simulation model across multiple processing elements, in order to enabling larger and/or more detailed studies to be carried out. The interest on PADS is increasing since the widespread availability of multicore processors and affordable high performance computing clusters. However, designing parallel simulation models requires considerable expertise, the result being that PADS techniques are not as widespread as they could be. In this paper we describe ErlangTW, a parallel simulation middleware based on the Time Warp synchronization protocol. ErlangTW is entirely written in Erlang, a concurrent, functional programming language specifically targeted at building distributed systems. We argue that writing parallel simulation models in Erlang is considerably easier than using conventional programming languages. Moreover, ErlangTW allows simulation models to be executed either on single-core, multicore and distributed computing architectures. We describe the design and prototype implementation of ErlangTW, and report some preliminary performance results on multicore and distributed architectures using the well known PHOLD benchmark.Comment: Proceedings of ACM SIGPLAN Workshop on Functional High-Performance Computing (FHPC 2012) in conjunction with ICFP 2012. ISBN: 978-1-4503-1577-

    Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge

    Full text link
    This paper presents a state-of-the-art model for visual question answering (VQA), which won the first place in the 2017 VQA Challenge. VQA is a task of significant importance for research in artificial intelligence, given its multimodal nature, clear evaluation protocol, and potential real-world applications. The performance of deep neural networks for VQA is very dependent on choices of architectures and hyperparameters. To help further research in the area, we describe in detail our high-performing, though relatively simple model. Through a massive exploration of architectures and hyperparameters representing more than 3,000 GPU-hours, we identified tips and tricks that lead to its success, namely: sigmoid outputs, soft training targets, image features from bottom-up attention, gated tanh activations, output embeddings initialized using GloVe and Google Images, large mini-batches, and smart shuffling of training data. We provide a detailed analysis of their impact on performance to assist others in making an appropriate selection.Comment: Winner of the 2017 Visual Question Answering (VQA) Challenge at CVP

    LUNES: Agent-based Simulation of P2P Systems (Extended Version)

    Full text link
    We present LUNES, an agent-based Large Unstructured NEtwork Simulator, which allows to simulate complex networks composed of a high number of nodes. LUNES is modular, since it splits the three phases of network topology creation, protocol simulation and performance evaluation. This permits to easily integrate external software tools into the main software architecture. The simulation of the interaction protocols among network nodes is performed via a simulation middleware that supports both the sequential and the parallel/distributed simulation approaches. In the latter case, a specific mechanism for the communication overhead-reduction is used; this guarantees high levels of performance and scalability. To demonstrate the efficiency of LUNES, we test the simulator with gossip protocols executed on top of networks (representing peer-to-peer overlays), generated with different topologies. Results demonstrate the effectiveness of the proposed approach.Comment: Proceedings of the International Workshop on Modeling and Simulation of Peer-to-Peer Architectures and Systems (MOSPAS 2011). As part of the 2011 International Conference on High Performance Computing and Simulation (HPCS 2011

    Performance analysis of the doubly-linked list protocol family for distributed shared memory systems

    Get PDF
    The 2nd International Conference on Algorithms and Architectures for Parallel Processing, Singapore, 11-13 June 1996The doubly-linked list (DLL) protocol provides a memory efficient, scalable, high-performance and yet easy to implement method to maintain memory coherence in distributed shared memory (DSM) systems. In this paper, the performance analysis of the DLL family of protocols is presented. Theoretically, the DLL protocol with stable owners has the shortest remote memory access latency among the DLL protocol family. According to the simulated performance evaluation, the DLL-S protocol is 65.7% faster than the DDM algorithm for the linear equation solver; and is 16.5% faster for the matrix multiplier. From the trend of the performance figures, it is predicted that the improvement in performance due to the DLL-S protocol will be considerably greater when a larger number of processors are used, indicating that the DLL-S protocol is also the most scalable of the protocols tested.published_or_final_versio

    Creation of Flexible Data Structure for an Emerging Network Control Protocol

    Get PDF
    Due to increasing number of versions of OpenFlow protocol, it is getting harder day by day to use isolated data structure support of OpenFlow protocol. There is high degree of variability between each versions of OpenFlow protocol. Each version of OpenFlow specifies an interface and the collection of abstractions present in a switch that can be manipulated. So our focus of this thesis is to use the data structure (Avro) which supports OpenFlow protocol through software infrastructure proposed by Warp development group. Using this we have developed the OpenFlow version 1.2 support to Warp controller. Warp architecture uses Avro data structure which has advantages like easy integration of new version, update existing version and apply run time changes, version control, data exchange and easy schema processing which heavily impact on performance and flexibility of OpenFlow controller. These mentioned factors are compared against other OpenFlow controller architectures such as Floodlight, Ryu etc. Comparing obtained observations with different architectures conclude that Warp is more flexible architecture as compared to Floodlight and Ryu.Engineering Technology, Department o

    Optimization driven multi-hop network design and experimentation: the approach of the FP7 project OPNEX

    Get PDF
    International audienceThe OPNEX project exemplifies system and optimization theory as the foundations for algorithms that provably maximize capacity of wireless networks. The algorithms termed in abstract network models have been converted to protocols and architectures practically applicable to wireless systems. A validation methodology through experimental protocol evaluation in real network testbeds has been proposed and used. OPNEX uses recent advances in system theoretic network control, including the Back-Pressure principle, max-weight scheduling, utility optimization, congestion control, and the primal-dual method for extracting network algorithms. These approaches exhibited vast potential for achieving high capacity and full exploitation of resources in abstract network models and found their way to reality in high performance architectures developed as a result of the research conducted within OPNEX

    Multistage Switching Architectures for Software Routers

    Get PDF
    Software routers based on personal computer (PC) architectures are becoming an important alternative to proprietary and expensive network devices. However, software routers suffer from many limitations of the PC architecture, including, among others, limited bus and central processing unit (CPU) bandwidth, high memory access latency, limited scalability in terms of number of network interface cards, and lack of resilience mechanisms. Multistage PC-based architectures can be an interesting alternative since they permit us to i) increase the performance of single software routers, ii) scale router size, iii) distribute packet manipulation and control functionality, iv) recover from single-component failures, and v) incrementally upgrade router performance. We propose a specific multistage architecture, exploiting PC-based routers as switching elements, to build a high-speed, largesize,scalable, and reliable software router. A small-scale prototype of the multistage router is currently up and running in our labs, and performance evaluation is under wa
    • 

    corecore