17,746 research outputs found

    A characteristic particle method for traffic flow simulations on highway networks

    Full text link
    A characteristic particle method for the simulation of first order macroscopic traffic models on road networks is presented. The approach is based on the method "particleclaw", which solves scalar one dimensional hyperbolic conservations laws exactly, except for a small error right around shocks. The method is generalized to nonlinear network flows, where particle approximations on the edges are suitably coupled together at the network nodes. It is demonstrated in numerical examples that the resulting particle method can approximate traffic jams accurately, while only devoting a few degrees of freedom to each edge of the network.Comment: 15 pages, 5 figures. Accepted to the proceedings of the Sixth International Workshop Meshfree Methods for PDE 201

    The Potential of the Intel Xeon Phi for Supervised Deep Learning

    Full text link
    Supervised learning of Convolutional Neural Networks (CNNs), also known as supervised Deep Learning, is a computationally demanding process. To find the most suitable parameters of a network for a given application, numerous training sessions are required. Therefore, reducing the training time per session is essential to fully utilize CNNs in practice. While numerous research groups have addressed the training of CNNs using GPUs, so far not much attention has been paid to the Intel Xeon Phi coprocessor. In this paper we investigate empirically and theoretically the potential of the Intel Xeon Phi for supervised learning of CNNs. We design and implement a parallelization scheme named CHAOS that exploits both the thread- and SIMD-parallelism of the coprocessor. Our approach is evaluated on the Intel Xeon Phi 7120P using the MNIST dataset of handwritten digits for various thread counts and CNN architectures. Results show a 103.5x speed up when training our large network for 15 epochs using 244 threads, compared to one thread on the coprocessor. Moreover, we develop a performance model and use it to assess our implementation and answer what-if questions.Comment: The 17th IEEE International Conference on High Performance Computing and Communications (HPCC 2015), Aug. 24 - 26, 2015, New York, US

    Synchronization and Redundancy: Implications for Robustness of Neural Learning and Decision Making

    Full text link
    Learning and decision making in the brain are key processes critical to survival, and yet are processes implemented by non-ideal biological building blocks which can impose significant error. We explore quantitatively how the brain might cope with this inherent source of error by taking advantage of two ubiquitous mechanisms, redundancy and synchronization. In particular we consider a neural process whose goal is to learn a decision function by implementing a nonlinear gradient dynamics. The dynamics, however, are assumed to be corrupted by perturbations modeling the error which might be incurred due to limitations of the biology, intrinsic neuronal noise, and imperfect measurements. We show that error, and the associated uncertainty surrounding a learned solution, can be controlled in large part by trading off synchronization strength among multiple redundant neural systems against the noise amplitude. The impact of the coupling between such redundant systems is quantified by the spectrum of the network Laplacian, and we discuss the role of network topology in synchronization and in reducing the effect of noise. A range of situations in which the mechanisms we model arise in brain science are discussed, and we draw attention to experimental evidence suggesting that cortical circuits capable of implementing the computations of interest here can be found on several scales. Finally, simulations comparing theoretical bounds to the relevant empirical quantities show that the theoretical estimates we derive can be tight.Comment: Preprint, accepted for publication in Neural Computatio

    Desynchronization: Synthesis of asynchronous circuits from synchronous specifications

    Get PDF
    Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur

    Max-plus algebra in the history of discrete event systems

    Get PDF
    This paper is a survey of the history of max-plus algebra and its role in the field of discrete event systems during the last three decades. It is based on the perspective of the authors but it covers a large variety of topics, where max-plus algebra plays a key role

    Towards a new generation of transport services adapted to multimedia application

    Get PDF
    Une connexion d'ordre et de fiabilité partiels (POC, partial order connection) est une connexion de transport autorisée à perdre certains objets mais également à les délivrer dans un ordre éventuellement différent de celui d'émission. L'approche POC établit un lien conceptuel entre les protocoles sans connexion au mieux et les protocoles fiables avec connexion. Le concept de POC est motivé par le fait que dans les réseaux hétérogènes sans connexion tels qu'Internet, les paquets transmis sont susceptibles de se perdre et d'arriver en désordre, entraînant alors une réduction des performances des protocoles usuels. De plus, on montre qu'un protocole associé au transport d'un flux multimédia permet une réduction très sensible de l'utilisation des ressources de communication et de mémorisation ainsi qu'une diminution du temps de transit moyen. Dans cet article, une extension temporelle de POC, nommée TPOC (POC temporisé), est introduite. Elle constitue un cadre conceptuel permettant la prise en compte des exigences de qualité de service des applications multimédias réparties. Une architecture offrant un service TPOC est également introduite et évaluée dans le cadre du transport de vidéo MPEG. Il est ainsi démontré que les connexions POC comblent, non seulement le fossé conceptuel entre les protocoles sans connexion et avec connexion, mais aussi qu'ils surpassent les performances des ces derniers lorsque des données multimédias (telles que la vidéo MPEG) sont transportées
    • …
    corecore