7,691 research outputs found

    Representation of coherency classes for parallel systems

    Get PDF
    Some parallel applications do not require a precise imitation of the behaviour of the physically shared memory programming model. Consequently, certain parallel machine architectures have elected to emphasise different required coherency properties because of possible efficiency gains. This has led to various definitions of models of store coherency. These definitions have not been amenable to detailed analysis and, consequently, inconsistencies have resulted. In this paper a unified framework is proposed in which different models of store coherency are developed systematically by progressively relaxing the constraints that they have to satisfy. A demonstration is given of how formal reasoning can be cam’ed out to compare different models. Some real-life systems are considered and a definition of a version of weak coherency is found to be incomplete

    Going Deeper into Action Recognition: A Survey

    Full text link
    Understanding human actions in visual data is tied to advances in complementary research areas including object recognition, human dynamics, domain adaptation and semantic segmentation. Over the last decade, human action analysis evolved from earlier schemes that are often limited to controlled environments to nowadays advanced solutions that can learn from millions of videos and apply to almost all daily activities. Given the broad range of applications from video surveillance to human-computer interaction, scientific milestones in action recognition are achieved more rapidly, eventually leading to the demise of what used to be good in a short time. This motivated us to provide a comprehensive review of the notable steps taken towards recognizing human actions. To this end, we start our discussion with the pioneering methods that use handcrafted representations, and then, navigate into the realm of deep learning based approaches. We aim to remain objective throughout this survey, touching upon encouraging improvements as well as inevitable fallbacks, in the hope of raising fresh questions and motivating new research directions for the reader

    BarrierPoint: sampled simulation of multi-threaded applications

    Get PDF
    Sampling is a well-known technique to speed up architectural simulation of long-running workloads while maintaining accurate performance predictions. A number of sampling techniques have recently been developed that extend well- known single-threaded techniques to allow sampled simulation of multi-threaded applications. Unfortunately, prior work is limited to non-synchronizing applications (e.g., server throughput workloads); requires the functional simulation of the entire application using a detailed cache hierarchy which limits the overall simulation speedup potential; leads to different units of work across different processor architectures which complicates performance analysis; or, requires massive machine resources to achieve reasonable simulation speedups. In this work, we propose BarrierPoint, a sampling methodology to accelerate simulation by leveraging globally synchronizing barriers in multi-threaded applications. BarrierPoint collects microarchitecture-independent code and data signatures to determine the most representative inter-barrier regions, called barrierpoints. BarrierPoint estimates total application execution time (and other performance metrics of interest) through detailed simulation of these barrierpoints only, leading to substantial simulation speedups. Barrierpoints can be simulated in parallel, use fewer simulation resources, and define fixed units of work to be used in performance comparisons across processor architectures. Our evaluation of BarrierPoint using NPB and Parsec benchmarks reports average simulation speedups of 24.7x (and up to 866.6x) with an average simulation error of 0.9% and 2.9% at most. On average, BarrierPoint reduces the number of simulation machine resources needed by 78x

    Frequency-Domain Stochastic Modeling of Stationary Bivariate or Complex-Valued Signals

    Get PDF
    There are three equivalent ways of representing two jointly observed real-valued signals: as a bivariate vector signal, as a single complex-valued signal, or as two analytic signals known as the rotary components. Each representation has unique advantages depending on the system of interest and the application goals. In this paper we provide a joint framework for all three representations in the context of frequency-domain stochastic modeling. This framework allows us to extend many established statistical procedures for bivariate vector time series to complex-valued and rotary representations. These include procedures for parametrically modeling signal coherence, estimating model parameters using the Whittle likelihood, performing semi-parametric modeling, and choosing between classes of nested models using model choice. We also provide a new method of testing for impropriety in complex-valued signals, which tests for noncircular or anisotropic second-order statistical structure when the signal is represented in the complex plane. Finally, we demonstrate the usefulness of our methodology in capturing the anisotropic structure of signals observed from fluid dynamic simulations of turbulence.Comment: To appear in IEEE Transactions on Signal Processin

    Temporal structure in neuronal activity during working memory in Macaque parietal cortex

    Full text link
    A number of cortical structures are reported to have elevated single unit firing rates sustained throughout the memory period of a working memory task. How the nervous system forms and maintains these memories is unknown but reverberating neuronal network activity is thought to be important. We studied the temporal structure of single unit (SU) activity and simultaneously recorded local field potential (LFP) activity from area LIP in the inferior parietal lobe of two awake macaques during a memory-saccade task. Using multitaper techniques for spectral analysis, which play an important role in obtaining the present results, we find elevations in spectral power in a 50--90 Hz (gamma) frequency band during the memory period in both SU and LFP activity. The activity is tuned to the direction of the saccade providing evidence for temporal structure that codes for movement plans during working memory. We also find SU and LFP activity are coherent during the memory period in the 50--90 Hz gamma band and no consistent relation is present during simple fixation. Finally, we find organized LFP activity in a 15--25 Hz frequency band that may be related to movement execution and preparatory aspects of the task. Neuronal activity could be used to control a neural prosthesis but SU activity can be hard to isolate with cortical implants. As the LFP is easier to acquire than SU activity, our finding of rich temporal structure in LFP activity related to movement planning and execution may accelerate the development of this medical application.Comment: Originally submitted to the neuro-sys archive which was never publicly announced (was 0005002
    • …
    corecore