7,691 research outputs found
Representation of coherency classes for parallel systems
Some parallel applications do not require a precise
imitation of the behaviour of the physically shared
memory programming model. Consequently, certain
parallel machine architectures have elected to emphasise
different required coherency properties because of
possible efficiency gains. This has led to various definitions
of models of store coherency. These definitions
have not been amenable to detailed analysis and, consequently,
inconsistencies have resulted.
In this paper a unified framework is proposed in
which different models of store coherency are developed
systematically by progressively relaxing the constraints
that they have to satisfy. A demonstration is given of
how formal reasoning can be cam’ed out to compare
different models. Some real-life systems are considered
and a definition of a version of weak coherency is
found to be incomplete
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
BarrierPoint: sampled simulation of multi-threaded applications
Sampling is a well-known technique to speed up architectural simulation of long-running workloads while maintaining accurate performance predictions. A number of sampling techniques have recently been developed that extend well- known single-threaded techniques to allow sampled simulation of multi-threaded applications. Unfortunately, prior work is limited to non-synchronizing applications (e.g., server throughput workloads); requires the functional simulation of the entire application using a detailed cache hierarchy which limits the overall simulation speedup potential; leads to different units of work across different processor architectures which complicates performance analysis; or, requires massive machine resources to achieve reasonable simulation speedups. In this work, we propose BarrierPoint, a sampling methodology to accelerate simulation by leveraging globally synchronizing barriers in multi-threaded applications. BarrierPoint collects microarchitecture-independent code and data signatures to determine the most representative inter-barrier regions, called barrierpoints. BarrierPoint estimates total application execution time (and other performance metrics of interest) through detailed simulation of these barrierpoints only, leading to substantial simulation speedups. Barrierpoints can be simulated in parallel, use fewer simulation resources, and define fixed units of work to be used in performance comparisons across processor architectures. Our evaluation of BarrierPoint using NPB and Parsec benchmarks reports average simulation speedups of 24.7x (and up to 866.6x) with an average simulation error of 0.9% and 2.9% at most. On average, BarrierPoint reduces the number of simulation machine resources needed by 78x
Frequency-Domain Stochastic Modeling of Stationary Bivariate or Complex-Valued Signals
There are three equivalent ways of representing two jointly observed
real-valued signals: as a bivariate vector signal, as a single complex-valued
signal, or as two analytic signals known as the rotary components. Each
representation has unique advantages depending on the system of interest and
the application goals. In this paper we provide a joint framework for all three
representations in the context of frequency-domain stochastic modeling. This
framework allows us to extend many established statistical procedures for
bivariate vector time series to complex-valued and rotary representations.
These include procedures for parametrically modeling signal coherence,
estimating model parameters using the Whittle likelihood, performing
semi-parametric modeling, and choosing between classes of nested models using
model choice. We also provide a new method of testing for impropriety in
complex-valued signals, which tests for noncircular or anisotropic second-order
statistical structure when the signal is represented in the complex plane.
Finally, we demonstrate the usefulness of our methodology in capturing the
anisotropic structure of signals observed from fluid dynamic simulations of
turbulence.Comment: To appear in IEEE Transactions on Signal Processin
Temporal structure in neuronal activity during working memory in Macaque parietal cortex
A number of cortical structures are reported to have elevated single unit
firing rates sustained throughout the memory period of a working memory task.
How the nervous system forms and maintains these memories is unknown but
reverberating neuronal network activity is thought to be important. We studied
the temporal structure of single unit (SU) activity and simultaneously recorded
local field potential (LFP) activity from area LIP in the inferior parietal
lobe of two awake macaques during a memory-saccade task. Using multitaper
techniques for spectral analysis, which play an important role in obtaining the
present results, we find elevations in spectral power in a 50--90 Hz (gamma)
frequency band during the memory period in both SU and LFP activity. The
activity is tuned to the direction of the saccade providing evidence for
temporal structure that codes for movement plans during working memory. We also
find SU and LFP activity are coherent during the memory period in the 50--90 Hz
gamma band and no consistent relation is present during simple fixation.
Finally, we find organized LFP activity in a 15--25 Hz frequency band that may
be related to movement execution and preparatory aspects of the task. Neuronal
activity could be used to control a neural prosthesis but SU activity can be
hard to isolate with cortical implants. As the LFP is easier to acquire than SU
activity, our finding of rich temporal structure in LFP activity related to
movement planning and execution may accelerate the development of this medical
application.Comment: Originally submitted to the neuro-sys archive which was never
publicly announced (was 0005002
- …