22,032 research outputs found
Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study
Analytic, first-principles performance modeling of distributed-memory
applications is difficult due to a wide spectrum of random disturbances caused
by the application and the system. These disturbances (commonly called "noise")
destroy the assumptions of regularity that one usually employs when
constructing simple analytic models. Despite numerous efforts to quantify,
categorize, and reduce such effects, a comprehensive quantitative understanding
of their performance impact is not available, especially for long delays that
have global consequences for the parallel application. In this work, we
investigate various traces collected from synthetic benchmarks that mimic real
applications on simulated and real message-passing systems in order to pinpoint
the mechanisms behind delay propagation. We analyze the dependence of the
propagation speed of idle waves emanating from injected delays with respect to
the execution and communication properties of the application, study how such
delays decay under increased noise levels, and how they interact with each
other. We also show how fine-grained noise can make a system immune against the
adverse effects of propagating idle waves. Our results contribute to a better
understanding of the collective phenomena that manifest themselves in
distributed-memory parallel applications.Comment: 10 pages, 9 figures; title change
Characterizing Attention Cascades in WhatsApp Groups
An important political and social phenomena discussed in several countries,
like India and Brazil, is the use of WhatsApp to spread false or misleading
content. However, little is known about the information dissemination process
in WhatsApp groups. Attention affects the dissemination of information in
WhatsApp groups, determining what topics or subjects are more attractive to
participants of a group. In this paper, we characterize and analyze how
attention propagates among the participants of a WhatsApp group. An attention
cascade begins when a user asserts a topic in a message to the group, which
could include written text, photos, or links to articles online. Others then
propagate the information by responding to it. We analyzed attention cascades
in more than 1.7 million messages posted in 120 groups over one year. Our
analysis focused on the structural and temporal evolution of attention cascades
as well as on the behavior of users that participate in them. We found specific
characteristics in cascades associated with groups that discuss political
subjects and false information. For instance, we observe that cascades with
false information tend to be deeper, reach more users, and last longer in
political groups than in non-political groups.Comment: Accepted as a full paper at the 11th International ACM Web Science
Conference (WebSci 2019). Please cite the WebSci versio
Nonlinear brain dynamics as macroscopic manifestation of underlying many-body field dynamics
Neural activity patterns related to behavior occur at many scales in time and
space from the atomic and molecular to the whole brain. Here we explore the
feasibility of interpreting neurophysiological data in the context of many-body
physics by using tools that physicists have devised to analyze comparable
hierarchies in other fields of science. We focus on a mesoscopic level that
offers a multi-step pathway between the microscopic functions of neurons and
the macroscopic functions of brain systems revealed by hemodynamic imaging. We
use electroencephalographic (EEG) records collected from high-density electrode
arrays fixed on the epidural surfaces of primary sensory and limbic areas in
rabbits and cats trained to discriminate conditioned stimuli (CS) in the
various modalities. High temporal resolution of EEG signals with the Hilbert
transform gives evidence for diverse intermittent spatial patterns of amplitude
(AM) and phase modulations (PM) of carrier waves that repeatedly re-synchronize
in the beta and gamma ranges at near zero time lags over long distances. The
dominant mechanism for neural interactions by axodendritic synaptic
transmission should impose distance-dependent delays on the EEG oscillations
owing to finite propagation velocities. It does not. EEGs instead show evidence
for anomalous dispersion: the existence in neural populations of a low velocity
range of information and energy transfers, and a high velocity range of the
spread of phase transitions. This distinction labels the phenomenon but does
not explain it. In this report we explore the analysis of these phenomena using
concepts of energy dissipation, the maintenance by cortex of multiple ground
states corresponding to AM patterns, and the exclusive selection by spontaneous
breakdown of symmetry (SBS) of single states in sequences.Comment: 31 page
Design of multimedia processor based on metric computation
Media-processing applications, such as signal processing, 2D and 3D graphics
rendering, and image compression, are the dominant workloads in many embedded
systems today. The real-time constraints of those media applications have
taxing demands on today's processor performances with low cost, low power and
reduced design delay. To satisfy those challenges, a fast and efficient
strategy consists in upgrading a low cost general purpose processor core. This
approach is based on the personalization of a general RISC processor core
according the target multimedia application requirements. Thus, if the extra
cost is justified, the general purpose processor GPP core can be enforced with
instruction level coprocessors, coarse grain dedicated hardware, ad hoc
memories or new GPP cores. In this way the final design solution is tailored to
the application requirements. The proposed approach is based on three main
steps: the first one is the analysis of the targeted application using
efficient metrics. The second step is the selection of the appropriate
architecture template according to the first step results and recommendations.
The third step is the architecture generation. This approach is experimented
using various image and video algorithms showing its feasibility
CampProf: A Visual Performance Analysis Tool for Memory Bound GPU Kernels
Current GPU tools and performance models provide some common architectural insights that guide the programmers to write optimal code. We challenge these performance models, by modeling and analyzing a lesser known, but very severe performance pitfall, called 'Partition Camping', in NVIDIA GPUs. Partition Camping is caused by memory accesses that are skewed towards a subset of the available memory partitions, which may degrade the performance of memory-bound CUDA kernels by up to seven-times. No existing tool can detect the partition camping effect in CUDA kernels.
We complement the existing tools by developing 'CampProf', a spreadsheet based, visual analysis tool, that detects the degree to which any memory-bound kernel suffers from partition camping. In addition, CampProf also predicts the kernel's performance at all execution configurations, if its performance parameters are known at any one of them. To demonstrate the utility of CampProf, we analyze three different applications using our tool, and demonstrate how it can be used to discover partition camping. We also demonstrate how CampProf can be used to monitor the performance improvements in the kernels, as the partition camping effect is being removed.
The performance model that drives CampProf was developed by applying multiple linear regression techniques over a set of specific micro-benchmarks that simulated the partition camping behavior. Our results show that the geometric mean of errors in our prediction model is within 12% of the actual execution times. In summary, CampProf is a new, accurate, and easy-to-use tool that can be used in conjunction with the existing tools to analyze and improve the overall performance of memory-bound CUDA kernels
Exploiting programmable architectures for WiFi/ZigBee inter-technology cooperation
The increasing complexity of wireless standards has shown that protocols cannot be designed once for all possible deployments, especially when unpredictable and mutating interference situations are present due to the coexistence of heterogeneous technologies. As such, flexibility and (re)programmability of wireless devices is crucial in the emerging scenarios of technology proliferation and unpredictable interference conditions.
In this paper, we focus on the possibility to improve coexistence performance of WiFi and ZigBee networks by exploiting novel programmable architectures of wireless devices able to support run-time modifications of medium access operations. Differently from software-defined radio (SDR) platforms, in which every function is programmed from scratch, our programmable architectures are based on a clear decoupling between elementary commands (hard-coded into the devices) and programmable protocol logic (injected into the devices) according to which the commands execution is scheduled.
Our contribution is two-fold: first, we designed and implemented a cross-technology time division multiple access (TDMA) scheme devised to provide a global synchronization signal and allocate alternating channel intervals to WiFi and ZigBee programmable nodes; second, we used the OMF control framework to define an interference detection and adaptation strategy that in principle could work in independent and autonomous networks. Experimental results prove the benefits of the envisioned solution
Problems in characterizing barrier performance
The barrier is a synchronization construct which is useful in separating a parallel program into parallel sections which are executed in sequence. The completion of a barrier requires cooperation among all executing processes. This requirement not only introduces the wait for the slowest process delay which is inherent in the definition of the synchronization, but also has implications for the efficient implementation and measurement of barrier performance in different systems. Types of barrier implementation and their relationship to different multiprocessor environments are described. Then the problem of measuring the performance of barrier implementations on specific machine architecture is discussed. The fact that the barrier synchronization requires the cooperation of all processes makes the problem of performance measurement similarly global. Making non-intrusive measurements of sufficient accuracy can be tricky on systems offering only rudimentary measurement tools
Neural Dynamics of Autistic Behaviors: Cognitive, Emotional, and Timing Substrates
What brain mechanisms underlie autism and how do they give rise to autistic behavioral symptoms? This article describes a neural model, called the iSTART model, which proposes how cognitive, emotional, timing, and motor processes may interact together to create and perpetuate autistic symptoms. These model processes were originally developed to explain data concerning how the brain controls normal behaviors. The iSTART model shows how autistic behavioral symptoms may arise from prescribed breakdowns in these brain processes.Air Force Office of Scientific Research (F49620-01-1-0397); Office of Naval Research (N00014-01-1-0624
- …