4,378 research outputs found

    Predicting coherence communication by tracking synchronization points at run time.”

    Get PDF
    Abstract Predicting target processors that a coherence request must be delivered to can improve the miss handling latency in shared memory systems. In directory coherence protocols, directly communicating with the predicted processors avoids costly indirection to the directory. In snooping protocols, prediction relaxes the high bandwidth requirements by replacing broadcast with multicast. In this work, we propose a new run-time coherence target prediction scheme that exploits the inherent correlation between synchronization points in a program and coherence communication. Our workload-driven analysis shows that by exposing synchronization points to hardware and tracking them at run time, we can simply and effectively track stable and repetitive communication patterns. Based on this observation, we build a predictor that can improve the miss latency of a directory protocol by 13%. Compared with existing address-and instruction-based prediction techniques, our predictor achieves comparable performance using substantially smaller power and storage overheads

    Interacting Turing-Hopf Instabilities Drive Symmetry-Breaking Transitions in a Mean-Field Model of the Cortex: A Mechanism for the Slow Oscillation

    Get PDF
    Electrical recordings of brain activity during the transition from wake to anesthetic coma show temporal and spectral alterations that are correlated with gross changes in the underlying brain state. Entry into anesthetic unconsciousness is signposted by the emergence of large, slow oscillations of electrical activity (≲1  Hz) similar to the slow waves observed in natural sleep. Here we present a two-dimensional mean-field model of the cortex in which slow spatiotemporal oscillations arise spontaneously through a Turing (spatial) symmetry-breaking bifurcation that is modulated by a Hopf (temporal) instability. In our model, populations of neurons are densely interlinked by chemical synapses, and by interneuronal gap junctions represented as an inhibitory diffusive coupling. To demonstrate cortical behavior over a wide range of distinct brain states, we explore model dynamics in the vicinity of a general-anesthetic-induced transition from “wake” to “coma.” In this region, the system is poised at a codimension-2 point where competing Turing and Hopf instabilities coexist. We model anesthesia as a moderate reduction in inhibitory diffusion, paired with an increase in inhibitory postsynaptic response, producing a coma state that is characterized by emergent low-frequency oscillations whose dynamics is chaotic in time and space. The effect of long-range axonal white-matter connectivity is probed with the inclusion of a single idealized point-to-point connection. We find that the additional excitation from the long-range connection can provoke seizurelike bursts of cortical activity when inhibitory diffusion is weak, but has little impact on an active cortex. Our proposed dynamic mechanism for the origin of anesthetic slow waves complements—and contrasts with—conventional explanations that require cyclic modulation of ion-channel conductances. We postulate that a similar bifurcation mechanism might underpin the slow waves of natural sleep and comment on the possible consequences of chaotic dynamics for memory processing and learning

    Synchronization-Point Driven Resource Management in Chip Multiprocessors.

    Get PDF
    With the proliferation of Chip Multiprocessors (CMPs), shared memory multi-threaded programs are expanding fast in every application domain. These programs exhibit execution characteristics that go beyond those observed in single-threaded programs, mainly due to data sharing and synchronization. To ensure that next generation CMPs will perform well on such anticipated workloads, it is vital to understand how these programs and architectures interact, and exploit the unique opportunities presented. This thesis examines the time-varying execution characteristics of the shared memory workloads in conjunction to the synchronization points that exist in the programs. The main hypothesis is that the type, the position, and the repetitive execution of synchronization constructs can be exploited to unfold important execution phases and enable new optimization opportunities. The research provides a simple application-driven approach for predicting the program behavior and effectively driving dynamic performance optimization and resource management actions in future CMPs. In the first part of this thesis, I show how synchronization points relate to various program-wide periodic behaviors. Based on the observations, I develop a framework where user-level synchronization primitives are exposed to the hardware and monitored to detect program phases and guide dynamic adaptation. Through workload-driven evaluation, I demonstrate the effectiveness of the framework in improving the performance/power in on-chip interconnects. The second part of the thesis explores in depth the inter-thread communication behaviors. I show that although synchronization points under the shared memory model do not expose any communication details, they indicate well the points where coherence communication patterns change or repeat. By leveraging this property, I design a synchronization-point-based coherence predictor that uncovers communication patterns with high accuracy, while consuming significantly less hardware resources compared to existing predictors. In the last part, I investigate the underlying reasons causing threads to wait in synchronization points, wasting resources. I show that these reasons can vary even across different programs phases, and existing critical-path predictors can render ineffective under certain conditions. I then present a new scheme that improves predictability by incorporating history information from previous points. The new design is robust and can amortize the run-time imbalances to improve the system's performance and/or energy

    Interacting Turing-Hopf Instabilities Drive Symmetry-Breaking Transitions in a Mean-Field Model of the Cortex: A Mechanism for the Slow Oscillation

    Get PDF
    Electrical recordings of brain activity during the transition from wake to anesthetic coma show temporal and spectral alterations that are correlated with gross changes in the underlying brain state. Entry into anesthetic unconsciousness is signposted by the emergence of large, slow oscillations of electrical activity (≲1  Hz) similar to the slow waves observed in natural sleep. Here we present a two-dimensional mean-field model of the cortex in which slow spatiotemporal oscillations arise spontaneously through a Turing (spatial) symmetry-breaking bifurcation that is modulated by a Hopf (temporal) instability. In our model, populations of neurons are densely interlinked by chemical synapses, and by interneuronal gap junctions represented as an inhibitory diffusive coupling. To demonstrate cortical behavior over a wide range of distinct brain states, we explore model dynamics in the vicinity of a general-anesthetic-induced transition from “wake” to “coma.” In this region, the system is poised at a codimension-2 point where competing Turing and Hopf instabilities coexist. We model anesthesia as a moderate reduction in inhibitory diffusion, paired with an increase in inhibitory postsynaptic response, producing a coma state that is characterized by emergent low-frequency oscillations whose dynamics is chaotic in time and space. The effect of long-range axonal white-matter connectivity is probed with the inclusion of a single idealized point-to-point connection. We find that the additional excitation from the long-range connection can provoke seizurelike bursts of cortical activity when inhibitory diffusion is weak, but has little impact on an active cortex. Our proposed dynamic mechanism for the origin of anesthetic slow waves complements—and contrasts with—conventional explanations that require cyclic modulation of ion-channel conductances. We postulate that a similar bifurcation mechanism might underpin the slow waves of natural sleep and comment on the possible consequences of chaotic dynamics for memory processing and learning

    Getting back on the beat: links between auditory-motor integration and precise auditory processing at fast time scales

    Get PDF
    The auditory system is unique in its ability to precisely detect the timing of perceptual events and use this information to update motor plans, a skill crucial for language. The characteristics of the auditory system which enable this temporal precision, however, are only beginning to be understood. Previous work has shown that participants who can tap consistently to a metronome have neural responses to sound with greater phase coherence from trial to trial. We hypothesized that this relationship is driven by a link between the updating of motor output by auditory feedback and neural precision. Moreover, we hypothesized that neural phase coherence at both fast time scales (reflecting subcortical processing) and slow time scales (reflecting cortical processing) would be linked to auditory-motor timing integration. To test these hypotheses we asked participants to synchronize to a pacing stimulus and then changed either the tempo or the timing of the stimulus to assess whether they could rapidly adapt. Participants who could rapidly and accurately resume synchronization had neural responses to sound with greater phase coherence. However, this precise timing was limited to the time scale of 10 ms (100 Hz) or faster; neural phase coherence at slower time scales was unrelated to performance on this task. Auditory-motor adaptation, therefore, specifically depends upon consistent auditory processing at fast, but not slow, time scales

    Brain interaction during cooperation: Evaluating local properties of multiple-brain network

    Get PDF
    Subjects’ interaction is the core of most human activities. This is the reason why a lack of coordination is often the cause of missing goals, more than individual failure. While there are different subjective and objective measures to assess the level of mental effort required by subjects while facing a situation that is getting harder, that is, mental workload, to define an objective measure based on how and if team members are interacting is not so straightforward. In this study, behavioral, subjective and synchronized electroencephalographic data were collected from couples involved in a cooperative task to describe the relationship between task difficulty and team coordination, in the sense of interaction aimed at cooperatively performing the assignment. Multiple-brain connectivity analysis provided information about the whole interacting system. The results showed that averaged local properties of a brain network were affected by task difficulty. In particular, strength changed significantly with task difficulty and clustering coefficients strongly correlated with the workload itself. In particular, a higher workload corresponded to lower clustering values over the central and parietal brain areas. Such results has been interpreted as less efficient organization of the network when the subjects’ activities, due to high workload tendencies, were less coordinated

    Visual cues in musical synchronisation

    Get PDF
    Although music performance is generally thought of as an auditory activity in the Western tradition, the presence of continuous visual information in live music contributes to the cohesiveness of music ensembles, which presents an interesting psychological phenomenon in which audio and visual cues are presumably integrated. In order to investigate how auditory and visual sensory information are combined in the basic process of synchronising movements with music, this thesis focuses on both musicians and nonmusicians as they respond to two sources of visual information common to ensembles: the conductor, and the ancillary movements (movements that do not directly create sound; e.g. body sway or head nods) of co-performers. These visual cues were hypothesized to improve the timing of intentional synchronous action (matching a musical pulse), as well as increasing the synchrony of emergent ancillary movements between participant and stimulus. The visual cues were tested in controlled renderings of ensemble music arrangements, and were derived from real, biological motion. All three experiments employed the same basic synchronisation task: participants drummed along to the pulse of tempo-changing music while observing various visual cues. For each experiment, participants’ drum timing and upper-body movements were recorded as they completed the synchronisation task. The analyses used to quantify drum timing and ancillary movements came from theoretical approaches to movement timing and entrainment: information processing and dynamical systems. Overall, this thesis shows that basic musical timing is a common ability that is facilitated by visual cues in certain contexts, and that emergent ancillary movements and intentional synchronous movements in combination may best explain musical timing and synchronisation

    Multicore-Aware Reuse Distance Analysis

    Get PDF
    This paper presents and validates methods to extend reuse distance analysis of application locality characteristics to shared-memory multicore platforms by accounting for invalidation-based cache-coherence and inter-core cache sharing. Existing reuse distance analysis methods track the number of distinct addresses referenced between reuses of the same address by a given thread, but do not model the effects of data references by other threads. This paper shows several methods to keep reuse stacks consistent so that they account for invalidations and cache sharing, either as references arise in a simulated execution or at synchronization points. These methods are evaluated against a Simics-based coherent cache simulator running several OpenMP and transaction-based benchmarks. The results show that adding multicore-awareness substantially improves the ability of reuse distance analysis to model cache behavior, reducing the error in miss ratio prediction (relative to cache simulation for a specific cache size) by an average of 69% for per-core caches and an average of 84% for shared caches

    The locality-aware adaptive cache coherence protocol

    Get PDF
    Next generation multicore applications will process massive amounts of data with significant sharing. Data movement and management impacts memory access latency and consumes power. Therefore, harnessing data locality is of fundamental importance in future processors. We propose a scalable, efficient shared memory cache coherence protocol that enables seamless adaptation between private and logically shared caching of on-chip data at the fine granularity of cache lines. Our data-centric approach relies on in-hardware yet low-overhead runtime profiling of the locality of each cache line and only allows private caching for data blocks with high spatio-temporal locality. This allows us to better exploit the private caches and enable low-latency, low-energy memory access, while retaining the convenience of shared memory. On a set of parallel benchmarks, our low-overhead locality-aware mechanisms reduce the overall energy by 25% and completion time by 15% in an NoC-based multicore with the Reactive-NUCA on-chip cache organization and the ACKwise limited directory-based coherence protocol.United States. Defense Advanced Research Projects Agency. The Ubiquitous High Performance Computing Progra
    corecore