4,671 research outputs found

    Architectural study of reconfigurable photonic networks-on-chip for multi-core processors

    Get PDF
    Photonic Networks-on-Chip (NoCs) have become a promising route to interconnect processor cores on chip multiprocessors (CMP) in a power efficient way. Although several photonic NoC proposals exist, their use is limited to the communication of large data messages due to a relatively long set-up time for the photonic channels. In this work, we evaluate a reconfigurable photonic NoC in which the topology is adapted automatically to the evolving traffic situation. This way, long photonic channel set-up times can be tolerated which makes our approach more compatible in the context of shared-memory CMPs

    Green Communication via Power-optimized HARQ Protocols

    Get PDF
    Recently, efficient use of energy has become an essential research topic for green communication. This paper studies the effect of optimal power controllers on the performance of delay-sensitive communication setups utilizing hybrid automatic repeat request (HARQ). The results are obtained for repetition time diversity (RTD) and incremental redundancy (INR) HARQ protocols. In all cases, the optimal power allocation, minimizing the outage-limited average transmission power, is obtained under both continuous and bursting communication models. Also, we investigate the system throughput in different conditions. The results indicate that the power efficiency is increased substantially, if adaptive power allocation is utilized. For instance, assume Rayleigh-fading channel, a maximum of two (re)transmission rounds with rates {1,12}\{1,\frac{1}{2}\} nats-per-channel-use and an outage probability constraint 10−3{10}^{-3}. Then, compared to uniform power allocation, optimal power allocation in RTD reduces the average power by 9 and 11 dB in the bursting and continuous communication models, respectively. In INR, these values are obtained to be 8 and 9 dB, respectively.Comment: Accepted for publication on IEEE Transactions on Vehicular Technolog

    Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency

    Full text link
    Persistent memory provides high-performance data persistence at main memory. Memory writes need to be performed in strict order to satisfy storage consistency requirements and enable correct recovery from system crashes. Unfortunately, adhering to such a strict order significantly degrades system performance and persistent memory endurance. This paper introduces a new mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering requirements at significantly lower performance and endurance loss. LOC consists of two key techniques. First, Eager Commit eliminates the need to perform a persistent commit record write within a transaction. We do so by ensuring that we can determine the status of all committed transactions during recovery by storing necessary metadata information statically with blocks of data written to memory. Second, Speculative Persistence relaxes the write ordering between transactions by allowing writes to be speculatively written to persistent memory. A speculative write is made visible to software only after its associated transaction commits. To enable this, our mechanism supports the tracking of committed transaction ID and multi-versioning in the CPU cache. Our evaluations show that LOC reduces the average performance overhead of memory persistence from 66.9% to 34.9% and the memory write traffic overhead from 17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and Distributed System

    The Keck Cosmic Web Imager

    Get PDF
    We are designing the Keck Cosmic Web Imager (KCWI) as a new facility instrument for the Keck II telescope at the W. M. Keck Observatory (WMKO). KCWI is based on the Cosmic Web Imager (CWI), an instrument that has recently had first light at the Hale Telescope. KCWI is a wide-field integral-field spectrograph (IFS) optimized for precision sky limited spectroscopy of low surface brightness phenomena. KCWI will feature high throughput, and flexibility in field of view (FOV), spatial sampling, bandpass, and spectral resolution. KCWI will provide full wavelength coverage (0.35 to 1.05 μm) using optimized blue and red channels. KCWI will provide a unique and complementary capability at WMKO (optical band integral field spectroscopy) that is directly connected to one of the Observatory's strategic goals (faint object, high precision spectroscopy), at a modest cost and on a competitive time scale, made possible by its simple concept and the prior demonstration of CWI

    Natural Regulation of Energy Flow in a Green Quantum Photocell

    Full text link
    Manipulating the flow of energy in nanoscale and molecular photonic devices is of both fundamental interest and central importance for applications in light harvesting optoelectronics. Under erratic solar irradiance conditions, unregulated power fluctuations in a light harvesting photocell lead to inefficient energy storage in conventional solar cells and potentially fatal oxidative damage in photosynthesis. Here, we show that regulation against these fluctuations arises naturally within a two-channel quantum heat engine photocell, thus enabling the efficient conversion of varying incident solar spectrum at Earth's surface. Remarkably, absorption in the green portion of the spectrum is avoided, as it provides no inherent regulatory benefit. Our findings illuminate a quantum structural origin of regulation, provide a novel optoelectronic design strategy, and may elucidate the link between photoprotection in photosynthesis and the predominance of green plants on Earth.Comment: 17 pages, 4 figure

    SCORPIO: A 36-Core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Ordering

    Get PDF
    URL to conference programIn the many-core era, scalable coherence and on-chip interconnects are crucial for shared memory processors. While snoopy coherence is common in small multicore systems, directory-based coherence is the de facto choice for scalability to many cores, as snoopy relies on ordered interconnects which do not scale. However, directory-based coherence does not scale beyond tens of cores due to excessive directory area overhead or inaccurate sharer tracking. Prior techniques supporting ordering on arbitrary unordered networks are impractical for full multicore chip designs. We present SCORPIO, an ordered mesh Network-on-Chip(NoC) architecture with a separate fixed-latency, bufferless network to achieve distributed global ordering. Message delivery is decoupled from the ordering, allowing messages to arrive in any order and at any time, and still be correctly ordered. The architecture is designed to plug-and-play with existing multicore IP and with practicality, timing, area, and power as top concerns. Full-system 36 and 64-core simulations on SPLASH-2 and PARSEC benchmarks show an average application run time reduction of 24.1% and 12.9%, in comparison to distributed directory and AMD HyperTransport coherence protocols, respectively. The SCORPIO architecture is incorporated in an 11 mm-by- 13 mm chip prototype, fabricated in IBM 45nm SOI technology, comprising 36 Freescale e200 Power Architecture TM cores with private L1 and L2 caches interfacing with the NoC via ARM AMBA, along with two Cadence on-chip DDR2 controllers. The chip prototype achieves a post synthesis operating frequency of 1 GHz (833 MHz post-layout) with an estimated power of 28.8 W (768 mW per tile), while the network consumes only 10% of tile area and 19 % of tile power.United States. Defense Advanced Research Projects Agency (DARPA UHPC grant at MIT (Angstrom))Center for Future Architectures ResearchMicroelectronics Advanced Research Corporation (MARCO)Semiconductor Research Corporatio

    Coherence in Large-Scale Networks: Dimension-Dependent Limitations of Local Feedback

    Full text link
    We consider distributed consensus and vehicular formation control problems. Specifically we address the question of whether local feedback is sufficient to maintain coherence in large-scale networks subject to stochastic disturbances. We define macroscopic performance measures which are global quantities that capture the notion of coherence; a notion of global order that quantifies how closely the formation resembles a solid object. We consider how these measures scale asymptotically with network size in the topologies of regular lattices in 1, 2 and higher dimensions, with vehicular platoons corresponding to the 1 dimensional case. A common phenomenon appears where a higher spatial dimension implies a more favorable scaling of coherence measures, with a dimensions of 3 being necessary to achieve coherence in consensus and vehicular formations under certain conditions. In particular, we show that it is impossible to have large coherent one dimensional vehicular platoons with only local feedback. We analyze these effects in terms of the underlying energetic modes of motion, showing that they take the form of large temporal and spatial scales resulting in an accordion-like motion of formations. A conclusion can be drawn that in low spatial dimensions, local feedback is unable to regulate large-scale disturbances, but it can in higher spatial dimensions. This phenomenon is distinct from, and unrelated to string instability issues which are commonly encountered in control problems for automated highways.Comment: To appear in IEEE Trans. Automat. Control; 15 pages, 2 figure

    Programmable Logic Devices in Experimental Quantum Optics

    Get PDF
    We discuss the unique capabilities of programmable logic devices (PLD's) for experimental quantum optics and describe basic procedures of design and implementation. Examples of advanced applications include optical metrology and feedback control of quantum dynamical systems. As a tutorial illustration of the PLD implementation process, a field programmable gate array (FPGA) controller is used to stabilize the output of a Fabry-Perot cavity

    Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips

    Full text link
    The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores with cache-coherent shared virtual memory (CCSVM), this is not the communication paradigm used by any current HMC. In this paper, we present a CCSVM design for a CPU/MTTOP chip, as well as an extension of the pthreads programming model, called xthreads, for programming this HMC. Our goal is to evaluate the potential performance benefits of tightly coupling heterogeneous cores with CCSVM
    • …
    corecore