4,671 research outputs found
Architectural study of reconfigurable photonic networks-on-chip for multi-core processors
Photonic Networks-on-Chip (NoCs) have become a promising route to interconnect processor cores on chip multiprocessors (CMP) in a power efficient way. Although several photonic NoC proposals exist, their use is limited to the communication of large data messages due to a relatively long set-up time for the photonic channels. In this work, we evaluate a reconfigurable photonic NoC in which the topology is adapted automatically to the evolving traffic situation. This way, long photonic channel set-up times can be tolerated which makes our approach more compatible in the context of shared-memory CMPs
Green Communication via Power-optimized HARQ Protocols
Recently, efficient use of energy has become an essential research topic for
green communication. This paper studies the effect of optimal power controllers
on the performance of delay-sensitive communication setups utilizing hybrid
automatic repeat request (HARQ). The results are obtained for repetition time
diversity (RTD) and incremental redundancy (INR) HARQ protocols. In all cases,
the optimal power allocation, minimizing the outage-limited average
transmission power, is obtained under both continuous and bursting
communication models. Also, we investigate the system throughput in different
conditions. The results indicate that the power efficiency is increased
substantially, if adaptive power allocation is utilized. For instance, assume
Rayleigh-fading channel, a maximum of two (re)transmission rounds with rates
nats-per-channel-use and an outage probability constraint
. Then, compared to uniform power allocation, optimal power
allocation in RTD reduces the average power by 9 and 11 dB in the bursting and
continuous communication models, respectively. In INR, these values are
obtained to be 8 and 9 dB, respectively.Comment: Accepted for publication on IEEE Transactions on Vehicular Technolog
Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency
Persistent memory provides high-performance data persistence at main memory.
Memory writes need to be performed in strict order to satisfy storage
consistency requirements and enable correct recovery from system crashes.
Unfortunately, adhering to such a strict order significantly degrades system
performance and persistent memory endurance. This paper introduces a new
mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering
requirements at significantly lower performance and endurance loss. LOC
consists of two key techniques. First, Eager Commit eliminates the need to
perform a persistent commit record write within a transaction. We do so by
ensuring that we can determine the status of all committed transactions during
recovery by storing necessary metadata information statically with blocks of
data written to memory. Second, Speculative Persistence relaxes the write
ordering between transactions by allowing writes to be speculatively written to
persistent memory. A speculative write is made visible to software only after
its associated transaction commits. To enable this, our mechanism supports the
tracking of committed transaction ID and multi-versioning in the CPU cache. Our
evaluations show that LOC reduces the average performance overhead of memory
persistence from 66.9% to 34.9% and the memory write traffic overhead from
17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and
Distributed System
The Keck Cosmic Web Imager
We are designing the Keck Cosmic Web Imager (KCWI) as a new facility instrument for the Keck II telescope at the W. M. Keck Observatory (WMKO). KCWI is based on the Cosmic Web Imager (CWI), an instrument that has recently had first light at the Hale Telescope. KCWI is a wide-field integral-field spectrograph (IFS) optimized for precision sky limited spectroscopy of low surface brightness phenomena. KCWI will feature high throughput, and flexibility in field of view (FOV), spatial sampling, bandpass, and spectral resolution. KCWI will provide full wavelength coverage (0.35 to 1.05 μm) using optimized blue and red channels. KCWI will provide a unique and complementary capability at WMKO (optical band integral field spectroscopy) that is directly connected to one of the Observatory's strategic goals (faint object, high precision spectroscopy), at a modest cost and on a competitive time scale, made possible by its simple concept and the prior demonstration of CWI
Natural Regulation of Energy Flow in a Green Quantum Photocell
Manipulating the flow of energy in nanoscale and molecular photonic devices
is of both fundamental interest and central importance for applications in
light harvesting optoelectronics. Under erratic solar irradiance conditions,
unregulated power fluctuations in a light harvesting photocell lead to
inefficient energy storage in conventional solar cells and potentially fatal
oxidative damage in photosynthesis. Here, we show that regulation against these
fluctuations arises naturally within a two-channel quantum heat engine
photocell, thus enabling the efficient conversion of varying incident solar
spectrum at Earth's surface. Remarkably, absorption in the green portion of the
spectrum is avoided, as it provides no inherent regulatory benefit. Our
findings illuminate a quantum structural origin of regulation, provide a novel
optoelectronic design strategy, and may elucidate the link between
photoprotection in photosynthesis and the predominance of green plants on
Earth.Comment: 17 pages, 4 figure
SCORPIO: A 36-Core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Ordering
URL to conference programIn the many-core era, scalable coherence and on-chip interconnects are crucial for shared memory processors. While snoopy coherence is common in small multicore systems, directory-based coherence is the de facto choice for scalability to many cores, as snoopy relies on ordered interconnects which do not scale. However, directory-based coherence does not scale beyond tens of cores due to excessive directory area overhead or inaccurate sharer tracking. Prior techniques supporting ordering on arbitrary unordered networks are impractical for full multicore chip designs. We present SCORPIO, an ordered mesh Network-on-Chip(NoC) architecture with a separate fixed-latency, bufferless network to achieve distributed global ordering. Message delivery is decoupled from the ordering, allowing messages to arrive in any order and at any time, and still be correctly ordered. The architecture is designed to plug-and-play with existing multicore IP and with practicality, timing, area, and power as top concerns. Full-system 36 and 64-core simulations on SPLASH-2 and PARSEC benchmarks show an average application run time reduction of 24.1% and 12.9%, in comparison to distributed directory and AMD HyperTransport coherence protocols, respectively. The SCORPIO architecture is incorporated in an 11 mm-by- 13 mm chip prototype, fabricated in IBM 45nm SOI technology, comprising 36 Freescale e200 Power Architecture TM cores with private L1 and L2 caches interfacing with the NoC via ARM AMBA, along with two Cadence on-chip DDR2 controllers. The chip prototype achieves a post synthesis operating frequency of 1 GHz (833 MHz post-layout) with an estimated power of 28.8 W (768 mW per tile), while the network consumes only 10% of tile area and 19 % of tile power.United States. Defense Advanced Research Projects Agency (DARPA UHPC grant at MIT (Angstrom))Center for Future Architectures ResearchMicroelectronics Advanced Research Corporation (MARCO)Semiconductor Research Corporatio
Coherence in Large-Scale Networks: Dimension-Dependent Limitations of Local Feedback
We consider distributed consensus and vehicular formation control problems.
Specifically we address the question of whether local feedback is sufficient to
maintain coherence in large-scale networks subject to stochastic disturbances.
We define macroscopic performance measures which are global quantities that
capture the notion of coherence; a notion of global order that quantifies how
closely the formation resembles a solid object. We consider how these measures
scale asymptotically with network size in the topologies of regular lattices in
1, 2 and higher dimensions, with vehicular platoons corresponding to the 1
dimensional case. A common phenomenon appears where a higher spatial dimension
implies a more favorable scaling of coherence measures, with a dimensions of 3
being necessary to achieve coherence in consensus and vehicular formations
under certain conditions. In particular, we show that it is impossible to have
large coherent one dimensional vehicular platoons with only local feedback. We
analyze these effects in terms of the underlying energetic modes of motion,
showing that they take the form of large temporal and spatial scales resulting
in an accordion-like motion of formations. A conclusion can be drawn that in
low spatial dimensions, local feedback is unable to regulate large-scale
disturbances, but it can in higher spatial dimensions. This phenomenon is
distinct from, and unrelated to string instability issues which are commonly
encountered in control problems for automated highways.Comment: To appear in IEEE Trans. Automat. Control; 15 pages, 2 figure
Programmable Logic Devices in Experimental Quantum Optics
We discuss the unique capabilities of programmable logic devices (PLD's) for
experimental quantum optics and describe basic procedures of design and
implementation. Examples of advanced applications include optical metrology and
feedback control of quantum dynamical systems. As a tutorial illustration of
the PLD implementation process, a field programmable gate array (FPGA)
controller is used to stabilize the output of a Fabry-Perot cavity
Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips
The trend in industry is towards heterogeneous multicore processors (HMCs),
including chips with CPUs and massively-threaded throughput-oriented processors
(MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the
cores with cache-coherent shared virtual memory (CCSVM), this is not the
communication paradigm used by any current HMC. In this paper, we present a
CCSVM design for a CPU/MTTOP chip, as well as an extension of the pthreads
programming model, called xthreads, for programming this HMC. Our goal is to
evaluate the potential performance benefits of tightly coupling heterogeneous
cores with CCSVM
- …