12 research outputs found
RETROSPECTIVE: Corona: System Implications of Emerging Nanophotonic Technology
The 2008 Corona effort was inspired by a pressing need for more of
everything, as demanded by the salient problems of the day. Dennard scaling was
no longer in effect. A lot of computer architecture research was in the
doldrums. Papers often showed incremental subsystem performance improvements,
but at incommensurate cost and complexity. The many-core era was moving
rapidly, and the approach with many simpler cores was at odds with the better
and more complex subsystem publications of the day. Core counts were doubling
every 18 months, while per-pin bandwidth was expected to double, at best, over
the next decade. Memory bandwidth and capacity had to increase to keep pace
with ever more powerful multi-core processors. With increasing core counts per
die, inter-core communication bandwidth and latency became more important. At
the same time, the area and power of electrical networks-on-chip were
increasingly problematic: To be reliably received, any signal that traverses a
wire spanning a full reticle-sized die would need significant equalization,
re-timing, and multiple clock cycles. This additional time, area, and power was
the crux of the concern, and things looked to get worse in the future.
Silicon nanophotonics was of particular interest and seemed to be improving
rapidly. This led us to consider taking advantage of 3D packaging, where one
die in the 3D stack would be a photonic network layer. Our focus was on a
system that could be built about a decade out. Thus, we tried to predict how
the technologies and the system performance requirements would converge in
about 2018. Corona was the result this exercise; now, 15 years later, it's
interesting to look back at the effort.Comment: 2 pages. Proceedings of ISCA-50: 50 years of the International
Symposia on Computer Architecture (selected papers) June 17-21 Orlando,
Florid
An evaluation of server consolidation workloads for multi-core designs
Abstract — While chip multiprocessors with ten or more cores will be feasible within a few years, the search for applications that fully exploit their attributes continues. In the meantime, one sure-fire application for such machines will be to serve as consolidation platforms for sets of workloads that previously occupied multiple discrete systems. Such server consolidation scenarios will simplify system administration and lead to savings in power, cost, and physical infrastructure. This paper studies the behavior of server consolidation workloads, focusing particularly on sharing of caches across a variety of configurations. Noteworthy interactions emerge within a workload, and notably across workloads, when multiple server workloads are scheduled on the same chip. These workloads present an interesting design point and will help designers better evaluate trade-offs as we push forward into the many-core era. I