5 research outputs found
Analysis and Design of Robust and High-Performance Complex Dynamical Networks
In the first part of this dissertation, we develop some basic principles to investigate performance deterioration of dynamical networks subject to external disturbances. First, we propose a graph-theoretic methodology to relate structural specifications of the coupling graph of a linear consensus network to its performance measure. Moreover, for this class of linear consensus networks, we introduce new insights into the network centrality based not only on the network graph but also on a more structured model of network uncertainties. Then, for the class of generic linear networks, we show that the H_2-norm, as a performance measure, can be tightly bounded from below and above by some spectral functions of state and output matrices of the system. Finally, we study nonlinear autocatalytic networks and exploit their structural properties to characterize their existing hard limits and essential tradeoffs. In the second part, we consider problems of network synthesis for performance enhancement. First, we propose an axiomatic approach for the design and performance analysis of linear consensus networks by introducing a notion of systemic performance measure. We build upon this new notion and investigate a general form of combinatorial problem of growing a linear consensus network via minimizing a given systemic performance measure. Two efficient polynomial-time approximation algorithms are devised to tackle this network synthesis problem. Then, we investigate the optimal design problem of distributed system throttlers. A throttler is a mechanism that limits the flow rate of incoming metrics, e.g., byte per second, network bandwidth usage, capacity, traffic, etc. Finally, a framework is developed to produce a sparse approximation of a given large-scale network with guaranteed performance bounds using a nearly-linear time algorithm
Mechanisms to improve the efficiency of hardware data prefetchers
A well known performance bottleneck in computer architecture is the so-called memory
wall. This term refers to the huge disparity between on-chip and off-chip access
latencies. Historically speaking, the operating frequency of processors has increased at
a steady pace, while most past advances in memory technology have been in density,
not speed. Nowadays, the trend for ever increasing processor operating frequencies
has been replaced by an increasing number of CPU cores per chip. This will continue
to exacerbate the memory wall problem, as several cores now have to compete for
off-chip data access. As multi-core systems pack more and more cores, it is expected
that the access latency as observed by each core will continue to increase. Although
the causes of the memory wall have changed, it is, and will continue to be in the near
future, a very significant challenge in terms of computer architecture design.
Prefetching has been an important technique to amortize the effect of the memory
wall. With prefetching, data or instructions that are expected to be used in the near
future are speculatively moved up in the memory hierarchy, were the access latency is
smaller. This dissertation focuses on hardware data prefetching at the last cache level
before memory (last level cache, LLC). Prefetching at the LLC usually offers the best
performance increase, as this is where the disparity between hit and miss latencies is
the largest.
Hardware prefetchers operate by examining the miss address stream generated
by the cache and identifying patterns and correlations between the misses. Most
prefetchers divide the global miss stream in several sub-streams, according to some
pre-specified criteria. This process is known as localization. The benefits of localization
are well established: it increases the accuracy of the predictions and helps
filtering out spurious, non-predictable misses. However localization has one important
drawback: since the misses are classified into different sub-streams, important chronological
information is lost. A consequence of this is that most localizing prefetchers
issue prefetches in an untimely manner, fetching data too far in advance. This behavior
promotes data pollution in the cache.
The first part of this thesis proposes a new class of prefetchers based on the novel
concept of Stream Chaining. With Stream Chaining, the prefetcher tries to reconstruct
the chronological information lost in the process of localization, while at the
same time keeping its benefits. We describe two novel Stream Chaining prefetching
algorithms based on two state of the art localizing prefetchers: PC/DC and C/DC. We show how both prefetchers issue prefetches in a more timely manner than their nonchaining
counterparts, increasing performance by as much as 55% (10% on average)
on a suite of sequential benchmarks, while consuming roughly the same amount of
memory bandwidth.
In order to hide the effects of the memory wall, hardware prefetchers are usually
configured to aggressively prefetch as much data as possible. However, a highly aggressive
prefetcher can have negative effects on performance. Factors such as prefetching
accuracy, cache pollution and memory bandwidth consumption have to be taken
into account. This is specially important in the context of multi-core systems, where
typically each core has its own prefetching engine and there is high competition for
accessing memory. Several prefetch throttling and filtering mechanisms have been
proposed to maximize the effect of prefetching in multi-core systems. The general
strategy behind these heuristics is to promote prefetches that are more likely to be used
and cause less interference. Traditionally these methods operate at the source level,
i.e., directly into the prefetch engine they are assigned to control.
In multi-core systems all prefetches are aggregated in a FIFO-like data structure
called the Prefetch Request Queue (PRQ), where they wait to be dispatched to memory.
The second part of this thesis shows that a traditional FIFO PRQ does not promote
a timely prefetching behavior and usually hinders part of the performance benefits
achieved by throttling heuristics. We propose a novel approach to prefetch aggressiveness
control in multi-cores that performs throttling at the PRQ (i.e., global) level, using
global knowledge of the metrics of all prefetchers and information about the global
state of the PRQ. To do this, we introduce the Resizable Prefetching Heap (RPH), a
data structure modeled after a binary heap that promotes timely dispatch of prefetches
as well as fairness in the distribution of prefetching bandwidth. The RPH is designed as
a drop-in replacement of traditional FIFO PRQs. We compare our proposal against a
state-of-the-art source-level throttling algorithm (HPAC) in a 8-core system. Unlike
previous research, we evaluate both multiprogrammed and multithreaded (parallel)
workloads, using a modern prefetching algorithm (C/DC). Our experimental results
show that RPH-based throttling increases the throttling performance benefits obtained
by HPAC by as much as 148% (53.8% average) in multiprogrammed workloads and
as much as 237% (22.5% average) in parallel benchmarks, while consuming roughly
the same amount of memory bandwidth. When comparing the speedup over fixed degree
prefetching, RPH increased the average speedup of HPAC from 7.1% to 10.9% in
multiprogrammed workloads, and from 5.1% to 7.9% in parallel benchmarks
Cruising for Community: Youth Culture and Politics in Los Angeles, 1910-1970.
âCruising for Communityâ examines youth culture in Los Angeles from the Progressive era of the early 1900s to the civil rights, antiwar, and counterculture movements of the 1960s. During this period, youth culture developed as a product of the triangular relationship of the state, the market, and youth subcultures. From early hot rodders to post-industrial punks, youth subcultures provided young people a means to develop local music, dancing, sports, and fashion. Through subcultures, young Angelinos like Jewish socialists and Chicano activists struggled to create a more just and multicultural city. L.A.âs suburban sprawl and corresponding social structures coordinated subcultures, and youth culture was expressed spatially. Cruisingâparading without permitârepresents young Angelinosâ appropriation of the street to forge belonging, friendship and new identities.
Whereas many historians have claimed that generations are essential to historical change, this dissertation identifies instances of collaboration as well as resistance across age groups. Local middlemen saw the profitability of youth subcultures and through co-optation placed locally generated products on the national market. Concurrently, adult youth experts lobbied to manage youth culture as a way to ensure social stability and common civic identity. This sometimes resulted in draconian policies such as the closing of cruising strips; at other points, youth experts encouraged collaboration, leading to organizations like adult-sponsored car clubs. The mobilizing power of youth culture was recognized by progressive youth leaders, who supported groups of young Angelinos in challenging the social inequities found within their communities; political demonstrations and school walkouts appropriated the cityâs structures to critique inequity, creating the means for a shared political identity.
While cruising represented a balance between the market, the state, and young people, other alignments alienated youthâoften along class, race, ethnic, and gender linesâand denied them autonomy with dramatic consequences, such as the Zoot Suit Riot and Watts Uprising. âCruising for Communityâ gives an analysis of local youth culture that accounts for its evolution, attendant subcultures, and role in 20th century American history. As such, the dissertation connects cultural studies of youth with American urban history, critically contributing to investigations of modern youth, youth culture, and politics.Ph.D.HistoryUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/62236/1/mides_1.pd