Search CORE

739 research outputs found

Sinking of Anchors and Other Subsea Structures due to Wave-Induced Seabed Liquefaction

Author: Kirca V. S. Ozgur
Sumer B. Mutlu
Publication venue
Publication date: 01/01/2019
Field of study

DSPatch: Dual Spatial Pattern Prefetcher

Author: Bera Rahul
Mutlu Onur
Nori Anant V.
Subramoney Sreenivas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/10/2019
Field of study

High main memory latency continues to limit performance of modern high-performance out-of-order cores. While DRAM latency has remained nearly the same over many generations, DRAM bandwidth has grown significantly due to higher frequencies, newer architectures (DDR4, LPDDR4, GDDR5) and 3D-stacked memory packaging (HBM). Current state-of-the-art prefetchers do not do well in extracting higher performance when higher DRAM bandwidth is available. Prefetchers need the ability to dynamically adapt to available bandwidth, boosting prefetch count and prefetch coverage when headroom exists and throttling down to achieve high accuracy when the bandwidth utilization is close to peak. To this end, we present the Dual Spatial Pattern Prefetcher (DSPatch) that can be used as a standalone prefetcher or as a lightweight adjunct spatial prefetcher to the state-of-the-art delta-based Signature Pattern Prefetcher (SPP). DSPatch builds on a novel and intuitive use of modulated spatial bit-patterns. The key idea is to: (1) represent program accesses on a physical page as a bit-pattern anchored to the first "trigger" access, (2) learn two spatial access bit-patterns: one biased towards coverage and another biased towards accuracy, and (3) select one bit-pattern at run-time based on the DRAM bandwidth utilization to generate prefetches. Across a diverse set of workloads, using only 3.6KB of storage, DSPatch improves performance over an aggressive baseline with a PC-based stride prefetcher at the L1 cache and the SPP prefetcher at the L2 cache by 6% (9% in memory-intensive workloads and up to 26%). Moreover, the performance of DSPatch+SPP scales with increasing DRAM bandwidth, growing from 6% over SPP to 10% when DRAM bandwidth is doubled.Comment: This work is to appear in MICRO 201

arXiv.org e-Print Archive

Crossref

Exploring collaboration in challenging environments: from the car to the factory and beyond

Author: Evers Vanessa
Meschtscherjakov A.
Mutlu B.
Tscheligi M.
Weiss A.
Wulf V.
Publication venue: Association for Computing Machinery
Publication date: 01/02/2012
Field of study

We propose a daylong workshop at CSCW2012 on the topic collaboration in challenging and dicult environments, which are to our understanding all contexts, which go beyond traditional working/oce settings topic. Examples for these environments can be the automotive context or the context of a semiconductor factory, which show very specic contextual conditions and therefore oer special research challenges: How to address all passengers in the car, not only the driver? How to explore operator tasks in a cleanroom? How could the long-term (social) collaboration of robots and humans be investigated in privacy critical environments

Crossref

University of Twente Research Information

Longitudinal dispersion of heavy particles in an oscillating tunnel and application to wave boundary layers

Author: Fuhrman David R.
Jensen Karsten Lindegård
Kirca V. S. Ozgur
Steffensen Michael
Sumer B. Mutlu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2015
Field of study

Crossref

Online Research Database In Technology

Exploiting Inter- and Intra-Memory Asymmetries for Data Mapping in Hybrid Tiered-Memories

Author: Antognetti P.
Arafa M.
Arjomand M.
Bhattacharyya A.
Blagodurov S.
Cao Y.
Chang Y.-M.
Cho B.-H.
Das A.
Das A.
Dray C.
Goda A.
Huang Y.
Jayasena N. S.
Kang U.
Kim Y.
Lee D.
Mallik A.
Mutlu O.
Mutlu O.
Pourshirazi B.
Qureshi M. K.
Qureshi M. K.
Redaelli A.
Rixner S.
Sandhu B. S.
Seong N. H.
Seshadri V.
Srinivasan J.
Stuecheli J.
Yoon H.
Yue J.
Zhang L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/05/2020
Field of study

Modern computing systems are embracing hybrid memory comprising of DRAM and non-volatile memory (NVM) to combine the best properties of both memory technologies, achieving low latency, high reliability, and high density. A prominent characteristic of DRAM-NVM hybrid memory is that it has NVM access latency much higher than DRAM access latency. We call this inter-memory asymmetry. We observe that parasitic components on a long bitline are a major source of high latency in both DRAM and NVM, and a significant factor contributing to high-voltage operations in NVM, which impact their reliability. We propose an architectural change, where each long bitline in DRAM and NVM is split into two segments by an isolation transistor. One segment can be accessed with lower latency and operating voltage than the other. By introducing tiers, we enable non-uniform accesses within each memory type (which we call intra-memory asymmetry), leading to performance and reliability trade-offs in DRAM-NVM hybrid memory. We extend existing NVM-DRAM OS in three ways. First, we exploit both inter- and intra-memory asymmetries to allocate and migrate memory pages between the tiers in DRAM and NVM. Second, we improve the OS's page allocation decisions by predicting the access intensity of a newly-referenced memory page in a program and placing it to a matching tier during its initial allocation. This minimizes page migrations during program execution, lowering the performance overhead. Third, we propose a solution to migrate pages between the tiers of the same memory without transferring data over the memory channel, minimizing channel occupancy and improving performance. Our overall approach, which we call MNEME, to enable and exploit asymmetries in DRAM-NVM hybrid tiered memory improves both performance and reliability for both single-core and multi-programmed workloads.Comment: 15 pages, 29 figures, accepted at ACM SIGPLAN International Symposium on Memory Managemen

arXiv.org e-Print Archive

Crossref

Mutation of Directed Graphs -- Corresponding Regular Expressions and Complexity of Their Generation

Author: A. C. Shaw
A. Gill
A. Salomaa
A. Salomaa
B. Beizer
Bianca Truthe
E. F. Moore
F. Belli
F. Belli
F. Belli
Fevzi Belli
G. H. Mealy
Giovanni Pighizzini
J. A. Brzozowski
J. E. Hopcroft
J. Hromkovic
J. Myhill
Jürgen Dassow
Mutlu Beyazit
R. A. DeMillo
R. E. Stearns
R. V. Binder
S. Gossens
V. Geffert
Y. Han
Publication venue: 'Open Publishing Association'
Publication date: 01/07/2009
Field of study

Directed graphs (DG), interpreted as state transition diagrams, are traditionally used to represent finite-state automata (FSA). In the context of formal languages, both FSA and regular expressions (RE) are equivalent in that they accept and generate, respectively, type-3 (regular) languages. Based on our previous work, this paper analyzes effects of graph manipulations on corresponding RE. In this present, starting stage we assume that the DG under consideration contains no cycles. Graph manipulation is performed by deleting or inserting of nodes or arcs. Combined and/or multiple application of these basic operators enable a great variety of transformations of DG (and corresponding RE) that can be seen as mutants of the original DG (and corresponding RE). DG are popular for modeling complex systems; however they easily become intractable if the system under consideration is complex and/or large. In such situations, we propose to switch to corresponding RE in order to benefit from their compact format for modeling and algebraic operations for analysis. The results of the study are of great potential interest to mutation testing

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

An Introduction to Silanes and Their Clinical Applications in Dentistry

Author: Lassila Lippo V. J.
Matinlinna Jukka P.
Pekka K. Vallittu [No Value]
Yli-Urpo Antti
Özcan Mutlu
Publication venue
Publication date
Field of study

An Approximate Dynamic Programming Approach to Urban Freight Distribution with Batch Arrivals

Author: AG Lium
AS Minkoff
CF Daganzo
F Mutlu
F Robusté
H Topaloglu
JH Bookbinder
LC Coelho
M SteadieSeifi
V Pillac
WB Powell
Publication venue: Springer
Publication date: 01/01/2015
Field of study

We study an extension of the delivery dispatching problem (DDP) with time windows, applied on LTL orders arriving at an urban consolidation center. Order properties (e.g., destination, size, dispatch window) may be highly varying, and directly distributing an incoming order batch may yield high costs. Instead, the hub operator may wait to consolidate with future arrivals. A consolidation policy is required to decide which orders to ship and which orders to hold. We model the dispatching problem as a Markov decision problem. Dynamic Programming (DP) is applied to solve toy-sized instances to optimality. For larger instances, we propose an Approximate Dynamic Programming (ADP) approach. Through numerical experiments, we show that ADP closely approximates the optimal values for small instances, and outperforms two myopic benchmark policies for larger instances. We contribute to literature by (i) formulating a DDP with dispatch windows and (ii) proposing an approach to solve this DDP

Crossref

University of Twente Research Information

Improving Phase Change Memory Performance with Data Content Aware Access

Author: Ahn S. J.
Alshboul M.
Awad A.
Awad A.
Bock S.
Bock S.
Bondurant D.
Boroumand A.
Burr G. W.
Chen J.
Chhabra S.
Dogan H.
Du Y.
Ferreira A. P.
Frigo P.
Gueron S.
Guerra J.
Ham T. J.
Hashemi M.
Hsieh K.
Hwang W.
Jia Y.
Jiang L.
Joo Y.
Kang U.
Karlsson M.
Kim J.
Kim Y.
Kim Y.
Lalam A.
Lam C. H.
Lee J. I.
Mallik A.
Marathe V. J.
Meza J.
Morikawa T.
Mutlu O.
Mutlu O.
Pourshirazi B.
Qureshi M. K.
Qureshi M. K.
Saileshwar G.
Seong N. H.
Seshadri V.
Stuecheli J.
Villa C.
Wang Y.
Wang Z.
Wuttig M.
Yamada N.
Yang J.
Yue J.
Zhang L.
Zhou M.
Zhou M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/05/2020
Field of study

A prominent characteristic of write operation in Phase-Change Memory (PCM) is that its latency and energy are sensitive to the data to be written as well as the content that is overwritten. We observe that overwriting unknown memory content can incur significantly higher latency and energy compared to overwriting known all-zeros or all-ones content. This is because all-zeros or all-ones content is overwritten by programming the PCM cells only in one direction, i.e., using either SET or RESET operations, not both. In this paper, we propose data content aware PCM writes (DATACON), a new mechanism that reduces the latency and energy of PCM writes by redirecting these requests to overwrite memory locations containing all-zeros or all-ones. DATACON operates in three steps. First, it estimates how much a PCM write access would benefit from overwriting known content (e.g., all-zeros, or all-ones) by comprehensively considering the number of set bits in the data to be written, and the energy-latency trade-offs for SET and RESET operations in PCM. Second, it translates the write address to a physical address within memory that contains the best type of content to overwrite, and records this translation in a table for future accesses. We exploit data access locality in workloads to minimize the address translation overhead. Third, it re-initializes unused memory locations with known all-zeros or all-ones content in a manner that does not interfere with regular read and write accesses. DATACON overwrites unknown content only when it is absolutely necessary to do so. We evaluate DATACON with workloads from state-of-the-art machine learning applications, SPEC CPU2017, and NAS Parallel Benchmarks. Results demonstrate that DATACON significantly improves system performance and memory system energy consumption compared to the best of performance-oriented state-of-the-art techniques.Comment: 18 pages, 21 figures, accepted at ACM SIGPLAN International Symposium on Memory Management (ISMM

arXiv.org e-Print Archive

Crossref

Metamaterial Polarization Converter Analysis: Limits of Performance

Author: A. Drezet
A. Pors
A. Roberts
A.C. Strikwerda
A.C. Strikwerda
Andrei Andryieuski
Andrei V. Lavrinenko
C. Sabah
D. Molter
D.-H. Kwon
Dmitry L. Markovich
F. Wang
J. Hao
J. Hao
J.B. Masson
J.K. Gansel
J.Y. Chin
J.Y. Chin
M. Mutlu
M. Tonouchi
Maksim Zalkovskij
P. Weis
P.U. Jepsen
R. Singh
Radu Malureanu
S.C Saha
S.X. Li
T. Arikawa
T. Kleine-Ostmann
T. Li
W. Sun
X.G. Peralta
Y. Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2012
Field of study

In this paper we analyze the theoretical limits of a metamaterial converter that allows for linear-to- elliptical polarization transformation with any desired ellipticity and ellipse orientation. We employ the transmission line approach providing a needed level of the design generalization. Our analysis reveals that the maximal conversion efficiency for transmission through a single metamaterial layer is 50%, while the realistic re ection configuration can give the conversion efficiency up to 90%. We show that a double layer transmission converter and a single layer with a ground plane can have 100% polarization conversion efficiency. We tested our conclusions numerically reaching the designated limits of efficiency using a simple metamaterial design. Our general analysis provides useful guidelines for the metamaterial polarization converter design for virtually any frequency range of the electromagnetic waves.Comment: 10 pages, 11 figures, 2 table

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology