Search CORE

20 research outputs found

CS 172.02: Introduction to Computer Modeling

Author: Demme John
Publication venue: ScholarWorks at University of Montana
Publication date: 01/09/2002
Field of study

University of Montana

Recommended from our members

Overcoming the Intuition Wall: Measurement and Analysis in Computer Architecture

Author: Demme John David
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

These are exciting times for computer architecture research. Today there is significant demand to improve the performance and energy-efficiency of emerging, transformative applications which are being hammered out by the hundreds for new computing platforms and usage models. This booming growth of applications and the variety of programming languages used to create them is challenging our ability as architects to rapidly and rigorously characterize these applications. Concurrently, hardware has become more complex with the emergence of accelerators, multicore systems, and heterogeneity caused by further divergence between processor market segments. No one architect can now understand all the complexities of many systems and reason about the full impact of changes or new applications. To that end, this dissertation presents four case studies in quantitative methods. Each case study attacks a different application and proposes a new measurement or analytical technique. In each case study we find at least one surprising or unintuitive result which would likely not have been found without the application of our method

Columbia University Academic Commons

Recommended from our members

Anti-Virus in Silicon

Author: Demme John David
Sethumadhavan Simha
Stolfo Salvatore
Tang Beng Chiew
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2015
Field of study

Anti-virus (AV) software is fundamentally broken. AV systems today rely on correct functioning of not only the AV software but also the underlying OS and VMM. Thus proper functioning of software AV requires millions of lines of complex code – which houses thousands of bugs – to work correctly. Needless to say, and as evidenced in numerous software AV attacks, effective software AV systems have been difficult to build. At the same time, malware incidents are increasing and there is strong demand for good anti-virus solutions; the software anti-virus market is estimated at close to 8B dollars annually. In this work we present a new class of robust AV systems called Silicon anti-virus systems. Unlike software AV systems, these systems are lean and mostly implemented in hardware to avoid reliance on complex software, but, like software AV systems, are updatable in the field when new malware is encountered. We describe the first generation of silicon AV that uses simple machine learning techniques with existing performance counter infrastructure. Our published and unpublished work shows that common malware such as viruses and adware, and even zero day exploits can be detected accurately. These systems form a very effective first-line, energy- efficient defense against malware

Columbia University Academic Commons

On the feasibility of online malware detection with performance counters

Author: Adam Waksman
Adrian Tang
Jared Schmitz
John Demme
Matthew Maycock
Salvatore Stolfo
Simha Sethumadhavan
Stone-Gross B.
Xia Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services

Author: Burger Doug
Caulfield Adrian M.
Chiou Derek
Chung Eric S.
Constantinides Kypros
Demme John
Esmaeilzadeh Hadi
Fowers Jeremy
Gopal Gopi Prashanth
Gray Jan
Haselman Michael
Hauck Scott
Heil Stephen
Hormati Amir
Kim Joo-Young
Lanka Sitaram
Larus James
Peterson Eric
Pope Simon
Putnam Andrew
Smith Aaron
Thong Jason
Xiao Phillip Yi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/01/2017
Field of study

Datacenter workloads demand high computational capabilities, flexibility, power efficiency, and low cost. It is challenging to improve all of these factors simultaneously. To advance datacenter capabilities beyond what commodity server designs can provide, we designed and built a composable, reconfigurable hardware fabric based on field programmable gate arrays (FPGA). Each server in the fabric contains one FPGA, and all FPGAs within a 48-server rack are interconnected over a low-latency, high-bandwidth network. We describe a medium-scale deployment of this fabric on a bed of 1632 servers, and measure its effectiveness in accelerating the ranking component of the Bing web search engine. We describe the requirements and architecture of the system, detail the critical engineering challenges and solutions needed to make the system robust in the presence of failures, and measure the performance, power, and resilience of the system. Under high load, the large-scale reconfigurable fabric improves the ranking throughput of each server by 95% at a desirable latency distribution or reduces tail latency by 29% at a fixed throughput. In other words, the reconfigurable fabric enables the same throughput using only half the number of servers

Infoscience - École polytechnique fédérale de Lausanne

Rapid identification of architectural bottlenecks via precise event counting

Author: John Demme
Simha Sethumadhavan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

ELite: Cost-effective approximation of exploration-based graph analysis

Author: Condie Tyson
Demme John
Gonzalez Joseph E.
Wang Kai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/06/2020
Field of study

Vertex-centric block synchronous processing systems, exemplified by Pregel and Giraph, have received extensive attention for graph processing. These systems allow programmers to think only about operations that take place at one vertex and provide the underlying computation framework that involves multiple iterations (supersteps) with communication between neighboring vertices between supersteps. As graphs grow in size to billions of vertices and trillions of edges, processing them in this model face challenges: (1) The poor latency of supersteps dominated by the tasks performed on high degree vertices or densely connected components; and (2) The overwhelming network communication among vertices that can be proved of high redundancy. For many applications, approximate results are acceptable, and if these can be computed rapidly, they may be preferable. Many of the existing approximate solutions suffer from algorithm-specific designs that are not generic or lacking theoretical guarantees on the results\u27 quality. In this paper we tackle this problem using a generic approach that can be incorporated into the graph processing platform. The approach we advocate involves communicating vertex states to a subset of the neighbors at each superstep; this is called selective edge lookup. We show how this approach can be incorporated into two primitive graph operators: BFS and DFS, which can be the basis of many graph analysis workloads. Extensive experiments over real-world and synthetic graphs validate the effectiveness and efficiency of the selective edge lookup approach

University of Memphis Digital Commons

Crossref