7,382 research outputs found
WMTrace : a lightweight memory allocation tracker and analysis framework
The diverging gap between processor and memory performance has been a well discussed aspect of computer architecture literature for some years. The use of multi-core processor designs has, however, brought new problems to the design of memory architectures - increased core density without matched improvement in memory capacity is reduc- ing the available memory per parallel process. Multiple cores accessing memory simultaneously degrades performance as a result of resource con- tention for memory channels and physical DIMMs. These issues combine to ensure that memory remains an on-going challenge in the design of parallel algorithms which scale. In this paper we present WMTrace, a lightweight tool to trace and analyse memory allocation events in parallel applications. This tool is able to dynamically link to pre-existing application binaries requiring no source code modification or recompilation. A post-execution analysis stage enables in-depth analysis of traces to be performed allowing memory allocations to be analysed by time, size or function. The second half of this paper features a case study in which we apply WMTrace to five parallel scientific applications and benchmarks, demonstrating its effectiveness at recording high-water mark memory consumption as well as memory use per-function over time. An in-depth analysis is provided for an unstructured mesh benchmark which reveals significant memory allocation imbalance across its participating processes
Recommended from our members
Drilling Down I/O Bottlenecks with Cross-layer I/O Profile Exploration
I/O performance monitoring tools such as Darshan and Recorder collect I/O-related metrics on production systems and help understand the applications' behavior. However, some gaps prevent end-users from seeing the whole picture when it comes to detecting and drilling down to the root causes of I/O performance slowdowns and where those problems originate. These gaps arise from limitations in the available metrics, their collection strategy, and the lack of translation to actionable items that could advise on optimizations. This paper highlights such gaps and proposes solutions to drill down to the source code level to pinpoint the root causes of I/O bottlenecks scientific applications face by relying on cross-layer analysis combining multiple performance metrics related to I/O software layers. We demonstrate with two real applications how metrics collected in high-level libraries (which are closer to the data models used by an application), enhanced by source-code insights and natural language translations, can help streamline the understanding of I/O behavior and provide guidance to end-users, developers, and supercomputing facilities on how to improve I/O performance. Using this cross-layer analysis and the heuristic recommendations, we attained up to 6.9× speedup from run-as-is executions
On the Efficacy of Live DDoS Detection with Hadoop
Distributed Denial of Service flooding attacks are one of the biggest
challenges to the availability of online services today. These DDoS attacks
overwhelm the victim with huge volume of traffic and render it incapable of
performing normal communication or crashes it completely. If there are delays
in detecting the flooding attacks, nothing much can be done except to manually
disconnect the victim and fix the problem. With the rapid increase of DDoS
volume and frequency, the current DDoS detection technologies are challenged to
deal with huge attack volume in reasonable and affordable response time.
In this paper, we propose HADEC, a Hadoop based Live DDoS Detection framework
to tackle efficient analysis of flooding attacks by harnessing MapReduce and
HDFS. We implemented a counter-based DDoS detection algorithm for four major
flooding attacks (TCP-SYN, HTTP GET, UDP and ICMP) in MapReduce, consisting of
map and reduce functions. We deployed a testbed to evaluate the performance of
HADEC framework for live DDoS detection. Based on the experiments we showed
that HADEC is capable of processing and detecting DDoS attacks in affordable
time
GAPP: A Fast Profiler for Detecting Serialization Bottlenecks in Parallel Linux Applications
We present a parallel profiling tool, GAPP, that identifies serialization
bottlenecks in parallel Linux applications arising from load imbalance or
contention for shared resources . It works by tracing kernel context switch
events using kernel probes managed by the extended Berkeley Packet Filter
(eBPF) framework. The overhead is thus extremely low (an average 4% run time
overhead for the applications explored), the tool requires no program
instrumentation and works for a variety of serialization bottlenecks. We
evaluate GAPP using the Parsec3.0 benchmark suite and two large open-source
projects: MySQL and Nektar++ (a spectral/hp element framework). We show that
GAPP is able to reveal a wide range of bottleneck-related performance issues,
for example arising from synchronization primitives, busy-wait loops, memory
operations, thread imbalance and resource contention.Comment: 8 page
- …