3,492 research outputs found
Performance and Optimization Abstractions for Large Scale Heterogeneous Systems in the Cactus/Chemora Framework
We describe a set of lower-level abstractions to improve performance on
modern large scale heterogeneous systems. These provide portable access to
system- and hardware-dependent features, automatically apply dynamic
optimizations at run time, and target stencil-based codes used in finite
differencing, finite volume, or block-structured adaptive mesh refinement
codes.
These abstractions include a novel data structure to manage refinement
information for block-structured adaptive mesh refinement, an iterator
mechanism to efficiently traverse multi-dimensional arrays in stencil-based
codes, and a portable API and implementation for explicit SIMD vectorization.
These abstractions can either be employed manually, or be targeted by
automated code generation, or be used via support libraries by compilers during
code generation. The implementations described below are available in the
Cactus framework, and are used e.g. in the Einstein Toolkit for relativistic
astrophysics simulations
Reviewer Integration and Performance Measurement for Malware Detection
We present and evaluate a large-scale malware detection system integrating
machine learning with expert reviewers, treating reviewers as a limited
labeling resource. We demonstrate that even in small numbers, reviewers can
vastly improve the system's ability to keep pace with evolving threats. We
conduct our evaluation on a sample of VirusTotal submissions spanning 2.5 years
and containing 1.1 million binaries with 778GB of raw feature data. Without
reviewer assistance, we achieve 72% detection at a 0.5% false positive rate,
performing comparable to the best vendors on VirusTotal. Given a budget of 80
accurate reviews daily, we improve detection to 89% and are able to detect 42%
of malicious binaries undetected upon initial submission to VirusTotal.
Additionally, we identify a previously unnoticed temporal inconsistency in the
labeling of training datasets. We compare the impact of training labels
obtained at the same time training data is first seen with training labels
obtained months later. We find that using training labels obtained well after
samples appear, and thus unavailable in practice for current training data,
inflates measured detection by almost 20 percentage points. We release our
cluster-based implementation, as well as a list of all hashes in our evaluation
and 3% of our entire dataset.Comment: 20 papers, 11 figures, accepted at the 13th Conference on Detection
of Intrusions and Malware & Vulnerability Assessment (DIMVA 2016
JPEG steganography with particle swarm optimization accelerated by AVX
Digital steganography aims at hiding secret messages in digital data transmitted over insecure channels. The JPEG format is prevalent in digital communication, and images are often used as cover objects in digital steganography. Optimization methods can improve the properties of images with embedded secret but introduce additional computational complexity to their processing. AVX instructions available in modern CPUs are, in this work, used to accelerate data parallel operations that are part of image steganography with advanced optimizations.Web of Science328art. no. e544
Semantic HMC for Big Data Analysis
Analyzing Big Data can help corporations to im-prove their efficiency. In
this work we present a new vision to derive Value from Big Data using a
Semantic Hierarchical Multi-label Classification called Semantic HMC based in a
non-supervised Ontology learning process. We also proposea Semantic HMC
process, using scalable Machine-Learning techniques and Rule-based reasoning
- …