2,006 research outputs found
TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments
Deep neural networks (DNNs) have become core computation components within
low latency Function as a Service (FaaS) prediction pipelines: including image
recognition, object detection, natural language processing, speech synthesis,
and personalized recommendation pipelines. Cloud computing, as the de-facto
backbone of modern computing infrastructure for both enterprise and consumer
applications, has to be able to handle user-defined pipelines of diverse DNN
inference workloads while maintaining isolation and latency guarantees, and
minimizing resource waste. The current solution for guaranteeing isolation
within FaaS is suboptimal -- suffering from "cold start" latency. A major cause
of such inefficiency is the need to move large amount of model data within and
across servers. We propose TrIMS as a novel solution to address these issues.
Our proposed solution consists of a persistent model store across the GPU, CPU,
local storage, and cloud storage hierarchy, an efficient resource management
layer that provides isolation, and a succinct set of application APIs and
container technologies for easy and transparent integration with FaaS, Deep
Learning (DL) frameworks, and user code. We demonstrate our solution by
interfacing TrIMS with the Apache MXNet framework and demonstrate up to 24x
speedup in latency for image classification models and up to 210x speedup for
large models. We achieve up to 8x system throughput improvement.Comment: In Proceedings CLOUD 201
Optically-Nonactive Assorted Helices Array with Interchangeable Magnetic/Electric Resonance
We report here the designing of optically-nonactive metamaterial by
assembling metallic helices with different chirality. With linearly polarized
incident light, pure electric or magnetic resonance can be selectively
realized, which leads to negative permittivity or negative permeability
accordingly. Further, we show that pure electric or magnetic resonance can be
interchanged at the same frequency band by merely changing the polarization of
incident light for 90 degrees. This design demonstrates a unique approach to
construct metamaterial.Comment: 15 pages, 4 figure
Mitochondrial amyloid-beta peptide: Pathogenesis or late-phase development?
This is the publisher's version, also available electronically from http://iospress.metapress.com/content/8q4cf2u7gw6cllxt/?genre=article&issn=1387-2877&volume=9&issue=2&spage=127Mitochondrial and metabolic dysfunction have been linked to Alzheimer's disease for some time. Key questions regarding this association concern the nature and mechanisms of mitochondrial dysfunction, and whether such changes in metabolic properties are pathogenic or secondary, with respect to neuronal degeneration. In terms of mitochondria and Alzheimer's, altered function could reflect intrinsic properties of this organelle, potentially due to mutations in mitochondrial DNA, or extrinsic changes secondary to signal transduction mechanisms activated in the cytosol. This review presents data relevant to these questions, and considers the implication of recent findings demonstrating the presence of amyloid-β peptide in mitochondria, as well as intra-mitochondrial molecular targets with which it can interact. Regardless of the underlying mechanism(s), it is likely that mitochondrial dysfunction contributes to oxidant stress which is commonly observed in brains of patients with Alzheimer's and transgenic models of Alzheimer's-like pathology
Accelerating Reduction and Scan Using Tensor Core Units
Driven by deep learning, there has been a surge of specialized processors for
matrix multiplication, referred to as TensorCore Units (TCUs). These TCUs are
capable of performing matrix multiplications on small matrices (usually 4x4 or
16x16) to accelerate the convolutional and recurrent neural networks in deep
learning workloads. In this paper we leverage NVIDIA's TCU to express both
reduction and scan with matrix multiplication and show the benefits -- in terms
of program simplicity, efficiency, and performance. Our algorithm exercises the
NVIDIA TCUs which would otherwise be idle, achieves 89%-98% of peak memory copy
bandwidth, and is orders of magnitude faster (up to 100x for reduction and 3x
for scan) than state-of-the-art methods for small segment sizes -- common in
machine learning and scientific applications. Our algorithm achieves this while
decreasing the power consumption by up to 22% for reduction and16%for scan.Comment: In Proceedings of the ACM International Conference on Supercomputing
(ICS '19
Chiral plasmonics and enhanced chiral light-matter interactions
International audienceChirality, which describes the broken mirror symmetry in geometric structures, exists macroscopically in our daily life as well as microscopically down to molecular levels. Correspondingly, chiral molecules interact differently with circularly polarized light exhibiting opposite handedness (left-handed and right-handed). However, the interaction between chiral molecules and chiral light is very weak. In contrast, artificial chiral plasmonic structures can generate “super-chiral” plasmonic near-field, leading to enhanced chiral light-matter (or chiroptical) interactions. The “super-chiral” near-field presents different amplitude and phase under opposite handedness incidence, which can be utilized to engineer linear and nonlinear chiroptical interactions. Specifically, in the interaction between quantum emitters and chiral plasmonic structures, the chiral hot spots can favour the emission with a specific handedness. This article reviews the state-of-the-art research on the design, fabrication and chiroptical response of different chiral plasmonic nanostructures or metasurfaces. This review also discusses enhanced chiral light-matter interactions that are essential for applications like chirality sensing, chiral selective light emitting and harvesting. In the final part, the review ends with a perspective on future directions of chiral plasmonics
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration
Point cloud registration, a fundamental task in 3D computer vision, has
remained largely unexplored in cross-source point clouds and unstructured
scenes. The primary challenges arise from noise, outliers, and variations in
scale and density. However, neglected geometric natures of point clouds
restricts the performance of current methods. In this paper, we propose a novel
method termed SPEAL to leverage skeletal representations for effective learning
of intrinsic topologies of point clouds, facilitating robust capture of
geometric intricacy. Specifically, we design the Skeleton Extraction Module to
extract skeleton points and skeletal features in an unsupervised manner, which
is inherently robust to noise and density variances. Then, we propose the
Skeleton-Aware GeoTransformer to encode high-level skeleton-aware features. It
explicitly captures the topological natures and inter-point-cloud skeletal
correlations with the noise-robust and density-invariant skeletal
representations. Next, we introduce the Correspondence Dual-Sampler to
facilitate correspondences by augmenting the correspondence set with skeletal
correspondences. Furthermore, we construct a challenging novel large-scale
cross-source point cloud dataset named KITTI CrossSource for benchmarking
cross-source point cloud registration methods. Extensive quantitative and
qualitative experiments are conducted to demonstrate our approach's superiority
and robustness on both cross-source and same-source datasets. To the best of
our knowledge, our approach is the first to facilitate point cloud registration
with skeletal geometric priors.Comment: Accepted by AAAI202
- …