554 research outputs found
Influence of quantum confinement on the ferromagnetism of (Ga,Mn)As diluted magnetic semiconductor
We investigate the effect of quantum confinement on the ferromagnetism of
diluted magnetic semiconductor GaMnAs using a combination of
tight-binding and density functional methods. We observe strong majority-spin
Mn -As hybridization, as well as half metallic behavior, down to sizes
as small as 20 \AA in diameter. Below this critical size, the doped holes are
self-trapped by the Mn-sites, signalling both valence and electronic
transitions. Our results imply that magnetically doped III-V nanoparticles will
provide a medium for manipulating the electronic structure of dilute magnetic
semiconductors while conserving the ferromagnetic properties and even enhancing
it in certain size regime.Comment: 4 pages, 3 figure
Simultaneous control of nanocrystal size and nanocrystal-nanocrystal separation in CdS nanocrystal assembly
We report an easy, one pot synthesis to prepare ordered CdS nanocrystals with varying inter-particle separation and characterize the particle separation using x-ray diffraction at low and wide angles
On Achieving Privacy-Preserving State-of-the-Art Edge Intelligence
Deep Neural Network (DNN) Inference in Edge Computing, often called Edge Intelligence, requires solutions to insure that sensitive data confidentiality and intellectual property are not revealed in the process. Privacy-preserving Edge Intelligence is only emerging, despite the growing prevalence of Edge Computing as a context of Machine-Learning-as-a-Service. Solutions are yet to be applied, and possibly adapted, to state-of-the-art DNNs. This position paper provides an original assessment of the compatibility of existing techniques for privacy-preserving DNN Inference with the characteristics of an Edge Computing setup, highlighting the appropriateness of secret sharing in this context. We then address the future role of model compression methods in the research towards secret sharing on DNNs with state-of-the-art performance
CPU-GPU Layer-Switched Low Latency CNN Inference
Convolutional Neural Networks (CNNs) inference on Heterogeneous Multi-Processor System-on-Chips (HMPSoCs) in edge devices represent cutting-edge embedded machine learning. Embedded CPU and GPU within an HMPSoC can both perform inference using CNNs. However, common practice is to run a CNN on the HMPSoC component (CPU or GPU) provides the best performance (lowest latency) for that CNN. CNNs are not monolithic and are composed of several layers of different types. Some of these layers have lower latency on the CPU, while others execute faster on the GPU. In this work, we investigate the reason behind this observation. We also propose an execution of CNN that switches between CPU and GPU at the layer granularity, wherein a CNN layer executes on the component that provides it with the lowest latency. Switching between the CPU and the GPU back and forth mid-inference introduces additional overhead (delay) in the inference. Regardless of overhead, we show in this work that a CPU-GPU layer switched execution results in, on average, having 4.72% lower CNN inference latency on the Khadas VIM 3 board with Amlogic A311D HMPSoC
PELSI: Power-Efficient Layer-Switched Inference
Convolutional Neural Networks (CNNs) are now quintessential kernels within embedded computer vision applications deployed in edge devices. Heterogeneous Multi-Processor System-on-Chips (HMPSoCs) with Dynamic Voltage and Frequency Scaling (DVFS) capable components (CPUs and GPUs) allow for low-latency, low-power CNN inference on resource-constrained edge devices when employed efficiently. CNNs comprise several heterogeneous layer types that execute with different degrees of power efficiency on different HMPSoC components at different frequencies. We propose the first framework, PELSI, that exploits this layer-wise power efficiency heterogeneity for power-efficient CPU-GPU layer-switched CNN interference on HMPSoCs. PELSI executes each layer of a CNN on an HMPSoC component (CPU or GPU) clocked at just the right frequency for every layer such that the CNN meets its inference latency target with minimal power consumption while still accounting for the power-performance overhead of multiple switching between CPU and GPU mid-inference. PELSI incorporates a Genetic Algorithm (GA) to identify the near-optimal CPU-GPU layer-switched CNN inference configuration from within the large exponential design space that meets the given latency requirement most power efficiently. We evaluate PELSI on Rock-Pi embedded platform. The platform contains an RK3399Pro HMPSoC with DVFS-capable CPU clusters and GPU. Empirical evaluations with five different CNNs show a 44.48% improvement in power efficiency for CNN inference under PELSI over the state-of-the-art
- …