78 research outputs found
Magnetophotoluminescence of negatively charged excitons in narrow quantum wells
We present the results of photoluminescence experiments on the negatively charged exciton X- in GaAs/AlxGa1-xAs quantum wells (QW) in high magnetic fields (≤50 T). Three different QW widths are used here: 100, 120, and 150 Å. All optically allowed transitions of X- are observed, enabling us to experimentally verify its energy-level diagram. All samples behave consistently with this diagram. We have determined the binding energy Eb of the singlet and triplet state of X- between 23 and 50 T for the 120 and 150 Å QW, while only the triplet Eb is observed for the 100 Å QW. A detailed comparison with recent theoretical calculations shows an agreement for all samples across this entire field range
Magnetic-field dependence of the spin states of the negatively charged exciton in GaAs quantum wells
We present high-field (<50 T) photoluminescence measurements of the binding energy of the singlet and triplet states of the negatively charged exciton in a 200-Angstrom quantum well. Comparing our data with those of other groups and with theoretical predictions we clearly show how the singlet, "bright" and "dark" triplet states may be identified according to the high-field dependence of their binding energies. We demonstrate that a very consistent behavior of the binding energy in a magnetic field has been observed in quantum wells of different widths by different groups and conclude that the triplet state found in this, as well as nearly all other experiments, is undoubtedly the bright triplet. By combining our data with that in the literature we are able to present the generic form of the binding energy of the spin states of the charged exciton in a magnetic field, which reveals the predicted singlet to dark triplet ground state transition at about 20 T
SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks
Going deeper and wider in neural architectures improves the accuracy, while
the limited GPU DRAM places an undesired restriction on the network design
domain. Deep Learning (DL) practitioners either need change to less desired
network architectures, or nontrivially dissect a network across multiGPUs.
These distract DL practitioners from concentrating on their original machine
learning tasks. We present SuperNeurons: a dynamic GPU memory scheduling
runtime to enable the network training far beyond the GPU DRAM capacity.
SuperNeurons features 3 memory optimizations, \textit{Liveness Analysis},
\textit{Unified Tensor Pool}, and \textit{Cost-Aware Recomputation}, all
together they effectively reduce the network-wide peak memory usage down to the
maximal memory usage among layers. We also address the performance issues in
those memory saving techniques. Given the limited GPU DRAM, SuperNeurons not
only provisions the necessary memory for the training, but also dynamically
allocates the memory for convolution workspaces to achieve the high
performance. Evaluations against Caffe, Torch, MXNet and TensorFlow have
demonstrated that SuperNeurons trains at least 3.2432 deeper network than
current ones with the leading performance. Particularly, SuperNeurons can train
ResNet2500 that has basic network layers on a 12GB K40c.Comment: PPoPP '2018: 23nd ACM SIGPLAN Symposium on Principles and Practice of
Parallel Programmin
Rethinking the Inception Architecture for Computer Vision
Convolutional networks are at the core of most stateof-the-art
computer vision solutions for a wide variety of
tasks. Since 2014 very deep convolutional networks started
to become mainstream, yielding substantial gains in various
benchmarks. Although increased model size and computational
cost tend to translate to immediate quality gains
for most tasks (as long as enough labeled data is provided
for training), computational efficiency and low parameter
count are still enabling factors for various use cases such as
mobile vision and big-data scenarios. Here we are exploring
ways to scale up networks in ways that aim at utilizing
the added computation as efficiently as possible by suitably
factorized convolutions and aggressive regularization. We
benchmark our methods on the ILSVRC 2012 classification
challenge validation set demonstrate substantial gains over
the state of the art: 21.2% top-1 and 5.6% top-5 error for
single frame evaluation using a network with a computational
cost of 5 billion multiply-adds per inference and with
using less than 25 million parameters. With an ensemble of
4 models and multi-crop evaluation, we report 3.5% top-5
error and 17.3% top-1 error
Classification of crystallization outcomes using deep convolutional neural networks
The Machine Recognition of Crystallization Outcomes (MARCO) initiative has assembled roughly half a million annotated images of macromolecular crystallization experiments from various sources and setups. Here, state-of-the-art machine learning algorithms are trained and tested on different parts of this data set. We find that more than 94% of the test images can be correctly labeled, irrespective of their experimental origin. Because crystal recognition is key to high-density screening and the systematic analysis of crystallization experiments, this approach opens the door to both industrial and fundamental research applications
DeepMon: Mobile GPU-based deep learning framework for continuous vision applications
© 2017 ACM. The rapid emergence of head-mounted devices such as the Microsoft Holo-lens enables a wide variety of continuous vision applications. Such applications often adopt deep-learning algorithms such as CNN and RNN to extract rich contextual information from the first-person-view video streams. Despite the high accuracy, use of deep learning algorithms in mobile devices raises critical challenges, i.e., high processing latency and power consumption. In this paper, we propose DeepMon, a mobile deep learning inference system to run a variety of deep learning inferences purely on a mobile device in a fast and energy-efficient manner. For this, we designed a suite of optimization techniques to efficiently offload convolutional layers to mobile GPUs and accelerate the processing; note that the convolutional layers are the common performance bottleneck of many deep learning models. Our experimental results show that DeepMon can classify an image over the VGG-VeryDeep-16 deep learning model in 644ms on Samsung Galaxy S7, taking an important step towards continuous vision without imposing any privacy concerns nor networking cost.N
Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis
Data movement is the dominating factor affecting performance and energy in
modern computing systems. Consequently, many algorithms have been developed to
minimize the number of I/O operations for common computing patterns. Matrix
multiplication is no exception, and lower bounds have been proven and
implemented both for shared and distributed memory systems. Reconfigurable
hardware platforms are a lucrative target for I/O minimizing algorithms, as
they offer full control of memory accesses to the programmer. While bounds
developed in the context of fixed architectures still apply to these platforms,
the spatially distributed nature of their computational and memory resources
requires a decentralized approach to optimize algorithms for maximum hardware
utilization. We present a model to optimize matrix multiplication for FPGA
platforms, simultaneously targeting maximum performance and minimum off-chip
data movement, within constraints set by the hardware. We map the model to a
concrete architecture using a high-level synthesis tool, maintaining a high
level of abstraction, allowing us to support arbitrary data types, and enables
maintainability and portability across FPGA devices. Kernels generated from our
architecture are shown to offer competitive performance in practice, scaling
with both compute and memory resources. We offer our design as an open source
project to encourage the open development of linear algebra and I/O minimizing
algorithms on reconfigurable hardware platforms
Magnetic field dependence of the energy of negatively charged excitons in semiconductor quantum wells
A variational calculation of the spin-singlet and spin-triplet state of a
negatively charged exciton (trion) confined to a single quantum well and in the
presence of a perpendicular magnetic field is presented. We calculated the
probability density and the pair correlation function of the singlet and
triplet trion states. The dependence of the energy levels and of the binding
energy on the well width and on the magnetic field strength was investigated.
We compared our results with the available experimental data on GaAs/AlGaAs
quantum wells and find that in the low magnetic field region (B<18 T) the
observed transition are those of the singlet and the dark triplet trion (with
angular momentum ), while for high magnetic fields (B>25 T) the dark
trion becomes optically inactive and possibly a transition to a bright triplet
trion (angular momentum ) state is observed.Comment: 9 pages, 10 figures submitted to Phys. Rev.
Binding Energy of Charged Excitons in ZnSe-based Quantum Wells
Excitons and charged excitons (trions) are investigated in ZnSe-based quantum
well structures with (Zn,Be,Mg)Se and (Zn,Mg)(S,Se) barriers by means of
magneto-optical spectroscopy. Binding energies of negatively () and positively
(X+) charged excitons are measured as functions of quantum well width, free
carrier density and in external magnetic fields up to 47 T. The binding energy
of shows a strong increase from 1.4 to 8.9 meV with decreasing quantum well
width from 190 to 29 A. The binding energies of X+ are about 25% smaller than
the binding energy in the same structures. The magnetic field behavior of and
X+ binding energies differ qualitatively. With growing magnetic field strength,
increases its binding energy by 35-150%, while for X+ it decreases by 25%.
Zeeman spin splittings and oscillator strengths of excitons and trions are
measured and discussed
- …