181 research outputs found
Plasmonic photoconductive terahertz focal-plane array with pixel super-resolution
Imaging systems operating in the terahertz part of the electromagnetic
spectrum are in great demand because of the distinct characteristics of
terahertz waves in penetrating many optically-opaque materials and providing
unique spectral signatures of various chemicals. However, the use of terahertz
imagers in real-world applications has been limited by the slow speed, large
size, high cost, and complexity of the existing imaging systems. These
limitations are mainly imposed due to the lack of terahertz focal-plane arrays
(THz-FPAs) that can directly provide the frequency-resolved and/or
time-resolved spatial information of the imaged objects. Here, we report the
first THz-FPA that can directly provide the spatial amplitude and phase
distributions, along with the ultrafast temporal and spectral information of an
imaged object. It consists of a two-dimensional array of ~0.3 million plasmonic
photoconductive nanoantennas optimized to rapidly detect broadband terahertz
radiation with a high signal-to-noise ratio. As the first proof-of-concept, we
utilized the multispectral nature of the amplitude and phase data captured by
these plasmonic nanoantennas to realize pixel super-resolution imaging of
objects. We successfully imaged and super-resolved etched patterns in a silicon
substrate and reconstructed both the shape and depth of these structures with
an effective number of pixels that exceeds 1-kilo pixels. By eliminating the
need for raster scanning and spatial terahertz modulation, our THz-FPA offers
more than a 1000-fold increase in the imaging speed compared to the
state-of-the-art. Beyond this proof-of-concept super-resolution demonstration,
the unique capabilities enabled by our plasmonic photoconductive THz-FPA offer
transformative advances in a broad range of applications that use hyperspectral
and three-dimensional terahertz images of objects for a wide range of
applications.Comment: 62 page
Learning Compact Compositional Embeddings via Regularized Pruning for Recommendation
Latent factor models are the dominant backbones of contemporary recommender
systems (RSs) given their performance advantages, where a unique vector
embedding with a fixed dimensionality (e.g., 128) is required to represent each
entity (commonly a user/item). Due to the large number of users and items on
e-commerce sites, the embedding table is arguably the least memory-efficient
component of RSs. For any lightweight recommender that aims to efficiently
scale with the growing size of users/items or to remain applicable in
resource-constrained settings, existing solutions either reduce the number of
embeddings needed via hashing, or sparsify the full embedding table to switch
off selected embedding dimensions. However, as hash collision arises or
embeddings become overly sparse, especially when adapting to a tighter memory
budget, those lightweight recommenders inevitably have to compromise their
accuracy. To this end, we propose a novel compact embedding framework for RSs,
namely Compositional Embedding with Regularized Pruning (CERP). Specifically,
CERP represents each entity by combining a pair of embeddings from two
independent, substantially smaller meta-embedding tables, which are then
jointly pruned via a learnable element-wise threshold. In addition, we
innovatively design a regularized pruning mechanism in CERP, such that the two
sparsified meta-embedding tables are encouraged to encode information that is
mutually complementary. Given the compatibility with agnostic latent factor
models, we pair CERP with two popular recommendation models for extensive
experiments, where results on two real-world datasets under different memory
budgets demonstrate its superiority against state-of-the-art baselines. The
codebase of CERP is available in https://github.com/xurong-liang/CERP.Comment: Accepted by ICDM'2
Near-threshold photoproduction of in two-gluon exchange model
The near-threshold photoproduction of is regarded as one golden
process to unveil the nucleon mass structure, pentaquark state involving the
charm quarks, and the poorly constrained gluon distribution of the nucleon at
large (). In this paper, we present an analysis of the current
experimental data under a two-gluon exchange model, which shows a good
consistency. Using a parameterized function form with three free parameters, we
have determined the nucleonic gluon distribution at the mass scale.
Moreover, we predict the differential cross section of the electroproduction of
as a function of the invariant mass of the final hadrons , at EicC,
as a practical application of the model and the obtained gluon distribution.
According to our estimation, thousands of events can be detected per
year on EicC near the threshold. Therefore, the relevant experimental
measurements are suggested to be carried out on EicC.Comment: 6 pages, 7 figure
Rapid Sensing of Hidden Objects and Defects using a Single-Pixel Diffractive Terahertz Processor
Terahertz waves offer numerous advantages for the nondestructive detection of
hidden objects/defects in materials, as they can penetrate through most
optically-opaque materials. However, existing terahertz inspection systems are
restricted in their throughput and accuracy (especially for detecting small
features) due to their limited speed and resolution. Furthermore, machine
vision-based continuous sensing systems that use large-pixel-count imaging are
generally bottlenecked due to their digital storage, data transmission and
image processing requirements. Here, we report a diffractive processor that
rapidly detects hidden defects/objects within a target sample using a
single-pixel spectroscopic terahertz detector, without scanning the sample or
forming/processing its image. This terahertz processor consists of passive
diffractive layers that are optimized using deep learning to modify the
spectrum of the terahertz radiation according to the absence/presence of hidden
structures or defects. After its fabrication, the resulting diffractive
processor all-optically probes the structural information of the sample volume
and outputs a spectrum that directly indicates the presence or absence of
hidden structures, not visible from outside. As a proof-of-concept, we trained
a diffractive terahertz processor to sense hidden defects (including
subwavelength features) inside test samples, and evaluated its performance by
analyzing the detection sensitivity as a function of the size and position of
the unknown defects. We validated its feasibility using a single-pixel
terahertz time-domain spectroscopy setup and 3D-printed diffractive layers,
successfully detecting hidden defects using pulsed terahertz illumination. This
technique will be valuable for various applications, e.g., security screening,
biomedical sensing, quality control, anti-counterfeiting measures and cultural
heritage protection.Comment: 23 Pages, 5 Figure
Spectrally-Encoded Single-Pixel Machine Vision Using Diffractive Networks
3D engineering of matter has opened up new avenues for designing systems that
can perform various computational tasks through light-matter interaction. Here,
we demonstrate the design of optical networks in the form of multiple
diffractive layers that are trained using deep learning to transform and encode
the spatial information of objects into the power spectrum of the diffracted
light, which are used to perform optical classification of objects with a
single-pixel spectroscopic detector. Using a time-domain spectroscopy setup
with a plasmonic nanoantenna-based detector, we experimentally validated this
machine vision framework at terahertz spectrum to optically classify the images
of handwritten digits by detecting the spectral power of the diffracted light
at ten distinct wavelengths, each representing one class/digit. We also report
the coupling of this spectral encoding achieved through a diffractive optical
network with a shallow electronic neural network, separately trained to
reconstruct the images of handwritten digits based on solely the spectral
information encoded in these ten distinct wavelengths within the diffracted
light. These reconstructed images demonstrate task-specific image decompression
and can also be cycled back as new inputs to the same diffractive network to
improve its optical object classification. This unique machine vision framework
merges the power of deep learning with the spatial and spectral processing
capabilities of diffractive networks, and can also be extended to other
spectral-domain measurement systems to enable new 3D imaging and sensing
modalities integrated with spectrally encoded classification tasks performed
through diffractive optical networks.Comment: 21 pages, 5 figures, 1 tabl
Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition
Automatic recognition of disordered speech remains a highly challenging task
to date. The underlying neuro-motor conditions, often compounded with
co-occurring physical disabilities, lead to the difficulty in collecting large
quantities of impaired speech required for ASR system development. This paper
presents novel variational auto-encoder generative adversarial network
(VAE-GAN) based personalized disordered speech augmentation approaches that
simultaneously learn to encode, generate and discriminate synthesized impaired
speech. Separate latent features are derived to learn dysarthric speech
characteristics and phoneme context representations. Self-supervised
pre-trained Wav2vec 2.0 embedding features are also incorporated. Experiments
conducted on the UASpeech corpus suggest the proposed adversarial data
augmentation approach consistently outperformed the baseline speed perturbation
and non-VAE GAN augmentation methods with trained hybrid TDNN and End-to-end
Conformer systems. After LHUC speaker adaptation, the best system using VAE-GAN
based augmentation produced an overall WER of 27.78% on the UASpeech test set
of 16 dysarthric speakers, and the lowest published WER of 57.31% on the subset
of speakers with "Very Low" intelligibility.Comment: Submitted to ICASSP 202
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Speaker adaptation techniques provide a powerful solution to customise
automatic speech recognition (ASR) systems for individual users. Practical
application of unsupervised model-based speaker adaptation techniques to data
intensive end-to-end ASR systems is hindered by the scarcity of speaker-level
data and performance sensitivity to transcription errors. To address these
issues, a set of compact and data efficient speaker-dependent (SD) parameter
representations are used to facilitate both speaker adaptive training and
test-time unsupervised speaker adaptation of state-of-the-art Conformer ASR
systems. The sensitivity to supervision quality is reduced using a confidence
score-based selection of the less erroneous subset of speaker-level adaptation
data. Two lightweight confidence score estimation modules are proposed to
produce more reliable confidence scores. The data sparsity issue, which is
exacerbated by data selection, is addressed by modelling the SD parameter
uncertainty using Bayesian learning. Experiments on the benchmark 300-hour
Switchboard and the 233-hour AMI datasets suggest that the proposed confidence
score-based adaptation schemes consistently outperformed the baseline
speaker-independent (SI) Conformer model and conventional non-Bayesian, point
estimate-based adaptation using no speaker data selection. Similar consistent
performance improvements were retained after external Transformer and LSTM
language model rescoring. In particular, on the 300-hour Switchboard corpus,
statistically significant WER reductions of 1.0%, 1.3%, and 1.4% absolute
(9.5%, 10.9%, and 11.3% relative) were obtained over the baseline SI Conformer
on the NIST Hub5'00, RT02, and RT03 evaluation sets respectively. Similar WER
reductions of 2.7% and 3.3% absolute (8.9% and 10.2% relative) were also
obtained on the AMI development and evaluation sets.Comment: IEEE/ACM Transactions on Audio, Speech, and Language Processin
- …