89 research outputs found
Statistical PT-symmetric lasing in an optical fiber network
PT-symmetry in optics is a condition whereby the real and imaginary parts of
the refractive index across a photonic structure are deliberately balanced.
This balance can lead to a host of novel optical phenomena, such as
unidirectional invisibility, loss-induced lasing, single-mode lasing from
multimode resonators, and non-reciprocal effects in conjunction with
nonlinearities. Because PT-symmetry has been thought of as fragile,
experimental realizations to date have been usually restricted to on-chip
micro-devices. Here, we demonstrate that certain features of PT-symmetry are
sufficiently robust to survive the statistical fluctuations associated with a
macroscopic optical cavity. We construct optical-fiber-based coupled-cavities
in excess of a kilometer in length (the free spectral range is less than 0.8
fm) with balanced gain and loss in two sub-cavities and examine the lasing
dynamics. In such a macroscopic system, fluctuations can lead to a
cavity-detuning exceeding the free spectral range. Nevertheless, by varying the
gain-loss contrast, we observe that both the lasing threshold and the growth of
the laser power follow the predicted behavior of a stable PT-symmetric
structure. Furthermore, a statistical symmetry-breaking point is observed upon
varying the cavity loss. These findings indicate that PT-symmetry is a more
robust optical phenomenon than previously expected, and points to potential
applications in optical fiber networks and fiber lasers.Comment: Submitted to Nature Communications, Pages 1-19: Main manuscript;
Pages 20-38: Supplementary material
Nonlinear reversal of PT symmetric phase transition in a system of coupled semiconductor micro-ring resonators
A system of two coupled semiconductor-based resonators is studied when lasing
around an exceptional point. We show that the presence of nonlinear saturation
effects can have important ramifications on the transition behavior of this
system. In sharp contrast with linear PT-symmetric configurations, nonlinear
processes are capable of reversing the order in which the symmetry breaking
occurs. Yet, even in the nonlinear regime, the resulting non-Hermitian states
still retain the structural form of the corresponding linear eigenvectors
expected above and below the phase transition point. The conclusions of our
analysis are in agreement with experimental data.Comment: 9 pages, 8 figure
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
End-to-end DNN architectures have pushed the state-of-the-art in speech
technologies, as well as in other spheres of AI, leading researchers to train
more complex and deeper models. These improvements came at the cost of
transparency. DNNs are innately opaque and difficult to interpret. We no longer
understand what features are learned, where they are preserved, and how they
inter-operate. Such an analysis is important for better model understanding,
debugging and to ensure fairness in ethical decision making. In this work, we
analyze the representations trained within deep speech models, towards the task
of speaker recognition, dialect identification and reconstruction of masked
signals. We carry a layer- and neuron-level analysis on the utterance-level
representations captured within pretrained speech models for speaker, language
and channel properties. We study: is this information captured in the learned
representations? where is it preserved? how is it distributed? and can we
identify a minimal subset of network that posses this information. Using
diagnostic classifiers, we answered these questions. Our results reveal: (i)
channel and gender information is omnipresent and is redundantly distributed
(ii) complex properties such as dialectal information is encoded only in the
task-oriented pretrained network and is localised in the upper layers (iii) a
minimal subset of neurons can be extracted to encode the predefined property
(iv) salient neurons are sometimes shared between properties and can highlights
presence of biases in the network. Our cross-architectural comparison indicates
that (v) the pretrained models captures speaker-invariant information and (vi)
the pretrained CNNs models are competitive to the Transformers for encoding
information for the studied properties. To the best of our knowledge, this is
the first study to investigate neuron analysis on the speech models.Comment: Submitted to CSL. Keywords: Speech, Neuron Analysis,
Interpretibility, Diagnostic Classifier, AI explainability, End-to-End
Architectur
Integrable nonlinear parity-time symmetric optical oscillator
The nonlinear dynamics of a balanced parity-time symmetric optical microring
arrangement are analytically investigated. By considering gain and loss
saturation effects, the pertinent conservation laws are explicitly obtained in
the Stokes domain-thus establishing integrability. Our analysis indicates the
existence of two regimes of oscillatory dynamics and frequency locking, both of
which are analogous to those expected in linear parity-time symmetric systems.
Unlike other saturable parity time symmetric systems considered before, the
model studied in this work first operates in the symmetric regime and then
enters the broken parity-time phase.Comment: 6 pages, 5 figures, accepted for publicatio
Multi-View Multi-Task Representation Learning for Mispronunciation Detection
The disparity in phonology between learner's native (L1) and target (L2)
language poses a significant challenge for mispronunciation detection and
diagnosis (MDD) systems. This challenge is further intensified by lack of
annotated L2 data. This paper proposes a novel MDD architecture that exploits
multiple `views' of the same input data assisted by auxiliary tasks to learn
more distinctive phonetic representation in a low-resource setting. Using the
mono- and multilingual encoders, the model learn multiple views of the input,
and capture the sound properties across diverse languages and accents. These
encoded representations are further enriched by learning articulatory features
in a multi-task setup. Our reported results using the L2-ARCTIC data
outperformed the SOTA models, with a phoneme error rate reduction of 11.13% and
8.60% and absolute F1 score increase of 5.89%, and 2.49% compared to the
single-view mono- and multilingual systems, with a limited L2 dataset.Comment: 5 page
The complementary roles of non-verbal cues for Robust Pronunciation Assessment
Research on pronunciation assessment systems focuses on utilizing phonetic
and phonological aspects of non-native (L2) speech, often neglecting the rich
layer of information hidden within the non-verbal cues. In this study, we
proposed a novel pronunciation assessment framework, IntraVerbalPA. % The
framework innovatively incorporates both fine-grained frame- and abstract
utterance-level non-verbal cues, alongside the conventional speech and phoneme
representations. Additionally, we introduce ''Goodness of phonemic-duration''
metric to effectively model duration distribution within the framework. Our
results validate the effectiveness of the proposed IntraVerbalPA framework and
its individual components, yielding performance that either matches or
outperforms existing research works.Comment: 5 pages, submitted to ICASSP 202
Automatic Pronunciation Assessment -- A Review
Pronunciation assessment and its application in computer-aided pronunciation
training (CAPT) have seen impressive progress in recent years. With the rapid
growth in language processing and deep learning over the past few years, there
is a need for an updated review. In this paper, we review methods employed in
pronunciation assessment for both phonemic and prosodic. We categorize the main
challenges observed in prominent research trends, and highlight existing
limitations, and available resources. This is followed by a discussion of the
remaining challenges and possible directions for future work.Comment: 9 pages, accepted to EMNLP Finding
- …