105 research outputs found
Memory-augmented cognitive radar for obstacle avoidance using nearest steering vector search
Abstract This study describes a cognitive radar architecture with application to realâtime obstacle avoidance in mobile robotic platforms. The concept of a world memory map is introduced as a means of providing an enhanced perception of the environment around the robotic platform. This is combined with a specially designed obstacle avoidance algorithm, Nearest Steering Vector Searching, all capable of operating in realâtime. The study analytically derives the radar signal processing algorithm, starting from rangeâangle maps, so that a collision free course to a set destination point can be robustly navigated. Finally, the performance of this cognitive approach is examined through a number of proofâofâconcept experiments using a commercial offâtheâshelf radar mounted on a mobile ground robotic platform
Integration of E-education and Knowledge Management
Abstract. With the realization that knowledge is a core resource, organizations are now attempting to manage knowledge in a more systematic and more effective way. However, managing knowledge is not always an easy task. In particular contexts, such as online e-education, knowledge is distributed across both time and space and may be constrained by social, cultural and language differences. This paper demonstrated the common characters of knowledge management and e-education, and proposed the current potential problems in e-education. The authors tried to develop a set of guidelines to help overcome problems using tools and techniques from KM, they proposed three strategies: corporate explicit knowledge and tacit knowledge; use the theory of KM to guide e-education resource management; use the theory of KM to guide eeducation resource management. These strategies will help us to develop a better e-education framework
Delay-penalized transducer for low-latency streaming ASR
In streaming automatic speech recognition (ASR), it is desirable to reduce
latency as much as possible while having minimum impact on recognition
accuracy. Although a few existing methods are able to achieve this goal, they
are difficult to implement due to their dependency on external alignments. In
this paper, we propose a simple way to penalize symbol delay in transducer
model, so that we can balance the trade-off between symbol delay and accuracy
for streaming models without external alignments. Specifically, our method adds
a small constant times (T/2 - t), where T is the number of frames and t is the
current frame, to all the non-blank log-probabilities (after normalization)
that are fed into the two dimensional transducer recursion. For both streaming
Conformer models and unidirectional long short-term memory (LSTM) models,
experimental results show that it can significantly reduce the symbol delay
with an acceptable performance degradation. Our method achieves similar
delay-accuracy trade-off to the previously published FastEmit, but we believe
our method is preferable because it has a better justification: it is
equivalent to penalizing the average symbol delay. Our work is open-sourced and
publicly available (https://github.com/k2-fsa/k2).Comment: Submitted to 2023 IEEE International Conference on Acoustics, Speech
and Signal Processin
PromptASR for contextualized ASR with controllable style
Prompts are crucial to large language models as they provide context
information such as topic or logical relationships. Inspired by this, we
propose PromptASR, a framework that integrates prompts in end-to-end automatic
speech recognition (E2E ASR) systems to achieve contextualized ASR with
controllable style of transcriptions. Specifically, a dedicated text encoder
encodes the text prompts and the encodings are injected into the speech encoder
by cross-attending the features from two modalities. When using the ground
truth text from preceding utterances as content prompt, the proposed system
achieves 21.9% and 6.8% relative word error rate reductions on a book reading
dataset and an in-house dataset compared to a baseline ASR system. The system
can also take word-level biasing lists as prompt to improve recognition
accuracy on rare words. An additional style prompt can be given to the text
encoder and guide the ASR system to output different styles of transcriptions.
The code is available at icefall.Comment: Submitted to ICASSP202
Delay-penalized CTC implemented based on Finite State Transducer
Connectionist Temporal Classification (CTC) suffers from the latency problem
when applied to streaming models. We argue that in CTC lattice, the alignments
that can access more future context are preferred during training, thereby
leading to higher symbol delay. In this work we propose the delay-penalized CTC
which is augmented with latency penalty regularization. We devise a flexible
and efficient implementation based on the differentiable Finite State
Transducer (FST). Specifically, by attaching a binary attribute to CTC
topology, we can locate the frames that firstly emit non-blank tokens on the
resulting CTC lattice, and add the frame offsets to the log-probabilities.
Experimental results demonstrate the effectiveness of our proposed
delay-penalized CTC, which is able to balance the delay-accuracy trade-off.
Furthermore, combining the delay-penalized transducer enables the CTC model to
achieve better performance and lower latency. Our work is open-sourced and
publicly available https://github.com/k2-fsa/k2.Comment: Accepted in INTERSPEECH 202
Multi-channel quantum noise suppression and phase-sensitive modulation in a hybrid optical resonant cavity system
Quantum noise suppression and phase-sensitive modulation of continuously
variable in vacuum and squeezed fields in a hybrid resonant cavity system are
investigated theoretically. Multiple dark windows similar to electromagnetic
induction transparency (EIT) are observed in quantum noise fluctuation curve.
The effects of pumping light on both suppression of quantum noise and control
the widths of dark windows are carefully analyzed, and the saturation point of
pumping light for nonlinear crystal conversion is obtained. We find that the
noise suppression effect is strongly sensitive to the pumping light power. The
degree of noise suppression can be up to 13.9 dB when the pumping light power
is 6.5 Beta_th. Moreover, a phase-sensitive modulation scheme is demonstrated,
which well fills the gap that multi-channel quantum noise suppression is
difficult to realize at the quadrature amplitude of squeezed field. Our result
is meaningful for various applications in precise measurement physics, quantum
information processing and quantum communications of system-on-a-chip
Fast and parallel decoding for transducer
The transducer architecture is becoming increasingly popular in the field of
speech recognition, because it is naturally streaming as well as high in
accuracy. One of the drawbacks of transducer is that it is difficult to decode
in a fast and parallel way due to an unconstrained number of symbols that can
be emitted per time step. In this work, we introduce a constrained version of
transducer loss to learn strictly monotonic alignments between the sequences;
we also improve the standard greedy search and beam search algorithms by
limiting the number of symbols that can be emitted per time step in transducer
decoding, making it more efficient to decode in parallel with batches.
Furthermore, we propose an finite state automaton-based (FSA) parallel beam
search algorithm that can run with graphs on GPU efficiently. The experiment
results show that we achieve slight word error rate (WER) improvement as well
as significant speedup in decoding. Our work is open-sourced and publicly
available\footnote{https://github.com/k2-fsa/icefall}.Comment: Submitted to 2023 IEEE International Conference on Acoustics, Speech
and Signal Processin
An association of a simultaneous nuclear and cytoplasmic localization of Fra-1 with breast malignancy
BACKGROUND: Overexpression of Fra-1 in fibroblasts causes anchorage-independent cell growth and oncogenic transformation. A high level of Fra-1 expression is found in various tumors and tumorigenic cell lines, suggesting that Fra-1 may be involved in malignant progression. This study aimed to investigate the significance of Fra-1 expression in breast carcinogenesis. METHODS: The expression of Fra-1 was investigated by immunohistochemistry in neoplastic breast diseases ranging from benign fibroadenoma to very aggressive undifferentiated carcinoma. The correlations of Fra-1 expression with other indicators of breast carcinoma prognosis (ER, PR and ErbB2 receptors) were analyzed. RESULTS: All neoplastic breast tissues, either benign or malignant breast tissues, were nuclear immunoreactive for Fra-1-recognizing antibody. The pattern of Fra-1 expression by benign neoplastic cells was predominantly nuclear. However, the nuclear/cytoplasmic concomitant immunoreactivity was observed in all types of breast carcinomas. A clear shift in Fra-1 immunoreactivity, from an exclusively nuclear to a simultaneous nuclear and cytoplasmic localization was noticed in ~90% of breast carcinomas. CONCLUSION: The overall expression, pattern and intensity of Fra-1 proteins were correlated with breast oncogenesis. Overexpression of Fra-1, leading to a persistent high cytoplasmic accumulation, may play a role in the process of breast carcinogenesis
Thirty-six months recurrence after acute ischemic stroke among patients with comorbid type 2 diabetes: A nested case-control study
Background: Stroke patients have to face a high risk of recurrence, especially for those with comorbid T2DM, which usually lead to much more serious neurologic damage and an increased likelihood of death. This study aimed to explore determinants of stroke relapse among patients with comorbid T2DM. Materials and methods: We conducted this case-control study nested a prospective cohort of ischemic stroke (IS) with comorbid T2DM. During 36-month follow-up, the second stroke occurred in 84 diabetic IS patients who were allocated into the case group, while 613 patients without recurrence were the controls. We collected the demographic data, behaviors and habits, therapies, and family history at baseline, and measured the variables during follow-up. LASSO and Logistic regression analyses were carried out to develop a prediction model of stroke recurrence. The receiver operator characteristic (ROC) curve was employed to evaluate the performance of the prediction model. Results: Compared to participants without recurrence, the higher levels of pulse rate (78.29 ± 12.79 vs. 74.88 ± 10.93) and hypertension (72.6 vs. 61.2 %) were recorded at baseline. Moreover, a lower level of physical activity (77.4 vs. 90.4 %), as well as a higher proportion of hypoglycemic therapy (36.9 vs. 23.3 %) was also observed during 36-month follow-up. Multivariate logistic regression revealed that higher pulse rate at admission (OR = 1.027, 95 % CI = 1.005 â 1.049), lacking physical activity (OR = 2.838, 9 5 % CI = 1.418 â 5.620) and not receiving hypoglycemic therapy (OR = 1.697, 95 % CI = 1.013 â 2.843) during follow-up increased the risk of stroke recurrence. We developed a prediction model using baseline pulse rate, hypoglycemic therapy, and physical activity, which produced an area under ROC curve (AUC) of 0.689. Conclusion: Physical activity and hypoglycemic therapy play a protective role for IS patients with comorbid diabetes. In addition to targeted therapeutics, the improvement of daily-life habit contributes to slowing the progress of the IS
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Knowledge distillation(KD) is a common approach to improve model performance
in automatic speech recognition (ASR), where a student model is trained to
imitate the output behaviour of a teacher model. However, traditional KD
methods suffer from teacher label storage issue, especially when the training
corpora are large. Although on-the-fly teacher label generation tackles this
issue, the training speed is significantly slower as the teacher model has to
be evaluated every batch. In this paper, we reformulate the generation of
teacher label as a codec problem. We propose a novel Multi-codebook Vector
Quantization (MVQ) approach that compresses teacher embeddings to codebook
indexes (CI). Based on this, a KD training framework (MVQ-KD) is proposed where
a student model predicts the CI generated from the embeddings of a
self-supervised pre-trained teacher model. Experiments on the LibriSpeech
clean-100 hour show that MVQ-KD framework achieves comparable performance as
traditional KD methods (l1, l2), while requiring 256 times less storage. When
the full LibriSpeech dataset is used, MVQ-KD framework results in 13.8% and
8.2% relative word error rate reductions (WERRs) for non -streaming transducer
on test-clean and test-other and 4.0% and 4.9% for streaming transducer. The
implementation of this work is already released as a part of the open-source
project icefall.Comment: Submitted to ICASSP 202
- âŠ