105 research outputs found

    Memory-augmented cognitive radar for obstacle avoidance using nearest steering vector search

    Get PDF
    Abstract This study describes a cognitive radar architecture with application to real‐time obstacle avoidance in mobile robotic platforms. The concept of a world memory map is introduced as a means of providing an enhanced perception of the environment around the robotic platform. This is combined with a specially designed obstacle avoidance algorithm, Nearest Steering Vector Searching, all capable of operating in real‐time. The study analytically derives the radar signal processing algorithm, starting from range‐angle maps, so that a collision free course to a set destination point can be robustly navigated. Finally, the performance of this cognitive approach is examined through a number of proof‐of‐concept experiments using a commercial off‐the‐shelf radar mounted on a mobile ground robotic platform

    Integration of E-education and Knowledge Management

    Get PDF
    Abstract. With the realization that knowledge is a core resource, organizations are now attempting to manage knowledge in a more systematic and more effective way. However, managing knowledge is not always an easy task. In particular contexts, such as online e-education, knowledge is distributed across both time and space and may be constrained by social, cultural and language differences. This paper demonstrated the common characters of knowledge management and e-education, and proposed the current potential problems in e-education. The authors tried to develop a set of guidelines to help overcome problems using tools and techniques from KM, they proposed three strategies: corporate explicit knowledge and tacit knowledge; use the theory of KM to guide e-education resource management; use the theory of KM to guide eeducation resource management. These strategies will help us to develop a better e-education framework

    Delay-penalized transducer for low-latency streaming ASR

    Full text link
    In streaming automatic speech recognition (ASR), it is desirable to reduce latency as much as possible while having minimum impact on recognition accuracy. Although a few existing methods are able to achieve this goal, they are difficult to implement due to their dependency on external alignments. In this paper, we propose a simple way to penalize symbol delay in transducer model, so that we can balance the trade-off between symbol delay and accuracy for streaming models without external alignments. Specifically, our method adds a small constant times (T/2 - t), where T is the number of frames and t is the current frame, to all the non-blank log-probabilities (after normalization) that are fed into the two dimensional transducer recursion. For both streaming Conformer models and unidirectional long short-term memory (LSTM) models, experimental results show that it can significantly reduce the symbol delay with an acceptable performance degradation. Our method achieves similar delay-accuracy trade-off to the previously published FastEmit, but we believe our method is preferable because it has a better justification: it is equivalent to penalizing the average symbol delay. Our work is open-sourced and publicly available (https://github.com/k2-fsa/k2).Comment: Submitted to 2023 IEEE International Conference on Acoustics, Speech and Signal Processin

    PromptASR for contextualized ASR with controllable style

    Full text link
    Prompts are crucial to large language models as they provide context information such as topic or logical relationships. Inspired by this, we propose PromptASR, a framework that integrates prompts in end-to-end automatic speech recognition (E2E ASR) systems to achieve contextualized ASR with controllable style of transcriptions. Specifically, a dedicated text encoder encodes the text prompts and the encodings are injected into the speech encoder by cross-attending the features from two modalities. When using the ground truth text from preceding utterances as content prompt, the proposed system achieves 21.9% and 6.8% relative word error rate reductions on a book reading dataset and an in-house dataset compared to a baseline ASR system. The system can also take word-level biasing lists as prompt to improve recognition accuracy on rare words. An additional style prompt can be given to the text encoder and guide the ASR system to output different styles of transcriptions. The code is available at icefall.Comment: Submitted to ICASSP202

    Delay-penalized CTC implemented based on Finite State Transducer

    Full text link
    Connectionist Temporal Classification (CTC) suffers from the latency problem when applied to streaming models. We argue that in CTC lattice, the alignments that can access more future context are preferred during training, thereby leading to higher symbol delay. In this work we propose the delay-penalized CTC which is augmented with latency penalty regularization. We devise a flexible and efficient implementation based on the differentiable Finite State Transducer (FST). Specifically, by attaching a binary attribute to CTC topology, we can locate the frames that firstly emit non-blank tokens on the resulting CTC lattice, and add the frame offsets to the log-probabilities. Experimental results demonstrate the effectiveness of our proposed delay-penalized CTC, which is able to balance the delay-accuracy trade-off. Furthermore, combining the delay-penalized transducer enables the CTC model to achieve better performance and lower latency. Our work is open-sourced and publicly available https://github.com/k2-fsa/k2.Comment: Accepted in INTERSPEECH 202

    Multi-channel quantum noise suppression and phase-sensitive modulation in a hybrid optical resonant cavity system

    Full text link
    Quantum noise suppression and phase-sensitive modulation of continuously variable in vacuum and squeezed fields in a hybrid resonant cavity system are investigated theoretically. Multiple dark windows similar to electromagnetic induction transparency (EIT) are observed in quantum noise fluctuation curve. The effects of pumping light on both suppression of quantum noise and control the widths of dark windows are carefully analyzed, and the saturation point of pumping light for nonlinear crystal conversion is obtained. We find that the noise suppression effect is strongly sensitive to the pumping light power. The degree of noise suppression can be up to 13.9 dB when the pumping light power is 6.5 Beta_th. Moreover, a phase-sensitive modulation scheme is demonstrated, which well fills the gap that multi-channel quantum noise suppression is difficult to realize at the quadrature amplitude of squeezed field. Our result is meaningful for various applications in precise measurement physics, quantum information processing and quantum communications of system-on-a-chip

    Fast and parallel decoding for transducer

    Full text link
    The transducer architecture is becoming increasingly popular in the field of speech recognition, because it is naturally streaming as well as high in accuracy. One of the drawbacks of transducer is that it is difficult to decode in a fast and parallel way due to an unconstrained number of symbols that can be emitted per time step. In this work, we introduce a constrained version of transducer loss to learn strictly monotonic alignments between the sequences; we also improve the standard greedy search and beam search algorithms by limiting the number of symbols that can be emitted per time step in transducer decoding, making it more efficient to decode in parallel with batches. Furthermore, we propose an finite state automaton-based (FSA) parallel beam search algorithm that can run with graphs on GPU efficiently. The experiment results show that we achieve slight word error rate (WER) improvement as well as significant speedup in decoding. Our work is open-sourced and publicly available\footnote{https://github.com/k2-fsa/icefall}.Comment: Submitted to 2023 IEEE International Conference on Acoustics, Speech and Signal Processin

    An association of a simultaneous nuclear and cytoplasmic localization of Fra-1 with breast malignancy

    Get PDF
    BACKGROUND: Overexpression of Fra-1 in fibroblasts causes anchorage-independent cell growth and oncogenic transformation. A high level of Fra-1 expression is found in various tumors and tumorigenic cell lines, suggesting that Fra-1 may be involved in malignant progression. This study aimed to investigate the significance of Fra-1 expression in breast carcinogenesis. METHODS: The expression of Fra-1 was investigated by immunohistochemistry in neoplastic breast diseases ranging from benign fibroadenoma to very aggressive undifferentiated carcinoma. The correlations of Fra-1 expression with other indicators of breast carcinoma prognosis (ER, PR and ErbB2 receptors) were analyzed. RESULTS: All neoplastic breast tissues, either benign or malignant breast tissues, were nuclear immunoreactive for Fra-1-recognizing antibody. The pattern of Fra-1 expression by benign neoplastic cells was predominantly nuclear. However, the nuclear/cytoplasmic concomitant immunoreactivity was observed in all types of breast carcinomas. A clear shift in Fra-1 immunoreactivity, from an exclusively nuclear to a simultaneous nuclear and cytoplasmic localization was noticed in ~90% of breast carcinomas. CONCLUSION: The overall expression, pattern and intensity of Fra-1 proteins were correlated with breast oncogenesis. Overexpression of Fra-1, leading to a persistent high cytoplasmic accumulation, may play a role in the process of breast carcinogenesis

    Thirty-six months recurrence after acute ischemic stroke among patients with comorbid type 2 diabetes: A nested case-control study

    Get PDF
    Background: Stroke patients have to face a high risk of recurrence, especially for those with comorbid T2DM, which usually lead to much more serious neurologic damage and an increased likelihood of death. This study aimed to explore determinants of stroke relapse among patients with comorbid T2DM. Materials and methods: We conducted this case-control study nested a prospective cohort of ischemic stroke (IS) with comorbid T2DM. During 36-month follow-up, the second stroke occurred in 84 diabetic IS patients who were allocated into the case group, while 613 patients without recurrence were the controls. We collected the demographic data, behaviors and habits, therapies, and family history at baseline, and measured the variables during follow-up. LASSO and Logistic regression analyses were carried out to develop a prediction model of stroke recurrence. The receiver operator characteristic (ROC) curve was employed to evaluate the performance of the prediction model. Results: Compared to participants without recurrence, the higher levels of pulse rate (78.29 ± 12.79 vs. 74.88 ± 10.93) and hypertension (72.6 vs. 61.2 %) were recorded at baseline. Moreover, a lower level of physical activity (77.4 vs. 90.4 %), as well as a higher proportion of hypoglycemic therapy (36.9 vs. 23.3 %) was also observed during 36-month follow-up. Multivariate logistic regression revealed that higher pulse rate at admission (OR = 1.027, 95 % CI = 1.005 – 1.049), lacking physical activity (OR = 2.838, 9 5 % CI = 1.418 – 5.620) and not receiving hypoglycemic therapy (OR = 1.697, 95 % CI = 1.013 – 2.843) during follow-up increased the risk of stroke recurrence. We developed a prediction model using baseline pulse rate, hypoglycemic therapy, and physical activity, which produced an area under ROC curve (AUC) of 0.689. Conclusion: Physical activity and hypoglycemic therapy play a protective role for IS patients with comorbid diabetes. In addition to targeted therapeutics, the improvement of daily-life habit contributes to slowing the progress of the IS

    Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation

    Full text link
    Knowledge distillation(KD) is a common approach to improve model performance in automatic speech recognition (ASR), where a student model is trained to imitate the output behaviour of a teacher model. However, traditional KD methods suffer from teacher label storage issue, especially when the training corpora are large. Although on-the-fly teacher label generation tackles this issue, the training speed is significantly slower as the teacher model has to be evaluated every batch. In this paper, we reformulate the generation of teacher label as a codec problem. We propose a novel Multi-codebook Vector Quantization (MVQ) approach that compresses teacher embeddings to codebook indexes (CI). Based on this, a KD training framework (MVQ-KD) is proposed where a student model predicts the CI generated from the embeddings of a self-supervised pre-trained teacher model. Experiments on the LibriSpeech clean-100 hour show that MVQ-KD framework achieves comparable performance as traditional KD methods (l1, l2), while requiring 256 times less storage. When the full LibriSpeech dataset is used, MVQ-KD framework results in 13.8% and 8.2% relative word error rate reductions (WERRs) for non -streaming transducer on test-clean and test-other and 4.0% and 4.9% for streaming transducer. The implementation of this work is already released as a part of the open-source project icefall.Comment: Submitted to ICASSP 202
    • 

    corecore