10 research outputs found

    Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications

    Full text link
    The representation learning of speech, without textual resources, is an area of significant interest for many low resource speech applications. In this paper, we describe an approach to self-supervised representation learning from raw audio using a hidden unit clustering (HUC) framework. The input to the model consists of audio samples that are windowed and processed with 1-D convolutional layers. The learned "time-frequency" representations from the convolutional neural network (CNN) module are further processed with long short term memory (LSTM) layers which generate a contextual vector representation for every windowed segment. The HUC framework, allowing the categorization of the representations into a small number of phoneme-like units, is used to train the model for learning semantically rich speech representations. The targets consist of phoneme-like pseudo labels for each audio segment and these are generated with an iterative k-means algorithm. We explore techniques that improve the speaker invariance of the learned representations and illustrate the effectiveness of the proposed approach on two settings, i) completely unsupervised speech applications on the sub-tasks described as part of the ZeroSpeech 2021 challenge and ii) semi-supervised automatic speech recognition (ASR) applications on the TIMIT dataset and on the GramVaani challenge Hindi dataset. In these experiments, we achieve state-of-art results for various ZeroSpeech tasks. Further, on the ASR experiments, the HUC representations are shown to improve significantly over other established benchmarks based on Wav2vec, HuBERT and Best-RQ

    Heart-on-a-Chip: A Closed-loop Testing Platform for Implantable Pacemakers

    Get PDF
    Implantable cardiac pacemakers restore normal heart rhythm by delivering external electrical pacing to the heart. The pacemaker software is life-critical as the timing of the pulses determine its ability to control the heart rate. Recalls due to software issues have been on the rise with the increasing complexity of pacing algorithms. Open-loop testing remains the primary approach to evaluate the safety of pacemaker software. While this tests how the pacemaker responds to stimulus, it cannot reveal pacemaker malfunctions which drive the heart into an unsafe state over multiple cycles. To evaluate the safety and efficacy of pacemaker software we have developed a heart model to generate different heart conditions and interact with real pacemakers. In this paper, we introduce the closed-loop testing platform which consists of a programmable hardware implementation of the heart that can interact with a commercial pacemaker in closed-loop. The heart-on-a-chip implementation is automatically generated from the Virtual Heart Model in Simulink which models different heart conditions. We describe a case study of Endless Loop Tachycardia to demonstrate potential closed-loop pacemaker malfunctions which inappropriately increase the heart rate. The test platform is part of our model-based design framework for verification and testing of medical devices with the patient--in-the-loop

    Unsupervised Syntactically Controlled Paraphrase Generation with Abstract Meaning Representations

    Full text link
    Syntactically controlled paraphrase generation has become an emerging research direction in recent years. Most existing approaches require annotated paraphrase pairs for training and are thus costly to extend to new domains. Unsupervised approaches, on the other hand, do not need paraphrase pairs but suffer from relatively poor performance in terms of syntactic control and quality of generated paraphrases. In this paper, we demonstrate that leveraging Abstract Meaning Representations (AMR) can greatly improve the performance of unsupervised syntactically controlled paraphrase generation. Our proposed model, AMR-enhanced Paraphrase Generator (AMRPG), separately encodes the AMR graph and the constituency parse of the input sentence into two disentangled semantic and syntactic embeddings. A decoder is then learned to reconstruct the input sentence from the semantic and syntactic embeddings. Our experiments show that AMRPG generates more accurate syntactically controlled paraphrases, both quantitatively and qualitatively, compared to the existing unsupervised approaches. We also demonstrate that the paraphrases generated by AMRPG can be used for data augmentation to improve the robustness of NLP models.Comment: Paper accepted by EMNLP 2022 Findings. The first two authors contribute equall

    Novel bisphosphonate-based cathepsin K-triggered compound targets the enthesis without impairing soft tissue-to-bone healing

    Get PDF
    Background: Osteoadsorptive fluorogenic sentinel 3 (OFS-3) is a recently described compound that contains a bone-targeting bisphosphonate (BP) and cathepsin K (Ctsk)-triggered fluorescence signal. A prior study in a murine Achilles repair model demonstrated its effectiveness at targeting the site of tendon-to-bone repair, but the intrinsic effect of this novel bisphosphonate chaperone on tendon-to-bone healing has not been previously explored. We hypothesized that application of this bisphosphonate-fluorophore cargo conjugate would not affect the biomechanical properties or histologic appearance of tendon-bone repairs.Materials and Methods: Right hindlimb Achilles tendon-to-bone repair was performed on 12-week old male mice. Animals were divided into 2 groups of 18 each: 1) Achilles repair with OFS-3 applied directly to the repair site prior to closure, and 2) Achilles repair with saline applied prior to closure. Repaired hindlimbs from 12 animals per group were harvested at 6 weeks for biomechanical analysis with a custom 3D-printed jig. At 4 and 6 weeks, repaired hindlimbs from the remaining animals were assessed histologically using H&E, immunohistochemistry (IHC) staining for the presence of Ctsk, and second harmonic generation (SHG) imaging to evaluate collagen fibers.Results: At 6 weeks, there was no significant difference in failure load, stiffness, toughness, or displacement to failure between repaired hindlimbs that received OFS-3 versus saline. There was no difference in tissue healing on H&E or Ctsk staining on immunohistochemistry between animals that received OFS-3 versus saline. Finally, second harmonic generation imaging demonstrated no difference in collagen fiber parameters between the two groups.Conclusion: OFS-3 did not significantly affect the biomechanical properties or histologic appearance of murine Achilles tendon-to-bone repairs. This study demonstrates that OFS-3 can target the site of tendon-to-bone repair without causing intrinsic negative effects on healing. Further development of this drug delivery platform to target growth factors to the site of tendon-bone repair is warranted

    ECLIPSE: Holistic AI System for Preparing Insurer Policy Data

    No full text
    Reinsurers possess high volumes of policy listings data from insurers, which they use to provide insurers with analytical insights and modeling that guide reinsurance treaties. These insurers often act on the same data for their own internal modeling and analytics needs. The problem is this data is messy and needs significant preparation in order to extract meaningful insights. Traditionally, this has required intensive manual labor from actuaries. However, a host of modern AI techniques and ML system architectures introduced in the past decade can be applied to the problem of insurance data preparation. In this paper, we explore a novel application of AI/ML on policy listings data that poses its own unique challenges, by outlining the holistic AI-based platform we developed, ECLIPSE (Elegant Cleaning and Labeling of Insurance Policies while Standardizing Entities). With ECLIPSE, actuaries not only save time on data preparation but can build more effective loss models and provide crisper insights

    Induced Churn as Shelter from Routing-Table Poisoning

    No full text
    Structured overlays are an important and powerful class of overlay networks that has emerged in recent years. They are typically targeted at peer-to-peer deployments involving millions of user-managed machines on the Internet. In this paper we address routing-table poisoning attacks against structured overlays, in which adversaries attempt to intercept traffic and control the system by convincing other nodes to use compromised nodes as their overlay network neighbors. In keeping with the fully-decentralized goals of structured overlay design, we propose a defense mechanism that makes minimal use of centralized infrastructure. Our approach, induced churn, utilizes periodic routing-table resets, unpredictable identifier changes, and a rate limit on routing-table updates. Induced churn leaves adversaries at the mercy of chance: they have little opportunity to strategize their positions in the overlay, and cannot entrench themselves in any position that they do acquire. We implement induced churn in Maelstrom, an extension to the broadly used Bamboo distributed hash table. Our Maelstrom experiments over a simulated network demonstrate robust routing with very modest costs in bandwidth and latency, at levels of adversarial activity where unprotected overlays are rendered almost completely useless
    corecore