15 research outputs found

    HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning

    Full text link
    While the Turkish language is listed among low-resource languages, literature on Turkish automatic speech recognition (ASR) is relatively old. In this paper, we present HuBERT-TR, a speech representation model for Turkish, based on HuBERT. HuBERT-TR achieves state-of-the-art results on several Turkish ASR datasets. We investigate pre-training HuBERT for Turkish with large-scale data curated from online resources. We pre-train HuBERT-TR using over 6,500 hours of speech data curated from YouTube that includes extensive variability in terms of quality and genre. We show that language-specific models are superior to other pre-trained models, where our Turkish model HuBERT-TR/base performs better than the x10 times larger state-of-the-art multilingual XLS-R-1b model in low-resource settings. Moreover, we study the effect of scaling on ASR performance by scaling our models up to 1B parameters. Our best model yields a state-of-the-art word error rate of 4.97% on the Turkish Broadcast News dataset. Models are available at https://huggingface.co/asafayaComment: Submitted to ICASSP202

    Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report

    Get PDF
    We describe our effort on automated extraction of socio-political events from news in the scope of a workshop and a shared task we organized at Language Resources and Evaluation Conference (LREC 2020). We believe the event extraction studies in computational linguistics and social and political sciences should further support each other in order to enable large scale socio-political event information collection across sources, countries, and languages. The event consists of regular research papers and a shared task, which is about event sentence coreference identification (ESCI), tracks. All submissions were reviewed by five members of the program committee. The workshop attracted research papers related to evaluation of machine learning methodologies, language resources, material conflict forecasting, and a shared task participation report in the scope of socio-political event information collection. It has shown us the volume and variety of both the data sources and event information collection approaches related to socio-political events and the need to fill the gap between automated text processing techniques and requirements of social and political sciences

    Lipoprotein associated phospholipase A2: role in atherosclerosis and utility as a biomarker for cardiovascular risk

    Get PDF
    Atherosclerosis and its clinical manifestations are widely prevalent throughout the world. Atherogenesis is highly complex and modulated by numerous genetic and environmental risk factors. A large body of basic scientific and clinical research supports the conclusion that inflammation plays a significant role in atherogenesis along the entire continuum of its progression. Inflammation adversely impacts intravascular lipid handling and metabolism, resulting in the development of macrophage foam cell, fatty streak, and atheromatous plaque formation. Given the enormous human and economic cost of myocardial infarction, ischemic stroke, peripheral arterial disease and amputation, and premature death and disability, considerable effort is being committed to refining our ability to correctly identify patients at heightened risk for atherosclerotic vascular disease and acute cardiovascular events so that they can be treated earlier and more aggressively. Serum markers of inflammation have emerged as an important component of risk factor burden. Lipoprotein-associated phospholipase A2 (Lp-PLA2) potentiates intravascular inflammation and atherosclerosis. A variety of epidemiologic studies support the utility of Lp-PLA2 measurements for estimating and further refining cardiovascular disease risk. Drug therapies to inhibit Lp-PLA2 are in development and show considerable promise, including darapladib, a specific molecular inhibitor of the enzyme. In addition to substantially inhibiting Lp-PLA2 activity, darapladib reduces progression of the necrotic core volume of human coronary artery atheromatous plaque. The growing body of evidence points to an important role and utility for Lp-PLA2 testing in preventive and personalized clinical medicine
    corecore