50 research outputs found

    Multilingual representations for low resource speech recognition and keyword search

    Get PDF
    © 2015 IEEE. This paper examines the impact of multilingual (ML) acoustic representations on Automatic Speech Recognition (ASR) and keyword search (KWS) for low resource languages in the context of the OpenKWS15 evaluation of the IARPA Babel program. The task is to develop Swahili ASR and KWS systems within two weeks using as little as 3 hours of transcribed data. Multilingual acoustic representations proved to be crucial for building these systems under strict time constraints. The paper discusses several key insights on how these representations are derived and used. First, we present a data sampling strategy that can speed up the training of multilingual representations without appreciable loss in ASR performance. Second, we show that fusion of diverse multilingual representations developed at different LORELEI sites yields substantial ASR and KWS gains. Speaker adaptation and data augmentation of these representations improves both ASR and KWS performance (up to 8.7% relative). Third, incorporating un-transcribed data through semi-supervised learning, improves WER and KWS performance. Finally, we show that these multilingual representations significantly improve ASR and KWS performance (relative 9% for WER and 5% for MTWV) even when forty hours of transcribed audio in the target language is available. Multilingual representations significantly contributed to the LORELEI KWS systems winning the OpenKWS15 evaluation

    The Simons Observatory: Cryogenic Half Wave Plate Rotation Mechanism for the Small Aperture Telescopes

    Full text link
    We present the requirements, design and evaluation of the cryogenic continuously rotating half-wave plate (CHWP) for the Simons Observatory (SO). SO is a cosmic microwave background (CMB) polarization experiment at Parque Astron\'{o}mico Atacama in northern Chile that covers a wide range of angular scales using both small (0.42 m) and large (6 m) aperture telescopes. In particular, the small aperture telescopes (SATs) focus on large angular scales for primordial B-mode polarization. To this end, the SATs employ a CHWP to modulate the polarization of the incident light at 8~Hz, suppressing atmospheric 1/f1/f noise and mitigating systematic uncertainties that would otherwise arise due to the differential response of detectors sensitive to orthogonal polarizations. The CHWP consists of a 505 mm diameter achromatic sapphire HWP and a cryogenic rotation mechanism, both of which are cooled down to ∌\sim50 K to reduce detector thermal loading. Under normal operation the HWP is suspended by a superconducting magnetic bearing and rotates with a constant 2 Hz frequency, controlled by an electromagnetic synchronous motor. The rotation angle is detected through an angular encoder with a noise level of 0.07ÎŒrads\mu\mathrm{rad}\sqrt{\mathrm{s}}. During a cooldown, the rotor is held in place by a grip-and-release mechanism that serves as both an alignment device and a thermal path. In this paper we provide an overview of the SO SAT CHWP: its requirements, hardware design, and laboratory performance.Comment: 19 pages, 21 figures, submitted to RS

    Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion

    Get PDF
    The electronic version of this article is the complete one and can be found online at: http://dx.doi.org/10.1186/s13636-015-0063-8Spoken term detection (STD) aims at retrieving data from a speech repository given a textual representation of the search term. Nowadays, it is receiving much interest due to the large volume of multimedia information. STD differs from automatic speech recognition (ASR) in that ASR is interested in all the terms/words that appear in the speech data, whereas STD focuses on a selected list of search terms that must be detected within the speech data. This paper presents the systems submitted to the STD ALBAYZIN 2014 evaluation, held as a part of the ALBAYZIN 2014 evaluation campaign within the context of the IberSPEECH 2014 conference. This is the first STD evaluation that deals with Spanish language. The evaluation consists of retrieving the speech files that contain the search terms, indicating their start and end times within the appropriate speech file, along with a score value that reflects the confidence given to the detection of the search term. The evaluation is conducted on a Spanish spontaneous speech database, which comprises a set of talks from workshops and amounts to about 7 h of speech. We present the database, the evaluation metrics, the systems submitted to the evaluation, the results, and a detailed discussion. Four different research groups took part in the evaluation. Evaluation results show reasonable performance for moderate out-of-vocabulary term rate. This paper compares the systems submitted to the evaluation and makes a deep analysis based on some search term properties (term length, in-vocabulary/out-of-vocabulary terms, single-word/multi-word terms, and in-language/foreign terms).This work has been partly supported by project CMC-V2 (TEC2012-37585-C02-01) from the Spanish Ministry of Economy and Competitiveness. This research was also funded by the European Regional Development Fund, the Galician Regional Government (GRC2014/024, “Consolidation of Research Units: AtlantTIC Project” CN2012/160)

    Results of gravitational lensing and primordial gravitational waves from the POLARBEAR experiment

    Get PDF
    POLARBEAR is a Cosmic Microwave Background radiation (CMB) polarization experiment that is located in the Atacama Desert in Chile. The scientific goals of the experiment are to characterize the B-mode signal from gravitational lensing, as well as to search for B-mode signals created by primordial gravitational waves (PGWs). Polarbear started observations in 2012 and has published a series of results. These include the first measurement of a nonzero B-mode angular auto-power spectrum at sub-degree scales where the dominant signal is gravitational lensing of the CMB. In addition, we have achieved the first measurement of crosscorrelation between the lensing potential, which was reconstructed from the CMB polarization data alone by Polarbear, and the cosmic shear field from galaxy shapes by the Subaru Hyper Suprime-Cam (HSC) survey. In 2014, we installed a continuously rotating half-wave plate (CRHWP) at the focus of the primary mirror to search for PGWs and demonstrated the control of low-frequency noise. We have found that the low-frequency B-mode power in the combined dataset with the Planck high-frequency maps is consistent with Galactic dust foreground, thus placing an upper limit on the tensor-to-scalar ratio of r < 0.90 at the 95% confidence level after marginalizing over the foregrounds

    Screening for pulmonary tuberculosis in a Tanzanian prison and computer-aided interpretation of chest X-rays

    No full text
    Tanzania is a high-burden country for tuberculosis (TB), and prisoners are a high-risk group that should be screened actively, as recommended by the World Health Organization. Screening algorithms, starting with chest X-rays (CXRs), can detect asymptomatic cases, but depend on experienced readers, who are scarce in the penitentiary setting. Recent studies with patients seeking health care for TB-related symptoms showed good diagnostic performance of the computer software CAD4TB.; To assess the potential of computer-assisted screening using CAD4TB in a predominantly asymptomatic prison population.; Cross-sectional study.; CAD4TB and seven health care professionals reading CXRs in local tuberculosis wards evaluated a set of 511 CXRs from the Ukonga prison in Dar es Salaam. Performance was compared using a radiological reference. Two readers performed significantly better than CAD4TB, three were comparable, and two performed significantly worse (area under the curve 0.75 in receiver operating characteristics analysis). On a superset of 1321 CXRs, CAD4TB successfully interpreted &gt;99%, with a predictably short time to detection, while 160 (12.2%) reports were delayed by over 24 h with conventional CXR reading.; CAD4TB reliably evaluates CXRs from a mostly asymptomatic prison population, with a diagnostic performance inferior to that of expert readers but comparable to local readers

    A high-performance Cantonese keyword search system

    No full text
    We present a system for keyword search on Cantonese conversational telephony audio, collected for the IARPA Babel program, that achieves good performance by combining postings lists produced by diverse speech recognition systems from three different research groups. We describe the keyword search task, the data on which the work was done, four different speech recognition systems, and our approach to system combination for keyword search. We show that the combination of four systems outperforms the best single system by 7%, achieving an actual term-weighted value of 0.517. © 2013 IEEE

    Speech Recognition For Darpa Communicator

    No full text
    We report the results of investigations in acoustic modeling, language modeling and decoding techniques, for DARPA Communicator, a speaker-independent, telephone-based dialog system. By a combination of methods, including enlarging the acoustic model, augmenting the recognizer vocabulary, conditioning the language model upon dialog state, and applying a post-processing decoding method, we lowered the overall word error rate from 21.9% to 15.0%, a gain of 6.9% absolute and 31.5% relative

    Understanding Somali piracy through cognitive resources theory

    No full text
    This paper examines Somali piracy through the lens of Cognitive Resources Theory (CRT). By and large, Somali piracy consists of hijacking ships, mostly in the Indian Ocean and adjacent areas, and collecting ransom money so as to fund future pirates’ operations. CRT postulates that stress is the enemy of rationality, harming a group’s ability to operate logically and analytically, and impacting both leadership and group performance. Cognitive resources refer to a group’s combined skills and its leader’s experience and decision-making abilities. Based on these cognitive resources, CRT asserts that levels of high and low stress affect a leader’s ability to employ his or her intelligence and experience. To carry out a smooth hijacking operation, a Somali pirate group must alleviate stress. It must have a leader with intelligence and experience to carry out a successful mission in situations of high stress. It is the leader’s responsibility to maintain group unity by providing a supportive, directive environment. As we will see in this analysis, one of the many ways to achieve group normalcy and commitment is by imposing the diya and heer codes – a long-established system of norms and honour within Somalian clans
    corecore