85 research outputs found

    Updated Corpora and Benchmarks for Long-Form Speech Recognition

    Full text link
    The vast majority of ASR research uses corpora in which both the training and test data have been pre-segmented into utterances. In most real-word ASR use-cases, however, test audio is not segmented, leading to a mismatch between inference-time conditions and models trained on segmented utterances. In this paper, we re-release three standard ASR corpora - TED-LIUM 3, Gigapeech, and VoxPopuli-en - with updated transcription and alignments to enable their use for long-form ASR research. We use these reconstituted corpora to study the train-test mismatch problem for transducers and attention-based encoder-decoders (AEDs), confirming that AEDs are more susceptible to this issue. Finally, we benchmark a simple long-form training for these models, showing its efficacy for model robustness under this domain shift.Comment: Submitted to ICASSP 202

    A summary of water-quality and salt marsh monitoring, Humboldt Bay, California

    Get PDF
    This report summarizes data-collection activities associated with the U.S. Geological Survey Humboldt Bay Water-Quality and Salt Marsh Monitoring Project. This work was undertaken to gain a comprehensive understanding ofwater-quality conditions, salt marsh accretion processes, marsh-edge erosion, and soil-carbon storage in Humboldt Bay, California. Multiparameter sondes recorded water temperature, specific conductance, and turbidity at a 15-minute timestep at two U.S. Geological Survey water-quality stations: Mad River Slough near Arcata, California (U.S. Geological Survey station 405219124085601) and (2) Hookton Slough near Loleta, California (U.S. Geological Survey station 404038124131801). At each station, discrete water samples were collected to develop surrogate regression models that were used to compute a continuous time seriesof suspended-sediment concentration from continuously measured turbidity. Data loggers recorded water depth at a 6-minute timestep in the primary tidal channels (Mad River Slough and Hookton Slough) in two adjacent marshes (Mad River marsh and Hookton marsh). The marsh monitoring network included five study marshes. Three marshes (Mad River, Manila, and Jacoby) are in the northern embayment of Humboldt Bay and two marshes (White and Hookton) are in the southern embayment. Surface deposition and elevation change were measured using deep rod surface elevation tables and feldspar marker horizons. Sediment characteristics and soil-carbon storage were measured using a total of 10 shallow cores, distributed across 5 study marshes, collected using an Eijkelkamp peat sampler. Rates of marsh edge erosion (2010–19) were quantified in four marshes (Mad River, Manila, Jacoby, and White) by estimating changes in the areal extent of the vegetated marsh plain using repeat aerial imagery and light detection and ranging (LiDAR)-derived elevation data. During the monitoring period (2016–19), the mean suspended-sediment concentration computed for Hookton Slough (50±20 milligrams per liter [mg/L]) was higher than Mad River Slough (18±7 mg/L). Uncertainty in mean suspended-sediment concentration values is reported using a 90-percent confidence interval. Across the five study marshes, elevation change (+1.8±0.6 millimeters per year[mm/yr]) and surface deposition (+2.5±0.5 mm/yr) were lower than published values of local sea-level rise (4.9±0.8 mm/yr), and mean carbon density was 0.029±0.005 grams of carbon per cubic centimeter. From 2010 to 2019, marsh edge erosion and soil carbon loss were greatest in low-elevation marshes with the marsh edge characterized by a gentle transition from mudflat to vegetated marsh (herein, ramped edge morphology) and larger wind-wave exposure. Jacoby Creek marsh experienced the greatest edge erosion. In total, marsh edge erosion was responsible for 62.3 metric tons of estuarine soil carbon storage loss across four study marshes. Salt marshes are an important component of coastal carbon, which is frequently referred to as “blue carbon.” The monitoring data presented in this report provide fundamental information needed to manage blue carbon stocks, assess marsh vulnerability, inform sea-level rise adaptation planning, and build coastal resiliency to climate change

    Accents in Speech Recognition through the Lens of a World Englishes Evaluation Set

    Get PDF
    Automatic Speech Recognition (ASR) systems generalize poorly on accented speech, creating bias issues for users and providers. The phonetic and linguistic variability of accents present challenges for ASR systems in both data collection and modeling strategies. We present two promising approaches to accented speech recognition— custom vocabulary and multilingual modeling— and highlight key challenges in the space. Among these, lack of a standard benchmark makes research and comparison difficult. We address this with a novel corpus of accented speech: Earnings-22, A 125 file, 119 hour corpus of English-language earnings calls gathered from global companies. We compare commercial models showing variation in performance when taking country of origin into consideration and demonstrate targeted improvements using the methods we introduce

    Antibodies to Henipavirus or Henipa-Like Viruses in Domestic Pigs in Ghana, West Africa

    Get PDF
    Henipaviruses, Hendra virus (HeV) and Nipah virus (NiV), have Pteropid bats as their known natural reservoirs. Antibodies against henipaviruses have been found in Eidolon helvum, an old world fruit bat species, and henipavirus-like nucleic acid has been detected in faecal samples from E. helvum in Ghana. The initial outbreak of NiV in Malaysia led to over 265 human encephalitis cases, including 105 deaths, with infected pigs acting as amplifier hosts for NiV during the outbreak. We detected non-neutralizing antibodies against viruses of the genus Henipavirus in approximately 5% of pig sera (N = 97) tested in Ghana, but not in a small sample of other domestic species sampled under a E. helvum roost. Although we did not detect neutralizing antibody, our results suggest prior exposure of the Ghana pig population to henipavirus(es). Because a wide diversity of henipavirus-like nucleic acid sequences have been found in Ghanaian E. helvum, we hypothesise that these pigs might have been infected by henipavirus(es) sufficiently divergent enough from HeVor NiV to produce cross-reactive, but not cross-neutralizing antibodies to HeV or NiV

    Optical Molecular Imaging in the Gastrointestinal Tract

    Get PDF
    Recent developments in optical molecular imaging allow for real-time identification of morphological and biochemical changes in tissue associated with gastrointestinal neoplasia. This review summarizes widefield and high resolution imaging modalities currently in pre-clinical and clinical evaluation for the detection of colorectal cancer and esophageal cancer. Widefield techniques discussed include high definition white light endoscopy, narrow band imaging, autofluoresence imaging, and chromoendoscopy; high resolution techniques discussed include probe-based confocal laser endomicroscopy, high-resolution microendoscopy, and optical coherence tomography. Finally, new approaches to enhance image contrast using vital dyes and molecular-specific targeted contrast agents are evaluated

    Heterologous hyperimmune polyclonal antibodies against SARS-COV-2: A broad coverage, affordable, and scalable potential immunotherapy for Covid-19

    Get PDF
    The emergence and dissemination of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the resulting COVID-19 pandemic triggered a global public health crisis. Although several SARS-CoV-2 vaccines have been developed, demand far exceeds supply, access to them is inequitable, and thus, populations in low- and middle-income countries are unlikely to be protected soon (1). Furthermore, there are no specific therapies available, which is a challenge for COVID-19 patient care (2). Thus, the appearance of SARS-CoV-2 variants and reports of reinfections associated with immune escape (3, 4) highlight the urgent need for effective and broad coverage COVID-19 therapeutics. Intravenous administration of human or heterologous antibodies is a therapy successfully used in patients with viral respiratory diseases (5). Accordingly, formulations containing SARS-CoV-2 specific antibodies are an attractive therapeutic option for COVID-19 patients (6). SARS-CoV-2 specific antibodies could limit infection by direct virion neutralization and/or by targeting infected cells for elimination via complement or antibody-mediated cytotoxicity (6). Specific SARS-CoV-2 antibody-based therapeutics include convalescent plasma (CP), monoclonal antibodies (mAbs), human polyclonal IgG formulations purified from CP or transgenic animals, and heterologous hyperimmune polyclonal antibodies (pAbs) (6). Although the window for using antibody-based therapeutics varies, clinical data show that they are mainly effective if administered early after symptoms onset (6).Universidad de Costa Rica/[741-C0-198]/UCR/Costa RicaCaja Costarricense del Seguro Social/[]/CCSS/Costa RicaBanco Centroamericano de Integración Económica/[]/BCIE/Costa RicaGerman academic exchange services/[57592642]/DAAD/AlemaniaUCR::Vicerrectoría de Investigación::Unidades de Investigación::Ciencias de la Salud::Instituto Clodomiro Picado (ICP)UCR::Vicerrectoría de Docencia::Salud::Facultad de Medicina::Escuela de MedicinaUCR::Vicerrectoría de Investigación::Unidades de Investigación::Ciencias de la Salud::Centro de Investigación en Enfermedades Tropicales (CIET

    Deciphering the origin and evolution of Hepatitis B viruses by means of a family of non-enveloped fish viruses

    Get PDF
    Hepatitis B viruses (HBVs), which are enveloped viruses with reverse-transcribed DNA genomes, constitute the family Hepadnaviridae. An outstanding feature of HBVs is their streamlined genome organization with extensive gene overlap. Remarkably, the ∼1,100 bp open reading frame (ORF) encoding the envelope proteins is fully nested within the ORF of the viral replicase P. Here, we report the discovery of a diversified family of fish viruses, designated nackednaviruses, which lack the envelope protein gene, but otherwise exhibit key characteristics of HBVs including genome replication via protein-primed reverse-transcription and utilization of structurally related capsids. Phylogenetic reconstruction indicates that these two virus families separated more than 400 million years ago before the rise of tetrapods. We show that HBVs are of ancient origin, descending from non-enveloped progenitors in fishes. Their envelope protein gene emerged de novo, leading to a major transition in viral lifestyle, followed by co-evolution with their hosts over geologic eras

    Henipavirus Neutralising Antibodies in an Isolated Island Population of African Fruit Bats

    Get PDF
    Isolated islands provide valuable opportunities to study the persistence of viruses in wildlife populations, including population size thresholds such as the critical community size. The straw-coloured fruit bat, Eidolon helvum, has been identified as a reservoir for henipaviruses (serological evidence) and Lagos bat virus (LBV; virus isolation and serological evidence) in continental Africa. Here, we sampled from a remote population of E. helvum annobonensis fruit bats on Annobón island in the Gulf of Guinea to investigate whether antibodies to these viruses also exist in this isolated subspecies. Henipavirus serological analyses (Luminex multiplexed binding and inhibition assays, virus neutralisation tests and western blots) and lyssavirus serological analyses (LBV: modified Fluorescent Antibody Virus Neutralisation test, LBV and Mokola virus: lentivirus pseudovirus neutralisation assay) were undertaken on 73 and 70 samples respectively. Given the isolation of fruit bats on Annobón and their lack of connectivity with other populations, it was expected that the population size on the island would be too small to allow persistence of viruses that are thought to cause acute and immunising infections. However, the presence of antibodies against henipaviruses was detected using the Luminex binding assay and confirmed using alternative assays. Neutralising antibodies to LBV were detected in one bat using both assays. We demonstrate clear evidence for exposure of multiple individuals to henipaviruses in this remote population of E. helvum annobonensis fruit bats on Annobón island. The situation is less clear for LBV. Seroprevalences to henipaviruses and LBV in Annobón are notably different to those in E. helvum in continental locations studied using the same sampling techniques and assays. Whilst cross-sectional serological studies in wildlife populations cannot provide details on viral dynamics within populations, valuable information on the presence or absence of viruses may be obtained and utilised for informing future studies

    Study protocol for the multicentre cohorts of Zika virus infection in pregnant women, infants, and acute clinical cases in Latin America and the Caribbean: The ZIKAlliance consortium

    Get PDF
    Background: The European Commission (EC) Horizon 2020 (H2020)-funded ZIKAlliance Consortium designed a multicentre study including pregnant women (PW), children (CH) and natural history (NH) cohorts. Clinical sites were selected over a wide geographic range within Latin America and the Caribbean, taking into account the dynamic course of the ZIKV epidemic. Methods: Recruitment to the PW cohort will take place in antenatal care clinics. PW will be enrolled regardless of symptoms and followed over the course of pregnancy, approximately every 4 weeks. PW will be revisited at delivery (or after miscarriage/abortion) to assess birth outcomes, including microcephaly and other congenital abnormalities according to the evolving definition of congenital Zika syndrome (CZS). After birth, children will be followed for 2 years in the CH cohort. Follow-up visits are scheduled at ages 1-3, 4-6, 12, and 24 months to assess neurocognitive and developmental milestones. In addition, a NH cohort for the characterization of symptomatic rash/fever illness was designed, including follow-up to capture persisting health problems. Blood, urine, and other biological materials will be collected, and tested for ZIKV and other relevant arboviral diseases (dengue, chikungunya, yellow fever) using RT-PCR or serological methods. A virtual, decentralized biobank will be created. Reciprocal clinical monitoring has been established between partner sites. Substudies of ZIKV seroprevalence, transmissio
    corecore