378 research outputs found

    Comparing Human and Machine Errors in Conversational Speech Transcription

    Full text link
    Recent work in automatic recognition of conversational telephone speech (CTS) has achieved accuracy levels comparable to human transcribers, although there is some debate how to precisely quantify human performance on this task, using the NIST 2000 CTS evaluation set. This raises the question what systematic differences, if any, may be found differentiating human from machine transcription errors. In this paper we approach this question by comparing the output of our most accurate CTS recognition system to that of a standard speech transcription vendor pipeline. We find that the most frequent substitution, deletion and insertion error types of both outputs show a high degree of overlap. The only notable exception is that the automatic recognizer tends to confuse filled pauses ("uh") and backchannel acknowledgments ("uhhuh"). Humans tend not to make this error, presumably due to the distinctive and opposing pragmatic functions attached to these words. Furthermore, we quantify the correlation between human and machine errors at the speaker level, and investigate the effect of speaker overlap between training and test data. Finally, we report on an informal "Turing test" asking humans to discriminate between automatic and human transcription error cases

    Advances in All-Neural Speech Recognition

    Full text link
    This paper advances the design of CTC-based all-neural (or end-to-end) speech recognizers. We propose a novel symbol inventory, and a novel iterated-CTC method in which a second system is used to transform a noisy initial output into a cleaner version. We present a number of stabilization and initialization methods we have found useful in training these networks. We evaluate our system on the commonly used NIST 2000 conversational telephony test set, and significantly exceed the previously published performance of similar systems, both with and without the use of an external language model and decoding technology

    Composite suspended sediment particles and flocculation in glacial meltwaters: preliminary evidence from Alpine and Himalayan basins

    Get PDF
    Research over the last decade has shown that the suspended sediment loads of many rivers are dominated by composite particles. These particles are also known as aggregates or flocs, and are commonly made up of constituent mineral particles, which evidence a wide range of grain sizes, and organic matter. The resulting in situ or effective particle size characteristics of fluvial suspended sediment exert a major control on all processes of entrainment, transport and deposition. The significance of composite suspended sediment particles in glacial meltwater streams has, however, not been established. Existing data on the particle size characteristics of suspended sediment in glacial meltwaters relate to the dispersed mineral fraction (absolute particle size), which, for certain size fractions, may bear little relationship to the effective or in situ distribution. Existing understanding of composite particle formation within freshwater environments would suggest that in-stream flocculation processes do not take place in glacial meltwater systems because of the absence of organic binding agents. However, we report preliminary scanning electron microscopy data for one Alpine and two Himalayan glaciers that show composite particles are present in the suspended sediment load of the meltwater system. The genesis and structure of these composite particles and their constituent grain size characteristics are discussed. We present evidence for the existence of both aggregates, or composite particles whose features are largely inherited from source materials, and flocs, which represent composite particles produced by instream flocculation processes. In the absence of organic materials, the latter may result solely from electrochemical flocculation in the meltwater sediment system. This type of floc formation has not been reported previously in the freshwater fluvial environment. Further work is needed to test the wider significance of these data and to investigate the effective particle size characteristics of suspended sediment associated with high concentration outburst events. Such events make a major contribution to suspended sediment fluxes in meltwater streams and may provide conditions that are conducive to composite particle formation by flocculation

    The Microsoft 2017 Conversational Speech Recognition System

    Full text link
    We describe the 2017 version of Microsoft's conversational speech recognition system, in which we update our 2016 system with recent developments in neural-network-based acoustic and language modeling to further advance the state of the art on the Switchboard speech recognition task. The system adds a CNN-BLSTM acoustic model to the set of model architectures we combined previously, and includes character-based and dialog session aware LSTM language models in rescoring. For system combination we adopt a two-stage approach, whereby subsets of acoustic models are first combined at the senone/frame level, followed by a word-level voting via confusion networks. We also added a confusion network rescoring step after system combination. The resulting system yields a 5.1\% word error rate on the 2000 Switchboard evaluation set

    The Microsoft 2016 Conversational Speech Recognition System

    Full text link
    We describe Microsoft's conversational speech recognition system, in which we combine recent developments in neural-network-based acoustic and language modeling to advance the state of the art on the Switchboard recognition task. Inspired by machine learning ensemble techniques, the system uses a range of convolutional and recurrent neural networks. I-vector modeling and lattice-free MMI training provide significant gains for all acoustic model architectures. Language model rescoring with multiple forward and backward running RNNLMs, and word posterior-based system combination provide a 20% boost. The best single system uses a ResNet architecture acoustic model with RNNLM rescoring, and achieves a word error rate of 6.9% on the NIST 2000 Switchboard task. The combined system has an error rate of 6.2%, representing an improvement over previously reported results on this benchmark task

    Erosion characteristics and floc strenght of Athabasca river cohesive sediments: towards managing sediment-related issues

    Get PDF
    Purpose: Most of Canada’s tar sands exploitations are located in the Athabasca river basin. Deposited cohesive sediments in Athabasca river and tributaries are a potential source of PAHs in the basin. Erosional behavior of cohesive sediments depends not only of fluid turbulence but on sediments structure and particularly the influence of organic content. This research tries to describe this behavior in Athabasca river sediments. Methods: An experimental study of cohesive sediments dynamics in one of the tributaries, the Muskeg river, was developed in a rotating annular flume. Variation of the shear stress allowed the determination of erosional strength for beds with different consolidation periods. Particle size measurements were made with a laser diffraction device operated in a continuous flow through mode. Optical analyses of flocs (ESEM and TEM) were performed with samples taken at the end of the experiments. Results: An inverse relationship between suspended sediment concentration (SS) and the consolidation period was found. The differences are related in this research to the increasing organic content of the sediments with consolidation period. The particle size measurements during the experiments showed differences on floc strength that are also related to changing organic content during different consolidation periods. ESEM and TEM observations confirm the structural differences for beds with different consolidation periods. The effects of SFGL on floc structure and in biostabilization of the bed are discussed. Conclusions: It is recommended in this paper that consolidation period should be taken into account for the modeling of erosion of cohesive sediments in the Athabasca river. Relating to transport models of pollutants (PAHs) it is highly recommended to consider flocs organic content, particularly algae, in the resuspension module.Environment Canada, CONACY

    The Prevalence of Freshwater Flocculation in Cold Regions: A Case Study from the Mackenzie River Delta, Northwest Territories, Canada

    Get PDF
    The Mackenzie River Delta (MRD) is used as a case study for evaluating the extent to which flocculation may play an important role in the transport of sediment and associated contaminants in arctic regions. Samples were collected for nondestructive analysis of particle/floc size, major ions, particulate organic carbon (POC), dissolved organic carbon (DOC), bacterial counts, and suspended solid (SS) concentrations. On-site measurements were made for pH, conductivity, and temperature. Results indicate that the dominant form of sediment transport to and within the MRD is flocs, and not traditionally sized primary particles. It is shown that the flocs of the Mackenzie Delta are at times larger in size than those in southern Ontario rivers that have been studied. The sediment distributions were bimodal in nature; the particle-deficient zone potentially represented a preferential particle size for flocculation. Spatial and temporal trends in the grain-size distributions suggest site-specific controlling factors of flocculation, such as source area and sediment characteristics. It is hypothesized that water temperature, suspended solid concentration, and bacteria are the important factors in controlling flocculation within the Delta.Le delta du Mackenzie (DM) sert d'étude de cas pour déterminer l'importance du rôle que peut jouer la floculation dans le transport des sédiments et contaminants connexes dans les régions arctiques. On a recueilli des échantillons pour analyse non destructive de la taille des particules/flocons, des ions majeurs, du carbone organique particulaire (COP), du carbone organique dissous (COD), de la numération bactérienne et des concentrations solides en suspension. Les mesures du pH, de la conductivité et de la température ont été faites sur le terrain. Les résultats indiquent que le transport solide en amont et à l'intérieur du DM s'opère principalement sous forme de flocons et non sous la forme de particules élémentaires calibrées de façon traditionnelle. On montre que les flocons du delta sont parfois plus gros que ceux des cours d'eau du sud de l'Ontario qui ont déjà fait l'objet d'une étude. La distribution des sédiments était de nature bimodale: la zone déficitaire en particules représentait potentiellement une grosseur de particules propice à la floculation. Des tendances spatiales et temporelles dans la distribution granulométrique suggèrent l'existence de facteurs de contrôle de la floculation qui sont spécifiques à certains sites, tels que la source d'origine et les caractéristiques des sédiments. On émet l'hypothèse que la température de l'eau, la concentration des matières solides en suspension et les bactéries sont les facteurs principaux qui contrôlent la floculation dans le delta
    corecore