2,024 research outputs found

    Spatial, Spectral, and Perceptual Nonlinear Noise Reduction for Hands-free Microphones in a Car

    Get PDF
    Speech enhancement in an automobile is a challenging problem because interference can come from engine noise, fans, music, wind, road noise, reverberation, echo, and passengers engaging in other conversations. Hands-free microphones make the situation worse because the strength of the desired speech signal reduces with increased distance between the microphone and talker. Automobile safety is improved when the driver can use a hands-free interface to phones and other devices instead of taking his eyes off the road. The demand for high quality hands-free communication in the automobile requires the introduction of more powerful algorithms. This thesis shows that a unique combination of five algorithms can achieve superior speech enhancement for a hands-free system when compared to beamforming or spectral subtraction alone. Several different designs were analyzed and tested before converging on the configuration that achieved the best results. Beamforming, voice activity detection, spectral subtraction, perceptual nonlinear weighting, and talker isolation via pitch tracking all work together in a complementary iterative manner to create a speech enhancement system capable of significantly enhancing real world speech signals. The following conclusions are supported by the simulation results using data recorded in a car and are in strong agreement with theory. Adaptive beamforming, like the Generalized Side-lobe Canceller (GSC), can be effectively used if the filters only adapt during silent data frames because too much of the desired speech is cancelled otherwise. Spectral subtraction removes stationary noise while perceptual weighting prevents the introduction of offensive audible noise artifacts. Talker isolation via pitch tracking can perform better when used after beamforming and spectral subtraction because of the higher accuracy obtained after initial noise removal. Iterating the algorithm once increases the accuracy of the Voice Activity Detection (VAD), which improves the overall performance of the algorithm. Placing the microphone(s) on the ceiling above the head and slightly forward of the desired talker appears to be the best location in an automobile based on the experiments performed in this thesis. Objective speech quality measures show that the algorithm removes a majority of the stationary noise in a hands-free environment of an automobile with relatively minimal speech distortion

    AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis

    Full text link
    Recently, sound recognition has been used to identify sounds, such as car and river. However, sounds have nuances that may be better described by adjective-noun pairs such as slow car, and verb-noun pairs such as flying insects, which are under explored. Therefore, in this work we investigate the relation between audio content and both adjective-noun pairs and verb-noun pairs. Due to the lack of datasets with these kinds of annotations, we collected and processed the AudioPairBank corpus consisting of a combined total of 1,123 pairs and over 33,000 audio files. One contribution is the previously unavailable documentation of the challenges and implications of collecting audio recordings with these type of labels. A second contribution is to show the degree of correlation between the audio content and the labels through sound recognition experiments, which yielded results of 70% accuracy, hence also providing a performance benchmark. The results and study in this paper encourage further exploration of the nuances in audio and are meant to complement similar research performed on images and text in multimedia analysis.Comment: This paper is a revised version of "AudioSentibank: Large-scale Semantic Ontology of Acoustic Concepts for Audio Content Analysis

    A New Lean Model: Improving Race Team Performance through Team-Driver Communication Efficacy

    Get PDF
    In some organizational settings and in the field of competitive automobile racing, certain situations and rules place an emphasis on and sometimes escalate the need for effective team communications. This dissertation hypothesizes that effective and dense communications contributes directly to team performance. Supported by organizational behavioral and lean six sigma theory, communications is declared a form of waste within the context of Industrial Engineering subject to data collection, measurements, and real-time, value-added metrics. Measuring and reporting trends in communications provides a basis for a new and unique model called a Communications Productivity Model (CPM) with an associated Communications Density Report (CDR). Industrial Engineering productivity, statistics, linguistic and text analysis tools were combined to develop a unique Dynamic Productivity Index (DPI) enhancing the CDR as a means to rapidly provide meaningful and value-added feedback on recent and future performance. Data was collected on actual automobile racing teams to validate the new communications model, report on the results using the CDR and introduce the DPI. Future research is also proposed in this dissertation to enhance the new communications model whereby speech recognition technologies are evaluated and tested

    Terminology mining in social media

    Get PDF
    The highly variable and dynamic word usage in social media presents serious challenges for both research and those commercial applications that are geared towards blogs or other user-generated non-editorial texts. This paper discusses and exempliïŹes a terminology mining approach for dealing with the productive character of the textual environment in social media. We explore the challenges of practically acquiring new terminology, and of modeling similarity and relatedness of terms from observing realistic amounts of data. We also discuss semantic evolution and density, and investigate novel measures for characterizing the preconditions for terminology mining

    Digital Signal Processing Research Program

    Get PDF
    Contains table of contents for Section 2, an introduction, reports on twenty-one research projects and a list of publications.U.S. Navy - Office of Naval Research Grant N00014-93-1-0686Lockheed Sanders, Inc. Contract P.O. BY5561U.S. Air Force - Office of Scientific Research Grant AFOSR 91-0034National Science Foundation Grant MIP 95-02885U.S. Navy - Office of Naval Research Grant N00014-95-1-0834MIT-WHOI Joint Graduate Program in Oceanographic EngineeringAT&T Laboratories Doctoral Support ProgramDefense Advanced Research Projects Agency/U.S. Navy - Office of Naval Research Grant N00014-89-J-1489Lockheed Sanders/U.S. Navy - Office of Naval Research Grant N00014-91-C-0125U.S. Navy - Office of Naval Research Grant N00014-89-J-1489National Science Foundation Grant MIP 95-02885Defense Advanced Research Projects Agency/U.S. Navy Contract DAAH04-95-1-0473U.S. Navy - Office of Naval Research Grant N00014-91-J-1628University of California/Scripps Institute of Oceanography Contract 1003-73-5

    Neuromorphic engineering needs closed-loop benchmarks

    Get PDF
    Neuromorphic engineering aims to build (autonomous) systems by mimicking biological systems. It is motivated by the observation that biological organisms—from algae to primates—excel in sensing their environment, reacting promptly to their perils and opportunities. Furthermore, they do so more resiliently than our most advanced machines, at a fraction of the power consumption. It follows that the performance of neuromorphic systems should be evaluated in terms of real-time operation, power consumption, and resiliency to real-world perturbations and noise using task-relevant evaluation metrics. Yet, following in the footsteps of conventional machine learning, most neuromorphic benchmarks rely on recorded datasets that foster sensing accuracy as the primary measure for performance. Sensing accuracy is but an arbitrary proxy for the actual system's goal—taking a good decision in a timely manner. Moreover, static datasets hinder our ability to study and compare closed-loop sensing and control strategies that are central to survival for biological organisms. This article makes the case for a renewed focus on closed-loop benchmarks involving real-world tasks. Such benchmarks will be crucial in developing and progressing neuromorphic Intelligence. The shift towards dynamic real-world benchmarking tasks should usher in richer, more resilient, and robust artificially intelligent systems in the future
    • 

    corecore