215 research outputs found

    Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

    Audio Spatio-Temporal Fingerprints for Cloudless Real-Time Hands-Free Diarization on Mobile Devices

    Get PDF
    In this paper, we propose a new low bit rate representation of a sound field and a new method for the corresponding cloudless low delay hands-free diarization suitable for low-performance mobile devices, e.g. mobile phones. The proposed audio spatio-temporal fingerprint representation results in low bit rate (500 bytes/second), however contains complete information about continuous audio tracking of multiple acoustic sources in an open, unconstrained environment. The core of the algorithm is based on simultaneous multiple data stream processing using audio spatio-temporal fingerprint representation to cover higher level events relevant for diarization, e.g. turns, interruptions, crosstalk, speech and non-speech segments. Performance levels achieved to date on 5 hours of hand-labelled datasets have shown the feasibility of the approach at the same time as resulting in 7.58% CPU load on 1-core ultra-low-power mobile processor running at 1 GHz and low algorithmic delay of 112 ms

    Cross-layer design for multimedia applications in cognitive radio networks.

    Get PDF
    Ph. D. University of KwaZulu-Natal, Durban 2015.The exponential growth in wireless services and the current trend of development in wireless communication technologies have resulted into an overcrowded radio spectrum band in such a way that it can no longer meet the ever increasing requirements of wireless applications. In contrary however, literature surveys indicate that a large amount of the licensed radio spectrum bands are underutilized. This has necessitated the need for efficient ways to be implemented for spectrum sharing among different systems, applications and services in dynamic wireless environment. Cognitive radio (CR) technology emerges as a way to improve the overall efficiency of radio spectrum utilization by allowing unlicensed users (also known as secondary user) to utilize a licensed band when it is vacant. Multimedia applications are being targeted for CR networks. However, the performance and success of CR technology will be determined by the quality of service (QoS) perceived by secondary users. In order to transmit multimedia contents which have stringent QoS requirements over the CR networks, many technical challenges have to be addressed that are constrained by the layered protocol architecture. Cross-layer design has shown a promise as an approach to optimize network performance among different layers. This work is aimed at addressing the question on how to provide QoS guarantee for multimedia transmission over CR networks in terms of throughput maximization while ensuring that the interference to primary users is avoided or minimized. Spectrum sensing is a fundamental problem in cognitive radio networks for the protection of primary users and therefore the first part of this work provides a review of some low complex spectrum sensing schemes. A cooperative spectrum sensing scheme where multi-users are independently performing spectrum sensing is also developed. In order to address a hidden node problem, a cooperate relay based on amplify-and-forward technique (AF) is formulated. Usually the performance of a spectrum sensor is evaluated using receiver operating characteristic (ROC) curve which provides a trade-off between the probability of miss detection and the probability of false alarm. Due to hardware limitations, the spectrum sensor can not sense the whole range of radio spec- trum which results into partial information of the channel state. In order to model a media access control(MAC) protocol which is able to make channel access decision under partial information about the state of the system we apply a partially observable Markov decision process (POMDP) technique as a suitable tool in making decision under uncertainty. A throughput optimization MAC scheme in presence of spectrum sensing errors is then devel- oped using the concept of cross-layer design which integrates the design of spectrum sensing at physical layer (PHY) and sensing and access strategies at MAC layer in order to maximize the overall network throughput. A problem is formulated as a POMDP and the throughput performance of the scheme is evaluated using computer simulations under greedy sensing algorithm. Simulation results demonstrate an improved overall throughput performance. Further more, multiple channels with multiple secondary users having random message ar- rivals are considered during simulation and the throughput performance is evaluated under greedy sensing scheme which forms a benchmark for cross-layer MAC scheme in presence of spectrum sensing errors. By realizing that speech communication is still the most dom- inant and common service in wireless application, we develop a cross-layer MAC scheme for speech transmission in CR networks. The design is aimed at maximizing throughput of secondary users by integrating the design of spectrum sensing at PHY, quantization param- eter of speech traffic at application layer (APP), together with strategy for spectrum access at MAC layer with the main goal to improve the QoS perceived by secondary users in CR networks. Simulation results demonstrate throughput performance improvement and hence QoS is improved. One of the main features of the modern communication systems is the parameterized operation at different layers of the protocol stack. The feature aims at providing them with the capability of adapting to the rapidly changing traffic, channel and system conditions. Another interesting research problem in this thesis is the combination of individual adap- tation mechanisms into a cross-layer that can maximize their effectiveness. We propose a joint cross-layer design MAC scheme that integrates the design of spectrum sensing at PHY layer, access at MAC layer and APP information in order to improve the QoS for video transmission in CR networks. The end-to-end video distortion which is considered as an APP parameter resides in the video encoder. This is integrated in the state space and the problem is formulated as a constrained POMDP. H.264 coding algorithm which is one of the high efficient video coding standards is considered. The objective is to minimize this end-to- end video distortion while maximizes the overall network throughput for video transmission in CR networks. The end-to-end video distortion has signifficant effects to the QoS the per- ceived by the user and is viewed as the cost in the overall system design. Given the target system throughput, the packet loss ration when the system is in the state i and a composite action is taken in time slot t, the system immediate cost is evaluated. The expected total cost for overall end-to-end video distortion over the total time slots is then computed. A joint optimal policy which minimizes the expected total end-to-end distortion in total time slots is computed iteratively. The minimum expected cost (which also known as the value function) is also evaluated iteratively for the total time slots. The throughput performance of the proposed scheme is evaluated through computer simulation. In order to study the throughput performance of the proposed scheme, we considered four simulation scenarios namely simulation scenario A, simulation scenario B, simulation scenario C, and simulation scenario D. These simulation scenarios enabled us to study the throughput performance of the proposed scheme by by computer simulations. In the simulation scenario A, the av- erage throughput performance as a function of time horizon is studied. The throughput performance under channel access decision based on belief vector and that of channel access decision based on the end-to-end distortion are compared. Simulation results show that the channel access decision based on end-to-end distortion outperforms that of channel access decision based on a belief vector. In the simulation scenario B we aimed at studying the spectral efficiency as a function of prescribed collision probability. The simulation results show that, at large values of collision probability the overall spectral efficiency performs poorly. However, there is an optimal value of collision probability of which the spectral efficiency approaches that of the perfect channel access decision. In the simulation scenario C, we aimed at studying the average throughput performance and the spectral efficiency both as a function of prescribed collision probability. The simulation results show that both average throughput and the spectral efficiency are highly affected by the increase in collision probability. However, there is an optimal prescribed collision probability which achieves the maximum average throughput and maximum spectral efficiency

    Multi-categories tool wear classification in micro-milling

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes
    corecore