317 research outputs found

    A survey on artificial intelligence-based acoustic source identification

    Get PDF
    The concept of Acoustic Source Identification (ASI), which refers to the process of identifying noise sources has attracted increasing attention in recent years. The ASI technology can be used for surveillance, monitoring, and maintenance applications in a wide range of sectors, such as defence, manufacturing, healthcare, and agriculture. Acoustic signature analysis and pattern recognition remain the core technologies for noise source identification. Manual identification of acoustic signatures, however, has become increasingly challenging as dataset sizes grow. As a result, the use of Artificial Intelligence (AI) techniques for identifying noise sources has become increasingly relevant and useful. In this paper, we provide a comprehensive review of AI-based acoustic source identification techniques. We analyze the strengths and weaknesses of AI-based ASI processes and associated methods proposed by researchers in the literature. Additionally, we did a detailed survey of ASI applications in machinery, underwater applications, environment/event source recognition, healthcare, and other fields. We also highlight relevant research directions

    A Review on the Applications of Machine Learning for Tinnitus Diagnosis Using EEG Signals

    Full text link
    Tinnitus is a prevalent hearing disorder that can be caused by various factors such as age, hearing loss, exposure to loud noises, ear infections or tumors, certain medications, head or neck injuries, and psychological conditions like anxiety and depression. While not every patient requires medical attention, about 20% of sufferers seek clinical intervention. Early diagnosis is crucial for effective treatment. New developments have been made in tinnitus detection to aid in early detection of this illness. Over the past few years, there has been a notable growth in the usage of electroencephalography (EEG) to study variations in oscillatory brain activity related to tinnitus. However, the results obtained from numerous studies vary greatly, leading to conflicting conclusions. Currently, clinicians rely solely on their expertise to identify individuals with tinnitus. Researchers in this field have incorporated various data modalities and machine-learning techniques to aid clinicians in identifying tinnitus characteristics and classifying people with tinnitus. The purpose of writing this article is to review articles that focus on using machine learning (ML) to identify or predict tinnitus patients using EEG signals as input data. We have evaluated 11 articles published between 2016 and 2023 using a systematic literature review (SLR) method. This article arranges perfect summaries of all the research reviewed and compares the significant aspects of each. Additionally, we performed statistical analyses to gain a deeper comprehension of the most recent research in this area. Almost all of the reviewed articles followed a five-step procedure to achieve the goal of tinnitus. Disclosure. Finally, we discuss the open affairs and challenges in this method of tinnitus recognition or prediction and suggest future directions for research

    Strategies for neural networks in ballistocardiography with a view towards hardware implementation

    Get PDF
    A thesis submitted for the degree of Doctor of Philosophy at the University of LutonThe work described in this thesis is based on the results of a clinical trial conducted by the research team at the Medical Informatics Unit of the University of Cambridge, which show that the Ballistocardiogram (BCG) has prognostic value in detecting impaired left ventricular function before it becomes clinically overt as myocardial infarction leading to sudden death. The objective of this study is to develop and demonstrate a framework for realising an on-line BCG signal classification model in a portable device that would have the potential to find pathological signs as early as possible for home health care. Two new on-line automatic BeG classification models for time domain BeG classification are proposed. Both systems are based on a two stage process: input feature extraction followed by a neural classifier. One system uses a principal component analysis neural network, and the other a discrete wavelet transform, to reduce the input dimensionality. Results of the classification, dimensionality reduction, and comparison are presented. It is indicated that the combined wavelet transform and MLP system has a more reliable performance than the combined neural networks system, in situations where the data available to determine the network parameters is limited. Moreover, the wavelet transfonn requires no prior knowledge of the statistical distribution of data samples and the computation complexity and training time are reduced. Overall, a methodology for realising an automatic BeG classification system for a portable instrument is presented. A fully paralJel neural network design for a low cost platform using field programmable gate arrays (Xilinx's XC4000 series) is explored. This addresses the potential speed requirements in the biomedical signal processing field. It also demonstrates a flexible hardware design approach so that an instrument's parameters can be updated as data expands with time. To reduce the hardware design complexity and to increase the system performance, a hybrid learning algorithm using random optimisation and the backpropagation rule is developed to achieve an efficient weight update mechanism in low weight precision learning. The simulation results show that the hybrid learning algorithm is effective in solving the network paralysis problem and the convergence is much faster than by the standard backpropagation rule. The hidden and output layer nodes have been mapped on Xilinx FPGAs with automatic placement and routing tools. The static time analysis results suggests that the proposed network implementation could generate 2.7 billion connections per second performance

    Malicious Network Traffic Detection via Deep Learning: An Information Theoretic View

    Get PDF
    The attention that deep learning has garnered from the academic community and industry continues to grow year over year, and it has been said that we are in a new golden age of artificial intelligence research. However, neural networks are still often seen as a "black box" where learning occurs but cannot be understood in a human-interpretable way. Since these machine learning systems are increasingly being adopted in security contexts, it is important to explore these interpretations. We consider an Android malware traffic dataset for approaching this problem. Then, using the information plane, we explore how homeomorphism affects learned representation of the data and the invariance of the mutual information captured by the parameters on that data. We empirically validate these results, using accuracy as a second measure of similarity of learned representations. Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same. Furthermore, our results show that since mutual information remains invariant under homeomorphism, only feature engineering methods that alter the entropy of the dataset will change the outcome of the neural network. This means that for some datasets and tasks, neural networks require meaningful, human-driven feature engineering or changes in architecture to provide enough information for the neural network to generate a sufficient statistic. Applying our results can serve to guide analysis methods for machine learning engineers and suggests that neural networks that can exploit the convolution theorem are equally accurate as standard convolutional neural networks, and can be more computationally efficient

    Multilabel Sound Event Classification with Neural Networks

    Get PDF
    There are multiple sound events simultaneously occuring in a real-life audio recording collected e.g. at a busy street in rush hour. The events may include traffic noise, sound of rain, people talking etc. The humans are amazingly good at distinguishing these individual events, but as of yet, there is not any machine that can detect these events with (even close to) human accuracy. Polyphonic nature of the environmental audio recordings makes it hard to detect single sound events when many events are overlapping. With the gigantic audio database and state-of-the-art machine learning methods of the digital age, this is bound to change. In this thesis, we use frequency-domain features to represent the audio input and multilabel deep neural networks (DNN) to detect multiple, simultaneous sound events in a real-life recording. We extract frequency-domain features from these recordings in short time frames. DNNs are artificial neural networks (ANN) with two or more hidden layers and they are especially good at modeling highly nonlinear relations and finding intermediate representations between system input and output. This is exactly the case in real-life sound event detection. Every feature extract is used as a training example and we train the neural network with these examples. For the evaluation of this work, we focus on the performance of different topologies of DNNs used in this task. There are a large number of hyper parameters that define the structure of a DNN, such as the number of neurons in a layer, the learning rate used during learning, number of the hidden layers etc. The effects of each of these parameters are investigated in detail. A detection accuracy of 66.5% is achieved, which outperforms the state-of-the-art method by a large margin

    Increasing Accuracy Performance through Optimal Feature Extraction Algorithms

    Get PDF
    This research developed models and techniques to improve the three key modules of popular recognition systems: preprocessing, feature extraction, and classification. Improvements were made in four key areas: processing speed, algorithm complexity, storage space, and accuracy. The focus was on the application areas of the face, traffic sign, and speaker recognition. In the preprocessing module of facial and traffic sign recognition, improvements were made through the utilization of grayscaling and anisotropic diffusion. In the feature extraction module, improvements were made in two different ways; first, through the use of mixed transforms and second through a convolutional neural network (CNN) that best fits specific datasets. The mixed transform system consists of various combinations of the Discrete Wavelet Transform (DWT) and Discrete Cosine Transform (DCT), which have a reliable track record for image feature extraction. In terms of the proposed CNN, a neuroevolution system was used to determine the characteristics and layout of a CNN to best extract image features for particular datasets. In the speaker recognition system, the improvement to the feature extraction module comprised of a quantized spectral covariance matrix and a two-dimensional Principal Component Analysis (2DPCA) function. In the classification module, enhancements were made in visual recognition through the use of two neural networks: the multilayer sigmoid and convolutional neural network. Results show that the proposed improvements in the three modules led to an increase in accuracy as well as reduced algorithmic complexity, with corresponding reductions in storage space and processing time
    corecore