1,930 research outputs found

    Deep Learning Techniques for Music Generation -- A Survey

    Full text link
    This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical content is to be generated? Examples are: melody, polyphony, accompaniment or counterpoint. - For what destination and for what use? To be performed by a human(s) (in the case of a musical score), or by a machine (in the case of an audio file). Representation - What are the concepts to be manipulated? Examples are: waveform, spectrogram, note, chord, meter and beat. - What format is to be used? Examples are: MIDI, piano roll or text. - How will the representation be encoded? Examples are: scalar, one-hot or many-hot. Architecture - What type(s) of deep neural network is (are) to be used? Examples are: feedforward network, recurrent network, autoencoder or generative adversarial networks. Challenge - What are the limitations and open challenges? Examples are: variability, interactivity and creativity. Strategy - How do we model and control the process of generation? Examples are: single-step feedforward, iterative feedforward, sampling or input manipulation. For each dimension, we conduct a comparative analysis of various models and techniques and we propose some tentative multidimensional typology. This typology is bottom-up, based on the analysis of many existing deep-learning based systems for music generation selected from the relevant literature. These systems are described and are used to exemplify the various choices of objective, representation, architecture, challenge and strategy. The last section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P. Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music Generation, Computational Synthesis and Creative Systems, Springer, 201

    Gas Detection and Identification Using Multimodal Artificial Intelligence Based Sensor Fusion

    Get PDF
    With the rapid industrialization and technological advancements, innovative engineering technologies which are cost effective, faster and easier to implement are essential. One such area of concern is the rising number of accidents happening due to gas leaks at coal mines, chemical industries, home appliances etc. In this paper we propose a novel approach to detect and identify the gaseous emissions using the multimodal AI fusion techniques. Most of the gases and their fumes are colorless, odorless, and tasteless, thereby challenging our normal human senses. Sensing based on a single sensor may not be accurate, and sensor fusion is essential for robust and reliable detection in several real-world applications. We manually collected 6400 gas samples (1600 samples per class for four classes) using two specific sensors: the 7-semiconductor gas sensors array, and a thermal camera. The early fusion method of multimodal AI, is applied The network architecture consists of a feature extraction module for individual modality, which is then fused using a merged layer followed by a dense layer, which provides a single output for identifying the gas. We obtained the testing accuracy of 96% (for fused model) as opposed to individual model accuracies of 82% (based on Gas Sensor data using LSTM) and 93% (based on thermal images data using CNN model). Results demonstrate that the fusion of multiple sensors and modalities outperforms the outcome of a single sensor.Comment: 14 Pages, 9 Figure

    Peak Alignment of Gas Chromatography-Mass Spectrometry Data with Deep Learning

    Full text link
    We present ChromAlignNet, a deep learning model for alignment of peaks in Gas Chromatography-Mass Spectrometry (GC-MS) data. In GC-MS data, a compound's retention time (RT) may not stay fixed across multiple chromatograms. To use GC-MS data for biomarker discovery requires alignment of identical analyte's RT from different samples. Current methods of alignment are all based on a set of formal, mathematical rules. We present a solution to GC-MS alignment using deep learning neural networks, which are more adept at complex, fuzzy data sets. We tested our model on several GC-MS data sets of various complexities and analysed the alignment results quantitatively. We show the model has very good performance (AUC 1\sim 1 for simple data sets and AUC 0.85\sim 0.85 for very complex data sets). Further, our model easily outperforms existing algorithms on complex data sets. Compared with existing methods, ChromAlignNet is very easy to use as it requires no user input of reference chromatograms and parameters. This method can easily be adapted to other similar data such as those from liquid chromatography. The source code is written in Python and available online

    Smartphone Gesture-Based Authentication

    Get PDF
    In this research, we consider the problem of authentication on a smartphone based on gestures, that is, movements of the phone. Accelerometer data from a number of subjects was collected and we analyze this data using a variety of machine learning techniques, including support vector machines (SVM) and convolutional neural networks (CNN). We analyze both the fraud rate (or false accept rate) and insult rate (or false reject rate) in each case

    A NOVEL APPROACH FOR DETECTION FAULT IN THE AIRCRAFT EXTERIOR BODY USING IMAGE PROCESSING

    Get PDF
    The primary objective of this thesis is to develop innovative techniques for the inspection and maintenance of aircraft structures. We aim to streamline the entire process by utilizing images to detect potential defects in the aircraft body and comparing them to properly functioning images of the aircraft. This enables us to determine whether a specific section of the aircraft is faulty or not. We achieve this by employing image processing to train a model capable of identifying faulty images. The image processing methodology we use involves the use of images of both defective and operational parts of the aircraft\u27s exterior. These images undergo a preprocessing phase that preserves valuable details. During the training period, a new image of the same section of the aircraft is used to validate the model. After processing, the algorithm grades the image as faulty or normal. To facilitate our study, we rely on the Convolutional Neural Network (CNN) approach. This technique collects distinguishing features from a single patch created by the frame segmentation of a CNN kernel. Furthermore, we use various filters to process the images using the image processing toolbox available in Python. In our initial trials, we observed that the CNN model struggled with the overfitting of the faulty class. To address this, we applied image augmentation by converting a small dataset of 87 images to an augmented dataset of 4000 images. After passing the data through multiple convolutional layers and executing multiple epochs, our proposed model achieved an impressive training accuracy of 98.28%. In addition, we designed a GUI-based interface that allows users to input an image and view the results in terms of faulty or normal. Finally, we propose that the application of this research in the field of robotics would be an ideal area for future work
    corecore