372 research outputs found

    Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion

    Full text link
    There are growing implications surrounding generative AI in the speech domain that enable voice cloning and real-time voice conversion from one individual to another. This technology poses a significant ethical threat and could lead to breaches of privacy and misrepresentation, thus there is an urgent need for real-time detection of AI-generated speech for DeepFake Voice Conversion. To address the above emerging issues, the DEEP-VOICE dataset is generated in this study, comprised of real human speech from eight well-known figures and their speech converted to one another using Retrieval-based Voice Conversion. Presenting as a binary classification problem of whether the speech is real or AI-generated, statistical analysis of temporal audio features through t-testing reveals that there are significantly different distributions. Hyperparameter optimisation is implemented for machine learning models to identify the source of speech. Following the training of 208 individual machine learning models over 10-fold cross validation, it is found that the Extreme Gradient Boosting model can achieve an average classification accuracy of 99.3% and can classify speech in real-time, at around 0.004 milliseconds given one second of speech. All data generated for this study is released publicly for future research on AI speech detection

    A Deep Evolutionary Approach to Bioinspired Classifier Optimisation for Brain-Machine Interaction

    Get PDF
    This study suggests a new approach to EEG data classification by exploring the idea of using evolutionary computation to both select useful discriminative EEG features and optimise the topology of Artificial Neural Networks. An evolutionary algorithm is applied to select the most informative features from an initial set of 2550 EEG statistical features. Optimisation of a Multilayer Perceptron (MLP) is performed with an evolutionary approach before classification to estimate the best hyperparameters of the network. Deep learning and tuning with Long Short-Term Memory (LSTM) are also explored, and Adaptive Boosting of the two types of models is tested for each problem. Three experiments are provided for comparison using different classifiers: One for attention state classification, one for emotional sentiment classification, and a third experiment in which the goal is to guess the number a subject is thinking of. The obtained results show that an Adaptive Boosted LSTM can achieve an accuracy of 84.44%, 97.06%, and 9.94% on the attentional, emotional, and number datasets, respectively. An evolutionary-optimised MLP achieves results close to the Adaptive Boosted LSTM for the two first experiments and significantly higher for the number-guessing experiment with an Adaptive Boosted DEvo MLP reaching 31.35%, while being significantly quicker to train and classify. In particular, the accuracy of the nonboosted DEvo MLP was of 79.81%, 96.11%, and 27.07% in the same benchmarks. Two datasets for the experiments were gathered using a Muse EEG headband with four electrodes corresponding to TP9, AF7, AF8, and TP10 locations of the international EEG placement standard. The EEG MindBigData digits dataset was gathered from the TP9, FP1, FP2, and TP10 locations

    Cross-domain MLP and CNN Transfer Learning for Biological Signal Processing: EEG and EMG

    Get PDF
    In this work, we show the success of unsupervised transfer learning between Electroencephalographic (brainwave) classification and Electromyographic (muscular wave) domains with both MLP and CNN methods. To achieve this, signals are measured from both the brain and forearm muscles and EMG data is gathered from a 4-class gesture classification experiment via the Myo Armband, and a 3-class mental state EEG dataset is acquired via the Muse EEG Headband. A hyperheuristic multi-objective evolutionary search method is used to find the best network hyperparameters. We then use this optimised topology of deep neural network to classify both EMG and EEG signals, attaining results of 84.76% and 62.37% accuracy, respectively. Next, when pre-trained weights from the EMG classification model are used for initial distribution rather than random weight initialisation for EEG classification, 93.82%(+29.95) accuracy is reached. When EEG pre-trained weights are used for initial weight distribution for EMG, 85.12% (+0.36) accuracy is achieved. When the EMG network attempts to classify EEG, it outperforms the EEG network even without any training (+30.25% to 82.39% at epoch 0), and similarly the EEG network attempting to classify EMG data outperforms the EMG network (+2.38% at epoch 0). All transfer networks achieve higher pre-training abilities, curves, and asymptotes, indicating that knowledge transfer is possible between the two signal domains. In a second experiment with CNN transfer learning, the same datasets are projected as 2D images and the same learning process is carried out. In the CNN experiment, EMG to EEG transfer learning is found to be successful but not vice-versa, although EEG to EMG transfer learning did exhibit a higher starting classification accuracy. The significance of this work is due to the successful transfer of ability between models trained on two different biological signal domains, reducing the need for building more computationally complex models in future research

    Towards ai-based interactive game intervention to monitor concentration levels in children with attention deficit

    Get PDF
    —Preliminary results to a new approach for neurocognitive training on academic engagement and monitoring of attention levels in children with learning difficulties is presented. Machine Learning (ML) techniques and a Brain-Computer Interface (BCI) are used to develop an interactive AI-based game for educational therapy to monitor the progress of children’s concentration levels during specific cognitive tasks. Our approach resorts to data acquisition of brainwaves of children using electroencephalography (EEG) to classify concentration levels through model calibration. The real-time brainwave patterns are inputs to our game interface to monitor concentration levels. When the concentration drops, the educational game can personalize to the user by changing the challenge of the training or providing some new visual or auditory stimuli to the user in order to reduce the attention loss. To understand concentration level patterns, we collected brainwave data from children at various primary schools in Brazil who have intellectual disabilities e.g. autism spectrum disorder and attention deficit hyperactivity disorder. Preliminary results show that we successfully benchmarked (96%) the brainwave patterns acquired by using various classical ML techniques. The result obtained through the automatic classification of brainwaves will be fundamental to further develop our full approach. Positive feedback from questionnaires was obtained for both, the AI-based game and the engagement and motivation during the training sessions

    Country-level pandemic risk and preparedness classification based on COVID-19 data: A machine learning approach

    Get PDF
    In this work we present a three-stage Machine Learning strategy to country-level risk classification based on countries that are reporting COVID-19 information. A K% binning discretisation (K = 25) is used to create four risk groups of countries based on the risk of transmission (coronavirus cases per million population), risk of mortality (coronavirus deaths per million population), and risk of inability to test (coronavirus tests per million population). The four risk groups produced by K% binning are labelled as ‘low’, ‘medium-low’, ‘medium-high’, and ‘high’. Coronavirus-related data are then removed and the attributes for prediction of the three types of risk are given as the geopolitical and demographic data describing each country. Thus, the calculation of class label is based on coronavirus data but the input attributes are country-level information regardless of coronavirus data. The three four-class classification problems are then explored and benchmarked through leave-one-country-out cross validation to find the strongest model, producing a Stack of Gradient Boosting and Decision Tree algorithms for risk of transmission, a Stack of Support Vector Machine and Extra Trees for risk of mortality, and a Gradient Boosting algorithm for the risk of inability to test. It is noted that high risk for inability to test is often coupled with low risks for transmission and mortality, therefore the risk of inability to test should be interpreted first, before consideration is given to the predicted transmission and mortality risks. Finally, the approach is applied to more recent risk levels to data from September 2020 and weaker results are noted due to the growth of international collaboration detracting useful knowledge from country-level attributes which suggests that similar machine learning approaches are more useful prior to situations later unfolding
    corecore