464 research outputs found

    Building Machine Learning systems for multi-atoms structures: CH3NH3PbI3 perovskite nanoparticles

    Get PDF
    In this study, we built a variety of Machine Learning (ML) systems over 23 different sizes of CH3NH3PbI3 perovskite nanoparticles (NPs) to predict the atoms in the NPs from their geometric locations. Our findings show that a specific type of ML algorithms, tree-based models which are Random Forest (RF), Extreme Gradient Boosting (XGBoost), Decision Trees (DT), can perfectly learn CH3NH3PbI3 perovskite NPs. Surprisingly, some popular ML algorithms such as Naive Bayes (NB), Support Vector Machines (SVM), Partial Least Squares (PLS), Regularized Logistic Regression (LR), Neural Networks (NN), Stacked Auto-Encoder Deep Neural Network (DNN), K-Nearest Neighbor (KNN) fail to learn CH3NH3PbI3 perovskite NPs

    Machine learning for Arabic phonemes recognition using electrolarynx speech

    Get PDF
    Automatic speech recognition system is one of the essential ways of interaction with machines. Interests in speech based intelligent systems have grown in the past few decades. Therefore, there is a need to develop more efficient methods for human speech recognition to ensure the reliability of communication between individuals and machines. This paper is concerned with Arabic phoneme recognition of electrolarynx device. Electrolarynx is a device used by cancer patients having vocal laryngeal cords removed. Speech recognition here is considered to find the preferred machine learning model that can classify phonemes produced by electrolarynx device. The phonemes recognition employs different machine learning schemes, including convolutional neural network, recurrent neural network, artificial neural network (ANN), random forest, extreme gradient boosting (XGBoost), and long short-term memory. Modern standard Arabic is utilized for testing and training phases of the recognition system. The dataset covers both an ordinary speech and electrolarynx device speech recorded by the same person. Mel frequency cepstral coefficients are considered as speech features. The results show that the ANN machine learning method outperformed other methods with an accuracy rate of 75%, a precision value of 77%, and a phoneme error rate (PER) of 21.85%

    Rapidly predicting Kohnā€“Sham total energy using data-centric AI

    Get PDF
    Predicting material properties by solving the Kohn-Sham (KS) equation, which is the basis of modern computational approaches to electronic structures, has provided significant improvements in materials sciences. Despite its contributions, both DFT and DFTB calculations are limited by the number of electrons and atoms that translate into increasingly longer run-times. In this work we introduce a novel, data-centric machine learning framework that is used to rapidly and accurately predicate the KS total energy of anatase TiO 2 nanoparticles (NPs) at different temperatures using only a small amount of theoretical data. The proposed framework that we call co-modeling eliminates the need for experimental data and is general enough to be used over any NPs to determine electronic structure and, consequently, more efficiently study physical and chemical properties. We include a web service to demonstrate the effectiveness of our approach. Ā© 2022, The Author(s)

    Use of EEG-Based Machine Learning to Predict Music-Related Brain Activity

    Get PDF
    Music has many awe-inspiring characteristics. Some may refer to it as a ā€œuniversal languageā€ with the ability to transcend the barriers of speech, while others may describe its ability to evoke intense emotional experiences for the listener. Regardless of the description, it is a commonly held view that music can have many profound effects. Studies of musicā€™s effects have found these beliefs to be more than pure conjecture, finding that music interacts with and changes our brains in physical and emotional ways. Music can even have clinical applications, such as music therapy. This type of therapy has been shown to be beneficial in many areas, ranging from stroke rehabilitation to mental health treatment. The mechanisms behind musicā€™s therapeutic benefit has to do with neuroplastic effects; Being able to harness this benefit in a therapeutic setting could make treatments for mental disorders and brain injuries even more effective. This thesis aimed to discover whether musical thoughts could be interpreted using machine learning, potentially opening the door to the use of thought-based musical training for therapeutic benefit. For this study, EEG data was collected while people were thinking of 5 melodies, then machine learning models were trained on labeled datasets. The models were then tasked with categorizing unlabeled sets of EEG data - in other words, predicting which melody a subject was thinking of while the data was being recorded. The accuracy of the predictions ranged from 45% to 80%, which means that the programs were 2-4 times more accurate than random guessing. This shows that these programs could potentially be used to examine the effects of musical thinking on neuroplasticity. While this topic is still exploratory and requires more research, these results could lead to a promising future of development of music-based brain-computer interfaces

    A framework for feature selection through boosting

    Get PDF
    As dimensions of datasets in predictive modelling continue to grow, feature selection becomes increasingly practical. Datasets with complex feature interactions and high levels of redundancy still present a challenge to existing feature selection methods. We propose a novel framework for feature selection that relies on boosting, or sample re-weighting, to select sets of informative features in classification problems. The method uses as its basis the feature rankings derived from fast and scalable tree-boosting models, such as XGBoost. We compare the proposed method to standard feature selection algorithms on 9 benchmark datasets. We show that the proposed approach reaches higher accuracies with fewer features on most of the tested datasets, and that the selected features have lower redundancy

    Exploring Time Series Spectral Features in Viral Hashtags Prediction

    Get PDF
    Viral hashtags spread across a large population of Internet users very quickly. Previous studies use features mostly in an aggregate sense to predict the popularity of hashtags, for example, the total number of hyperlinks in early tweets adopting a tag. Since each tweet is time stamped, many aggregate features can be decomposed into fine-grained time series such as a series of numbers of hyperlinks in early adopting tweets. This research utilizes frequency domain tools to analyze these time series. In particular, we apply scalogram analysis to study the series of adoption time lapses and the series of mentions and hyperlinks in early adopting tweets. Besides continuous wavelet transforms (CWTs), we also use fast wavelet transforms (FWTs) to analyze the time series. Through experiments with two sets of tweets collected in different seasons, out-of-sample cross validations show that wavelet spectral features can generally improve the prediction performance, and discrete FWT yields results as good as the more complicated CWT-based methods with scalogram analysis

    Algorithmes de recommandation musicale

    Full text link
    Ce meĢmoire est composeĢ de trois articles qui sā€™unissent sous le theĢ€me de la recommandation musicale aĢ€ grande eĢchelle. Nous preĢsentons dā€™abord une meĢthode pour effectuer des recommandations musicales en reĢcoltant des eĢtiquettes (tags) deĢcrivant les items et en utilisant cette aura textuelle pour deĢterminer leur similariteĢ. En plus dā€™effectuer des recommandations qui sont transparentes et personnalisables, notre meĢthode, baseĢe sur le contenu, nā€™est pas victime des probleĢ€mes dont souffrent les systeĢ€mes de filtrage collaboratif, comme le probleĢ€me du deĢmarrage aĢ€ froid (cold start problem). Nous preĢsentons ensuite un algorithme dā€™apprentissage automatique qui applique des eĢtiquettes aĢ€ des chansons aĢ€ partir dā€™attributs extraits de leur fichier audio. Lā€™ensemble de donneĢes que nous utilisons est construit aĢ€ partir dā€™une treĢ€s grande quantiteĢ de donneĢes sociales provenant du site Last.fm. Nous preĢsentons finalement un algorithme de geĢneĢration automatique de liste dā€™eĢcoute personnalisable qui apprend un espace de similariteĢ musical aĢ€ partir dā€™attributs audio extraits de chansons joueĢes dans des listes dā€™eĢcoute de stations de radio commerciale. En plus dā€™utiliser cet espace de similariteĢ, notre systeĢ€me prend aussi en compte un nuage dā€™eĢtiquettes que lā€™utilisateur est en mesure de manipuler, ce qui lui permet de deĢcrire de manieĢ€re abstraite la sorte de musique quā€™il deĢsire eĢcouter.This thesis is composed of three papers which unite under the general theme of large-scale music recommendation. The first paper presents a recommendation technique that works by collecting text descriptions of items and using this textual aura to compute the similarity between them using techniques drawn from information retrieval. We show how this representation can be used to explain the similarities between items using terms from the textual aura and further how it can be used to steer the recommender. Because our system is content-based, it is not victim of the usual problems associated with collaborative filtering recommenders like the cold start problem. The second paper presents a machine learning model which automatically applies tags to music. The model uses features extracted from the audio files and was trained on a very large data set constructed with social data from the online community Last.fm. The third paper presents an approach to generating steerable playlists. We first demonstrate a method for learning song transition probabilities from audio features extracted from songs played in professional radio station playlists. We then show that by using this learnt similarity function as a prior, we are able to generate steerable playlists by choosing the next song to play not simply based on that prior, but on a tag cloud that the user is able to manipulate to express the high-level characteristics of the music he wishes to listen to
    • ā€¦
    corecore