24,585 research outputs found

    A Modified Boosted Ensemble Classifier on Location Based Social Networking

    Get PDF
    One of the research issues that researchers are interested in is unbalanced data classification techniques. Boosting approaches like Wang\u27s Boosting and Modified Boosted SVM (MBSVM) have been demonstrated to be more effective for unbalanced data. Our proposal The Modified Boosted Random Forest (MBRF) classifier is a Random Forest classifier that uses the Boosting approach. The main motivation of the study is to analyze sentiment of geotagged tweets understanding the state of mind of people at FIFA and Olympics datasets. Tree based model Random Forest algorithm using boosting approach classifies the tweets to build a recommendation system with an idea of providing commercial suggestions to participants, recommending local places to visit or perform activities. MBRF employs various strategies: i) a distance-based weight-update method based on K-Medoids ii) a sign-based classifier elimination technique. We have equally partitioned the datasets as 70% of data allocated for training and the remaining 30% data as test data. Our imbalanced data ratio measured 3.1666 and 4.6 for FIFA and Olympics datasets. We looked at accuracy, precision, recall and ROC curves for each event. The average AUC achieved by MBRF on FIFA dataset is 0.96 and Olympics is 0.97. A comparison of MBRF and Decision tree model using \u27Entropy\u27 proved MBRF better

    Music Similarity Estimation

    Get PDF
    Music is a complicated form of communication, where creators and culture communicate and expose their individuality. After music digitalization took place, recommendation systems and other online services have become indispensable in the field of Music Information Retrieval (MIR). To build these systems and recommend the right choice of song to the user, classification of songs is required. In this paper, we propose an approach for finding similarity between music based on mid-level attributes like pitch, midi value corresponding to pitch, interval, contour and duration and applying text based classification techniques. Our system predicts jazz, metal and ragtime for western music. The experiment to predict the genre of music is conducted based on 450 music files and maximum accuracy achieved is 95.8% across different n-grams. We have also analyzed the Indian classical Carnatic music and are classifying them based on its raga. Our system predicts Sankarabharam, Mohanam and Sindhubhairavi ragas. The experiment to predict the raga of the song is conducted based on 95 music files and the maximum accuracy achieved is 90.3% across different n-grams. Performance evaluation is done by using the accuracy score of scikit-learn

    Algorithm Selection Framework for Cyber Attack Detection

    Full text link
    The number of cyber threats against both wired and wireless computer systems and other components of the Internet of Things continues to increase annually. In this work, an algorithm selection framework is employed on the NSL-KDD data set and a novel paradigm of machine learning taxonomy is presented. The framework uses a combination of user input and meta-features to select the best algorithm to detect cyber attacks on a network. Performance is compared between a rule-of-thumb strategy and a meta-learning strategy. The framework removes the conjecture of the common trial-and-error algorithm selection method. The framework recommends five algorithms from the taxonomy. Both strategies recommend a high-performing algorithm, though not the best performing. The work demonstrates the close connectedness between algorithm selection and the taxonomy for which it is premised.Comment: 6 pages, 7 figures, 1 table, accepted to WiseML '2
    corecore