12 research outputs found

    Recognition of Promoters in DNA Sequences Using Weightily Averaged One-dependence Estimators

    Get PDF
    AbstractThe completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17% in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes

    Bayes classifiers for imbalanced traffic accidents datasets

    Full text link
    [EN] Traffic accidents data sets are usually imbalanced, where the number of instances classified under the killed or severe injuries class (minority) is much lower than those classified under the slight injuries class (majority). This, however, supposes a challenging problem for classification algorithms and may cause obtaining a model that well cover the slight injuries instances whereas the killed or severe injuries instances are misclassified frequently. Based on traffic accidents data collected on urban and suburban roads in Jordan for three years (2009-2011); three different data balancing techniques were used: under sampling which removes some instances of the majority class, oversampling which creates new instances of the minority class and a mix technique that combines both. In addition, different Bayes classifiers were compared for the different imbalanced and balanced data sets: Averaged One-Dependence Estimators, Weightily Average One-Dependence Estimators, and Bayesian networks in order to identify factors that affect the severity of an accident. The results indicated that using the balanced data sets, especially those created using oversampling techniques, with Bayesian networks improved classifying a traffic accident according to its severity and reduced the misclassification of killed and severe injuries instances. On the other hand, the following variables were found to contribute to the occurrence of a killed causality or a severe injury in a traffic accident: number of vehicles involved, accident pattern, number of directions, accident type, lighting, surface condition, and speed limit. This work, to the knowledge of the authors, is the first that aims at analyzing historical data records for traffic accidents occurring in Jordan and the first to apply balancing techniques to analyze injury severity of traffic accidents. (C) 2015 Elsevier Ltd. All rights reserved.The authors are grateful to the Police Traffic Department in Jordan for providing the data necessary for this research. Griselda Lopez wishes to express her acknowledgement to the regional ministry of Economy, Innovation and Science of the regional government of Andalusia (Spain) for their scholarship to train teachers and researchers in Deficit Areas, which has made this work possible. The authors appreciate the reviewers' comments and effort in order to improve the paper.Mujalli, R.; López-Maldonado, G.; Garach, L. (2016). Bayes classifiers for imbalanced traffic accidents datasets. Accident Analysis & Prevention. 88:37-51. https://doi.org/10.1016/j.aap.2015.12.003S37518

    Locally weighted learning: How and when does it work in Bayesian networks?

    Full text link
    © 2016, Taylor and Francis Ltd. All rights reserved. Bayesian network (BN), a simple graphical notation for conditional independence assertions, is promised to represent the probabilistic relationships between diseases and symptoms. Learning the structure of a Bayesian network classifier (BNC) encodes conditional independence assumption between attributes, which may deteriorate the classification performance. One major approach to mitigate the BNC’s primary weakness (the attributes independence assumption) is the locally weighted approach. And this type of approach has been proved to achieve good performance for naive Bayes, a BNC with simple structure. However, we do not know whether or how effective it works for improving the performance of the complex BNC. In this paper, we first do a survey on the complex structure models for BNCs and their improvements, then carry out a systematically experimental analysis to investigate the effectiveness of locally weighted method for complex BNCs, e.g., tree-augmented naive Bayes (TAN), averaged one-dependence estimators AODE and hidden naive Bayes (HNB), measured by classification accuracy (ACC) and the area under the ROC curve ranking (AUC). Experiments and comparisons on 36 benchmark data sets collected from University of California, Irvine (UCI) in Weka system demonstrate that locally weighting technologies just slightly outperforms unweighted complex BNCs on ACC and AUC. In other words, although locally weighting could significantly improve the performance of NB (a BNC with simple structure), it could not work well on BNCs with complex structures. This is because the performance improvements of BNCs are attributed to their structures not the locally weighting

    Contents

    Get PDF

    Data Fusion for Real-time Multimodal Emotion Recognition through Webcams and Microphones in E-Learning

    Get PDF
    The original article is available on the Taylor & Francis Online website in the following link: http://www.tandfonline.com/doi/abs/10.1080/10447318.2016.1159799?journalCode=hihc20This paper describes the validation study of our software that uses combined webcam and microphone data for real-time, continuous, unobtrusive emotion recognition as part of our FILTWAM framework. FILTWAM aims at deploying a real time multimodal emotion recognition method for providing more adequate feedback to the learners through an online communication skills training. Herein, timely feedback is needed that reflects on their shown intended emotions and which is also useful to increase learners’ awareness of their own behaviour. At least, a reliable and valid software interpretation of performed face and voice emotions is needed to warrant such adequate feedback. This validation study therefore calibrates our software. The study uses a multimodal fusion method. Twelve test persons performed computer-based tasks in which they were asked to mimic specific facial and vocal emotions. All test persons’ behaviour was recorded on video and two raters independently scored the showed emotions, which were contrasted with the software recognition outcomes. A hybrid method for multimodal fusion of our multimodal software shows accuracy between 96.1% and 98.6% for the best-chosen WEKA classifiers over predicted emotions. The software fulfils its requirements of real-time data interpretation and reliable results.The Netherlands Laboratory for Lifelong Learning (NELLL) of the Open University Netherlands

    SODE: Self-Adaptive One-Dependence Estimators for classification

    Full text link
    © 2015 Elsevier Ltd. SuperParent-One-Dependence Estimators (SPODEs) represent a family of semi-naive Bayesian classifiers which relax the attribute independence assumption of Naive Bayes (NB) to allow each attribute to depend on a common single attribute (superparent). SPODEs can effectively handle data with attribute dependency but still inherent NB's key advantages such as computational efficiency and robustness for high dimensional data. In reality, determining an optimal superparent for SPODEs is difficult. One common approach is to use weighted combinations of multiple SPODEs, each having a different superparent with a properly assigned weight value (i.e., a weight value is assigned to each attribute). In this paper, we propose a self-adaptive SPODEs, namely SODE, which uses immunity theory in artificial immune systems to automatically and self-adaptively select the weight for each single SPODE. SODE does not need to know the importance of individual SPODE nor the relevance among SPODEs, and can flexibly and efficiently search optimal weight values for each SPODE during the learning process. Extensive experiments and comparisons on 56 benchmark data sets, and validations on image and text classification, demonstrate that SODE outperforms state-of-the-art weighted SPODE algorithms and is suitable for a wide range of learning tasks. Results also confirm that SODE provides an appropriate balance between runtime efficiency and accuracy

    Analyzing and enhancing music mood classification : an empirical study

    Get PDF
    In the computer age, managing large data repositories is one of the common challenges, especially for music data. Categorizing, manipulating, and refining music tracks are among the most complex tasks in Music Information Retrieval (MIR). Classification is one of the core functions in MIR, which classifies music data from different perspectives, from genre to instrument to mood. The primary focus of this study is on music mood classification. Mood is a subjective phenomenon in MIR, which involves different considerations, such as psychology, musicology, culture, and social behavior. One of the most significant prerequisitions in music mood classification is answering these questions: what combination of acoustic features helps us to improve the accuracy of classification in this area? What type of classifiers is appropriate in music mood classification? How can we increase the accuracy of music mood classification using several classifiers? To find the answers to these questions, we empirically explored different acoustic features and classification schemes on the mood classification in music data. Also, we found the two approaches to use several classifiers simultaneously to classify music tracks using mood labels automatically. These methods contain two voting procedures; namely, Plurality Voting and Borda Count. These approaches are categorized into ensemble techniques, which combine a group of classifiers to reach better accuracy. The proposed ensemble methods are implemented and verified through empirical experiments. The results of the experiments have shown that these proposed approaches could improve the accuracy of music mood classification
    corecore