84,708 research outputs found

    Visualizing classification of natural video sequences using sparse, hierarchical models of cortex.

    Get PDF
    Recent work on hierarchical models of visual cortex has reported state-of-the-art accuracy on whole-scene labeling using natural still imagery. This raises the question of whether the reported accuracy may be due to the sophisticated, non-biological back-end supervised classifiers typically used (support vector machines) and/or the limited number of images used in these experiments. In particular, is the model classifying features from the object or the background? Previous work (Landecker, Brumby, et al., COSYNE 2010) proposed tracing the spatial support of a classifier’s decision back through a hierarchical cortical model to determine which parts of the image contributed to the classification, compared to the positions of objects in the scene. In this way, we can go beyond standard measures of accuracy to provide tools for visualizing and analyzing high-level object classification. We now describe new work exploring the extension of these ideas to detection of objects in video sequences of natural scenes

    COVID-19 Chest X-Ray Detection Performance Through Variations of Wavelets Basis Function

    Get PDF
    Our previous work regarding the X-Ray detection of COVID-19 using Haar wavelet feature extraction and the Support Vector Machines (SVM) classification machine has shown that the combination of the two methods can detect COVID-19 well but then the question arises whether the Haar wavelet is the best wavelet method. So that in this study we conducted experiments on several wavelet methods such as biorthogonal, coiflet, Daubechies, haar, and symlets for chest X-Ray feature extraction with the same dataset. The results of the feature extraction are then classified using SVM and measure the quality of the classification model with parameters of accuracy, error rate, recall, specification, and precision. The results showed that the Daubechies wavelet gave the best performance for all classification quality parameters. The Daubechies wavelet transformation gave 95.47% accuracy, 4.53% error rate, 98.75% recall, 92.19% specificity, and 93.45% precision

    Malware Classification with BERT

    Get PDF
    Malware Classification is used to distinguish unique types of malware from each other. This project aims to carry out malware classification using word embeddings which are used in Natural Language Processing (NLP) to identify and evaluate the relationship between words of a sentence. Word embeddings generated by BERT and Word2Vec for malware samples to carry out multi-class classification. BERT is a transformer based pre- trained natural language processing (NLP) model which can be used for a wide range of tasks such as question answering, paraphrase generation and next sentence prediction. However, the attention mechanism of a pre-trained BERT model can also be used in malware classification by capturing information about relation between each opcode and every other opcode belonging to a malware family. Word2Vec generates word embeddings where words with similar context will be closer. The word embeddings generated by Word2Vec would help classify malware samples belonging to a certain family based on similarity. Classification will be carried out using classifiers such as Support Vector Machines (SVM), Logistic Regression, Random Forests and Multi-Layer Perceptron (MLP). The classification accuracy of classification carried out by word embeddings generated by BERT can be compared with the accuracy of Word2Vec that would establish a baseline for results

    What kind of questions do developers ask on Stack Overflow? A comparison of automated approaches to classify posts into question categories

    Get PDF
    On question and answer sites, such as Stack Overflow (SO), developers use tags to label the content of a post and to support developers in question searching and browsing. However, these tags mainly refer to technological aspects instead of the purpose of the question. Tagging questions with their purpose can add a new dimension to the identification of discussed topics in posts on SO. In this paper, we aim at automating the classification of SO question posts into seven question categories. As a first step, we harmonized existing taxonomies of question categories and then, we manually classified 1,000 SO questions according to our new taxonomy. Additionally to the question category, we marked the phrases that indicate a question category for each of the posts. We then use this data set to automate the classification of posts using two approaches. For the first approach, we manually analyzed the phrases to find patterns. Based on regular expressions, we implemented a classifier, for each of the categories, that determines whether a post belongs to a category. These regular expressions are derived by analyzing patterns in the phrases. In the second approach, we use the curated data set to train classification models of supervised machine learning algorithms (Random Forest and Support Vector Machines). For the machine learning algorithms, we experimented with 1,312 different configurations regarding the preprocessing of the text and the representation of the input data. Then, we compared the performance of the regex approach with the performance of the best configuration that uses machine learning algorithms on a validation set of 110 posts. The results show that using the regular expression approach, we can classify posts into the correct question category with an average precision and recall of 0.90, and an MCC of 0.68. Additionally, we applied the regex approach on all questions of SO that deal with Android app development and investigated the co-occurrence of question categories in posts. We found that the categories API usage, Conceptual, and Discrepancy are the most frequently assigned question categories and that they also occur together frequently. Our approach can be used to support developers in browsing SO discussions or researchers in building recommender systems based on SO

    Designing Semantic Kernels as Implicit Superconcept Expansions

    Get PDF
    Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of ‘semantic smoothing kernels’ by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel
    corecore