3 research outputs found

    Hybridized term-weighting method for dark web classification

    No full text
    The role of intelligence and security informatics based on statistical computations is becoming more significant in detecting terrorism activities proactively as the extremist groups are misusing many of the obtainable facilities on the Internet to incite violence and hatred. However, the performance of statistical methods is limited due to the inadequate accuracy produced by the inability of these methods to comprehend the texts created by humans. In this paper, we propose a hybridized feature selection method based on the basic term-weighting techniques for accurate terrorism activities detection in textual contexts. The proposed method combines the feature sets selected based on different individual feature selection methods into one feature space for effective web pages classification. UNION and Symmetric Difference combination functions are proposed for dimensionality reduction of the combined feature space. The method is tested on a selected dataset from the Dark Web Forum Portal and benchmarked using various famous text classifiers. Experimental results show that the hybridized method efficiently identifies the terrorist activities content and outperforms the individual methods. Furthermore, the results revealed that the classification performance achieved by hybridizing few feature sets is relatively competitive in the number of features used for classification with higher hybridization levels. Moreover, the experiments of hybridizing functions show that the dimensionality of the feature sets is significantly reduced by applying the Symmetric Difference function for feature sets combination

    Hybridized term-weighting method for web contents classification using SVM

    No full text
    The role of intelligence and security informatics based on statistical computations is becoming more significant in detecting terrorism activities proactively as the extremist groups are misusing many of the obtainable facilities on the Internet to incite violence and hatred. However, the performance of statistical methods is reported to be limited due to the inadequate accuracy produced by the inability of these methods to comprehend the meaning of texts created by humans. Miss classification of the actual terrorism web content as non-terrorism or vice versa reduces the usefulness of intelligent techniques to support the efforts against potential threats, and limits the opportunities for the effective use of intelligence and security informatics in the early detection of terrorist activities. In this paper, we propose a hybridized method based on the basic term-weighting techniques for accurate terrorism activities detection in textual contexts. The proposed method combines the feature sets generated by different individual term-weighting techniques such as Term Frequency (TF), Document Frequency (DF), Term Frequency-Inverse Document Frequency (TF-IDF), Glasgow, and Entropy into one feature set for effective classification. Moreover, two combination functions are proposed to reduce the dimensionality of combined feature set. The method is tested on a selected dataset from the Dark Web Portal Forum (DWPF) and benchmarked using Support Vector Machine (SVM), and other famous text classifiers such as K-Nearest Neighbor (KNN), Decision Trees (DT), Naïve Bayes (NB), and Extreme Learning Machine (ELM) classifiers. Experimental results show that the hybridized method efficiently identifies the terrorist activities content and outperforms the individual methods. Moreover, the results further revealed that the classification performance achieved by hybridizing few feature sets is relatively competitive in the number of features used for classification with higher hybridization levels. Moreover, the experiments of hybridizing functions show that the dimensionality of the feature sets is significantly reduced by applying the symmetric difference function for feature sets combination
    corecore