14 research outputs found

    Deep learning for religious and continent-based toxic content detection and classification

    Get PDF
    With time, numerous online communication platforms have emerged that allow people to express themselves, increasing the dissemination of toxic languages, such as racism, sexual harassment, and other negative behaviors that are not accepted in polite society. As a result, toxic language identification in online communication has emerged as a critical application of natural language processing. Numerous academic and industrial researchers have recently researched toxic language identification using machine learning algorithms. However, Nontoxic comments, including particular identification descriptors, such as Muslim, Jewish, White, and Black, were assigned unrealistically high toxicity ratings in several machine learning models. This research analyzes and compares modern deep learning algorithms for multilabel toxic comments classification. We explore two scenarios: the first is a multilabel classification of Religious toxic comments, and the second is a multilabel classification of race or toxic ethnicity comments with various word embeddings (GloVe, Word2vec, and FastText) without word embeddings using an ordinary embedding layer. Experiments show that the CNN model produced the best results for classifying multilabel toxic comments in both scenarios. We compared the outcomes of these modern deep learning model performances in terms of multilabel evaluation metrics

    Authorship identification using ensemble learning

    Get PDF
    With time, textual data is proliferating, primarily through the publications of articles. With this rapid increase in textual data, anonymous content is also increasing. Researchers are searching for alternative strategies to identify the author of an unknown text. There is a need to develop a system to identify the actual author of unknown texts based on a given set of writing samples. This study presents a novel approach based on ensemble learning, DistilBERT, and conventional machine learning techniques for authorship identification. The proposed approach extracts the valuable characteristics of the author using a count vectorizer and bi-gram Term frequency-inverse document frequency (TF-IDF). An extensive and detailed dataset, All the news is used in this study for experimentation. The dataset is divided into three subsets (article1, article2, and article3). We limit the scope of the dataset and selected ten authors in the first scope and 20 authors in the second scope for experimentation. The experimental results of proposed ensemble learning and DistilBERT provide better performance for all the three subsets of the All the news dataset. In the first scope, the experimental results prove that the proposed ensemble learning approach from 10 authors provides a better accuracy gain of 3.14% and from DistilBERT 2.44% from the article1 dataset. Similarly, in the second scope from 20 authors, the proposed ensemble learning approach provides a better accuracy gain of 5.25% and from DistilBERT 7.17% from the article1 dataset, which is better than previous state-of-the-art studies

    Data Augmentation-based Novel Deep Learning Method for Deepfaked Images Detection

    Get PDF
    Recent advances in artificial intelligence have led to deepfake images, enabling users to replace a real face with a genuine one. deepfake images have recently been used to malign public figures, politicians, and even average citizens. deepfake but realistic images have been used to stir political dissatisfaction, blackmail, propagate false news, and even carry out bogus terrorist attacks. Thus, identifying real images from fakes has got more challenging. To avoid these issues, this study employs transfer learning and data augmentation technique to classify deepfake images. For experimentation, 190,335 RGB-resolution deepfake and real images and image augmentation methods are used to prepare the dataset. The experiments use the deep learning models: convolutional neural network (CNN), Inception V3, visual geometry group (VGG19) and VGG16 with a transfer learning approach. Essential evaluation metrics (accuracy, precision, recall, F1-score, confusion matrix and AUC-ROC curve score) are used to test the efficacy of the proposed approach. Results revealed that the proposed approach achieves an accuracy, recall, F1-score and AUC-ROC score of 90% and 91% precision, with our fine-tuned VGG16 model outperforming other DL models in recognizing real and deepfakes

    Edge-Preserving Image Denoising Based on Lipschitz Estimation

    No full text
    International audienceThis article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC B

    Are Computer Experience and Anxiety Irrelevant? Towards a Simple Model for Adoption of E-Learning Systems

    No full text
    Massive growth of technology based e-learning systems is enabling student access to academic content from higher education institutions around the world. This study explores the antecedents of behavioral intention of students to use e-learning systems in university education to supplement classroom learning. A quantitative approach involving a structural equation model is adopted and research data collected from 358 undergraduate students is used for analysis. The study framework is based upon the technology acceptance model (TAM) and three external factors are proposed to influence the behavioral intention of students to use e-learning. Frequently used external factors in previous researches like computer experience and anxiety were not used and alternate factors were explored. Results show that self-efficacy, enjoyment and results demonstrability have a significant positive influence on perceived usefulness and on perceived ease of use of the e-learning system. The study contributes to understanding such contributory factors from the viewpoint of a student by suggesting that these factors hold well in the Pakistani academia culture where sufficient relevant empirical evidence did not exist due to lack of prior studies

    Cyber Security in IoT-Based Cloud Computing:A Comprehensive Survey

    No full text
    Cloud computing provides the flexible architecture where data and resources are dispersed at various locations and are accessible from various industrial environments. Cloud computing has changed the using, storing, and sharing of resources such as data, services, and applications for industrial applications. During the last decade, industries have rapidly switched to cloud computing for having more comprehensive access, reduced cost, and increased performance. In addition, significant improvement has been observed in the internet of things (IoT) with the integration of cloud computing. However, this rapid transition into the cloud raised various security issues and concerns. Traditional security solutions are not directly applicable and sometimes ineffective for cloud-based systems. Cloud platforms’ challenges and security concerns have been addressed during the last three years, despite the successive use and proliferation of multifaceted cyber weapons. The rapid evolution of deep learning (DL) in the artificial intelligence (AI) domain has brought many benefits that can be utilized to address industrial security issues in the cloud. The findings of the proposed research include the following: we present a comprehensive survey of enabling cloud-based IoT architecture, services, configurations, and security models; the classification of cloud security concerns in IoT into four major categories (data, network and service, applications, and people-related security issues), which are discussed in detail; we identify and inspect the latest advancements in cloud-based IoT attacks; we identify, discuss, and analyze significant security issues in each category and present the limitations from a general, artificial intelligence and deep learning perspective; we provide the technological challenges identified in the literature and then identify significant research gaps in the IoT-based cloud infrastructure to highlight future research directions to blend cybersecurity in cloud

    Cross corpus multi-lingual speech emotion recognition using ensemble learning

    No full text
    Receiving an accurate emotional response from robots has been a challenging task for researchers for the past few years. With the advancements in technology, robots like service robots interact with users of different cultural and lingual backgrounds. The traditional approach towards speech emotion recognition cannot be utilized to enable the robot and give an efficient and emotional response. The conventional approach towards speech emotion recognition uses the same corpus for both training and testing of classifiers to detect accurate emotions, but this approach cannot be generalized for multi-lingual environments, which is a requirement for robots used by people all across the globe. In this paper, a series of experiments are conducted to highlight an ensemble learning effect using a majority voting technique for cross-corpus, multi-lingual speech emotion recognition system. A comparison of the performance of an ensemble learning approach against traditional machine learning algorithms is performed. This study tests a classifier’s performance trained on one corpus with data from another corpus to evaluate its efficiency for multi-lingual emotion detection. According to experimental analysis, different classifiers give the highest accuracy for different corpora. Using an ensemble learning approach gives the benefit of combining all classifiers’ effect instead of choosing one classifier and compromising certain language corpus’s accuracy. Experiments show an increased accuracy of 13% for Urdu corpus, 8% for German corpus, 11% for Italian corpus, and 5% for English corpus from with-in corpus testing. For cross-corpus experiments, an improvement of 2% when training on Urdu data and testing on German data and 15% when training on Urdu data and testing on Italian data is achieved. An increase of 7% in accuracy is obtained when testing on Urdu data and training on German data, 3% when testing on Urdu data and training on Italian data, and 5% when testing on Urdu data and training on English data. Experiments prove that the ensemble learning approach gives promising results against other state-of-the-art techniques

    Identification and Categorization of Unusual Internet of Vehicles Events in Noisy Audio

    No full text
    The volume of multimedia data produced by various smart devices has increased dramatically with the advent of new digital technologies, including in the Internet of Vehicles (IoV). It has become more difficult to extract valuable insights from multimedia data due to several challenges during data analysis. The main problem is the need to quickly and precisely identify abnormalities in multimedia data. This research presents an unusual occurrence of the audio forensics database named UOAFDB and a practical method for identifying and categorizing unusual occurrences in audio files. To study the detection of abnormal audio and the classification of rare sound (e.g., car crash—machine gun, explosion) events for audio forensics, we construct a large audio dataset containing ten rare special events (anomalies) with 15 different background environmental settings (e.g., beach, restaurant, and train). The suggested method determines the optimal amount of features using the best feature extraction methodology available by extracting Mel-frequency cepstral coefficients (MFCCs) features from the audio signals of the newly formed dataset. Modern deep learning algorithms use these features as input to assess performance. Additionally, we apply deep learning methods to the most recent and best available dataset and obtain promising outcomes. The experimental findings demonstrate promising results on the UOAFDB dataset

    Deepfake Audio Detection via MFCC Features Using Machine Learning

    Get PDF
    Deepfake content is created or altered synthetically using artificial intelligence (AI) approaches to appear real. It can include synthesizing audio, video, images, and text. Deepfakes may now produce natural-looking content, making them harder to identify. Much progress has been achieved in identifying video deepfakes in recent years; nevertheless, most investigations in detecting audio deepfakes have employed the ASVSpoof or AVSpoof dataset and various machine learning, deep learning, and deep learning algorithms. This research uses machine and deep learning-based approaches to identify deepfake audio. Mel-frequency cepstral coefficients (MFCCs) technique is used to acquire the most useful information from the audio. We choose the Fake-or-Real dataset, which is the most recent benchmark dataset. The dataset was created with a text-to-speech model and is divided into four sub-datasets: for-rece, for-2-sec, for-norm and for-original. These datasets are classified into sub-datasets mentioned above according to audio length and bit rate. The experimental results show that the support vector machine (SVM) outperformed the other machine learning (ML) models in terms of accuracy on for-rece and for-2-sec datasets, while the gradient boosting model performed very well using for-norm dataset. The VGG-16 model produced highly encouraging results when applied to the for-original dataset. The VGG-16 model outperforms other state-of-the-art approaches
    corecore