2 research outputs found

    An Improved Similarity Matching based Clustering Framework for Short and Sentence Level Text

    Get PDF
    Text clustering plays a key role in navigation and browsing process. For an efficient text clustering, the large amount of information is grouped into meaningful clusters. Multiple text clustering techniques do not address the issues such as, high time and space complexity, inability to understand the relational and contextual attributes of the word, less robustness, risks related to privacy exposure, etc. To address these issues, an efficient text based clustering framework is proposed. The Reuters dataset is chosen as the input dataset. Once the input dataset is preprocessed, the similarity between the words are computed using the cosine similarity. The similarities between the components are compared and the vector data is created. From the vector data the clustering particle is computed. To optimize the clustering results, mutation is applied to the vector data. The performance the proposed text based clustering framework is analyzed using the metrics such as Mean Square Error (MSE), Peak Signal Noise Ratio (PSNR) and Processing time. From the experimental results, it is found that, the proposed text based clustering framework produced optimal MSE, PSNR and processing time when compared to the existing Fuzzy C-Means (FCM) and Pairwise Random Swap (PRS) methods

    Detection of fake news from social media using support vector machine learning algorithms

    No full text
    Never happened before in human history the spreading of fake news; now, the development of the Worldwide Web and the adoption of social media have given a pathway for people to spread misinformation to the world. Everyone is using the Internet, creating and sharing content on social media, but not all the information is valid, and no one is verifying the originality of the content. Identifying the content's essence is sometimes complicated for researchers and intelligent researchers. For example, during Covid-19, misinformation spread worldwide about the outbreak, and much false information spread faster than the virus. This misinformation will create a problem for the public and mislead people into taking the proper medicine. This work will help us to improve the prediction rate. Here we investigate the ability of machine learning classifiers and deep learning models: Naive Bayes, Logistic Regression, Support Vector Machine, Decision Tree, Random Forest and K-Nearest Neighbor. Deep learning models include Convolutional Neural Networks and Long Short-Term Memory (LSTM). The various types of machine learning and deep learning models will be trained and tested using the Covid-19 dataset (1,375,592 tweets)
    corecore