9 research outputs found

    Leveraging Machine Learning and Semi-Structured Information to Identify Political Views from Social Media Posts

    No full text
    Social media platforms make a significant contribution to modeling and influencing people’s opinions and decisions, including political views and orientation. Analyzing social media content can reveal trends and key triggers that will influence society. This paper presents an exhaustive analysis of the performance generated by various implementations of the Naïve Bayes classifier, combined with a semi-structured information approach, to identify the political orientation of Twitter users, based on their posts. As research methodology, we aggregate in a semi-structured format a database of over 86,000 political posts from Democrat (right) and Republican (left) ideologies. Such an approach allows us to associate a Democrat or Republican label to each tweet, in order to create and train the model. The semi-structured input data are processed using several NLP techniques and then the model is trained to classify the political orientation based on semantic criteria and semi-structured information. This paper examines several variations of the Naïve Bayes classifier suite: Gaussian Naïve Bayes, Multinomial Naïve Bayes, Calibrated Naïve Bayes algorithms, and tracks a variety of performance indices and their graphical representations: Prediction Accuracy, Precision, Recall, Confusion Matrix, Brier Score Loss, etc. We obtained an accuracy of around 80–85% in identifying the political orientation of the users. This leads us to the conclusion that this type of application can be integrated into a more complex system and can help in determining political trends or election results

    Fostering Cyber-Physical Social Systems through an Ontological Approach to Personality Classification Based on Social Media Posts

    No full text
    The exponential increase in social networks has led to emergent convergence of cyber-physical systems (CPS) and social computing, accelerating the creation of smart communities and smart organizations and enabling the concept of cyber-physical social systems. Social media platforms have made a significant contribution to what we call human behavior modeling. This paper presents a novel approach to developing a users’ segmentation tool for the Romanian language, based on the four DISC personality types, based on social media statement analysis. We propose and design the ontological modeling approach of the specific vocabulary for each personality and its mapping with text from posts on social networks. This research proposal adds significant value both in terms of scientific and technological contributions (by developing semantic technologies and tools), as well as in terms of business, social and economic impact (by supporting the investigation of smart communities in the context of cyber-physical social systems). For the validation of the model developed we used a dataset of almost 2000 posts retrieved from 10 social medial accounts (Facebook and Twitter) and we have obtained an accuracy of over 90% in identifying the personality profile of the users

    Leveraging Machine Learning and Semi-Structured Information to Identify Political Views from Social Media Posts

    No full text
    Social media platforms make a significant contribution to modeling and influencing people’s opinions and decisions, including political views and orientation. Analyzing social media content can reveal trends and key triggers that will influence society. This paper presents an exhaustive analysis of the performance generated by various implementations of the Naïve Bayes classifier, combined with a semi-structured information approach, to identify the political orientation of Twitter users, based on their posts. As research methodology, we aggregate in a semi-structured format a database of over 86,000 political posts from Democrat (right) and Republican (left) ideologies. Such an approach allows us to associate a Democrat or Republican label to each tweet, in order to create and train the model. The semi-structured input data are processed using several NLP techniques and then the model is trained to classify the political orientation based on semantic criteria and semi-structured information. This paper examines several variations of the Naïve Bayes classifier suite: Gaussian Naïve Bayes, Multinomial Naïve Bayes, Calibrated Naïve Bayes algorithms, and tracks a variety of performance indices and their graphical representations: Prediction Accuracy, Precision, Recall, Confusion Matrix, Brier Score Loss, etc. We obtained an accuracy of around 80–85% in identifying the political orientation of the users. This leads us to the conclusion that this type of application can be integrated into a more complex system and can help in determining political trends or election results

    Organizing Web Search Results Using Clustering By Compression

    No full text
    International audienceCurrent Web search engines return long lists of ranked documents that users are forced to sift through to find relevant documents. This paper introduces an interactive presentation method of the search results, based on the notion of clustering by compression. Compression algorithms allow defining a similarity measure based on the degree of common information. Clustering methods allow clustering similar data without any previous knowledge. For this work, we have developed a Java application which retrieves the first 50 results returned by the Google search engine in response to a query, applies some text processing techniques to the Web documents, and thirdly applies the clustering by compression algorithm. The result is a binary tree enhancing the visualization of the formed clusters
    corecore