1,765 research outputs found

    A survey of data mining techniques for social media analysis

    Get PDF
    Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors

    The role of textual data in finance: methodological issues and empirical evidence

    Get PDF
    This thesis investigates the role of textual data in the financial field. Textual data fall into the more extensive category of alternative data. These types of data, such as reviews, blog post, tweet, are constantly growing, and this reinforces the importance in several domains. The thesis explores different applications of textual data in finance to highlight how it is possible to use this type of data and how this implementation can add value to financial analysis. The first application concerns the use of a lexicon-based approach in the credit scoring model. The second application proposes a causality detection between financial and sentiment data using an information-theoretic measure, the transfer entropy. The last application concerns the use of sentiment analysis in a network model, called BGVAR, to analyze the financial impact of the Covid-19 Pandemic. Overall, this thesis shows that combining textual data with traditional financial data can lead to a more insightful knowledge and, therefore, to a more in-depth analysis, allowing for a broader understanding of economic events and financial relationships among economic entities of any kind

    A Survey of Data Mining Techniques for Social Network Analysis

    Get PDF
    Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their author

    Information extraction from multimedia web documents: an open-source platform and testbed

    No full text
    The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval

    A social media and crowd-sourcing data mining system for crime prevention during and post-crisis situations

    Get PDF
    A number of large crisis situations, such as natural disasters have affected the planet over the last decade. The outcomes of such disasters are catastrophic for the infrastructures of modern societies. Furthermore, after large disasters, societies come face-to-face with important issues, such as the loss of human lives, people who are missing and the increment of the criminality rate. In many occasions, they seem unprepared to face such issues. This paper aims to present an automated system for the synchronization of the police and Law Enforcement Agencies (LEAs) for the prevention of criminal activities during and post a large crisis situation. The paper presents a review of the literature focusing on the necessity of using data mining in combination with advanced web technologies, such as social media and crowd-sourcing, for the resolution of the problems related to criminal activities caused during and post-crisis situations. The paper provides an introduction to examples of different techniques and algorithms used for social media and crowd-sourcing scanning, such as sentiment analysis and link analysis. The main focus of the paper is the ATHENA Crisis Management system. The function of the ATHENA system is based on the use of social media and crowd-sourcing for collecting crisis-related information. The system uses a number of data mining techniques to collect and analyze data from the social media for the purpose of crime prevention. A number of conclusions are drawn on the significance of social media and crowd-sourcing data mining techniques for the resolution of problems related to large crisis situations with emphasis to the ATHENA system

    A Survey on Mining Top-k Competitors from Large Unstructured Dataset Using k_means Clustering Algorithm and Sentiment Analysis Approach

    Get PDF
    Along line of research has shown the vital significance of recognizing and observing companyļæ½s contestants. In the framework of this activity various questions are emerge like: In what way we formalize and measure the competitiveness between two items? Who are the most important competitors of a specified item? What are the various features of an item that act on competitiveness? Inspired by this issue, the advertising and administration group have concentrated on observational strategies for competitor distinguishing proof and in addition on techniques for examining known contenders. Surviving examination on the previous has concentrated on mining near articulations (e.g.one product is superior then other product) from the web or other documentary sources. Despite the fact that such articulations can without a doubt be indications of strength, they are truant in numerous spaces. By surveying the various papers, we found the conclusion of basic significance of the competitiveness between two items on the basis of market segments
    • ā€¦
    corecore