6,081 research outputs found

    Precursors and Laggards: An Analysis of Semantic Temporal Relationships on a Blog Network

    Full text link
    We explore the hypothesis that it is possible to obtain information about the dynamics of a blog network by analysing the temporal relationships between blogs at a semantic level, and that this type of analysis adds to the knowledge that can be extracted by studying the network only at the structural level of URL links. We present an algorithm to automatically detect fine-grained discussion topics, characterized by n-grams and time intervals. We then propose a probabilistic model to estimate the temporal relationships that blogs have with one another. We define the precursor score of blog A in relation to blog B as the probability that A enters a new topic before B, discounting the effect created by asymmetric posting rates. Network-level metrics of precursor and laggard behavior are derived from these dyadic precursor score estimations. This model is used to analyze a network of French political blogs. The scores are compared to traditional link degree metrics. We obtain insights into the dynamics of topic participation on this network, as well as the relationship between precursor/laggard and linking behaviors. We validate and analyze results with the help of an expert on the French blogosphere. Finally, we propose possible applications to the improvement of search engine ranking algorithms

    Precursors and Laggards: An Analysis of Semantic Temporal Relationships on a Blog Network

    Full text link
    We explore the hypothesis that it is possible to obtain information about the dynamics of a blog network by analysing the temporal relationships between blogs at a semantic level, and that this type of analysis adds to the knowledge that can be extracted by studying the network only at the structural level of URL links. We present an algorithm to automatically detect fine-grained discussion topics, characterized by n-grams and time intervals. We then propose a probabilistic model to estimate the temporal relationships that blogs have with one another. We define the precursor score of blog A in relation to blog B as the probability that A enters a new topic before B, discounting the effect created by asymmetric posting rates. Network-level metrics of precursor and laggard behavior are derived from these dyadic precursor score estimations. This model is used to analyze a network of French political blogs. The scores are compared to traditional link degree metrics. We obtain insights into the dynamics of topic participation on this network, as well as the relationship between precursor/laggard and linking behaviors. We validate and analyze results with the help of an expert on the French blogosphere. Finally, we propose possible applications to the improvement of search engine ranking algorithms

    Investigating the Impact of the Blogsphere: Using PageRank to Determine the Distribution of Attention

    Get PDF
    Much has been written in recent years about the blogosphere and its impact on political, educational and scientific debates. Lately the issue has received significant attention from the industry. As the blogosphere continues to grow, even doubling its size every six months, this paper investigates its apparent impact on the overall Web itself. We use the popular Google PageRank algorithm which employs a model of Web used to measure the distribution of user attention across sites in the blogosphere. The paper is based on an analysis of the PageRank distribution for 8.8 million blogs in 2005 and 2006. This paper addresses the following key questions: How is PageRank distributed across the blogosphere? Does it indicate the existence of measurable, visible effects of blogs on the overall mediasphere? Can we compare the distribution of attention to blogs as characterised by the PageRank with the situation for other forms of Web content? Has there been a growth in the impact of the blogosphere on the Web over the two years analysed here? Finally, it will also be necessary to examine the limitations of a PageRank-centred approach

    Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks

    Full text link
    Identifying the most influential spreaders that maximize information flow is a central question in network theory. Recently, a scalable method called "Collective Influence (CI)" has been put forward through collective influence maximization. In contrast to heuristic methods evaluating nodes' significance separately, CI method inspects the collective influence of multiple spreaders. Despite that CI applies to the influence maximization problem in percolation model, it is still important to examine its efficacy in realistic information spreading. Here, we examine real-world information flow in various social and scientific platforms including American Physical Society, Facebook, Twitter and LiveJournal. Since empirical data cannot be directly mapped to ideal multi-source spreading, we leverage the behavioral patterns of users extracted from data to construct "virtual" information spreading processes. Our results demonstrate that the set of spreaders selected by CI can induce larger scale of information propagation. Moreover, local measures as the number of connections or citations are not necessarily the deterministic factors of nodes' importance in realistic information spreading. This result has significance for rankings scientists in scientific networks like the APS, where the commonly used number of citations can be a poor indicator of the collective influence of authors in the community.Comment: 11 pages, 4 figure

    PREDICTION IN SOCIAL MEDIA FOR MONITORING AND RECOMMENDATION

    Get PDF
    Social media including blogs and microblogs provide a rich window into user online activity. Monitoring social media datasets can be expensive due to the scale and inherent noise in such data streams. Monitoring and prediction can provide significant benefit for many applications including brand monitoring and making recommendations. Consider a focal topic and posts on multiple blog channels on this topic. Being able to target a few potentially influential blog channels which will contain relevant posts is valuable. Once these channels have been identified, a user can proactively join the conversation themselves to encourage positive word-of-mouth and to mitigate negative word-of-mouth. Links between different blog channels, and retweets and mentions between different microblog users, are a proxy of information flow and influence. When trying to monitor where information will flow and who will be influenced by a focal user, it is valuable to predict future links, retweets and mentions. Predictions of users who will post on a focal topic or who will be influenced by a focal user can yield valuable recommendations. In this thesis we address the problem of prediction in social media to select social media channels for monitoring and recommendation. Our analysis focuses on individual authors and linkers. We address a series of prediction problems including future author prediction problem and future link prediction problem in the blogosphere, as well as prediction in microblogs such as twitter. For the future author prediction in the blogosphere, where there are network properties and content properties, we develop prediction methods inspired by information retrieval approaches that use historical posts in the blog channel for prediction. We also train a ranking support vector machine (SVM) to solve the problem, considering both network properties and content properties. We identify a number of features which have impact on prediction accuracy. For the future link prediction in the blogosphere, we compare multiple link prediction methods, and show that our proposed solution which combines the network properties of the blog with content properties does better than methods which examine network properties or content properties in isolation. Most of the previous work has only looked at either one or the other. For the prediction in microblogs, where there are follower network, retweet network, and mention network, we propose a prediction model to utilize the hybrid network for prediction. In this model, we define a potential function that reflects the likelihood of a candidate user having a specific type of link to a focal user in the future and identify an optimization problem by the principle of maximum likelihood to determine the parameters in the model. We propose different approximate approaches based on the prediction model. Our approaches are demonstrated to outperform the baseline methods which only consider one network or utilize hybrid networks in a naive way. The prediction model can be applied to other similar problems where hybrid networks exist

    BlogForever D5.1: Design and Specification of Case Studies

    Get PDF
    This document presents the specification and design of six case studies for testing the BlogForever platform implementation process. The report explains the data collection plan where users of the repository will provide usability feedback through questionnaires as well as details of scalability analysis through the creation of specific log files analytics. The case studies will investigate the sustainability of the platform, that it meets potential users’ needs and that is has an important long term impact

    Identification of Influential Social Networkers

    Get PDF
    Online social networking is deeply interleaved in today\u27s lifestyle. People come together and build communities to share thoughts, offer suggestions, exchange information, ideas, and opinions. Moreover, social networks often serve as platforms for information dissemination and product placement or promotion through viral marketing. The success rate in this type of marketing could be increased by targeting specific individuals, called \u27influential users\u27, having the largest possible reach within an online community. In this paper, we present a method aiming at identifying the influential users within an online social networking application. We introduce ProfileRank, a metric that uses popularity and activity characteristics of each user to rank them in terms of their influence. We then assess this algorithm\u27s added value in identifying influential users compared to other commonly used social network analysis metrics, such as the betweenness centrality and the well-known PageRank, by performing an experimental evaluation on a synthetic and a real-life dataset. We also integrate all three metrics in a unified metric and measure its performance

    Exploring the role of sentiments in identification of active and influential bloggers

    Get PDF
    The social Web provides opportunities for the public to have social interactions and online discussions. A large number of online users using the social web sites create a high volume of data. This leads to the emergence of Big Data, which focuses on computational analysis of data to reveal patterns, and associations relating to human interactions. Such analyses have vast applications in various fields such as understanding human behaviors, studying culture influence, and promoting online marketing. The blogs are one of the social web channels that offer a way to discuss various topics. Finding the top bloggers has been a major research problem in the research domain of the social web and big data. Various models and metrics have been proposed to find important blog users in the blogosphere community. In this work, first find the sentiment of blog posts, then we find the active and influential bloggers. Then, we compute various measures to explore the correlation between the sentiment and active as well as bloggers who have impact on other bloggers in online communities. Data computed using the real world blog data reveal that the sentiment is an important factor and should be considered as a feature for finding top bloggers. Sentiment analysis helps to understand how it affects human behaviors

    Exploring the role of sentiments in identification of active and influential bloggers

    Get PDF
    The social Web provides opportunities for the public to have social interactions and online discussions. A large number of online users using the social web sites create a high volume of data. This leads to the emergence of Big Data, which focuses on computational analysis of data to reveal patterns, and associations relating to human interactions. Such analyses have vast applications in various fields such as understanding human behaviors, studying culture influence, and promoting online marketing. The blogs are one of the social web channels that offer a way to discuss various topics. Finding the top bloggers has been a major research problem in the research domain of the social web and big data. Various models and metrics have been proposed to find important blog users in the blogosphere community. In this work, first find the sentiment of blog posts, then we find the active and influential bloggers. Then, we compute various measures to explore the correlation between the sentiment and active as well as bloggers who have impact on other bloggers in online communities. Data computed using the real world blog data reveal that the sentiment is an important factor and should be considered as a feature for finding top bloggers. Sentiment analysis helps to understand how it affects human behaviors

    Identifying Influential Bloggers: Time Does Matter

    Full text link
    Blogs have recently become one of the most favored services on the Web. Many users maintain a blog and write posts to express their opinion, experience and knowledge about a product, an event and every subject of general or specific interest. More users visit blogs to read these posts and comment them. This "participatory journalism" of blogs has such an impact upon the masses that Keller and Berry argued that through blogging "one American in tens tells the other nine how to vote, where to eat and what to buy" \cite{keller1}. Therefore, a significant issue is how to identify such influential bloggers. This problem is very new and the relevant literature lacks sophisticated solutions, but most importantly these solutions have not taken into account temporal aspects for identifying influential bloggers, even though the time is the most critical aspect of the Blogosphere. This article investigates the issue of identifying influential bloggers by proposing two easily computed blogger ranking methods, which incorporate temporal aspects of the blogging activity. Each method is based on a specific metric to score the blogger's posts. The first metric, termed MEIBI, takes into consideration the number of the blog post's inlinks and its comments, along with the publication date of the post. The second metric, MEIBIX, is used to score a blog post according to the number and age of the blog post's inlinks and its comments. These methods are evaluated against the state-of-the-art influential blogger identification method utilizing data collected from a real-world community blog site. The obtained results attest that the new methods are able to better identify significant temporal patterns in the blogging behaviour
    corecore