    The Roles of Bloggers in Web 2.0

    This study examine the roles of bloggers in Web 2.0. Blogging is considered to be a fairly new phenomenon and blog authors are increasingly seen as producers of user-­generated content wielding influence on both their readers and respective industries. Thus, it is argued that recognizing the different roles assumed by bloggers will shed light on the importance of bloggers and how this may affect marketingpractices

    Veracity Roadmap: Is Big Data Objective, Truthful and Credible?

    This paper argues that big data can possess different characteristics, which affect its quality. Depending on its origin, data processing technologies, and methodologies used for data collection and scientific discoveries, big data can have biases, ambiguities, and inaccuracies which need to be identified and accounted for to reduce inference errors and improve the accuracy of generated insights. Big data veracity is now being recognized as a necessary property for its utilization, complementing the three previously established quality dimensions (volume, variety, and velocity), But there has been little discussion of the concept of veracity thus far. This paper provides a roadmap for theoretical and empirical definitions of veracity along with its practical implications. We explore veracity across three main dimensions: 1) objectivity/subjectivity, 2) truthfulness/deception, 3) credibility/implausibility – and propose to operationalize each of these dimensions with either existing computational tools or potential ones, relevant particularly to textual data analytics. We combine the measures of veracity dimensions into one composite index – the big data veracity index. This newly developed veracity index provides a useful way of assessing systematic variations in big data quality across datasets with textual information. The paper contributes to the big data research by categorizing the range of existing tools to measure the suggested dimensions, and to Library and Information Science (LIS) by proposing to account for heterogeneity of diverse big data, and to identify information quality dimensions important for each big data type

    The Future of the Internet III

    Presents survey results on technology experts' predictions on the Internet's social, political, and economic impact as of 2020, including its effects on integrity and tolerance, intellectual property law, and the division between personal and work lives

    DIR 2011: Dutch_Belgian Information Retrieval Workshop Amsterdam

    This post is sponsored but all opinions are my own: does fashion blogging offer an authentic voice? An investigation into the credibility of fashion blogger sponsored content and blogger perspectives on the tensions between authenticity and commercialisation.

    This study investigates the impact of commercial sponsorship upon fashion blogging, a form of digital communication that has become important in influencing online consumer behaviour. Fashion companies appreciate the marketing value of blogs and have utilised them to their own advantage. As a result, the fashion blogosphere has become increasingly commercialised. Existing research into changes in fashion blogging has generally focused upon the attitudes and perspectives of blog readers. Relatively little research investigates the attitude of fashion bloggers themselves. This thesis therefore specifically examines the attitudes of UK fashion bloggers as regards the impact of commercial sponsorship upon their practice and on the credibility and authenticity of their blog output. This study takes an interpretative, qualitative research approach with a combination of an online questionnaire and in-depth interviews. It focuses upon over 300 fashion bloggers divided into three distinct groups: young and old active bloggers and significantly a third group of bloggers who have discontinued the activity. A review of existing literature identified a number of key areas for exploration: the effect of sponsorship upon blogger motivation, design and content of blog output, the pressures upon bloggers resulting from accepting sponsorship rewards, blogger perception of the impact that commercialisation may have had upon their practice values, the potential effect of sponsorship upon their relationship with readers, their views on the changing status of the fashion blogosphere and their role as fashion bloggers. The findings offer a number of new perspectives upon the evolving fashion-blog sector, especially with reference to the following themes: the personal pressures felt by some fashion bloggers as a consequence of their involvement with commercial partnerships and the negative impact that this can have upon their mental health; the increased discrepancy between the ways in which fashion bloggers talk about their practice and the reality of their actual online behaviour as regards disclosure of sponsored material, self-censorship and reluctance to be critical; the increased priority that many fashion bloggers now place upon commercial opportunities rather than their relationship with readers. This research is of significance as it has explored the tensions affecting fashion blogger attitudes and practice from their own point of view. It has specifically analysed the general decline of social community in the fashion blogosphere and the impact that this has had upon the authenticity and credibility of the fashion blogger voice

    On the Role of Social Identity and Cohesion in Characterizing Online Social Communities

    Two prevailing theories for explaining social group or community structure are cohesion and identity. The social cohesion approach posits that social groups arise out of an aggregation of individuals that have mutual interpersonal attraction as they share common characteristics. These characteristics can range from common interests to kinship ties and from social values to ethnic backgrounds. In contrast, the social identity approach posits that an individual is likely to join a group based on an intrinsic self-evaluation at a cognitive or perceptual level. In other words group members typically share an awareness of a common category membership. In this work we seek to understand the role of these two contrasting theories in explaining the behavior and stability of social communities in Twitter. A specific focal point of our work is to understand the role of these theories in disparate contexts ranging from disaster response to socio-political activism. We extract social identity and social cohesion features-of-interest for large scale datasets of five real-world events and examine the effectiveness of such features in capturing behavioral characteristics and the stability of groups. We also propose a novel measure of social group sustainability based on the divergence in group discussion. Our main findings are: 1) Sharing of social identities (especially physical location) among group members has a positive impact on group sustainability, 2) Structural cohesion (represented by high group density and low average shortest path length) is a strong indicator of group sustainability, and 3) Event characteristics play a role in shaping group sustainability, as social groups in transient events behave differently from groups in events that last longer

    Effects of vlogger race on perceived credibility, self-efficacy and behavioral intentions towards weight loss

    Abstract Spokesperson race and expertise have exhibited an impact on audiences. This study examines the effect of race congruency between vlogger and audience in regards to weight loss. Using social psychology and communication theories, including Elaboration Likelihood Model, spokesperson effects, the Theory of Planned Behavior, social cognitive theory, and self-efficacy theory, the current study features independently produced vlogs (video blogs) discussing weight loss strategies. The race and expertise of the vloggers in the videos were manipulated to test the effects on perceived message and source credibility, self-efficacy towards exercising and dieting, and behavioral intentions towards exercising and dieting. Results reveal that, although race congruency demonstrates limited effect on the outcome variables, it interacts with participant race, ethnic identity, and vlogger expertise to predict perceived message credibility, self-efficacy towards exercising and dieting, and behavioral intentions towards exercising and dieting. Asian American participants report greater perceived message credibility and behavioral intentions towards exercising and dieting after watching an Asian American vlogger compared to a White American vlogger. Asian American participants with low ethnic identity report greater self-efficacy towards exercising and dieting after watching an Asian American vlogger compared to a White American vlogger, while White American participants with low ethnic identity report greater self-efficacy towards exercising after watching an Asian American vlogger compared to a White American vlogger. Furthermore, perceived message credibility mediates the effect of the interaction of race congruency and participant race on participant’s behavioral intentions towards exercising and dieting. This study provides insights for understanding spokesperson effect and designing health campaigns in the interactive media environment

    Ways of not reading Gertrude Stein

    I situate the controversial critical strategies of “distant reading” and “surface reading” in the reception history of Gertrude Stein, an author whose work was frequently declared “unreadable.” I argue that an early twentieth-century history of compromised forms of reading, including women’s reading and information work, subtends both the technology with which distant reading may be carried out and the ways in which an author’s work comes to be understood as a “corpus.

    Methods for ranking user-generated text streams: a case study in blog feed retrieval

    User generated content are one of the main sources of information on the Web nowadays. With the huge amount of this type of data being generated everyday, having an efficient and effective retrieval system is essential. The goal of such a retrieval system is to enable users to search through this data and retrieve documents relevant to their information needs. Among the different retrieval tasks of user generated content, retrieving and ranking streams is one of the important ones that has various applications. The goal of this task is to rank streams, as collections of documents with chronological order, in response to a user query. This is different than traditional retrieval tasks where the goal is to rank single documents and temporal properties are less important in the ranking. In this thesis we investigate the problem of ranking user-generated streams with a case study in blog feed retrieval. Blogs, like all other user generated streams, have specific properties and require new considerations in the retrieval methods. Blog feed retrieval can be defined as retrieving blogs with a recurrent interest in the topic of the given query. We define three different properties of blog feed retrieval each of which introduces new challenges in the ranking task. These properties include: 1) term mismatch in blog retrieval, 2) evolution of topics in blogs and 3) diversity of blog posts. For each of these properties, we investigate its corresponding challenges and propose solutions to overcome those challenges. We further analyze the effect of our solutions on the performance of a retrieval system. We show that taking the new properties into account for developing the retrieval system can help us to improve state of the art retrieval methods. In all the proposed methods, we specifically pay attention to temporal properties that we believe are important information in any type of streams. We show that when combined with content-based information, temporal information can be useful in different situations. Although we apply our methods to blog feed retrieval, they are mostly general methods that are applicable to similar stream ranking problems like ranking experts or ranking twitter users

    Combining granularity-based topic-dependent and topic-independent evidences for opinion detection

    Fouille des opinion, une sous-discipline dans la recherche d'information (IR) et la linguistique computationnelle, fait rĂ©fĂ©rence aux techniques de calcul pour l'extraction, la classification, la comprĂ©hension et l'Ă©valuation des opinions exprimĂ©es par diverses sources de nouvelles en ligne, social commentaires des mĂ©dias, et tout autre contenu gĂ©nĂ©rĂ© par l'utilisateur. Il est Ă©galement connu par de nombreux autres termes comme trouver l'opinion, la dĂ©tection d'opinion, l'analyse des sentiments, la classification sentiment, de dĂ©tection de polaritĂ©, etc. DĂ©finition dans le contexte plus spĂ©cifique et plus simple, fouille des opinion est la tĂąche de rĂ©cupĂ©ration des opinions contre son besoin aussi exprimĂ© par l'utilisateur sous la forme d'une requĂȘte. Il y a de nombreux problĂšmes et dĂ©fis liĂ©s Ă  l'activitĂ© fouille des opinion. Dans cette thĂšse, nous nous concentrons sur quelques problĂšmes d'analyse d'opinion. L'un des dĂ©fis majeurs de fouille des opinion est de trouver des opinions concernant spĂ©cifiquement le sujet donnĂ© (requĂȘte). Un document peut contenir des informations sur de nombreux sujets Ă  la fois et il est possible qu'elle contienne opiniĂątre texte sur chacun des sujet ou sur seulement quelques-uns. Par consĂ©quent, il devient trĂšs important de choisir les segments du document pertinentes Ă  sujet avec leurs opinions correspondantes. Nous abordons ce problĂšme sur deux niveaux de granularitĂ©, des phrases et des passages. Dans notre premiĂšre approche de niveau de phrase, nous utilisons des relations sĂ©mantiques de WordNet pour trouver cette association entre sujet et opinion. Dans notre deuxiĂšme approche pour le niveau de passage, nous utilisons plus robuste modĂšle de RI i.e. la language modĂšle de se concentrer sur ce problĂšme. L'idĂ©e de base derriĂšre les deux contributions pour l'association d'opinion-sujet est que si un document contient plus segments textuels (phrases ou passages) opiniĂątre et pertinentes Ă  sujet, il est plus opiniĂątre qu'un document avec moins segments textuels opiniĂątre et pertinentes. La plupart des approches d'apprentissage-machine basĂ©e Ă  fouille des opinion sont dĂ©pendants du domaine i.e. leurs performances varient d'un domaine Ă  d'autre. D'autre part, une approche indĂ©pendant de domaine ou un sujet est plus gĂ©nĂ©ralisĂ©e et peut maintenir son efficacitĂ© dans diffĂ©rents domaines. Cependant, les approches indĂ©pendant de domaine souffrent de mauvaises performances en gĂ©nĂ©ral. C'est un grand dĂ©fi dans le domaine de fouille des opinion Ă  dĂ©velopper une approche qui est plus efficace et gĂ©nĂ©ralisĂ©. Nos contributions de cette thĂšse incluent le dĂ©veloppement d'une approche qui utilise de simples fonctions heuristiques pour trouver des documents opiniĂątre. Fouille des opinion basĂ©e entitĂ© devient trĂšs populaire parmi les chercheurs de la communautĂ© IR. Il vise Ă  identifier les entitĂ©s pertinentes pour un sujet donnĂ© et d'en extraire les opinions qui leur sont associĂ©es Ă  partir d'un ensemble de documents textuels. Toutefois, l'identification et la dĂ©termination de la pertinence des entitĂ©s est dĂ©jĂ  une tĂąche difficile. Nous proposons un systĂšme qui prend en compte Ă  la fois l'information de l'article de nouvelles en cours ainsi que des articles antĂ©rieurs pertinents afin de dĂ©tecter les entitĂ©s les plus importantes dans les nouvelles actuelles. En plus de cela, nous prĂ©sentons Ă©galement notre cadre d'analyse d'opinion et tĂąches relieĂ©s. Ce cadre est basĂ©e sur les Ă©vidences contents et les Ă©vidences sociales de la blogosphĂšre pour les tĂąches de trouver des opinions, de prĂ©vision et d'avis de classement multidimensionnel. Cette contribution d'prĂ©maturĂ©e pose les bases pour nos travaux futurs. L'Ă©valuation de nos mĂ©thodes comprennent l'utilisation de TREC 2006 Blog collection et de TREC Novelty track 2004 collection. La plupart des Ă©valuations ont Ă©tĂ© rĂ©alisĂ©es dans le cadre de TREC Blog track.Opinion mining is a sub-discipline within Information Retrieval (IR) and Computational Linguistics. It refers to the computational techniques for extracting, classifying, understanding, and assessing the opinions expressed in various online sources like news articles, social media comments, and other user-generated content. It is also known by many other terms like opinion finding, opinion detection, sentiment analysis, sentiment classification, polarity detection, etc. Defining in more specific and simpler context, opinion mining is the task of retrieving opinions on an issue as expressed by the user in the form of a query. There are many problems and challenges associated with the field of opinion mining. In this thesis, we focus on some major problems of opinion mining
