13 research outputs found

    Optimized classification predictions with a new index combining machine learning algorithms

    Get PDF
    Voting is a commonly used ensemble method aiming to optimize classification predictions by combining results from individual base classifiers. However, the selection of appropriate classifiers to participate in voting algorithm is currently an open issue. In this study we developed a novel Dissimilarity-Performance (DP) index which incorporates two important criteria for the selection of base classifiers to participate in voting: their differential response in classification (dissimilarity) when combined in triads and their individual performance. To develop this empirical index we firstly used a range of different datasets to evaluate the relationship between voting results and measures of dissimilarity among classifiers of different types (rules, trees, lazy classifiers, functions and Bayes). Secondly, we computed the combined effect on voting performance of classifiers with different individual performance and/or diverse results in the voting performance. Our DP index was able to rank the classifier combinations according to their voting performance and thus to suggest the optimal combination. The proposed index is recommended for individual machine learning users as a preliminary tool to identify which classifiers to combine in order to achieve more accurate classification predictions avoiding computer intensive and time-consuming search

    Popularity, face and voice: Predicting and interpreting livestreamers' retail performance using machine learning techniques

    Full text link
    Livestreaming commerce, a hybrid of e-commerce and self-media, has expanded the broad spectrum of traditional sales performance determinants. To investigate the factors that contribute to the success of livestreaming commerce, we construct a longitudinal firm-level database with 19,175 observations, covering an entire livestreaming subsector. By comparing the forecasting accuracy of eight machine learning models, we identify a random forest model that provides the best prediction of gross merchandise volume (GMV). Furthermore, we utilize explainable artificial intelligence to open the black-box of machine learning model, discovering four new facts: 1) variables representing the popularity of livestreaming events are crucial features in predicting GMV. And voice attributes are more important than appearance; 2) popularity is a major determinant of sales for female hosts, while vocal aesthetics is more decisive for their male counterparts; 3) merits and drawbacks of the voice are not equally valued in the livestreaming market; 4) based on changes of comments, page views and likes, sales growth can be divided into three stages. Finally, we innovatively propose a 3D-SHAP diagram that demonstrates the relationship between predicting feature importance, target variable, and its predictors. This diagram identifies bottlenecks for both beginner and top livestreamers, providing insights into ways to optimize their sales performance.Comment: 25 pages, 10 figure

    Blogging mastery: analyzing the key strategies behind successful blogs

    Get PDF
    Bloggers in the digital landscape have the power to shape consumer behavior and influence their peers. However, successfully running a blog demands time and commitment, similar to operating a small business. Yet, there is scant literature regarding successful practices and strategies that bloggers use to build their blogs and remain successful. This study explores bloggers\u27 most effective methods and strategies to establish themselves in their respective niches. The qualitative research study uses transcendental phenomenology to examine the lived experiences of successful bloggers, aiming to provide insights into their successful strategies, best practices, challenges, and insights for new bloggers. Twelve bloggers that met the criteria for inclusion were interviewed using 12 semi-structured open-ended questions. Thematic analysis was used to code and categorize the themes. The findings suggest that bloggers use various strategies to establish themselves in their respective niches and overcome challenges. The study results were integrated and used to develop the Blogger Success Framework to help established and aspiring bloggers navigate the digital landscape of blogging

    Combining granularity-based topic-dependent and topic-independent evidences for opinion detection

    Get PDF
    Fouille des opinion, une sous-discipline dans la recherche d'information (IR) et la linguistique computationnelle, fait référence aux techniques de calcul pour l'extraction, la classification, la compréhension et l'évaluation des opinions exprimées par diverses sources de nouvelles en ligne, social commentaires des médias, et tout autre contenu généré par l'utilisateur. Il est également connu par de nombreux autres termes comme trouver l'opinion, la détection d'opinion, l'analyse des sentiments, la classification sentiment, de détection de polarité, etc. Définition dans le contexte plus spécifique et plus simple, fouille des opinion est la tâche de récupération des opinions contre son besoin aussi exprimé par l'utilisateur sous la forme d'une requête. Il y a de nombreux problèmes et défis liés à l'activité fouille des opinion. Dans cette thèse, nous nous concentrons sur quelques problèmes d'analyse d'opinion. L'un des défis majeurs de fouille des opinion est de trouver des opinions concernant spécifiquement le sujet donné (requête). Un document peut contenir des informations sur de nombreux sujets à la fois et il est possible qu'elle contienne opiniâtre texte sur chacun des sujet ou sur seulement quelques-uns. Par conséquent, il devient très important de choisir les segments du document pertinentes à sujet avec leurs opinions correspondantes. Nous abordons ce problème sur deux niveaux de granularité, des phrases et des passages. Dans notre première approche de niveau de phrase, nous utilisons des relations sémantiques de WordNet pour trouver cette association entre sujet et opinion. Dans notre deuxième approche pour le niveau de passage, nous utilisons plus robuste modèle de RI i.e. la language modèle de se concentrer sur ce problème. L'idée de base derrière les deux contributions pour l'association d'opinion-sujet est que si un document contient plus segments textuels (phrases ou passages) opiniâtre et pertinentes à sujet, il est plus opiniâtre qu'un document avec moins segments textuels opiniâtre et pertinentes. La plupart des approches d'apprentissage-machine basée à fouille des opinion sont dépendants du domaine i.e. leurs performances varient d'un domaine à d'autre. D'autre part, une approche indépendant de domaine ou un sujet est plus généralisée et peut maintenir son efficacité dans différents domaines. Cependant, les approches indépendant de domaine souffrent de mauvaises performances en général. C'est un grand défi dans le domaine de fouille des opinion à développer une approche qui est plus efficace et généralisé. Nos contributions de cette thèse incluent le développement d'une approche qui utilise de simples fonctions heuristiques pour trouver des documents opiniâtre. Fouille des opinion basée entité devient très populaire parmi les chercheurs de la communauté IR. Il vise à identifier les entités pertinentes pour un sujet donné et d'en extraire les opinions qui leur sont associées à partir d'un ensemble de documents textuels. Toutefois, l'identification et la détermination de la pertinence des entités est déjà une tâche difficile. Nous proposons un système qui prend en compte à la fois l'information de l'article de nouvelles en cours ainsi que des articles antérieurs pertinents afin de détecter les entités les plus importantes dans les nouvelles actuelles. En plus de cela, nous présentons également notre cadre d'analyse d'opinion et tâches relieés. Ce cadre est basée sur les évidences contents et les évidences sociales de la blogosphère pour les tâches de trouver des opinions, de prévision et d'avis de classement multidimensionnel. Cette contribution d'prématurée pose les bases pour nos travaux futurs. L'évaluation de nos méthodes comprennent l'utilisation de TREC 2006 Blog collection et de TREC Novelty track 2004 collection. La plupart des évaluations ont été réalisées dans le cadre de TREC Blog track.Opinion mining is a sub-discipline within Information Retrieval (IR) and Computational Linguistics. It refers to the computational techniques for extracting, classifying, understanding, and assessing the opinions expressed in various online sources like news articles, social media comments, and other user-generated content. It is also known by many other terms like opinion finding, opinion detection, sentiment analysis, sentiment classification, polarity detection, etc. Defining in more specific and simpler context, opinion mining is the task of retrieving opinions on an issue as expressed by the user in the form of a query. There are many problems and challenges associated with the field of opinion mining. In this thesis, we focus on some major problems of opinion mining

    How users balance opportunity and risk : a conceptual exploration of social media literacy and measurement

    Get PDF

    Ghostwriting as a Critical Lens: Authorship and Attribution in Professional and Academic Contexts

    Get PDF
    This dissertation exposes the inherent deceit within the practice of ghostwriting, considers ways that business applications of writing de-value the labor of writing, and, finally, argues for a composition pedagogy that moves past the emphasis on single-author documents so that students can critically view corporate authorship as an alternative. This dissertation engages in mixed-methods research that included surveys of blog readers and interviews of professional ghostwriters to include voices too often excluded from discussions about the impacts of professional ghostwriting. After establishing the layers of silence placed around the practice of ghostwriting, I then argue that perpetuating this practice de-values the labor of writing despite the integral role writing plays in creating value in our current world. After discussing the ethical and professional implications of ghostwriting in corporate settings, this dissertation argues that students in First-Year Composition (FYC) programs occupy a role similar to the professional ghostwriter in terms of limited agency, pay-off, and potential. As with the context of professional writing, this study challenges the status quo of single-authored texts as assessments in FYC and argues for the benefits of students composing in digital genres such as wikis and social media to critique the benefits of single-authored, collaborative, and corporate writing in and out of the classroom

    The Exposure Economy Model: Navigating Visibility on Instagram

    Get PDF
    To be seen on social media is a crucial concern for content creators, who have developed visibility practices to stand out in overcrowded online markets. ‘Exposure’, the state of being publicised to new audiences, has hence become increasingly valuable and is treated as a reward to be utilised as currency. This thesis shifts thinking around the social media landscape by offering a new model to view the participants and practices involved in the production, consumption, and trade of exposure. The Exposure Economy Model (EEM) compares the operations of the Instagram platform and its users, influencers, and agencies to respective economic stakeholders, namely retail institutions, consumers, brand manufacturers and distributors. This comparison is grounded in digital ethnography, consisting of participant observation, surveys, semi-structured interviews, and textual analysis. Through investigating the exposure-seeking practices within EEM, the research design examines algorithmic structures that lead to disproportionate visibility outcomes online. Subsequent research findings introduce new categories to segment social media users, namely engaged users, private participants, and need-centric consumers, and illustrate how variables such as aesthetics, aspiration and authenticity are crucial to the construction of influencer branding. By focusing on Instagram, this thesis explores the app’s specific use by key stakeholders, how they navigate capitalist systems and the social and cultural impacts of exposure inflation. Beyond the example of Instagram, however, these discussions build on existing research on influencers, micro-celebrity, and the creator economy by drawing attention to digital inequalities and providing suggestions for mitigating and adapting to social media change

    Blogging the hyperlocal : the disruption and renegotiation of hegemony in Malta

    Get PDF
    This thesis examines how blogging is being deployed to disrupt institutional hegemony in Malta. The island state is an example of a hyperlocal context that includes strong political, ecclesiastical and media institutions, advanced take-up of social technologies and a popular culture adjusting to the promise of modernity represented by EU membership. Popular discourse is dominated by political partisanship and advocacy journalism, with Malta being the only European country that permits political parties to directly own broadcasting stations.The primary evidence in this study is derived from an analysis of online texts during an organic crisis that eventually led to a national referendum to consider the introduction of divorce legislation in Malta. Using netnography supplemented by critical discourse analysis, the research identifies a set of strategies bloggers used to resist, challenge and disrupt the discourse of a hegemonic alliance that included the ruling political party, the Roman Catholic Church and their media. The empirical results indicate that blogging in Malta is contributing to the erosion of the Church’s hegemony. Subjects that were previously marginalised as alternative are increasingly finding an online outlet in blog posts, social media networks and commentary on newspaper portals.Nevertheless, a culture of social surveillance together with the natural barriers of size and the permeability of the social web facilitates the appropriation of blogging by political blocs, who remain vigilant to the opportunity of extending their influence in new media to disrupt horizontal networks of information exchange. Blogging is increasingly operating as a component of a hybrid media ecosystem that thrives on reflexive cycles of entertainment: the independent newspaper media, for long an active partner in the hegemonic set up in Malta, are being transformed and rendered more permeable at the same time as their power and influence are being eroded. The study concludes that a new episteme is more likely to emerge through the symbiosis of hybrid media and reflexive waves of networked individualism than systemic, organised attempts at online political disruption

    The structural determinants of media contagion

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 157-166).Informal exchanges between friends, family and acquaintances play a crucial role in the dissemination of news and opinion. These casual interactions are embedded in a network of communication that spans our society, allowing information to spread from any one person to another via some set of intermediary ties. Weblogs have recently emerged as a part of our media ecology and incidentally engender this process of media contagion; because weblog authors are tied by social networks of readership, contagious media events happen frequently, and in a form that is immediately measurable. The generally accepted notion of media diffusion is that it occurs through two channels: externally, as applied by a constant force such as the mass media, and internally through socio-structural means. Sitting between our traditional notions of mass media and the public, weblogs problematize this classical theory of mass media influence. This thesis aims to elucidate the role of weblogs in media contagion through a sociological study of this community in two parts: First, I will address the issues of modeling the social structure of weblogs as observed through their readership network, and the various media events that occur therein.(cont.) Using a large weblog corpus collected over a one-month period, I have constructed a model describing the structure of popularity and influence from the extracted readership network, and will show that this model more accurately describes the weblog network. I will also derive a typology of media events from collected examples using features of structural and non-structural diffusion. Second, the extent to which these data are reflective of actual social processes as opposed to artifacts of data collection and aggregation will be explored. To validate the models presented in part one, I have conducted a survey of randomly selected authors to examine their social behaviors, both in weblog use and otherwise. I will characterize the range of weblog uses and practices, presenting an analysis of personal influence in the blogging community.by Cameron Alexander Marlow.Ph.D
    corecore