12 research outputs found

    The big four: discrete choice modelling to predict the four major Oscar categories

    Get PDF
    The present study formulates regression models that predict the four major Oscar categories (Picture, Director, Actor and Actress). A database was created, collecting publicly available information from 2005 to 2016. The approach taken was to apply discrete choice modelling. A remarkable predictive accuracy was achieved, as every single Oscar winner was correctly predicted. The study found evidence of the crucial role of directors, the predictive power of box office, gender discrepancies in the film industry and the Academy’s biases in the selection of winners related to the film genre, nominees’ body of work and the portrayal of actual events

    SIWeb: understanding the Interests of the Society through Web data Analysis

    Get PDF
    The high availability of user-generated contents in the Web scenario represents a tremendous asset for understanding various social phenomena. Methods and commercial products that exploit the widespread use of the Web as a way of conveying personal opinions have been proposed, but a critical thinking is that these approaches may produce a partial, or distorted, understanding of the society, because most of them focus on definite scenarios, use specific platforms, base their analysis on the sole magnitude of data, or treat the different Web resources with the same importance. In this paper, we present SIWeb (Social Interests through Web Analysis), a novel mechanism designed to measure the interest the society has on a topic (e.g., a real world phenomenon, an event, a person, a thing). SIWeb is general purpose (it can be applied to any decision making process), cross platforms (it uses the entire Webspace, from social media to websites, from tags to reviews), and time effective (it measures the time correlatio between the Web resources). It uses fractal analysis to detect the temporal relations behind all the Web resources (e.g., Web pages, RSS, newsgroups, etc.) that talk about a topic and combines this number with the temporal relations to give an insight of the the interest the society has about a topic. The evaluation of the proposal shows that SIWeb might be helpful in decision making processes as it reflects the interests the society has on a specific topic

    A New Term Representation Method for Gender and Age Prediction

    Get PDF
    Author Profiling is a kind of text classification method that is used for detecting the personality profiles such as age, gender, educational background, place of origin, personality traits, native language, etc., of authors by processing their written texts. Several applications like forensic analysis, security and marking are used the techniques of author profiling for finding the basic details of authors. The main problem in the domain of author profiling is preparation of suitable dataset for predicting the characteristics of authors. PAN is one organization conducting competitions on various types of shared tasks. In 2013, PAN organizers presented the task of author profiling in their series of competitions and continued this task in further years. They arranged different kinds of datasets in different varieties of languages. From 2013 onwards several researchers proposed solutions for author profiling to predict different personality features of authors by utilizing the datasets provided in PAN competitions. Researchers used different kinds of features like character based, lexical or word based, structural features, syntactic, content based, style based features for distinguishing the author’s writing styles in their texts. Most of the researchers observed that the content based features like words or phrases those are used in the text are most useful for detecting the personality features of authors. In this work, the experiment conducted with the content based features like most important words or terms for predicting age group and gender from the PAN competition datasets. Two datasets such as PAN 2014 and 2016 author profiling datasets are used in this experiment. The documents of dataset are converted in to a vector representation which is a suitable format for giving training to machine learning algorithms. The term representation in a document vector plays a crucial role to improve the performance of gender and age group prediction.The Term Weight Measures (TWMs) are such techniques used for this purpose to represent the significance of a term value in document vector representation. In this work, we developed a new TWM for representing the term value in document vector representation. The proposed TWM’s efficiency is compared with the efficiency of other existing TWMs. Two Machine Learning (ML) algorithms like SVM (Support Vector Machine) and RF (Random Forest) are considered in this experiment for estimating the accuracy of proposed approach. We recognized that the proposed TWM accomplished best accuracies for gender and age prediction in two PAN Datasets

    TRank: ranking Twitter users according to specific topics

    Get PDF
    Abstract-Twitter is the most popular real-time microblogging service and it is a platform where users provide and obtain information at rapid pace. In this scenario, one of the biggest challenge is to find a way to automatically identify the most influential users of a given topic. Currently, there are several approaches that try to address this challenge using different Twitter signals (e.g., number of followers, lists, metadata), but results are not clear and sometimes conflicting. In this paper, we propose TRank, a novel method designed to address the problem of identifying the most influential Twitter users on specific topics identified with hashtags. The novelty of our approach is that it combines different Twitter signals (that represent both the user and the user's tweets) to provide three different indicators that are intended to capture different aspects of being influent. The computation of these indicators is not based on the magnitude of the Twitter signals alone, but they are computed taking into consideration also human factors, as for example the fact that a user with many active followings might have a very noisy time lime and, thus, miss to read many tweets. The experimental assessment confirms that our approach provides results that are more reasonable than the one obtained by mechanisms based on the sole magnitude of data

    Applications of attention economics in studying equilibria in social networking

    Get PDF
    Within social networking services, users construct their personal social networks by creating asymmetric or symmetric social links. They usually follow friends and selected professional users, such as celebrities and news agencies. On such platforms, attentions are used as currency to consume the information. The economic theory that deals with this situation of excessive information and scarce attention is called attention economics and it parallels standard economic theory although there are some interesting points of difference. In this dissertation, we use attention economic method to analyze interactions on social media. We statically and dynamically analyze a huge social graph with a manually classified set of professional users. The results show that the in-degree of professional users does not fit to power-law distribution. Conversely, the maximum number of professional users in one category for each user shows power-law property. We analyze the reasons of these phenomena wherein we consider questions of supply and demand, the game among professional users, the game among common and professional users, and the marginal utility of common users. The result of supply and demand determines the proportion of professional users in different subjects and the games strongly influence the profession users' interaction patterns. The marginal utility is the direct reason for users to follow and unfollow others. Finally, game theory from economics is applied to analyze the malicious URL attack on social media. Unlike other cyberspace, it is hard to directly publish malware or phishing page on social media. The attackers publish some bad-content URLs on social media, and lure users to click them with the URLs leading the users to the malicious page. These malicious URLs become the major gateway to further cyber-attacks on these platforms. We have shown that even with perfect and real-time detection algorithms, malicious URLs can easily snag many visitors, if they are checked by the system only once. We propose some countermeasures. Our research on the use of attention economics has demonstrated its significance for the study of social networks

    Social media and e-commerce: A scientometrics analysis

    Get PDF
    he purpose of this research is to investigate the status and the evolution of the scientific studies on the effect of social networks on e-commerce. The study seeks to address the status of a set of scientific productions of researchers in the world indexed in Scopus based on scientometrics indicators. In total, 1926 articles were found and the collected data were analyzed using quantitative and qualitative indicators of scientometrics with bibliometrix R software package. The findings show that researches have grown exponentially since 2009 and the trend has continued at relatively stable rates. Thematic analysis shows that the subject had a significant but not well-developed research field. There is a high rate of cooperation with a rich research network among institutions in United States, European and Asian countries. Studies also show that research interest in this area is prevalent in developed countries. In addition, the lack of funds and complex analytical tools may be due to lack of studies in developing countries, especially in Africa. The study of the global trend of research through scientometrics helps managers and researchers in identifying countries and institutions with the greatest potential for scientific production, which allows them to develop their professions

    Identification and characterization of diseases on social web

    Get PDF
    [no abstract

    Using social media to predict future events with agent-based markets

    No full text
    corecore