260 research outputs found

    Fame for sale: efficient detection of fake Twitter followers

    Get PDF
    Fake followers\textit{Fake followers} are those Twitter accounts specifically created to inflate the number of followers of a target account. Fake followers are dangerous for the social platform and beyond, since they may alter concepts like popularity and influence in the Twittersphere - hence impacting on economy, politics, and society. In this paper, we contribute along different dimensions. First, we review some of the most relevant existing features and rules (proposed by Academia and Media) for anomalous Twitter accounts detection. Second, we create a baseline dataset of verified human and fake follower accounts. Such baseline dataset is publicly available to the scientific community. Then, we exploit the baseline dataset to train a set of machine-learning classifiers built over the reviewed rules and features. Our results show that most of the rules proposed by Media provide unsatisfactory performance in revealing fake followers, while features proposed in the past by Academia for spam detection provide good results. Building on the most promising features, we revise the classifiers both in terms of reduction of overfitting and cost for gathering the data needed to compute the features. The final result is a novel Class A\textit{Class A} classifier, general enough to thwart overfitting, lightweight thanks to the usage of the less costly features, and still able to correctly classify more than 95% of the accounts of the original training set. We ultimately perform an information fusion-based sensitivity analysis, to assess the global sensitivity of each of the features employed by the classifier. The findings reported in this paper, other than being supported by a thorough experimental methodology and interesting on their own, also pave the way for further investigation on the novel issue of fake Twitter followers

    Leveraging Multi-level Dependency of Relational Sequences for Social Spammer Detection

    Full text link
    Much recent research has shed light on the development of the relation-dependent but content-independent framework for social spammer detection. This is largely because the relation among users is difficult to be altered when spammers attempt to conceal their malicious intents. Our study investigates the spammer detection problem in the context of multi-relation social networks, and makes an attempt to fully exploit the sequences of heterogeneous relations for enhancing the detection accuracy. Specifically, we present the Multi-level Dependency Model (MDM). The MDM is able to exploit user's long-term dependency hidden in their relational sequences along with short-term dependency. Moreover, MDM fully considers short-term relational sequences from the perspectives of individual-level and union-level, due to the fact that the type of short-term sequences is multi-folds. Experimental results on a real-world multi-relational social network demonstrate the effectiveness of our proposed MDM on multi-relational social spammer detection

    LSSL-SSD: Social spammer detection with Laplacian score and semi-supervised learning

    Full text link
    © Springer International Publishing AG 2016. The rapid development of social networks makes it easy for people to communicate online. However, social networks usually suffer from social spammers due to their openness. Spammers deliver information for economic purposes, and they pose threats to the security of social networks. To maintain the long-term running of online social networks, many detection methods are proposed. But current methods normally use high dimension features with supervised learning algorithms to find spammers, resulting in low detection performance. To solve this problem, in this paper, we first apply the Laplacian score method, which is an unsupervised feature selection method, to obtain useful features. Based on the selected features, the semi-supervised ensemble learning is then used to train the detection model. Experimental results on the Twitter dataset show the efficiency of our approach after feature selection. Moreover, the proposed method remains high detection performance in the face of limited labeled data

    Efficient and Trustworthy Review/Opinion Spam Detection

    Get PDF
    The most common mode for consumers to express their level of satisfaction with their purchases is through online ratings, which we can refer as Online Review System. Network analysis has recently gained a lot of attention because of the arrival and the increasing attractiveness of social sites, such as blogs, social networking applications, micro blogging, or customer review sites. The reviews are used by potential customers to find opinions of existing users before purchasing the products. Online review systems plays an important part in affecting consumers' actions and decision making, and therefore attracting many spammers to insert fake feedback or reviews in order to manipulate review content and ratings. Malicious users misuse the review website and post untrustworthy, low quality, or sometimes fake opinions, which are referred as Spam Reviews. In this study, we aim at providing an efficient method to identify spam reviews and to filter out the spam content with the dataset of gsmarena.com. Experiments on the dataset collected from gsmarena.com show that the proposed system achieves higher accuracy than the standard na?ve bayes

    Fake accounts detection system based on bidirectional gated recurrent unit neural network

    Get PDF
    Online social networks have become the most widely used medium to interact with friends and family, share news and important events or publish daily activities. However, this growing popularity has made social networks a target for suspicious exploitation such as the spreading of misleading or malicious information, making them less reliable and less trustworthy. In this paper, a fake account detection system based on the bidirectional gated recurrent unit (BiGRU) model is proposed. The focus has been on the content of users’ tweets to classify twitter user profile as legitimate or fake. Tweets are gathered in a single file and are transformed into a vector space using the GloVe word embedding technique in order to preserve the semantic and syntax context. Compared with the baseline models such as long short-term memory (LSTM) and convolutional neural networks (CNN), the results are promising and confirm that using GloVe with BiGRU classifier outperforms with 99.44% for accuracy and 99.25% for precision. To prove the efficiency of our approach the results obtained with GloVe were compared to Word2vec under the same conditions. Results confirm that GloVe with BiGRU classifier performs the best results for detection of fake Twitter accounts using only tweets content feature

    Social spammer detection: A multi-relational embedding approach

    Full text link
    © Springer International Publishing AG, part of Springer Nature 2018. Since the relation is the main data shape of social networks, social spammer detection desperately needs a relation-dependent but content-independent framework. Some recent detection method transforms the social relations into a set of topological features, such as degree, k-core, etc. However, the multiple heterogeneous relations and the direction within each relation have not been fully explored for identifying social spammers. In this paper, we make an attempt to adopt the Multi-Relational Embedding (MRE) approach for learning latent features of the social network. The MRE model is able to fuse multiple kinds of different relations and also learn two latent vectors for each relation indicating both sending role and receiving role of every user, respectively. Experimental results on a real-world multi-relational social network demonstrate the latent features extracted by our MRE model can improve the detection performance remarkably

    Enhancing data privacy and security related process through machine learning

    Get PDF
    In this thesis, we exploit the advantages of Machine learning (ML) in the domains of data security and data privacy. ML is one of the most exciting technologies being developed in the world today. The major advantages of ML technology are its prediction capability and its ability to reduce the need for human activities to perform tasks. These benefits motivated us to exploit ML to improve users' data privacy and security. Firstly, we use ML technology to try to predict the best privacy settings for users, since ML has a strong prediction ability and the average user might find it difficult to properly set up privacy settings due to a lack of knowledge and subsequent lack of decision-making abilities regarding the privacy of their data. Besides, since the ML approach has the potential to considerably cut down on manual efforts by humans, our second task in this thesis is to exploit ML technology to redesign security mechanisms of social media environments that rely on human participation for providing such services. In particular, we use ML to train spam filters for identifying and removing violent, insulting, aggressive, and harassing content creators (a.k.a. spammers) from a social media platform. It helps to solve violent and aggressive issues that have been growing on social media environments. The experimental results show that our proposals are efficient and effective
    • …
    corecore