554 research outputs found

    The Effect of Feature Reduction in Click Fraud Detection: Review

    Get PDF
    It is almost impossible for online activities being without fraud. Online ads face a major threat represents by fake clicks which happen because of bots or some mischievous people. Several studies have solved the problem using machine learning algorithms. Some of them have solved only the problem of automatic click fraud (which carried out using bot), to classify physical or bot click. While many recent researches have detected click fraud problem in spite of clicks type. This paper presents a survey of methods used to detect fraud clicks on ads. It presents advantages, as well as disadvantages of each method, in general, Most recent studies in this field, have focused on features preprocessing before classification, because of the problems’ type which imposed existence many related features and this may lead to overfitting. So the solution is applying dimensional reduction algorithms, to get better results and avoid overfitting. Keywords: Click Fraud, dimensional reduction, features, Online advertising, pay_per_click. DOI: 10.7176/NCS/11-01 Publication date:July 31st 202

    REAL-TIME AD CLICK FRAUD DETECTION

    Get PDF
    With the increase in Internet usage, it is now considered a very important platform for advertising and marketing. Digital marketing has become very important to the economy: some of the major Internet services available publicly to users are free, thanks to digital advertising. It has also allowed the publisher ecosystem to flourish, ensuring significant monetary incentives for creating quality public content, helping to usher in the information age. Digital advertising, however, comes with its own set of challenges. One of the biggest challenges is ad fraud. There is a proliferation of malicious parties and software seeking to undermine the ecosystem and causing monetary harm to digital advertisers and ad networks. Pay-per-click advertising is especially susceptible to click fraud, where each click is highly valuable. This leads advertisers to lose money and ad networks to lose their credibility, hurting the overall ecosystem. Much of the fraud detection is done in offline data pipelines, which compute fraud/non-fraud labels on clicks long after they happened. This is because click fraud detection usually depends on complex machine learning models using a large number of features on huge datasets, which can be very costly to train and lookup. In this thesis, the existence of low-cost ad click fraud classifiers with reasonable precision and recall is hypothesized. A set of simple heuristics as well as basic machine learning models (with associated simplified feature spaces) are compared with complex machine learning models, on performance and classification accuracy. Through research and experimentation, a performant classifier is discovered which can be deployed for real-time fraud detection

    Improving the robustness and privacy of HTTP cookie-based tracking systems within an affiliate marketing context : a thesis presented in fulfilment of the requirements for the degree of Doctor of Philosophy at Massey University, Albany, New Zealand

    Get PDF
    E-commerce activities provide a global reach for enterprises large and small. Third parties generate visitor traffic for a fee; through affiliate marketing, search engine marketing, keyword bidding and through organic search, amongst others. Therefore, improving the robustness of the underlying tracking and state management techniques is a vital requirement for the growth and stability of e-commerce. In an inherently stateless ecosystem such as the Internet, HTTP cookies have been the de-facto tracking vector for decades. In a previous study, the thesis author exposed circumstances under which cookie-based tracking system can fail, some due to technical glitches, others due to manipulations made for monetary gain by some fraudulent actors. Following a design science research paradigm, this research explores alternative tracking vectors discussed in previous research studies within a cross-domain tracking environment. It evaluates their efficacy within current context and demonstrates how to use them to improve the robustness of existing tracking techniques. Research outputs include methods, instantiations and a privacy model artefact based on information seeking behaviour of different categories of tracking software, and their resulting privacy intrusion levels. This privacy model provides clarity and is useful for practitioners and regulators to create regulatory frameworks that do not hinder technological advancement, rather they curtail privacy-intrusive tracking practices on the Internet. The method artefacts are instantiated as functional prototypes, available publicly on Internet, to demonstrate the efficacy and utility of the methods through live tests. The research contributes to the theoretical knowledge base through generalisation of empirical findings and to the industry by problem solving design artefacts

    An edge-based strategy for smart advertising

    Get PDF
    Smart advertising creates awareness about some offer with a more direct, personalized and interactive focus. In this area, AROUND is a social network aimed at providing smart advertising to suggest appealing business to their customers and friends. The AROUND system is supported by a sophisticated recommender system, which considers not only the customers historical behaviours, but also their current mood and accurate location. In such smart recommendation systems, the response time for the personalized advertising is critical for a successful users’ quality of experience. In this research work we first evaluate the current performance of the AROUND system in terms of processing and communication times considering that, nowadays, this social network has more than 3 million users. The current implementation of the system relies on the deployment of a network of beacons, and uses a domestic cloud provider as the main infrastructure. We show that when the number of concurrent requests becomes too high, the response time faces some limitations. In order to address this issue, we discuss several alternatives, and propose the use of an edge-based strategy as a solution for fast response time. In the experimental section, we measure the performance of the AROUND system, both in our current infrastructure at the cloud and with an edge-based approach, and show the additional advantages of leveraging the edge-based strategy even in the case of overloading the cloud capacity.This work has been supported by the Spanish Ministry of Science, Innovation and Universities and by the European Regional Development Fund (FEDER) under contract RTI2018-094532-B-I00.Peer ReviewedPostprint (author's final draft

    MadDroid: Characterising and Detecting Devious Ad Content for Android Apps

    Get PDF
    Advertisement drives the economy of the mobile app ecosystem. As a key component in the mobile ad business model, mobile ad content has been overlooked by the research community, which poses a number of threats, e.g., propagating malware and undesirable contents. To understand the practice of these devious ad behaviors, we perform a large-scale study on the app contents harvested through automated app testing. In this work, we first provide a comprehensive categorization of devious ad contents, including five kinds of behaviors belonging to two categories: \emph{ad loading content} and \emph{ad clicking content}. Then, we propose MadDroid, a framework for automated detection of devious ad contents. MadDroid leverages an automated app testing framework with a sophisticated ad view exploration strategy for effectively collecting ad-related network traffic and subsequently extracting ad contents. We then integrate dedicated approaches into the framework to identify devious ad contents. We have applied MadDroid to 40,000 Android apps and found that roughly 6\% of apps deliver devious ad contents, e.g., distributing malicious apps that cannot be downloaded via traditional app markets. Experiment results indicate that devious ad contents are prevalent, suggesting that our community should invest more effort into the detection and mitigation of devious ads towards building a trustworthy mobile advertising ecosystem.Comment: To be published in The Web Conference 2020 (WWW'20

    Click Fraud Detection in Online and In-app Advertisements: A Learning Based Approach

    Get PDF
    Click Fraud is the fraudulent act of clicking on pay-per-click advertisements to increase a site’s revenue, to drain revenue from the advertiser, or to inflate the popularity of content on social media platforms. In-app advertisements on mobile platforms are among the most common targets for click fraud, which makes companies hesitant to advertise their products. Fraudulent clicks are supposed to be caught by ad providers as part of their service to advertisers, which is commonly done using machine learning methods. However: (1) there is a lack of research in current literature addressing and evaluating the different techniques of click fraud detection and prevention, (2) threat models composed of active learning systems (smart attackers) can mislead the training process of the fraud detection model by polluting the training data, (3) current deep learning models have significant computational overhead, (4) training data is often in an imbalanced state, and balancing it still results in noisy data that can train the classifier incorrectly, and (5) datasets with high dimensionality cause increased computational overhead and decreased classifier correctness -- while existing feature selection techniques address this issue, they have their own performance limitations. By extending the state-of-the-art techniques in the field of machine learning, this dissertation provides the following solutions: (i) To address (1) and (2), we propose a hybrid deep-learning-based model which consists of an artificial neural network, auto-encoder and semi-supervised generative adversarial network. (ii) As a solution for (3), we present Cascaded Forest and Extreme Gradient Boosting with less hyperparameter tuning. (iii) To overcome (4), we propose a row-wise data reduction method, KSMOTE, which filters out noisy data samples both in the raw data and the synthetically generated samples. (iv) For (5), we propose different column-reduction methods such as multi-time-scale Time Series analysis for fraud forecasting, using binary labeled imbalanced datasets and hybrid filter-wrapper feature selection approaches

    Web usage mining for click fraud detection

    Get PDF
    Estágio realizado na AuditMark e orientado pelo Eng.º Pedro FortunaTese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

    Risk factors for social networking site scam victimisation amongst Malaysian students

    Get PDF
    Prior evidence suggests that board independence may enhance financial performance, but this relationship has been tested almost exclusively for Anglo-American countries. To explore the boundary conditions of this prominent governance mechanism, we examine the impact of the formal and information institutions of 18 national business systems (Whitley, 1999) on the board independence-financial performance relationship. Our results show that while the direct effect of independence is weak, national-level institutions significantly moderate the independence-performance relationship. Our findings suggest that the efficacy of board structures is likely to be contingent on the specific national context, but the type of legal system is insignificant

    When does personalization work on social media? a posteriori segmentation of consumers

    Get PDF
    The aim of this research is to find a segment of consumers of fashion products based on their personal visions of personalization of shoppable ads on mobile social media. To meet this objective, three operational objectives are defined. First, a theoretical model is evaluated based on the stimulus-organism-response framework (S–O–R). This examines, with a PLS-SEM approach, how the stimulation of personalization will affect consumers’ internal cognitive state (perceived usefulness) and consequently generates a behavioral response (intention to buy). Second, we look for fashion consumer segments based on their perception of personalization through prediction-oriented segmentation (PLS-POS). Third, the segments are explained based on three constructs that were considered important in fashion consumption through mobile social networks: purchase intention, concern for privacy, and perception of trend. The inclusion of personalization and the perception of usefulness of advertisements can greatly help the intention to purchase clothing to be understood. The application of a posterior segmentation helps to better understand the different types of users exposed to shoppable ads on mobile social networks and their relationship with the purchase intention, concern for privacy and trend. While the measures and scales were tested in a context of mobile clothing trade, the methodology can be applied to other types of products or services
    • …
    corecore