1,599 research outputs found

    Understanding the Detection of View Fraud in Video Content Portals

    Full text link
    While substantial effort has been devoted to understand fraudulent activity in traditional online advertising (search and banner), more recent forms such as video ads have received little attention. The understanding and identification of fraudulent activity (i.e., fake views) in video ads for advertisers, is complicated as they rely exclusively on the detection mechanisms deployed by video hosting portals. In this context, the development of independent tools able to monitor and audit the fidelity of these systems are missing today and needed by both industry and regulators. In this paper we present a first set of tools to serve this purpose. Using our tools, we evaluate the performance of the audit systems of five major online video portals. Our results reveal that YouTube's detection system significantly outperforms all the others. Despite this, a systematic evaluation indicates that it may still be susceptible to simple attacks. Furthermore, we find that YouTube penalizes its videos' public and monetized view counters differently, the former being more aggressive. This means that views identified as fake and discounted from the public view counter are still monetized. We speculate that even though YouTube's policy puts in lots of effort to compensate users after an attack is discovered, this practice places the burden of the risk on the advertisers, who pay to get their ads displayed.Comment: To appear in WWW 2016, Montr\'eal, Qu\'ebec, Canada. Please cite the conference version of this pape

    Engineering Crowdsourced Stream Processing Systems

    Full text link
    A crowdsourced stream processing system (CSP) is a system that incorporates crowdsourced tasks in the processing of a data stream. This can be seen as enabling crowdsourcing work to be applied on a sample of large-scale data at high speed, or equivalently, enabling stream processing to employ human intelligence. It also leads to a substantial expansion of the capabilities of data processing systems. Engineering a CSP system requires the combination of human and machine computation elements. From a general systems theory perspective, this means taking into account inherited as well as emerging properties from both these elements. In this paper, we position CSP systems within a broader taxonomy, outline a series of design principles and evaluation metrics, present an extensible framework for their design, and describe several design patterns. We showcase the capabilities of CSP systems by performing a case study that applies our proposed framework to the design and analysis of a real system (AIDR) that classifies social media messages during time-critical crisis events. Results show that compared to a pure stream processing system, AIDR can achieve a higher data classification accuracy, while compared to a pure crowdsourcing solution, the system makes better use of human workers by requiring much less manual work effort

    Characterizing Key Stakeholders in an Online Black-Hat Marketplace

    Get PDF
    Over the past few years, many black-hat marketplaces have emerged that facilitate access to reputation manipulation services such as fake Facebook likes, fraudulent search engine optimization (SEO), or bogus Amazon reviews. In order to deploy effective technical and legal countermeasures, it is important to understand how these black-hat marketplaces operate, shedding light on the services they offer, who is selling, who is buying, what are they buying, who is more successful, why are they successful, etc. Toward this goal, in this paper, we present a detailed micro-economic analysis of a popular online black-hat marketplace, namely, SEOClerks.com. As the site provides non-anonymized transaction information, we set to analyze selling and buying behavior of individual users, propose a strategy to identify key users, and study their tactics as compared to other (non-key) users. We find that key users: (1) are mostly located in Asian countries, (2) are focused more on selling black-hat SEO services, (3) tend to list more lower priced services, and (4) sometimes buy services from other sellers and then sell at higher prices. Finally, we discuss the implications of our analysis with respect to devising effective economic and legal intervention strategies against marketplace operators and key users.Comment: 12th IEEE/APWG Symposium on Electronic Crime Research (eCrime 2017

    Fake News Detection in Social Networks via Crowd Signals

    Full text link
    Our work considers leveraging crowd signals for detecting fake news and is motivated by tools recently introduced by Facebook that enable users to flag fake news. By aggregating users' flags, our goal is to select a small subset of news every day, send them to an expert (e.g., via a third-party fact-checking organization), and stop the spread of news identified as fake by an expert. The main objective of our work is to minimize the spread of misinformation by stopping the propagation of fake news in the network. It is especially challenging to achieve this objective as it requires detecting fake news with high-confidence as quickly as possible. We show that in order to leverage users' flags efficiently, it is crucial to learn about users' flagging accuracy. We develop a novel algorithm, DETECTIVE, that performs Bayesian inference for detecting fake news and jointly learns about users' flagging accuracy over time. Our algorithm employs posterior sampling to actively trade off exploitation (selecting news that maximize the objective value at a given epoch) and exploration (selecting news that maximize the value of information towards learning about users' flagging accuracy). We demonstrate the effectiveness of our approach via extensive experiments and show the power of leveraging community signals for fake news detection

    Predictive Policing

    Get PDF
    UAE is one of the safest countries to live in, but that does not indicate that the country does not witness crimes, During the COVID-19 pandemic, the country saw an increase in cyber and digital crimes. Apart from cybercrime, there are other types of crimes, such as street crimes and violent crimes. Data analytics aids Dubai Police to predict crimes. Criminal investigation is one of the fields that is very interesting and is taught in colleges and academies. Data analytics opens the door for studying the details of each crime. Data mining tools consist of a variety of techniques that can help solve a problem or indicate a cause or an effect of something. Data analysts use data mining tools through a lot of software that allow the user to analyze data easily and fluently. SAS (statistical analysis system) is one of the reputable software that is used especially for visualizing and analyzing data. In this capstone, we will use SAS since it is a software that is accredited from Dubai Police and we use it already in our workplace. Prediction techniques supports to interpret and facilitate Dubai Police to develop strategies to reduce the crime rate. Hence, it allows UAE to sustain its position as the “safest” country. The capstone idea will actually help us develop what we do at work and stop or reduce crime which is one of the main pillars in Dubai Police. The crime related data will be collected from CID in Dubai police. Link analysis and predictive analysis will be performed in this project to forecast any crime. We will build a predictive model using SAS to predict crime. This proposed project will help to identify the trends of historical crime data. Project timeline has been provided in this writing to have a better outline. The first step is to collect the data from the source which is in our case, the criminal investigation department in Dubai Police. Meeting with the department; they have agreed on giving us datasets of specific crimes that Dubai Police finds critical and needs further analysis from five years. Thus, the data that we will be analyzing will be from the years 2017 to the year 2021. After collecting the data ; the processing took place which is the cleaning part of the data. Since the data is in Arabic and it is old as mentioned earlier that the data of the past five years are collected; there are some missing fields, some inconsistencies and some redundant data. After cleaning the dataset which took 70% of the time working on this project. Now the dataset is ready and can be analyzed in SAS. Importing the dataset through SAS was the first step. Then, we started analyzing the criminals first as we wanted to build a portfolio of the criminals and observe of any patterns found. The highest nationality of the criminals was India. We tried to see if there are higher nationalities in certain years, but in all five years the analysis showed that India was the number one nationality in criminals. Then we wanted to observe the criminals’ education level; the highest education level was unemployed meaning they do not have any degree that supports them. The education level part was very interesting because we found out that even though university degrees did not come first in the highest education level. however there is a sample of the criminals that hold very high level degrees such as PhDs and Masters degrees and this shows us that the stereotype of how uneducated people are bad or are the only people that commit crimes should be disregarded. Next , we analyzed the criminals’ age group and the outcome was that 30 – 45 age groups are the ones that commit crimes the most in Dubai. Finally, we have analyzed the criminals’ gender to see which gender commits most crimes in Dubai and from our analysis; the outcome showed that men are the most that commit crimes in Dubai. After analyzing the criminals’ profiles ; we have moved on to analyzing the crimes in the past five years. The type of crime was the first thing we wanted to analyze to observe what is the most crime committed in Dubai in the last five years. Fraud was the most crime committed in Dubai and this was not a huge shock to us since Dubai is considered a business city and it attracts some people to do their business in it. Dubai has always been interested in building the city financially in the best , legal way possible, however there will always be people that see it as a city to commit fraud in since it has a large population and has many tourists visiting the city. Next, we analyzed the crime replotting per year. 2019 has scored the highest in crime reporting in Dubai; right before the pandemic. We analyzed the police stations that had the most reporting in the past five years in order to observe the locations that are considered crime appealing to criminals. This analysis is very important since every area has a police station assigned to it and the outcome of this analysis was that Bur Dubai police station had the highest number of incidents in the last five years. Lastly, we wanted to analyze what time was the crime committed and the result was that most crimes have been committed in the morning between 9AM and 11AM and that was very shocking and interesting to us because it is know globally that most crimes are committed at night in the dark where no one can see the criminal , but this is due to the type of crime as well , and as we have observed that fraud is the most committed crime, then the morning is the best time to commit this crime since people are awake and willing to do business with other people whether it was online or offline. Finally, the purpose of this whole project is to forecast the crime rates; thus, we built a forecasting model in SAS and it showed us that in the upcoming years, the crime rates in Dubai will decrease dramatically based on the pattern of crimes in the historical data. This is a positive result; however this does not mean that Dubai Police should neglect the surveillance and monitoring of the city due to this forecasting as it is not always accurate

    Search Rank Fraud Prevention in Online Systems

    Get PDF
    The survival of products in online services such as Google Play, Yelp, Facebook and Amazon, is contingent on their search rank. This, along with the social impact of such services, has also turned them into a lucrative medium for fraudulently influencing public opinion. Motivated by the need to aggressively promote products, communities that specialize in social network fraud (e.g., fake opinions and reviews, likes, followers, app installs) have emerged, to create a black market for fraudulent search optimization. Fraudulent product developers exploit these communities to hire teams of workers willing and able to commit fraud collectively, emulating realistic, spontaneous activities from unrelated people. We call this behavior “search rank fraud”. In this dissertation, we argue that fraud needs to be proactively discouraged and prevented, instead of only reactively detected and filtered. We introduce two novel approaches to discourage search rank fraud in online systems. First, we detect fraud in real-time, when it is posted, and impose resource consuming penalties on the devices that post activities. We introduce and leverage several novel concepts that include (i) stateless, verifiable computational puzzles that impose minimal performance overhead, but enable the efficient verification of their authenticity, (ii) a real-time, graph based solution to assign fraud scores to user activities, and (iii) mechanisms to dynamically adjust puzzle difficulty levels based on fraud scores and the computational capabilities of devices. In a second approach, we introduce the problem of fraud de-anonymization: reveal the crowdsourcing site accounts of the people who post large amounts of fraud, thus their bank accounts, and provide compelling evidence of fraud to the users of products that they promote. We investigate the ability of our solutions to ensure that fraud does not pay off
    • …
    corecore