1,310 research outputs found

    VEWS: A Wikipedia Vandal Early Warning System

    Full text link
    We study the problem of detecting vandals on Wikipedia before any human or known vandalism detection system reports flagging potential vandals so that such users can be presented early to Wikipedia administrators. We leverage multiple classical ML approaches, but develop 3 novel sets of features. Our Wikipedia Vandal Behavior (WVB) approach uses a novel set of user editing patterns as features to classify some users as vandals. Our Wikipedia Transition Probability Matrix (WTPM) approach uses a set of features derived from a transition probability matrix and then reduces it via a neural net auto-encoder to classify some users as vandals. The VEWS approach merges the previous two approaches. Without using any information (e.g. reverts) provided by other users, these algorithms each have over 85% classification accuracy. Moreover, when temporal recency is considered, accuracy goes to almost 90%. We carry out detailed experiments on a new data set we have created consisting of about 33K Wikipedia users (including both a black list and a white list of editors) and containing 770K edits. We describe specific behaviors that distinguish between vandals and non-vandals. We show that VEWS beats ClueBot NG and STiki, the best known algorithms today for vandalism detection. Moreover, VEWS detects far more vandals than ClueBot NG and on average, detects them 2.39 edits before ClueBot NG when both detect the vandal. However, we show that the combination of VEWS and ClueBot NG can give a fully automated vandal early warning system with even higher accuracy.Comment: To appear in Proceedings of the 21st ACM SIGKDD Conference of Knowledge Discovery and Data Mining (KDD 2015

    Deception Detection in Videos

    Full text link
    We present a system for covert automated deception detection in real-life courtroom trial videos. We study the importance of different modalities like vision, audio and text for this task. On the vision side, our system uses classifiers trained on low level video features which predict human micro-expressions. We show that predictions of high-level micro-expressions can be used as features for deception prediction. Surprisingly, IDT (Improved Dense Trajectory) features which have been widely used for action recognition, are also very good at predicting deception in videos. We fuse the score of classifiers trained on IDT features and high-level micro-expressions to improve performance. MFCC (Mel-frequency Cepstral Coefficients) features from the audio domain also provide a significant boost in performance, while information from transcripts is not very beneficial for our system. Using various classifiers, our automated system obtains an AUC of 0.877 (10-fold cross-validation) when evaluated on subjects which were not part of the training set. Even though state-of-the-art methods use human annotations of micro-expressions for deception detection, our fully automated approach outperforms them by 5%. When combined with human annotations of micro-expressions, our AUC improves to 0.922. We also present results of a user-study to analyze how well do average humans perform on this task, what modalities they use for deception detection and how they perform if only one modality is accessible. Our project page can be found at \url{https://doubaibai.github.io/DARE/}.Comment: AAAI 2018, project page: https://doubaibai.github.io/DARE

    Abduction in Annotated Probabilistic Temporal Logic

    Get PDF
    Annotated Probabilistic Temporal (APT) logic programs are a form of logic programs that allow users to state (or systems to automatically learn)rules of the form ``formula G becomes true K time units after formula F became true with L to U% probability.\u27\u27 In this paper, we develop a theory of abduction for APT logic programs. Specifically, given an APT logic program Pi, a set of formulas H that can be ``added\u27\u27 to Pi, and a goal G, is there a subset S of H such that Pi cup S is consistent and entails the goal G? In this paper, we study the complexity of the Basic APT Abduction Problem (BAAP). We then leverage a geometric characterization of BAAP to suggest a set of pruning strategies when solving BAAP and use these intuitions to develop a sound and complete algorithm

    Using Generalized Annotated Programs to Solve Social Network Optimization Problems

    Get PDF
    Reasoning about social networks (labeled, directed, weighted graphs) is becoming increasingly important and there are now models of how certain phenomena (e.g. adoption of products/services by consumers, spread of a given disease) "diffuse" through the network. Some of these diffusion models can be expressed via generalized annotated programs (GAPs). In this paper, we consider the following problem: suppose we have a given goal to achieve (e.g. maximize the expected number of adoptees of a product or minimize the spread of a disease) and suppose we have limited resources to use in trying to achieve the goal (e.g. give out a few free plans, provide medication to key people in the SN) - how should these resources be used so that we optimize a given objective function related to the goal? We define a class of social network optimization problems (SNOPs) that supports this type of reasoning. We formalize and study the complexity of SNOPs and show how they can be used in conjunction with existing economic and disease diffusion models
    corecore