46,010 research outputs found

    Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings

    Get PDF
    In this paper we present a novel interactive multimodal learning system, which facilitates search and exploration in large networks of social multimedia users. It allows the analyst to identify and select users of interest, and to find similar users in an interactive learning setting. Our approach is based on novel multimodal representations of users, words and concepts, which we simultaneously learn by deploying a general-purpose neural embedding model. We show these representations to be useful not only for categorizing users, but also for automatically generating user and community profiles. Inspired by traditional summarization approaches, we create the profiles by selecting diverse and representative content from all available modalities, i.e. the text, image and user modality. The usefulness of the approach is evaluated using artificial actors, which simulate user behavior in a relevance feedback scenario. Multiple experiments were conducted in order to evaluate the quality of our multimodal representations, to compare different embedding strategies, and to determine the importance of different modalities. We demonstrate the capabilities of the proposed approach on two different multimedia collections originating from the violent online extremism forum Stormfront and the microblogging platform Twitter, which are particularly interesting due to the high semantic level of the discussions they feature

    Occupational Fraud Detection Through Visualization

    Full text link
    Occupational fraud affects many companies worldwide causing them economic loss and liability issues towards their customers and other involved entities. Detecting internal fraud in a company requires significant effort and, unfortunately cannot be entirely prevented. The internal auditors have to process a huge amount of data produced by diverse systems, which are in most cases in textual form, with little automated support. In this paper, we exploit the advantages of information visualization and present a system that aims to detect occupational fraud in systems which involve a pair of entities (e.g., an employee and a client) and periodic activity. The main visualization is based on a spiral system on which the events are drawn appropriately according to their time-stamp. Suspicious events are considered those which appear along the same radius or on close radii of the spiral. Before producing the visualization, the system ranks both involved entities according to the specifications of the internal auditor and generates a video file of the activity such that events with strong evidence of fraud appear first in the video. The system is also equipped with several different visualizations and mechanisms in order to meet the requirements of an internal fraud detection system

    TimeMachine: Timeline Generation for Knowledge-Base Entities

    Full text link
    We present a method called TIMEMACHINE to generate a timeline of events and relations for entities in a knowledge base. For example for an actor, such a timeline should show the most important professional and personal milestones and relationships such as works, awards, collaborations, and family relationships. We develop three orthogonal timeline quality criteria that an ideal timeline should satisfy: (1) it shows events that are relevant to the entity; (2) it shows events that are temporally diverse, so they distribute along the time axis, avoiding visual crowding and allowing for easy user interaction, such as zooming in and out; and (3) it shows events that are content diverse, so they contain many different types of events (e.g., for an actor, it should show movies and marriages and awards, not just movies). We present an algorithm to generate such timelines for a given time period and screen size, based on submodular optimization and web-co-occurrence statistics with provable performance guarantees. A series of user studies using Mechanical Turk shows that all three quality criteria are crucial to produce quality timelines and that our algorithm significantly outperforms various baseline and state-of-the-art methods.Comment: To appear at ACM SIGKDD KDD'15. 12pp, 7 fig. With appendix. Demo and other info available at http://cs.stanford.edu/~althoff/timemachine
    corecore