46,010 research outputs found
Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings
In this paper we present a novel interactive multimodal learning system,
which facilitates search and exploration in large networks of social multimedia
users. It allows the analyst to identify and select users of interest, and to
find similar users in an interactive learning setting. Our approach is based on
novel multimodal representations of users, words and concepts, which we
simultaneously learn by deploying a general-purpose neural embedding model. We
show these representations to be useful not only for categorizing users, but
also for automatically generating user and community profiles. Inspired by
traditional summarization approaches, we create the profiles by selecting
diverse and representative content from all available modalities, i.e. the
text, image and user modality. The usefulness of the approach is evaluated
using artificial actors, which simulate user behavior in a relevance feedback
scenario. Multiple experiments were conducted in order to evaluate the quality
of our multimodal representations, to compare different embedding strategies,
and to determine the importance of different modalities. We demonstrate the
capabilities of the proposed approach on two different multimedia collections
originating from the violent online extremism forum Stormfront and the
microblogging platform Twitter, which are particularly interesting due to the
high semantic level of the discussions they feature
Occupational Fraud Detection Through Visualization
Occupational fraud affects many companies worldwide causing them economic
loss and liability issues towards their customers and other involved entities.
Detecting internal fraud in a company requires significant effort and,
unfortunately cannot be entirely prevented. The internal auditors have to
process a huge amount of data produced by diverse systems, which are in most
cases in textual form, with little automated support. In this paper, we exploit
the advantages of information visualization and present a system that aims to
detect occupational fraud in systems which involve a pair of entities (e.g., an
employee and a client) and periodic activity. The main visualization is based
on a spiral system on which the events are drawn appropriately according to
their time-stamp. Suspicious events are considered those which appear along the
same radius or on close radii of the spiral. Before producing the
visualization, the system ranks both involved entities according to the
specifications of the internal auditor and generates a video file of the
activity such that events with strong evidence of fraud appear first in the
video. The system is also equipped with several different visualizations and
mechanisms in order to meet the requirements of an internal fraud detection
system
TimeMachine: Timeline Generation for Knowledge-Base Entities
We present a method called TIMEMACHINE to generate a timeline of events and
relations for entities in a knowledge base. For example for an actor, such a
timeline should show the most important professional and personal milestones
and relationships such as works, awards, collaborations, and family
relationships. We develop three orthogonal timeline quality criteria that an
ideal timeline should satisfy: (1) it shows events that are relevant to the
entity; (2) it shows events that are temporally diverse, so they distribute
along the time axis, avoiding visual crowding and allowing for easy user
interaction, such as zooming in and out; and (3) it shows events that are
content diverse, so they contain many different types of events (e.g., for an
actor, it should show movies and marriages and awards, not just movies). We
present an algorithm to generate such timelines for a given time period and
screen size, based on submodular optimization and web-co-occurrence statistics
with provable performance guarantees. A series of user studies using Mechanical
Turk shows that all three quality criteria are crucial to produce quality
timelines and that our algorithm significantly outperforms various baseline and
state-of-the-art methods.Comment: To appear at ACM SIGKDD KDD'15. 12pp, 7 fig. With appendix. Demo and
other info available at http://cs.stanford.edu/~althoff/timemachine
- …