493 research outputs found

    Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings

    Get PDF
    In this paper we present a novel interactive multimodal learning system, which facilitates search and exploration in large networks of social multimedia users. It allows the analyst to identify and select users of interest, and to find similar users in an interactive learning setting. Our approach is based on novel multimodal representations of users, words and concepts, which we simultaneously learn by deploying a general-purpose neural embedding model. We show these representations to be useful not only for categorizing users, but also for automatically generating user and community profiles. Inspired by traditional summarization approaches, we create the profiles by selecting diverse and representative content from all available modalities, i.e. the text, image and user modality. The usefulness of the approach is evaluated using artificial actors, which simulate user behavior in a relevance feedback scenario. Multiple experiments were conducted in order to evaluate the quality of our multimodal representations, to compare different embedding strategies, and to determine the importance of different modalities. We demonstrate the capabilities of the proposed approach on two different multimedia collections originating from the violent online extremism forum Stormfront and the microblogging platform Twitter, which are particularly interesting due to the high semantic level of the discussions they feature

    Research on multi-modal sentiment feature learning of social media content

    Get PDF
    社交媒体已成为现代社会舆论交流和信息传递的主要平台。针对社交媒体的情感分析对于舆论监控、商业产品导向和股市预测等都具有重大应用价值。但社交媒体内容的多模态性(文本、图片等)让传统的单模态情感分析方法面临许多局限,多模态情感分析技术对跨媒体内容的理解与分析具有重大的理论价值。 多模态情感分析区别于单模态方法的关键问题在于,如何综合利用形态各异的多模态情感信息,来获取整体的情感倾向性,同时考虑单个模态本身在情感表达上的性质。针对该问题,利用社交媒体上的多模态内容在情感表达上所具有的关联性、抽象层级性的特点,提出了一套面向社交媒体的多模态情感特征学习与融合方法,实现多模态情感分析,主要内容和创新点...Social media has become a main platform of public communication and information transmission. Therefore, social media sentiment analysis has great application values in many fields, such as public opinion monitoring, production marking, stock forecasting and so on. But the multi-modal characteristic of social media content (e.g. texts and images) significantly challenges traditional text-based sen...学位:工学硕士院系专业:信息科学与技术学院_模式识别与智能系统学号:3152013115327

    Context-Preserving Visual Analytics of Multi-Scale Spatial Aggregation.

    Get PDF
    Spatial datasets (i.e., location-based social media, crime incident reports, and demographic data) often exhibit varied distribution patterns at multiple spatial scales. Examining these patterns across different scales enhances the understanding from global to local perspectives and offers new insights into the nature of various spatial phenomena. Conventional navigation techniques in such multi-scale data-rich spaces are often inefficient, require users to choose between an overview or detailed information, and do not support identifying spatial patterns at varying scales. In this work, we present a context-preserving visual analytics technique that aggregates spatial datasets into hierarchical clusters and visualizes the multi-scale aggregates in a single visual space. We design a boundary distortion algorithm to minimize the visual clutter caused by overlapping aggregates and explore visual encoding strategies including color, transparency, shading, and shapes, in order to illustrate the hierarchical and statistical patterns of the multi-scale aggregates. We also propose a transparency-based technique that maintains a smooth visual transition as the users navigate across adjacent scales. To further support effective semantic exploration in the multi-scale space, we design a set of text-based encoding and layout methods that draw textual labels along the boundary or filled within the aggregates. The text itself not only summarizes the semantics at each scale, but also indicates the spatial coverage of the aggregates and their hierarchical relationships. We demonstrate the effectiveness of the proposed approaches through real-world application examples and user studies

    The role of geographic knowledge in sub-city level geolocation algorithms

    Get PDF
    Geolocation of microblog messages has been largely investigated in the lit- erature. Many solutions have been proposed that achieve good results at the city-level. Existing approaches are mainly data-driven (i.e., they rely on a training phase). However, the development of algorithms for geolocation at sub-city level is still an open problem also due to the absence of good training datasets. In this thesis, we investigate the role that external geographic know- ledge can play in geolocation approaches. We show how di)erent geographical data sources can be combined with a semantic layer to achieve reasonably accurate sub-city level geolocation. Moreover, we propose a knowledge-based method, called Sherloc, to accurately geolocate messages at sub-city level, by exploiting the presence in the message of toponyms possibly referring to the speci*c places in the target geographical area. Sherloc exploits the semantics associated with toponyms contained in gazetteers and embeds them into a metric space that captures the semantic distance among them. This allows toponyms to be represented as points and indexed by a spatial access method, allowing us to identify the semantically closest terms to a microblog message, that also form a cluster with respect to their spatial locations. In contrast to state-of-the-art methods, Sherloc requires no prior training, it is not limited to geolocating on a *xed spatial grid and it experimentally demonstrated its ability to infer the location at sub-city level with higher accuracy

    Efficient pruning of large knowledge graphs

    Get PDF
    In this paper we present an efficient and highly accurate algorithm to prune noisy or over-ambiguous knowledge graphs given as input an extensional definition of a domain of interest, namely as a set of instances or concepts. Our method climbs the graph in a bottom-up fashion, iteratively layering the graph and pruning nodes and edges in each layer while not compromising the connectivity of the set of input nodes. Iterative layering and protection of pre-defined nodes allow to extract semantically coherent DAG structures from noisy or over-ambiguous cyclic graphs, without loss of information and without incurring in computational bottlenecks, which are the main problem of stateof- the-art methods for cleaning large, i.e., Webscale, knowledge graphs. We apply our algorithm to the tasks of pruning automatically acquired taxonomies using benchmarking data from a SemEval evaluation exercise, as well as the extraction of a domain-adapted taxonomy from theWikipedia category hierarchy. The results show the superiority of our approach over state-of-art algorithms in terms of both output quality and computational efficiency
    corecore