83 research outputs found

    Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

    Get PDF
    Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, i.e. estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this paper introduces a taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison between the state-of-the-art, a new experimental protocol is presented, with training sets containing 10k, 100k and 1m images and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.Comment: to appear in ACM Computing Survey

    Recommending Structured Objects: Paths and Sets

    Get PDF
    Recommender systems have been widely adopted in industry to help people find the most appropriate items to purchase or consume from the increasingly large collection of available resources (e.g., books, songs and movies). Conventional recommendation techniques follow the approach of ``ranking all possible options and pick the top'', which can work effectively for single item recommendation but fall short when the item in question has internal structures. For example, a travel trajectory with a sequence of points-of-interest or a music playlist with a set of songs. Such structured objects pose critical challenges to recommender systems due to the intractability of ranking all possible candidates. This thesis study the problem of recommending structured objects, in particular, the recommendation of path (a sequence of unique elements) and set (a collection of distinct elements). We study the problem of recommending travel trajectories in a city, which is a typical instance of path recommendation. We propose methods that combine learning to rank and route planning techniques for efficient trajectory recommendation. Another contribution of this thesis is to develop the structured recommendation approach for path recommendation by substantially modifying the loss function, the learning and inference procedures of structured support vector machines. A novel application of path decoding techniques helps us achieve efficient learning and recommendation. Additionally, we investigate the problem of recommending a set of songs to form a playlist as an example of the set recommendation problem. We propose to jointly learn user representations by employing the multi-task learning paradigm, and a key result of equivalence between bipartite ranking and binary classification enables efficient learning of our set recommendation method. Extensive evaluations on real world datasets demonstrate the effectiveness of our proposed approaches for path and set recommendation

    Sensing Human Sentiment via Social Media Images: Methodologies and Applications

    Get PDF
    abstract: Social media refers computer-based technology that allows the sharing of information and building the virtual networks and communities. With the development of internet based services and applications, user can engage with social media via computer and smart mobile devices. In recent years, social media has taken the form of different activities such as social network, business network, text sharing, photo sharing, blogging, etc. With the increasing popularity of social media, it has accumulated a large amount of data which enables understanding the human behavior possible. Compared with traditional survey based methods, the analysis of social media provides us a golden opportunity to understand individuals at scale and in turn allows us to design better services that can tailor to individuals’ needs. From this perspective, we can view social media as sensors, which provides online signals from a virtual world that has no geographical boundaries for the real world individual's activity. One of the key features for social media is social, where social media users actively interact to each via generating content and expressing the opinions, such as post and comment in Facebook. As a result, sentiment analysis, which refers a computational model to identify, extract or characterize subjective information expressed in a given piece of text, has successfully employs user signals and brings many real world applications in different domains such as e-commerce, politics, marketing, etc. The goal of sentiment analysis is to classify a user’s attitude towards various topics into positive, negative or neutral categories based on textual data in social media. However, recently, there is an increasing number of people start to use photos to express their daily life on social media platforms like Flickr and Instagram. Therefore, analyzing the sentiment from visual data is poise to have great improvement for user understanding. In this dissertation, I study the problem of understanding human sentiments from large scale collection of social images based on both image features and contextual social network features. We show that neither visual features nor the textual features are by themselves sufficient for accurate sentiment prediction. Therefore, we provide a way of using both of them, and formulate sentiment prediction problem in two scenarios: supervised and unsupervised. We first show that the proposed framework has flexibility to incorporate multiple modalities of information and has the capability to learn from heterogeneous features jointly with sufficient training data. Secondly, we observe that negative sentiment may related to human mental health issues. Based on this observation, we aim to understand the negative social media posts, especially the post related to depression e.g., self-harm content. Our analysis, the first of its kind, reveals a number of important findings. Thirdly, we extend the proposed sentiment prediction task to a general multi-label visual recognition task to demonstrate the methodology flexibility behind our sentiment analysis model.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Image Understanding by Socializing the Semantic Gap

    Get PDF
    Several technological developments like the Internet, mobile devices and Social Networks have spurred the sharing of images in unprecedented volumes, making tagging and commenting a common habit. Despite the recent progress in image analysis, the problem of Semantic Gap still hinders machines in fully understand the rich semantic of a shared photo. In this book, we tackle this problem by exploiting social network contributions. A comprehensive treatise of three linked problems on image annotation is presented, with a novel experimental protocol used to test eleven state-of-the-art methods. Three novel approaches to annotate, under stand the sentiment and predict the popularity of an image are presented. We conclude with the many challenges and opportunities ahead for the multimedia community

    Classification with Large Sparse Datasets: Convergence Analysis and Scalable Algorithms

    Get PDF
    Large and sparse datasets, such as user ratings over a large collection of items, are common in the big data era. Many applications need to classify the users or items based on the high-dimensional and sparse data vectors, e.g., to predict the profitability of a product or the age group of a user, etc. Linear classifiers are popular choices for classifying such datasets because of their efficiency. In order to classify the large sparse data more effectively, the following important questions need to be answered. 1. Sparse data and convergence behavior. How different properties of a dataset, such as the sparsity rate and the mechanism of missing data systematically affect convergence behavior of classification? 2. Handling sparse data with non-linear model. How to efficiently learn non-linear data structures when classifying large sparse data? This thesis attempts to address these questions with empirical and theoretical analysis on large and sparse datasets. We begin by studying the convergence behavior of popular classifiers on large and sparse data. It is known that a classifier gains better generalization ability after learning more and more training examples. Eventually, it will converge to the best generalization performance with respect to a given data distribution. In this thesis, we focus on how the sparsity rate and the missing data mechanism systematically affect such convergence behavior. Our study covers different types of classification models, including generative classifier and discriminative linear classifiers. To systematically explore the convergence behaviors, we use synthetic data sampled from statistical models of real-world large sparse datasets. We consider different types of missing data mechanisms that are common in practice. From the experiments, we have several useful observations about the convergence behavior of classifying large sparse data. Based on these observations, we further investigate the theoretical reasons and come to a series of useful conclusions. For better applicability, we provide practical guidelines for applying our results in practice. Our study helps to answer whether obtaining more data or missing values in the data is worthwhile in different situations, which is useful for efficient data collection and preparation. Despite being efficient, linear classifiers cannot learn the non-linear structures such as the low-rankness in a dataset. As a result, its accuracy may suffer. Meanwhile, most non-linear methods such as the kernel machines cannot scale to very large and high-dimensional datasets. The third part of this thesis studies how to efficiently learn non-linear structures in large sparse data. Towards this goal, we develop novel scalable feature mappings that can achieve better accuracy than linear classification. We demonstrate that the proposed methods not only outperform linear classification but is also scalable to large and sparse datasets with moderate memory and computation requirement. The main contribution of this thesis is to answer important questions on classifying large and sparse datasets. On the one hand, we study the convergence behavior of widely used classifiers under different missing data mechanisms; on the other hand, we develop efficient methods to learn the non-linear structures in large sparse data and improve classification accuracy. Overall, the thesis not only provides practical guidance for the convergence behavior of classifying large sparse datasets, but also develops highly efficient algorithms for classifying large sparse datasets in practice

    Network Representation Learning: A Survey

    Full text link
    With the widespread use of information technologies, information networks are becoming increasingly popular to capture complex relationships across various disciplines, such as social networks, citation networks, telecommunication networks, and biological networks. Analyzing these networks sheds light on different aspects of social life such as the structure of societies, information diffusion, and communication patterns. In reality, however, the large scale of information networks often makes network analytic tasks computationally expensive or intractable. Network representation learning has been recently proposed as a new learning paradigm to embed network vertices into a low-dimensional vector space, by preserving network topology structure, vertex content, and other side information. This facilitates the original network to be easily handled in the new vector space for further analysis. In this survey, we perform a comprehensive review of the current literature on network representation learning in the data mining and machine learning field. We propose new taxonomies to categorize and summarize the state-of-the-art network representation learning techniques according to the underlying learning mechanisms, the network information intended to preserve, as well as the algorithmic designs and methodologies. We summarize evaluation protocols used for validating network representation learning including published benchmark datasets, evaluation methods, and open source algorithms. We also perform empirical studies to compare the performance of representative algorithms on common datasets, and analyze their computational complexity. Finally, we suggest promising research directions to facilitate future study.Comment: Accepted by IEEE transactions on Big Data; 25 pages, 10 tables, 6 figures and 127 reference

    Learning Explainable User Sentiment and Preferences for Information Filtering

    Get PDF
    In the last decade, online social networks have enabled people to interact in many ways with each other and with content. The digital traces of such actions reveal people's preferences towards online content such as news or products. These traces often result from interactions such as sharing or liking, but also from interactions in natural language. The continuous growth of the amount of content and of digital traces has led to information overload: surrounded by large volumes of information, people are facing difficulties when searching for information relevant to their interests. To improve user experience, information systems must be able to assist users in achieving their search goals, effectively and efficiently. This thesis is concerned with two important challenges that information systems need to address in order to significantly improve search experience and overcome information overload. First, these systems need to model accurately the variety of user traces, and second, they need to meaningfully explain search results and recommendations to users. To address these challenges, this thesis proposes novel methods based on machine learning to model user sentiment and preferences for information filtering systems, which are effective, scalable, and easily interpretable by humans. We focus on two prominent types of user traces in social networks: on the one hand, user comments accompanied by unary preferences such as likes, and on the other hand, user reviews accompanied by numerical preferences such as star ratings. In both cases, we advocate that by better understanding user text through mining its semantics and modeling its structure, we can not only improve information filtering, but also explain predictions to users. Within this context, we aim to answer three main research questions, namely: (i)~how do item semantics help to predict unary preferences; (ii)~how do sentiments of free-form user texts help to predict unary preferences; and (iii)~how to model fine-grained numerical preferences from user review texts. Our goal is to model and extract from user text the knowledge required to answer these questions, and to obtain insights on how to design better information filtering systems that are more effective and improve user experience. To answer the first question, we formulate the recommendation problem based on unary preferences as a top-N retrieval task and we define an appropriate dataset and metrics for measuring performance. Then, we propose and evaluate several content-based methods based on semantic similarities under presence or absence of preferences. To answer the second question, we propose a sentiment-aware neighborhood model which integrates the sentiment of user comments with unary preferences, either through fixed or through learned mapping functions. For the latter type, we propose a learning algorithm which adapts the sentiment of user comments to unary preferences at collective or individual levels. To answer the third question, we cast the problem of modeling user attitude toward aspects of items as a weakly supervised problem, and we propose a weighted multiple-instance learning method for solving it. Lastly, we show that the learned saliency weights, apart from being easily interpretable, are useful indicators for review segmentation and summarization

    Extracting and Harnessing Interpretation in Data Mining

    Get PDF
    Machine learning, especially the recent deep learning technique, has aroused significant development to various data mining applications, including recommender systems, misinformation detection, outlier detection, and health informatics. Unfortunately, while complex models have achieved unprecedented prediction capability, they are often criticized as ``black boxes'' due to multiple layers of non-linear transformation and the hardly understandable working mechanism. To tackle the opacity issue, interpretable machine learning has attracted increasing attentions. Traditional interpretation methods mainly focus on explaining predictions of classification models with gradient based methods or local approximation methods. However, the natural characteristics of data mining applications are not considered, and the internal mechanisms of models are not fully explored. Meanwhile, it is unknown how to utilize interpretation to improve models. To bridge the gap, I developed a series of interpretation methods that gradually increase the transparency of data mining models. First, a fundamental goal of interpretation is providing the attribution of input features to model outputs. To adapt feature attribution to explaining outlier detection, I propose Contextual Outlier Interpretation (COIN). Second, to overcome the limitation of attribution methods that do not explain internal information inside models, I further propose representation interpretation methods to extract knowledge as a taxonomy. However, these post-hoc methods may suffer from interpretation accuracy and the inability to directly control model training process. Therefore, I propose an interpretable network embedding framework to explicitly control the meaning of latent dimensions. Finally, besides obtaining explanation, I propose to use interpretation to discover the vulnerability of models in adversarial circumstances, and then actively prepare models using adversarial training to improve their robustness against potential threats. My research of interpretable machine learning enables data scientists to better understand their models and discover defects for further improvement, as well as improves the experiences of customers who benefit from data mining systems. It broadly impacts fields such as Information Retrieval, Information Security, Social Computing, and Health Informatics
    • …
    corecore