27 research outputs found

    can we measure the beauty of an image

    Get PDF
    n/

    Learning Explainable User Sentiment and Preferences for Information Filtering

    Get PDF
    In the last decade, online social networks have enabled people to interact in many ways with each other and with content. The digital traces of such actions reveal people's preferences towards online content such as news or products. These traces often result from interactions such as sharing or liking, but also from interactions in natural language. The continuous growth of the amount of content and of digital traces has led to information overload: surrounded by large volumes of information, people are facing difficulties when searching for information relevant to their interests. To improve user experience, information systems must be able to assist users in achieving their search goals, effectively and efficiently. This thesis is concerned with two important challenges that information systems need to address in order to significantly improve search experience and overcome information overload. First, these systems need to model accurately the variety of user traces, and second, they need to meaningfully explain search results and recommendations to users. To address these challenges, this thesis proposes novel methods based on machine learning to model user sentiment and preferences for information filtering systems, which are effective, scalable, and easily interpretable by humans. We focus on two prominent types of user traces in social networks: on the one hand, user comments accompanied by unary preferences such as likes, and on the other hand, user reviews accompanied by numerical preferences such as star ratings. In both cases, we advocate that by better understanding user text through mining its semantics and modeling its structure, we can not only improve information filtering, but also explain predictions to users. Within this context, we aim to answer three main research questions, namely: (i)~how do item semantics help to predict unary preferences; (ii)~how do sentiments of free-form user texts help to predict unary preferences; and (iii)~how to model fine-grained numerical preferences from user review texts. Our goal is to model and extract from user text the knowledge required to answer these questions, and to obtain insights on how to design better information filtering systems that are more effective and improve user experience. To answer the first question, we formulate the recommendation problem based on unary preferences as a top-N retrieval task and we define an appropriate dataset and metrics for measuring performance. Then, we propose and evaluate several content-based methods based on semantic similarities under presence or absence of preferences. To answer the second question, we propose a sentiment-aware neighborhood model which integrates the sentiment of user comments with unary preferences, either through fixed or through learned mapping functions. For the latter type, we propose a learning algorithm which adapts the sentiment of user comments to unary preferences at collective or individual levels. To answer the third question, we cast the problem of modeling user attitude toward aspects of items as a weakly supervised problem, and we propose a weighted multiple-instance learning method for solving it. Lastly, we show that the learned saliency weights, apart from being easily interpretable, are useful indicators for review segmentation and summarization

    EXPLOITING USER COMMENTS FOR WEB APPLICATIONS

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

    Full text link
    Large language models (LLMs) are a special class of pretrained language models obtained by scaling model size, pretraining corpus and computation. LLMs, because of their large size and pretraining on large volumes of text data, exhibit special abilities which allow them to achieve remarkable performances without any task-specific training in many of the natural language processing tasks. The era of LLMs started with OpenAI GPT-3 model, and the popularity of LLMs is increasing exponentially after the introduction of models like ChatGPT and GPT4. We refer to GPT-3 and its successor OpenAI models, including ChatGPT and GPT4, as GPT-3 family large language models (GLLMs). With the ever-rising popularity of GLLMs, especially in the research community, there is a strong need for a comprehensive survey which summarizes the recent research progress in multiple dimensions and can guide the research community with insightful future research directions. We start the survey paper with foundation concepts like transformers, transfer learning, self-supervised learning, pretrained language models and large language models. We then present a brief overview of GLLMs and discuss the performances of GLLMs in various downstream tasks, specific domains and multiple languages. We also discuss the data labelling and data augmentation abilities of GLLMs, the robustness of GLLMs, the effectiveness of GLLMs as evaluators, and finally, conclude with multiple insightful future research directions. To summarize, this comprehensive survey paper will serve as a good resource for both academic and industry people to stay updated with the latest research related to GPT-3 family large language models.Comment: Preprint under review, 58 page

    Mining, analyzing and exploiting community feedback on the web

    Get PDF
    [no abstract

    The role of context in image annotation and recommendation

    Get PDF
    With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start)
    corecore