13,183 research outputs found

    Extended Recommendation Framework: Generating the Text of a User Review as a Personalized Summary

    Full text link
    We propose to augment rating based recommender systems by providing the user with additional information which might help him in his choice or in the understanding of the recommendation. We consider here as a new task, the generation of personalized reviews associated to items. We use an extractive summary formulation for generating these reviews. We also show that the two information sources, ratings and items could be used both for estimating ratings and for generating summaries, leading to improved performance for each system compared to the use of a single source. Besides these two contributions, we show how a personalized polarity classifier can integrate the rating and textual aspects. Overall, the proposed system offers the user three personalized hints for a recommendation: rating, text and polarity. We evaluate these three components on two datasets using appropriate measures for each task

    From media crossing to media mining

    Get PDF
    This paper reviews how the concept of Media Crossing has contributed to the advancement of the application domain of information access and explores directions for a future research agenda. These will include themes that could help to broaden the scope and to incorporate the concept of medium-crossing in a more general approach that not only uses combinations of medium-specific processing, but that also exploits more abstract medium-independent representations, partly based on the foundational work on statistical language models for information retrieval. Three examples of successful applications of media crossing will be presented, with a focus on the aspects that could be considered a first step towards a generalized form of media mining

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    "You Tube and I Find" - personalizing multimedia content access

    Full text link
    Recent growth in broadband access and proliferation of small personal devices that capture images and videos has led to explosive growth of multimedia content available everywhereVfrom personal disks to the Web. While digital media capture and upload has become nearly universal with newer device technology, there is still a need for better tools and technologies to search large collections of multimedia data and to find and deliver the right content to a user according to her current needs and preferences. A renewed focus on the subjective dimension in the multimedia lifecycle, fromcreation, distribution, to delivery and consumption, is required to address this need beyond what is feasible today. Integration of the subjective aspects of the media itselfVits affective, perceptual, and physiological potential (both intended and achieved), together with those of the users themselves will allow for personalizing the content access, beyond today’s facility. This integration, transforming the traditional multimedia information retrieval (MIR) indexes to more effectively answer specific user needs, will allow a richer degree of personalization predicated on user intention and mode of interaction, relationship to the producer, content of the media, and their history and lifestyle. In this paper, we identify the challenges in achieving this integration, current approaches to interpreting content creation processes, to user modelling and profiling, and to personalized content selection, and we detail future directions. The structure of the paper is as follows: In Section I, we introduce the problem and present some definitions. In Section II, we present a review of the aspects of personalized content and current approaches for the same. Section III discusses the problem of obtaining metadata that is required for personalized media creation and present eMediate as a case study of an integrated media capture environment. Section IV presents the MAGIC system as a case study of capturing effective descriptive data and putting users first in distributed learning delivery. The aspects of modelling the user are presented as a case study in using user’s personality as a way to personalize summaries in Section V. Finally, Section VI concludes the paper with a discussion on the emerging challenges and the open problems

    A novel concept-level approach for ultra-concise opinion summarization

    Get PDF
    The Web 2.0 has resulted in a shift as to how users consume and interact with the information, and has introduced a wide range of new textual genres, such as reviews or microblogs, through which users communicate, exchange, and share opinions. The exploitation of all this user-generated content is of great value both for users and companies, in order to assist them in their decision-making processes. Given this context, the analysis and development of automatic methods that can help manage online information in a quicker manner are needed. Therefore, this article proposes and evaluates a novel concept-level approach for ultra-concise opinion abstractive summarization. Our approach is characterized by the integration of syntactic sentence simplification, sentence regeneration and internal concept representation into the summarization process, thus being able to generate abstractive summaries, which is one the most challenging issues for this task. In order to be able to analyze different settings for our approach, the use of the sentence regeneration module was made optional, leading to two different versions of the system (one with sentence regeneration and one without). For testing them, a corpus of 400 English texts, gathered from reviews and tweets belonging to two different domains, was used. Although both versions were shown to be reliable methods for generating this type of summaries, the results obtained indicate that the version without sentence regeneration yielded to better results, improving the results of a number of state-of-the-art systems by 9%, whereas the version with sentence regeneration proved to be more robust to noisy data.This research work has been partially funded by the University of Alicante, Generalitat Valenciana, Spanish Government and the European Commission through the projects, “Tratamiento inteligente de la información para la ayuda a la toma de decisiones” (GRE12-44), “Explotación y tratamiento de la información disponible en Internet para la anotación y generación de textos adaptados al usuario” (GRE13-15), DIIM2.0 (PROMETEOII/2014/001), ATTOS (TIN2012-38536-C03-03), LEGOLANG-UAGE (TIN2012-31224), SAM (FP7-611312), and FIRST (FP7-287607)
    corecore