28,154 research outputs found

    Towards Succinct and Relevant Image Descriptions

    Get PDF
    What does it mean to produce a good description of an image? Is a description good because it correctly identifies all of the objects in the image, because it describes the interesting attributes of the objects, or because it is short, yet informative? Grice’s Cooperative Principle, stated as “Make your contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged ” (Grice, 1975), alongside other ideas of pragmatics in communication, have proven useful in thinking about language generation (Hovy, 1987; McKeown et al., 1995). The Cooperative Principle provides one possible framework for thinking about the generation and evaluation of image descriptions.1 The immediate question is whether automatic image description is within the scope of the Cooperative Principle. Consider the task of searching for images using natural language, where the purpose of the exchange is for the user to quickly and accurately find images that match their information needs. In this scenario, the user formulates a complete sentence query to express their needs, e.g. A sheepdog chasing sheep in a field, and initiates an exchange with the system in the form of a sequence of one-shot con-versations. In this exchange, both participants can describe images in natural language, and a successful outcome relies on each participant succinctly and correctly expressing their beliefs about the images. I

    Video Storytelling: Textual Summaries for Events

    Full text link
    Bridging vision and natural language is a longstanding goal in computer vision and multimedia research. While earlier works focus on generating a single-sentence description for visual content, recent works have studied paragraph generation. In this work, we introduce the problem of video storytelling, which aims at generating coherent and succinct stories for long videos. Video storytelling introduces new challenges, mainly due to the diversity of the story and the length and complexity of the video. We propose novel methods to address the challenges. First, we propose a context-aware framework for multimodal embedding learning, where we design a Residual Bidirectional Recurrent Neural Network to leverage contextual information from past and future. Second, we propose a Narrator model to discover the underlying storyline. The Narrator is formulated as a reinforcement learning agent which is trained by directly optimizing the textual metric of the generated story. We evaluate our method on the Video Story dataset, a new dataset that we have collected to enable the study. We compare our method with multiple state-of-the-art baselines, and show that our method achieves better performance, in terms of quantitative measures and user study.Comment: Published in IEEE Transactions on Multimedi

    Destination brand positioning slogans - towards the development of a set of accountability criteria

    Get PDF
    A significant gap in the tourism and travel literature exists in the area of tourism destination branding. While interest in applications of brand theory to practise in tourism is increasing, there is a paucity of published research in the literature to guide destination marketing organisations (DMOs). In particular there have been few reported analyses of destination brand positioning slogans, which represent the interface between brand identity and brand image. Brand positioning is an inherently complex process, exacerbated for DMOs by the politics of decision making. DMOs must somehow capture the essence of a multi-attributed destination community in a succinct and focused positioning slogan, in a way that is both meaningful to the target audience and effectively differentiates the destination from the myriad of competitors offering the same features. Based on a review of the brand positioning literature and an examination of destination slogans used in the USA, Australia and New Zealand, the paper proposes a set of slogan criteria by which a DMO’s marketing manager, political appointees and advertising agency could be held accountable to stakeholders

    A Company Profile as a Proposed Solution of Management Problems of “Widjaja Music” School Surabaya

    Full text link
    This project is a further step after taking an internship class in the previous semester. The writer had done an internship at Widjaja Music, music educational school located at Jalan Raya Darmo Permai Timur, No. 19P, West Surabaya, for 122 hours and 25 minutes within two months. Then, the writer found three problems in the institution during the internship, (1) the system of the music school is not good, (2) the facility for the administration is not well-supported, (3) and the music school does not have a clear background and history. Despite the writer\u27s position as an intern who does not have any power to make an important decision, she uses the company profile as the solution to make the improvements of those problems mentioned above. The company profile is specially designed for both customers and owners. For the customers, the writer emphasizes more on the good and strength side of Widjaja Music. For example, the good things about the various courses that Widjaja Music offers in the school. On the other hand, for the owner, the writer emphasizes more on evaluations that could build and strengthen Widjaja Music. For example, the thing that is not good enough is the personnel structure of the music school. By looking at the evaluation, the owner also could fix the weaknesses and keep the strengths of Widjaja Music. At last, the writer learned a lot about the music school throughout the process of making the company profile for Widjaja Music. Thus, the writer has known the basic things regarding the music school, which she found very useful later in her future career life

    NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding

    Full text link
    Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding. [The dataset is available at: http://rose1.ntu.edu.sg/Datasets/actionRecognition.asp]Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI

    Renormalization and Computation II: Time Cut-off and the Halting Problem

    Full text link
    This is the second installment to the project initiated in [Ma3]. In the first Part, I argued that both philosophy and technique of the perturbative renormalization in quantum field theory could be meaningfully transplanted to the theory of computation, and sketched several contexts supporting this view. In this second part, I address some of the issues raised in [Ma3] and provide their development in three contexts: a categorification of the algorithmic computations; time cut--off and Anytime Algorithms; and finally, a Hopf algebra renormalization of the Halting Problem.Comment: 28 page

    Renormalisation and computation II: time cut-off and the Halting Problem

    No full text
    corecore