714,320 research outputs found

    FĂ­schlĂĄr-DiamondTouch: collaborative video searching on a table

    Get PDF
    In this paper we present the system we have developed for our participation in the annual TRECVid benchmarking activity, specically the system we have developed, FĂ­schlĂĄr-DT, for participation in the interactive search task of TRECVid 2005. Our back-end search engine uses a combination of a text search which operates over the automatic speech recognised text, and an image search which uses low-level image features matched against video keyframes. The two novel aspects of our work are the fact that we are evaluating collaborative, team-based search among groups of users working together, and that we are using a novel touch-sensitive tabletop interface and interaction device known as the DiamondTouch to support this collaborative search. The paper summarises the backend search systems as well as presenting the interface we have developed, in detail

    Position-dependent exact-exchange energy for slabs and semi-infinite jellium

    Get PDF
    The position-dependent exact-exchange energy per particle Δx(z)\varepsilon_x(z) (defined as the interaction between a given electron at zz and its exact-exchange hole) at metal surfaces is investigated, by using either jellium slabs or the semi-infinite (SI) jellium model. For jellium slabs, we prove analytically and numerically that in the vacuum region far away from the surface ΔxSlab(z→∞)→−e2/2z\varepsilon_{x}^{\text{Slab}}(z \to \infty) \to - e^{2}/2z, {\it independent} of the bulk electron density, which is exactly half the corresponding exact-exchange potential Vx(z→∞)→−e2/zV_{x}(z \to \infty) \to - e^2/z [Phys. Rev. Lett. {\bf 97}, 026802 (2006)] of density-functional theory, as occurs in the case of finite systems. The fitting of ΔxSlab(z)\varepsilon_{x}^{\text{Slab}}(z) to a physically motivated image-like expression is feasible, but the resulting location of the image plane shows strong finite-size oscillations every time a slab discrete energy level becomes occupied. For a semi-infinite jellium, the asymptotic behavior of ΔxSI(z)\varepsilon_{x}^{\text{SI}}(z) is somehow different. As in the case of jellium slabs ΔxSI(z→∞)\varepsilon_{x}^{\text{SI}}(z \to \infty) has an image-like behavior of the form ∝−e2/z\propto - e^2/z, but now with a density-dependent coefficient that in general differs from the slab universal coefficient 1/2. Our numerical estimates for this coefficient agree with two previous analytical estimates for the same. For an arbitrary finite thickness of a jellium slab, we find that the asymptotic limits of ΔxSlab(z)\varepsilon_{x}^{\text{Slab}}(z) and ΔxSI(z)\varepsilon_{x}^{\text{SI}}(z) only coincide in the low-density limit (rs→∞r_s \to \infty), where the density-dependent coefficient of the semi-infinite jellium approaches the slab {\it universal} coefficient 1/2.Comment: 26 pages, 7 figures, to appear in Phys. Rev.

    Memory-Inspired Temporal Prompt Interaction for Text-Image Classification

    Full text link
    In recent years, large-scale pre-trained multimodal models (LMM) generally emerge to integrate the vision and language modalities, achieving considerable success in various natural language processing and computer vision tasks. The growing size of LMMs, however, results in a significant computational cost for fine-tuning these models for downstream tasks. Hence, prompt-based interaction strategy is studied to align modalities more efficiently. In this contex, we propose a novel prompt-based multimodal interaction strategy inspired by human memory strategy, namely Memory-Inspired Temporal Prompt Interaction (MITP). Our proposed method involves in two stages as in human memory strategy: the acquiring stage, and the consolidation and activation stage. We utilize temporal prompts on intermediate layers to imitate the acquiring stage, leverage similarity-based prompt interaction to imitate memory consolidation, and employ prompt generation strategy to imitate memory activation. The main strength of our paper is that we interact the prompt vectors on intermediate layers to leverage sufficient information exchange between modalities, with compressed trainable parameters and memory usage. We achieve competitive results on several datasets with relatively small memory usage and 2.0M of trainable parameters (about 1% of the pre-trained foundation model)

    Intention and Attention in Image-Text Presentations: A Coherence Approach

    Get PDF
    In image-text presentations from online discourse, pronouns can refer to entities depicted in images, even if these entities are not otherwise referred to in a text caption. While visual salience may be enough to allow a writer to use a pronoun to refer to a prominent entity in the image, coherence theory suggests that pronoun use is more restricted. Specifically, language users may need an appropriate coherence relation between text and imagery to license and resolve pronouns. To explore this hypothesis and better understand the relationship between image context and text interpretation, we annotated an image-text data set with coherence relations and pronoun information. We find that pronoun use reflects a complex interaction between the content of the pronoun, the grammar of the text, and the relation of text and image

    The Author’s Modality and Sctratificational Structure of a Literary Text in Modern English

    Get PDF
    Any literary text irrespective of its genre or trend represents a unique and aesthetic image of the world, created by the author according to his communicative intention and his subjective modality. Hence, the subjective is the organizing axis of a literary work, for, in expressing his vision of the world, the author represents reality in the way that he considers to be most fitting. The interaction and co-existence of subjective and objective factors find their realization in the stratificational structure of the text, i.e. in its multi-layered constitution. The interdisciplinary methodology of research, employed in the article involves some essential data of the theory of literature, linguo-stylistics, text interpretation and linguo-pragmatics.

    Patterns of Intersemiotic Cohesion in the Moving Image Text

    Get PDF
    This paper investigates intersemiotic cohesion in the moving image text popularly known as film discourse. The investigation aimed at identifying the interaction pattern of visual verbal strands in an unfolding text of selected video films. The corpus of the study consists of six movie pictures and corresponding excerpts of conversation(s) from selected film scripts. Baumgarten (2008) theory of visual verbal cohesion and Ngamsa (2012) methodological consideration were adopted for the annotation and matching of deictic devices as both explicit-- invitro and implicit--invivo . The findings of the paper therefore show that interaction of visual and verbal semiotic strands is always achieved at the point of convergence and at the instance of reference items, pronouns, adjectives and other extra linguistic features—signs and existents. The study finally recommends that film directors, screenplay writers and text linguists should explore the use of intersemiotic cohesive devices for explication of the moving image text. Key words: Intersemiotic, Cohesion, Moving Image Text and Film Discourse

    'Desiderio in search of a master': desire and the quest for recognition

    Get PDF
    This essay examines the manner in which desire and Hegelian recognition intersect in Angela Carter’s 1972 novel, The Infernal Desire Machines of Doctor Hoffman. After providing a brief description of Hegel’s famous account of the interaction between the lord and the bondsman, the essay goes on to discuss the manner in which the novel invests the figure of the love-object with the potential to become an ideal master. The image of the reflecting eye, which recurs throughout Carter’s text, is then analyzed as an enactment of, and a commentary upon, the desiring gaze

    A Taxonomy of Prompt Modifiers for Text-To-Image Generation

    Full text link
    Text-to-image generation has seen an explosion of interest since 2021. Today, beautiful and intriguing digital images and artworks can be synthesized from textual inputs ("prompts") with deep generative models. Online communities around text-to-image generation and AI generated art have quickly emerged. This paper identifies six types of prompt modifiers used by practitioners in the online community based on a 3-month ethnographic study. The novel taxonomy of prompt modifiers provides researchers a conceptual starting point for investigating the practice of text-to-image generation, but may also help practitioners of AI generated art improve their images. We further outline how prompt modifiers are applied in the practice of "prompt engineering." We discuss research opportunities of this novel creative practice in the field of Human-Computer Interaction (HCI). The paper concludes with a discussion of broader implications of prompt engineering from the perspective of Human-AI Interaction (HAI) in future applications beyond the use case of text-to-image generation and AI generated art.Comment: 15 page
    • 

    corecore