714,320 research outputs found
FĂschlĂĄr-DiamondTouch: collaborative video searching on a table
In this paper we present the system we have developed for our participation in the annual TRECVid benchmarking activity, specically the system we have developed, FĂschlĂĄr-DT, for participation in the interactive search
task of TRECVid 2005. Our back-end search engine uses a combination of a text search which operates over the automatic speech recognised text, and an image search which uses low-level image features matched against video keyframes. The two novel aspects of our work are the fact that we are evaluating collaborative, team-based search among groups of users working together, and that we are using a novel touch-sensitive tabletop interface and interaction device known as the DiamondTouch to support this collaborative search. The paper summarises the backend search systems as well as presenting the interface we have developed, in detail
Position-dependent exact-exchange energy for slabs and semi-infinite jellium
The position-dependent exact-exchange energy per particle
(defined as the interaction between a given electron at and its
exact-exchange hole) at metal surfaces is investigated, by using either jellium
slabs or the semi-infinite (SI) jellium model. For jellium slabs, we prove
analytically and numerically that in the vacuum region far away from the
surface , {\it
independent} of the bulk electron density, which is exactly half the
corresponding exact-exchange potential [Phys.
Rev. Lett. {\bf 97}, 026802 (2006)] of density-functional theory, as occurs in
the case of finite systems. The fitting of
to a physically motivated image-like expression is feasible, but the resulting
location of the image plane shows strong finite-size oscillations every time a
slab discrete energy level becomes occupied. For a semi-infinite jellium, the
asymptotic behavior of is somehow different.
As in the case of jellium slabs has
an image-like behavior of the form , but now with a
density-dependent coefficient that in general differs from the slab universal
coefficient 1/2. Our numerical estimates for this coefficient agree with two
previous analytical estimates for the same. For an arbitrary finite thickness
of a jellium slab, we find that the asymptotic limits of
and only
coincide in the low-density limit (), where the
density-dependent coefficient of the semi-infinite jellium approaches the slab
{\it universal} coefficient 1/2.Comment: 26 pages, 7 figures, to appear in Phys. Rev.
Memory-Inspired Temporal Prompt Interaction for Text-Image Classification
In recent years, large-scale pre-trained multimodal models (LMM) generally
emerge to integrate the vision and language modalities, achieving considerable
success in various natural language processing and computer vision tasks. The
growing size of LMMs, however, results in a significant computational cost for
fine-tuning these models for downstream tasks. Hence, prompt-based interaction
strategy is studied to align modalities more efficiently. In this contex, we
propose a novel prompt-based multimodal interaction strategy inspired by human
memory strategy, namely Memory-Inspired Temporal Prompt Interaction (MITP). Our
proposed method involves in two stages as in human memory strategy: the
acquiring stage, and the consolidation and activation stage. We utilize
temporal prompts on intermediate layers to imitate the acquiring stage,
leverage similarity-based prompt interaction to imitate memory consolidation,
and employ prompt generation strategy to imitate memory activation. The main
strength of our paper is that we interact the prompt vectors on intermediate
layers to leverage sufficient information exchange between modalities, with
compressed trainable parameters and memory usage. We achieve competitive
results on several datasets with relatively small memory usage and 2.0M of
trainable parameters (about 1% of the pre-trained foundation model)
Intention and Attention in Image-Text Presentations: A Coherence Approach
In image-text presentations from online discourse, pronouns can refer to entities depicted in images, even if these entities are not otherwise referred to in a text caption. While visual salience may be enough to allow a writer to use a pronoun to refer to a prominent entity in the image, coherence theory suggests that pronoun use is more restricted. Specifically, language users may need an appropriate coherence relation between text and imagery to license and resolve pronouns. To explore this hypothesis and better understand the relationship between image context and text interpretation, we annotated an image-text data set with coherence relations and pronoun information. We find that pronoun use reflects a complex interaction between the content of the pronoun, the grammar of the text, and the relation of text and image
The Authorâs Modality and Sctratificational Structure of a Literary Text in Modern English
Any literary text irrespective of its genre or trend represents a unique and aesthetic image of the world, created by the author according to his communicative intention and his subjective modality. Hence, the subjective is the organizing axis of a literary work, for, in expressing his vision of the world, the author represents reality in the way that he considers to be most fitting. The interaction and co-existence of subjective and objective factors find their realization in the stratificational structure of the text, i.e. in its multi-layered constitution. The interdisciplinary methodology of research, employed in the article involves some essential data of the theory of literature, linguo-stylistics, text interpretation and linguo-pragmatics.
Patterns of Intersemiotic Cohesion in the Moving Image Text
This paper investigates intersemiotic cohesion in the moving image text popularly known as film discourse. The investigation aimed at identifying the interaction pattern of visual verbal strands in an unfolding text of selected video films. The corpus of the study consists of six movie pictures and corresponding excerpts of conversation(s) from selected film scripts. Baumgarten (2008) theory of visual verbal cohesion and Ngamsa (2012) methodological consideration were adopted for the annotation and matching of deictic devices as both explicit-- invitro and implicit--invivo . The findings of the paper therefore show that interaction of visual and verbal semiotic strands is always achieved at the point of convergence and at the instance of reference items, pronouns, adjectives and other extra linguistic featuresâsigns and existents. The study finally recommends that film directors, screenplay writers and text linguists should explore the use of intersemiotic cohesive devices for explication of the moving image text. Key words: Intersemiotic, Cohesion, Moving Image Text and Film Discourse
'Desiderio in search of a master': desire and the quest for recognition
This essay examines the manner in which desire and Hegelian recognition intersect in Angela Carterâs 1972 novel, The Infernal Desire Machines of Doctor Hoffman. After providing a brief description of Hegelâs famous account of the interaction between the lord and the bondsman, the essay goes on to discuss the manner in which the novel invests the figure of the love-object with the potential to become an ideal master. The image of the reflecting eye, which recurs throughout Carterâs text, is then analyzed as an enactment of, and a commentary upon, the desiring gaze
A Taxonomy of Prompt Modifiers for Text-To-Image Generation
Text-to-image generation has seen an explosion of interest since 2021. Today,
beautiful and intriguing digital images and artworks can be synthesized from
textual inputs ("prompts") with deep generative models. Online communities
around text-to-image generation and AI generated art have quickly emerged. This
paper identifies six types of prompt modifiers used by practitioners in the
online community based on a 3-month ethnographic study. The novel taxonomy of
prompt modifiers provides researchers a conceptual starting point for
investigating the practice of text-to-image generation, but may also help
practitioners of AI generated art improve their images. We further outline how
prompt modifiers are applied in the practice of "prompt engineering." We
discuss research opportunities of this novel creative practice in the field of
Human-Computer Interaction (HCI). The paper concludes with a discussion of
broader implications of prompt engineering from the perspective of Human-AI
Interaction (HAI) in future applications beyond the use case of text-to-image
generation and AI generated art.Comment: 15 page
- âŠ