1,199 research outputs found
How Foot Tracking Matters: The Impact of an Animated Self-Avatar on Interaction, Embodiment and Presence in Shared Virtual Environments
The use of a self-avatar representation in head-mounted displays has been shown to have important effects on user behavior. However, relatively few studies focus on feet and legs. We implemented a shared virtual reality for consumer virtual reality systems where each user could be represented by a gender-matched self-avatar controlled by multiple trackers. The self-avatar allowed users to see their feet, legs and part of their torso when they looked down. We implemented an experiment where participants worked together to solve jigsaw puzzles. Participants experienced either no-avatar, a self-avatar with floating feet, or a self-avatar with tracked feet, in a between-subjects manipulation. First, we found that participants could solve the puzzle more quickly with self-avatars than without self-avatars; but there was no significant difference between the latter two conditions, solely on task completion time. Second, we found participants with tracked feet placed their feet statistically significantly closer to obstacles than participants with floating feet, whereas participants who did not have a self-avatar usually ignored obstacles. Our post-experience questionnaire results confirmed that the use of a self-avatar has important effects on presence and interaction. Together the results show that although the impact of animated legs might be subtle, it does change how users behave around obstacles. This could have important implications for the design of virtual spaces for applications such as training or behavioral analysis
Xi Xi’s Playful Image-texts: Ekphrasis, Parergon, and the Concept of Toy
What does the ludic have to do with Xi Xi’s writings and creative concerns? This essay addresses the question by examining Xi Xi’s “little prose pieces,” or xiaopin sanwen, which take the format of the intermedial image-text and exist as a third category of her writing besides fiction and poetry. I explore how Xi Xi’s image-texts offer new articulations of play by discussing these questions: how are these image-texts constructed in game-like ways? What modes of play do they engage with? And what playful experience do they offer to readers and viewers?
The author first considers the image-text’s intermediality in Xi Xi’s earlier works, including Scrapbook, Picture/Storybook, and Jigsaw Puzzles. Xi Xi relates to two key concepts in art history to augment the ludic dimension of her works: ekphrasis, generally denoting writing that represents and expounds images and artworks, and the parergon, understood as “supplement” to the artwork and as visual framing. The image-text in these earlier works functions as a toy. Second, the essay addresses Xi Xi’s engagement with material playthings in her recent image-texts that are explicitly about toys, such as The Teddy Bear Chronicles and My Toys. Here, the concept of toy intersects with premodern Chinese leisure (xianqing) culture.
Tracing the evolution of Xi Xi’s relation to the ludic over time, the essay argues that Xi Xi’s image-texts are a site of ludic aesthetics that highlights two modes of play: the youxi mode encompasses the notion of literature as a game with particular techniques and rules and ludicity as a dynamic and liminal experience; the wanshang mode posits play as leisure and the cultivation of style and taste, as well as a space of temporary withdrawal from the world of obligations. Both ludic modes affirm play as an aesthetic experience, appreciated for its intrinsic value. Literary play, as found in Xi Xi’s image-texts, therefore produces aesthetically sophisticated and significant works that engage readers on multiple levels of creative reading and seeing
Delving Deep into Fine-Grained Sketch-Based Image Retrieval.
PhD ThesisTo see is to sketch. Since prehistoric times, people use sketch-like petroglyphs as an effective
communicative tool which predates the appearance of language tens of thousands of years ago.
This is even more true nowadays that with the ubiquitous proliferation of touchscreen devices,
sketching is possibly the only rendering mechanism readily available for all to express visual
intentions. The intriguing free-hand property of human sketches, however, becomes a major
obstacle when practically applied – humans are not faithful artists, the sketches drawn are iconic
abstractions of mental images and can quickly fall off the visual manifold of natural objects.
When matching discriminatively with their corresponding photos, this problem is known as finegrained
sketch-based image retrieval (FG-SBIR) and has drawn increasing interest due to its
potential commercial adoption. This thesis delves deep into FG-SBIR by intuitively analysing
the intrinsic unique traits of human sketches and make such understanding importantly leveraged
to enhance their links to match with photos under deep learning. More specifically, this thesis
investigates and has developed four methods for FG-SBIR as follows:
Chapter 3 describes a discriminative-generative hybrid method to better bridge the domain
gap between photo and sketch. Existing FG-SBIR models learn a deep joint embedding space
with discriminative losses only to pull matching pairs of photos and sketches close and push
mismatched pairs away, thus indirectly align the two domains. To this end, we introduce a
i
generative task of cross-domain image synthesis. Concretely when an input photo is embedded
in the joint space, the embedding vector is used as input to a generative model to synthesise the
corresponding sketch. This task enforces the learned embedding space to preserve all the domain
invariant information that is useful for cross-domain reconstruction, thus explicitly reducing the
domain gap as opposed to existing models. Such an approach achieves the first near-human
performance on the largest FG-SBIR dataset to date, Sketchy.
Chapter 4 presents a new way of modelling human sketch and shows how such modelling can
be integrated into existing FG-SBIR paradigm with promising performance. Instead of modelling
the forward sketching pass, we attempt to invert it. We model this inversion by translating
iconic free-hand sketches to contours that resemble more geometrically realistic projections
of object boundaries and separately factorise out the salient added details. This factorised rerepresentation
makes it possible for more effective sketch-photo matching. Specifically, we
propose a novel unsupervised image style transfer model based on enforcing a cyclic embedding
consistency constraint. A deep four-way Siamese model is then formulated to importantly utilise
the synthesised contours by extracting distinct complementary detail features for FG-SBIR.
Chapter 5 extends the practical applicability of FG-SBIR to work well beyond its training
categories. Existing models, while successful, require instance-level pairing within each coarsegrained
category as annotated training data, leaving their ability to deal with out-of-sample data
unknown. We identify cross-category generalisation for FG-SBIR as a domain generalisation
problem and propose the first solution. Our key contribution is a novel unsupervised learning
approach to model a universal manifold of prototypical visual sketch traits. This manifold can
then be used to paramaterise the learning of a sketch/photo representation. Model adaptation to
novel categories then becomes automatic via embedding the novel sketch in the manifold and
updating the representation and retrieval function accordingly.
Chapter 6 challenges the ImageNet pre-training that has long been considered crucial by the
FG-SBIR community due to the lack of large sketch-photo paired datasets for FG-SBIR training,
and propose a self-supervised alternative for representation pre-training. Specifically, we
consider the jigsaw puzzle game of recomposing images from shuffled parts. We identify two
ii
key facets of jigsaw task design that are required for effective performance. The first is formulating
the puzzle in a mixed-modality fashion. Second we show that framing the optimisation
as permutation matrix inference via Sinkhorn iterations is more effective than existing classifier
instantiation of the Jigsaw idea. We show for the first time that ImageNet classification is unnecessary
as a pre-training strategy for FG-SBIR and confirm the efficacy of our jigsaw approach
Computer aided puzzle assembly based on shape and texture information /
Puzzle assembly’s importance lies into application in many areas such as restoration and reconstruction of archeological findings, the repairing of broken objects, solving of the jigsaw type puzzles, molecular docking problem, etc. Puzzle pieces usually include not only geometrical shape information but also visual information of texture, color, continuity of lines, and so on. Moreover, textural information is mainly used to assembly pieces in some cases, such as classic jigsaw puzzles. This research presents a new approach in that pictorial assembly, in contrast to previous curve matching methods, uses texture information as well as geometric shape. The assembly in this study is performed using textural features and geometrical constraints. First, the texture of a band outside the border of pieces is predicted by inpainting and texture synthesis methods. The feature values are derived by these original and predicted images of pieces. A combination of the feature and confidence values is used to generate an affinity measure of corresponding pieces. Two new algorithms using Fourier based image registration techniques are developed to optimize the affinity. The algorithms for inpainting, affinity and Fourier based assembly are explained with experimental results on real and artificial data. The main contributions of this research are: The development of a performance measure that indicates the level of success of assembly of pieces based on textural features and geometrical shape. Solution of the assembly problem by using of the Fourier based methods
- …