769 research outputs found

    Toward Efficient and Robust Large-Scale Structure-from-Motion Systems

    Get PDF
    The ever-increasing number of images that are uploaded and shared on the Internet has recently been leveraged by computer vision researchers to extract 3D information about the content seen in these images. One key mechanism to extract this information is structure-from-motion, which is the process of recovering the 3D geometry (structure) of a scene via a set of images from different viewpoints (camera motion). However, when dealing with crowdsourced datasets comprised of tens or hundreds of millions of images, the magnitude and diversity of the imagery poses challenges such as robustness, scalability, completeness, and correctness for existing structure-from-motion systems. This dissertation focuses on these challenges and demonstrates practical methods to address the problems of data association and verification within structure-from-motion systems. Data association within structure-from-motion systems consists of the discovery of pairwise image overlap within the input dataset. In order to perform this discovery, previous systems assumed that information about every image in the input dataset could be stored in memory, which is prohibitive for large-scale photo collections. To address this issue, we propose a novel streaming-based framework for the discovery of related sets of images, and demonstrate our approach on a crowdsourced dataset containing 100 million images from all around the world. Results illustrate that our streaming-based approach does not compromise model completeness, but achieves unprecedented levels of efficiency and scalability. The verification of individual data associations is difficult to perform during the process of structure-from-motion, as standard methods have limited scope when determining image overlap. Therefore, it is possible for erroneous associations to form, especially when there are symmetric, repetitive, or duplicate structures which can be incorrectly associated with each other. The consequences of these errors are incorrectly placed cameras and scene geometry within the 3D reconstruction. We present two methods that can detect these local inconsistencies and successfully resolve them into a globally consistent 3D model. In our evaluation, we show that our techniques are efficient, are robust to a variety of scenes, and outperform existing approaches.Doctor of Philosoph

    Predictive text-entry in immersive environments

    Get PDF
    Virtual Reality (VR) has progressed significantly since its conception, enabling previously impossible applications such as virtual prototyping, telepresence, and augmented reality However, text-entry remains a difficult problem for immersive environments (Bowman et al, 2001b, Mine et al , 1997). Wearing a head-mounted display (HMD) and datagloves affords a wealth of new interaction techniques. However, users no longer have access to traditional input devices such as a keyboard. Although VR allows for more natural interfaces, there is still a need for simple, yet effective, data-entry techniques. Examples include communicating in a collaborative environment, accessing system commands, or leaving an annotation for a designer m an architectural walkthrough (Bowman et al, 2001b). This thesis presents the design, implementation, and evaluation of a predictive text-entry technique for immersive environments which combines 5DT datagloves, a graphically represented keyboard, and a predictive spelling paradigm. It evaluates the fundamental factors affecting the use of such a technique. These include keyboard layout, prediction accuracy, gesture recognition, and interaction techniques. Finally, it details the results of user experiments, and provides a set of recommendations for the future use of such a technique in immersive environments

    Presuppositions are Fast, whether Hard or Soft - Evidence from the visual world

    Get PDF
    One focus of work on the processing of linguistic meaning has been the relative processing speed of different aspects of meaning. While much early work has focused on implicatures in comparison to literal asserted content (e.g., Bott & Noveck 2004, Huang & Snedeker 2009, among many others), the present paper extends recent efforts to experimentally investigate another aspect of meaning, namely presuppositions. It investigates the triggers again and stop using the visual world eye tracking paradigm, and provides evidence for rapid processing of presupposed content. Our study finds no difference in timing for the two triggers, which is of theoretical relevance given proposals for distinguishing classes of triggers, such as hard vs. soft (Abusch 2010). Whatever differences between these there may be are apparently not affecting the online processing time course. As a further comparison, again was also compared to twice, which expresses essentially the same meaning without a presupposition. Shifts in eye movements for these two cases also appear to be entirely on par, further supporting the notion that presupposed and asserted content are available in parallel early on in online processing

    From seen to unseen: Designing keyboard-less interfaces for text entry on the constrained screen real estate of Augmented Reality headsets

    Get PDF
    Text input is a very challenging task in the constrained screen real-estate of Augmented Reality headsets. Typical keyboards spread over multiple lines and occupy a significant portion of the screen. In this article, we explore the feasibility of single-line text entry systems for smartglasses. We first design FITE, a dynamic keyboard where the characters are positioned depending on their probability within the current input. However, the dynamic layout leads to mediocre text input and low accuracy. We then introduce HIBEY, a fixed 1-line solution that further decreases the screen real-estate usage by hiding the layout. Despite its hidden layout, HIBEY surprisingly performs much better than FITE, and achieves a mean text entry rate of 9.95 words per minute (WPM) with 96.06% accuracy, which is comparable to other state-of-the-art approaches. After 8 days, participants achieve an average of 13.19 WPM. In addition, HIBEY only occupies 13.14% of the screen real estate at the edge region, which is 62.80% smaller than the default keyboard layout on Microsoft Hololens.Peer reviewe

    Language thought and literal meaning

    Get PDF
    PhD ThesisThe notion of literalness in linguistics is based on the following assumptions: Linguistic expressions are vehicle-meaning p airs (since literal meaning has to be the meaning of something). Linguistic expressions have to be cognised a nd used (especially uttered) in order for their meanings to be regarded as literally theirs. "Linguistie, vehicle-meaning relations are fixed and autonomous- - rather than having particular meanings in virtue of being used to express those meanings," linguistie' vehicle-meaning p airs are used to express certain meanings in virtue of having the meanings th at they have. This thesis criticises Chomsky's and Sperber and Wilson's attempts to establish the autonomy of "linguistie'vehicle-meaning pairs. I argue that " Both Chomsky and Sperber and Wilson fail to distinguish "linguistie' semantics from the "real" semantics of what "linguistic" vehicle-meaning pairs are used to express. " They persist in the idea that "linguistic" vehicles are specifically for being uttered (physically instantiated), thus defeating their own purpose of setting the linguistic absolutely apart from what it is used for. " Neither Chomsky's internalist conception of language nor Sperber and Wilson's relevance framework is able to account for the phenomenon of "language misuse", i.e. the use of a "linguistic" vehicle to express the "wrong" meaning. Burton-Roberts' representational conjecture is applied and developed in the presentation of an alternative non/ extra-linguistic account of "literal meaning" and "language use/ misuse". This account has it that neither "linguistic" vehicles nor "linguistic" vehicle-meaning relations are actually linguistic. It avoids the problems attending the notion of linguistic expressions as objects with sortally disjoint and arbitrarily conjoint properties (i.e. physically instantiable "vehicle" and mentally constituted "meaning"), and resolves the unease within Chomsky's Minimalist Program about the inclusion of phonology in I-language. Finally, by way of this resolution, I address some seemingly unrelated issues concerning vehicle-less "meanings" and the relations between l anguage,thought and consciousness

    An exploration of the integration of speech with co-speech gesture with non-invasive brain stimulation

    Get PDF
    The current PhD project focuses on the integration of gesture with their co-occurring speech with the use of non-invasive brain stimulation. The project investigated ‘where’ and ‘when’ gesture-speech integration takes place. Building on the paradigm of Kelly et al., (2010) which provides a reaction time index of automatic gesture-speech integration, it was tested whether left middle temporal gyrus (pMTG) as well as left Inferior frontal gyrus (LIFG) are causally involved in gesture-speech integration. A follow-up study investigated the time window for this integration of gesture and speech in pMTG. This study found that gesture has a priming effect on the semantic retrieval of speech. This effect only manifested itself after gesture had been clearly understood and before the semantic analysis of speech. Based on the common coding hypothesis, this finding was interpreted in terms of gesture and speech originating from a common coding system, with both LIFG and pMTG as its neural underpining, enabling bi-directional influences between both domains
    corecore