174 research outputs found

    Video anatomy : spatial-temporal video profile

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)A massive amount of videos are uploaded on video websites, smooth video browsing, editing, retrieval, and summarization are demanded. Most of the videos employ several types of camera operations for expanding field of view, emphasizing events, and expressing cinematic effect. To digest heterogeneous videos in video websites and databases, video clips are profiled to 2D image scroll containing both spatial and temporal information for video preview. The video profile is visually continuous, compact, scalable, and indexing to each frame. This work analyzes the camera kinematics including zoom, translation, and rotation, and categorize camera actions as their combinations. An automatic video summarization framework is proposed and developed. After conventional video clip segmentation and video segmentation for smooth camera operations, the global flow field under all camera actions has been investigated for profiling various types of video. A new algorithm has been designed to extract the major flow direction and convergence factor using condensed images. Then this work proposes a uniform scheme to segment video clips and sections, sample video volume across the major flow, compute flow convergence factor, in order to obtain an intrinsic scene space less influenced by the camera ego-motion. The motion blur technique has also been used to render dynamic targets in the profile. The resulting profile of video can be displayed in a video track to guide the access to video frames, help video editing, and facilitate the applications such as surveillance, visual archiving of environment, video retrieval, and online video preview

    Multiperspective mosaics and layered representation for scene visualization

    Get PDF
    This thesis documents the efforts made to implement multiperspective mosaicking for the purpose of mosaicking undervehicle and roadside sequences. For the undervehicle sequences, it is desired to create a large, high-resolution mosaic that may used to quickly inspect the entire scene shot by a camera making a single pass underneath the vehicle. Several constraints are placed on the video data, in order to facilitate the assumption that the entire scene in the sequence exists on a single plane. Therefore, a single mosaic is used to represent a single video sequence. Phase correlation is used to perform motion analysis in this case. For roadside video sequences, it is assumed that the scene is composed of several planar layers, as opposed to a single plane. Layer extraction techniques are implemented in order to perform this decomposition. Instead of using phase correlation to perform motion analysis, the Lucas-Kanade motion tracking algorithm is used in order to create dense motion maps. Using these motion maps, spatial support for each layer is determined based on a pre-initialized layer model. By separating the pixels in the scene into motion-specific layers, it is possible to sample each element in the scene correctly while performing multiperspective mosaicking. It is also possible to fill in many gaps in the mosaics caused by occlusions, hence creating more complete representations of the objects of interest. The results are several mosaics with each mosaic representing a single planar layer of the scene

    Retracing the 1910 Carruthers Royal Geographical Society Expedition to the Turgen Mountains of Mongolia – Reconstruction of a Century of Glacial Change

    Get PDF
    The Turgen Mountains lie in northwestern Mongolia, roughly 80 kilometers south of the Russian border. The area was visited in 1910 by a Royal Geographical Society (RGS) expedition led by Douglas Carruthers. They undertook an extensive survey of the range and produced a detailed topographic map. They also documented the extent of the glaciers with photographs. This modern study consisted of three phases. The first step was to procure the historical documents from the RGS in London, including copies of the photos, journal entries, and the map. Field work in Mongolia entailed traveling to the remote study site and retracing portions the 1910 expedition. Camera locations were matched to the historical photographs and repeat images taken. In addition, the termini of the two main glacial lobes were surveyed by GPS. Finally, spatial analysis was conducted in the computer laboratory using a GIS to generate a „historic‟ elevation model from the 1910 map and compare it to a modern DEM generated from SRTM data. Map analysis software was employed to evaluate cartometric accuracy of the 1910 map against modern Russian topographic sheets. The results of the DEM and map analysis were then validated using the field GPS data and remotely sensed imagery to quantitatively describe the changes in the glacial system. The repeat photography was analyzed using photogrammetric techniques to measure glacier changes. Also, a custom cartographic product was produced in the style of the 1910 Carruthers map. It displays the extent of the glaciers in 2010 and the locations of repeat photography stations for future expeditions. Placing the results of this study alongside previous work paints a clear picture of the Turgen glacial regime over the last century. The results suggest that while the snow and ice volume on the summits appears to be intact, lower elevation glaciers show significant ablation. This study successively demonstrates the utility of using historic expedition documents to extend the modern record of glacial change

    Deep Learning Localization for Self-driving Cars

    Get PDF
    Identifying the location of an autonomous car with the help of visual sensors can be a good alternative to traditional approaches like Global Positioning Systems (GPS) which are often inaccurate and absent due to insufficient network coverage. Recent research in deep learning has produced excellent results in different domains leading to the proposition of this thesis which uses deep learning to solve the problem of localization in smart cars with visual data. Deep Convolutional Neural Networks (CNNs) were used to train models on visual data corresponding to unique locations throughout a geographic location. In order to evaluate the performance of these models, multiple datasets were created from Google Street View as well as manually by driving a golf cart around the campus while collecting GPS tagged frames. The efficacy of the CNN models was also investigated across different weather/light conditions. Validation accuracies as high as 98% were obtained from some of these models, proving that this novel method has the potential to act as an alternative or aid to traditional GPS based localization methods for cars. The root mean square (RMS) precision of Google Maps is often between 2-10m. However, the precision required for the navigation of self-driving cars is between 2-10cm. Empirically, this precision has been achieved with the help of different error-correction systems on GPS feedback. The proposed method was able to achieve an approximate localization precision of 25 cm without the help of any external error correction system

    Plenoptische Modellierung und Darstellung komplexer starrer Szenen

    Get PDF
    Image-Based Rendering is the task of generating novel views from existing images. In this thesis different new methods to solve this problem are presented. These methods are designed to fulfil special goals such as scalability and interactive rendering performance. First, the theory of the Plenoptic Function is introduced as the mathematical foundation of image formation. Then a new taxonomy is introduced to categorise existing methods and an extensive overview of known approaches is given. This is followed by a detailed analysis of the design goals and the requirements with regards to input data. It is concluded that for perspectively correct image generation from sparse spatial sampling geometry information about the scene is necessary. This leads to the design of three different Image-Based Rendering methods. The rendering results are analysed on different data sets. For this analysis, error metrics are defined to evaluate different aspects

    Appearance Modelling and Reconstruction for Navigation in Minimally Invasive Surgery

    Get PDF
    Minimally invasive surgery is playing an increasingly important role for patient care. Whilst its direct patient benefit in terms of reduced trauma, improved recovery and shortened hospitalisation has been well established, there is a sustained need for improved training of the existing procedures and the development of new smart instruments to tackle the issue of visualisation, ergonomic control, haptic and tactile feedback. For endoscopic intervention, the small field of view in the presence of a complex anatomy can easily introduce disorientation to the operator as the tortuous access pathway is not always easy to predict and control with standard endoscopes. Effective training through simulation devices, based on either virtual reality or mixed-reality simulators, can help to improve the spatial awareness, consistency and safety of these procedures. This thesis examines the use of endoscopic videos for both simulation and navigation purposes. More specifically, it addresses the challenging problem of how to build high-fidelity subject-specific simulation environments for improved training and skills assessment. Issues related to mesh parameterisation and texture blending are investigated. With the maturity of computer vision in terms of both 3D shape reconstruction and localisation and mapping, vision-based techniques have enjoyed significant interest in recent years for surgical navigation. The thesis also tackles the problem of how to use vision-based techniques for providing a detailed 3D map and dynamically expanded field of view to improve spatial awareness and avoid operator disorientation. The key advantage of this approach is that it does not require additional hardware, and thus introduces minimal interference to the existing surgical workflow. The derived 3D map can be effectively integrated with pre-operative data, allowing both global and local 3D navigation by taking into account tissue structural and appearance changes. Both simulation and laboratory-based experiments are conducted throughout this research to assess the practical value of the method proposed

    Ways of expression: the impact of VFX technology on modern storytelling in film and interactive media production.

    Get PDF

    The Eye in Motion: Mid-Victorian Fiction and Moving-Image Technologies

    Get PDF
    This thesis reads selected works of fiction by three mid-Victorian writers (Charlotte Brontë, Charles Dickens, and George Eliot) alongside contemporaneous innovations and developments in moving-image technologies, or what have been referred to by historians of film as ‘pre-cinematic devices’. It looks specifically at the moving panorama, diorama, dissolving magic lantern slides, the kaleidoscope, and persistence of vision devices such as the phenakistiscope and zoetrope, and ranges across scientific writing, journalism, letters, and paintings to demonstrate the scope and popularity of visual motion devices. By exploring this history of optical technologies I show how their display, mechanism, and manual operation contributed to a broader cultural and literary interest in the phenomenological experience of animation, decades before the establishment of cinematography as an industry, technology, and viewing practice. Through a close reading of a range of mid-Victorian novels, this thesis identifies and analyses the literary use of language closely associated with moving-image technologies to argue that the Victorian literary imagination reflected upon, drew from, and incorporated reference to visual and technological animation many decades earlier than critics, focusing usually on early twentieth-century cinema and modernist literature, have allowed. It develops current scholarship on Victorian visual culture and optical technologies by a close reading of the language of moving-image devices—found in advertisements, reviews, and descriptions of their physiological operation and spectacle—alongside the choices Victorian authors made to describe precisely how their characters perceived, how they imagined, remembered, and mentally relived particular scenes and images, and how the readers of their texts were encouraged to imaginatively ‘see’ the animated unfolding of the plot and the material dimensionality of its world through a shared understanding of this language of moving images

    Visual sequence-based place recognition for changing conditions and varied viewpoints

    Get PDF
    Correctly identifying previously-visited locations is essential for robotic place recognition and localisation. This thesis presents training-free solutions to vision-based place recognition under changing environmental conditions and camera viewpoints. Using vision as a primary sensor, the proposed approaches combine image segmentation and rescaling techniques over sequences of visual imagery to enable successful place recognition over a range of challenging environments where prior techniques have failed
    • …
    corecore