58 research outputs found

    Mobile graphics: SIGGRAPH Asia 2017 course

    Get PDF
    Peer ReviewedPostprint (published version

    KOLAM : human computer interfaces fro visual analytics in big data imagery

    Get PDF
    In the present day, we are faced with a deluge of disparate and dynamic information from multiple heterogeneous sources. Among these are the big data imagery datasets that are rapidly being generated via mature acquisition methods in the geospatial, surveillance (specifically, Wide Area Motion Imagery or WAMI) and biomedical domains. The need to interactively visualize these imagery datasets by using multiple types of views (as needed) into the data is common to these domains. Furthermore, researchers in each domain have additional needs: users of WAMI datasets also need to interactively track objects of interest using algorithms of their choice, visualize the resulting object trajectories and interactively edit these results as needed. While software tools that fulfill each of these requirements individually are available and well-used at present, there is still a need for tools that can combine the desired aspects of visualization, human computer interaction (HCI), data analysis, data management, and (geo-)spatial and temporal data processing into a single flexible and extensible system. KOLAM is an open, cross-platform, interoperable, scalable and extensible framework for visualization and analysis that we have developed to fulfil the above needs. The novel contributions in this thesis are the following: 1) Spatio-temporal caching for animating both giga-pixel and Full Motion Video (FMV) imagery, 2) Human computer interfaces purposefully designed to accommodate big data visualization, 3) Human-in-the-loop interactive video object tracking - ground-truthing of moving objects in wide area imagery using algorithm assisted human-in-the-loop coupled tracking, 4) Coordinated visualization using stacked layers, side-by-side layers/video sub-windows and embedded imagery, 5) Efficient one-click manual tracking, editing and data management of trajectories, 6) Efficient labeling of image segmentation regions and passing these results to desired modules, 7) Visualization of image processing results generated by non-interactive operators using layers, 8) Extension of interactive imagery and trajectory visualization to multi-monitor wall display environments, 9) Geospatial applications: Providing rapid roam, zoom and hyper-jump spatial operations, interactive blending, colormap and histogram enhancement, spherical projection and terrain maps, 10) Biomedical applications: Visualization and target tracking of cell motility in time-lapse cell imagery, collecting ground-truth from experts on whole-slide imagery (WSI) for developing histopathology analytic algorithms and computer-aided diagnosis for cancer grading, and easy-to-use tissue annotation features.Includes bibliographical reference

    Multiscale Analysis for Characterization of Remotely Sensed Images.

    Get PDF
    In this study we addressed fundamental characteristics of image analysis in remote sensing, enumerated unavoidable problems in spectral analysis, and highlighted the spatial structure and features that increase information amount and measurement accuracy. We addressed the relationship between scale and spatial structure and the difficulties in characterizing them in complex remotely sensed images. We suggested that it is necessary to employ multiscale analysis techniques for analyzing and extracting information from remotely sensed images. We developed a multiscale characterization software system based on an existing software called ICAMS (Image Characterization And Modeling System), and applied the system to various test data sets including both simulated and real remote sensing data in order to evaluate the performance of these methods. In particular, we analyzed the fractal and wavelet methods. For the fractal methods, the results from using a set of simulated surfaces suggested that the triangular prism surface area method was the best technique for estimating the fractal dimension of remote sensing images. Through examining Landsat TM images of four different land covers, we found that fractal dimension and energy signatures derived from wavelets can measure some interesting aspects of the spatial content of remote sensing data, such as spatial complexity, spatial frequency, and textural orientation. Forest areas displayed the highest fractal dimension values, followed by coastal, urban, and agriculture respectively. However, fractal dimension by itself is insufficient for accurate classification of TM images. Wavelet analysis is more accurate for characterizing spatial structures. A longer wavelet was shown to be more accurate in the representation and discrimination of land-cover classes than a similar function of shorter length, and the combination of energy signatures from multiple decomposition levels and multispectral bands led to better characterization results than a single resolution and single band decomposition. Significant improvements in classification accuracy were achieved by using fractal dimensions in conjunction with the energy signature. This study has shown that multiscale analysis techniques are very useful to complement spectral classification techniques to extract information from remotely sensed images

    Scalable exploration of highly detailed and annotated 3D models

    Get PDF
    With the widespread availability of mobile graphics terminals andWebGL-enabled browsers, 3D graphics over the Internet is thriving. Thanks to recent advances in 3D acquisition and modeling systems, high-quality 3D models are becoming increasingly common, and are now potentially available for ubiquitous exploration. In current 3D repositories, such as Blend Swap, 3D Café or Archive3D, 3D models available for download are mostly presented through a few user-selected static images. Online exploration is limited to simple orbiting and/or low-fidelity explorations of simplified models, since photorealistic rendering quality of complex synthetic environments is still hardly achievable within the real-time constraints of interactive applications, especially on on low-powered mobile devices or script-based Internet browsers. Moreover, navigating inside 3D environments, especially on the now pervasive touch devices, is a non-trivial task, and usability is consistently improved by employing assisted navigation controls. In addition, 3D annotations are often used in order to integrate and enhance the visual information by providing spatially coherent contextual information, typically at the expense of introducing visual cluttering. In this thesis, we focus on efficient representations for interactive exploration and understanding of highly detailed 3D meshes on common 3D platforms. For this purpose, we present several approaches exploiting constraints on the data representation for improving the streaming and rendering performance, and camera movement constraints in order to provide scalable navigation methods for interactive exploration of complex 3D environments. Furthermore, we study visualization and interaction techniques to improve the exploration and understanding of complex 3D models by exploiting guided motion control techniques to aid the user in discovering contextual information while avoiding cluttering the visualization. We demonstrate the effectiveness and scalability of our approaches both in large screen museum installations and in mobile devices, by performing interactive exploration of models ranging from 9Mtriangles to 940Mtriangles

    Real-Time Computational Gigapixel Multi-Camera Systems

    Get PDF
    The standard cameras are designed to truthfully mimic the human eye and the visual system. In recent years, commercially available cameras are becoming more complex, and offer higher image resolutions than ever before. However, the quality of conventional imaging methods is limited by several parameters, such as the pixel size, lens system, the diffraction limit, etc. The rapid technological advancements, increase in the available computing power, and introduction of Graphics Processing Units (GPU) and Field-Programmable-Gate-Arrays (FPGA) open new possibilities in the computer vision and computer graphics communities. The researchers are now focusing on utilizing the immense computational power offered on the modern processing platforms, to create imaging systems with novel or significantly enhanced capabilities compared to the standard ones. One popular type of the computational imaging systems offering new possibilities is a multi-camera system. This thesis will focus on FPGA-based multi-camera systems that operate in real-time. The aim of themulti-camera systems presented in this thesis is to offer a wide field-of-view (FOV) video coverage at high frame rates. The wide FOV is achieved by constructing a panoramic image from the images acquired by the multi-camera system. Two new real-time computational imaging systems that provide new functionalities and better performance compared to conventional cameras are presented in this thesis. Each camera system design and implementation are analyzed in detail, built and tested in real-time conditions. Panoptic is a miniaturized low-cost multi-camera system that reconstructs a 360 degrees view in real-time. Since it is an easily portable system, it provides means to capture the complete surrounding light field in dynamic environment, such as when mounted on a vehicle or a flying drone. The second presented system, GigaEye II , is a modular high-resolution imaging system that introduces the concept of distributed image processing in the real-time camera systems. This thesis explains in detail howsuch concept can be efficiently used in real-time computational imaging systems. The purpose of computational imaging systems in the form of multi-camera systems does not end with real-time panoramas. The application scope of these cameras is vast. They can be used in 3D cinematography, for broadcasting live events, or for immersive telepresence experience. The final chapter of this thesis presents three potential applications of these systems: object detection and tracking, high dynamic range (HDR) imaging, and observation of multiple regions of interest. Object detection and tracking, and observation of multiple regions of interest are extremely useful and desired capabilities of surveillance systems, in security and defense industry, or in the fast-growing industry of autonomous vehicles. On the other hand, high dynamic range imaging is becoming a common option in the consumer market cameras, and the presented method allows instantaneous capture of HDR videos. Finally, this thesis concludes with the discussion of the real-time multi-camera systems, their advantages, their limitations, and the future predictions

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Towards Predictive Rendering in Virtual Reality

    Get PDF
    The strive for generating predictive images, i.e., images representing radiometrically correct renditions of reality, has been a longstanding problem in computer graphics. The exactness of such images is extremely important for Virtual Reality applications like Virtual Prototyping, where users need to make decisions impacting large investments based on the simulated images. Unfortunately, generation of predictive imagery is still an unsolved problem due to manifold reasons, especially if real-time restrictions apply. First, existing scenes used for rendering are not modeled accurately enough to create predictive images. Second, even with huge computational efforts existing rendering algorithms are not able to produce radiometrically correct images. Third, current display devices need to convert rendered images into some low-dimensional color space, which prohibits display of radiometrically correct images. Overcoming these limitations is the focus of current state-of-the-art research. This thesis also contributes to this task. First, it briefly introduces the necessary background and identifies the steps required for real-time predictive image generation. Then, existing techniques targeting these steps are presented and their limitations are pointed out. To solve some of the remaining problems, novel techniques are proposed. They cover various steps in the predictive image generation process, ranging from accurate scene modeling over efficient data representation to high-quality, real-time rendering. A special focus of this thesis lays on real-time generation of predictive images using bidirectional texture functions (BTFs), i.e., very accurate representations for spatially varying surface materials. The techniques proposed by this thesis enable efficient handling of BTFs by compressing the huge amount of data contained in this material representation, applying them to geometric surfaces using texture and BTF synthesis techniques, and rendering BTF covered objects in real-time. Further approaches proposed in this thesis target inclusion of real-time global illumination effects or more efficient rendering using novel level-of-detail representations for geometric objects. Finally, this thesis assesses the rendering quality achievable with BTF materials, indicating a significant increase in realism but also confirming the remainder of problems to be solved to achieve truly predictive image generation

    Omnidirectional Light Field Analysis and Reconstruction

    Get PDF
    Digital photography exists since 1975, when Steven Sasson attempted to build the first digital camera. Since then the concept of digital camera did not evolve much: an optical lens concentrates light rays onto a focal plane where a planar photosensitive array transforms the light intensity into an electric signal. During the last decade a new way of conceiving digital photography emerged: a photography is the acquisition of the entire light ray field in a confined region of space. The main implication of this new concept is that a digital camera does not acquire a 2-D signal anymore, but a 5-D signal in general. Acquiring an image becomes more demanding in terms of memory and processing power; at the same time, it offers the users a new set of possibilities, like choosing dynamically the focal plane and the depth of field of the final digital photo. In this thesis we develop a complete mathematical framework to acquire and then reconstruct the omnidirectional light field around an observer. We also propose the design of a digital light field camera system, which is composed by several pinhole cameras distributed around a sphere. The choice is not casual, as we take inspiration from something already seen in nature: the compound eyes of common terrestrial and flying insects like the house fly. In the first part of the thesis we analyze the optimal sampling conditions that permit an efficient discrete representation of the continuous light field. In other words, we will give an answer to the question: how many cameras and what resolution are needed to have a good representation of the 4-D light field? Since we are dealing with an omnidirectional light field we use a spherical parametrization. The results of our analysis is that we need an irregular (i.e., not rectangular) sampling scheme to represent efficiently the light field. Then, to store the samples we use a graph structure, where each node represents a light ray and the edges encode the topology of the light field. When compared to other existing approaches our scheme has the favorable property of having a number of samples that scales smoothly for a given output resolution. The next step after the acquisition of the light field is to reconstruct a digital picture, which can be seen as a 2-D slice of the 4-D acquired light field. We interpret the reconstruction as a regularized inverse problem defined on the light field graph and obtain a solution based on a diffusion process. The proposed scheme has three main advantages when compared to the classic linear interpolation: it is robust to noise, it is computationally efficient and can be implemented in a distributed fashion. In the second part of the thesis we investigate the problem of extracting geometric information about the scene in the form of a depth map. We show that the depth information is encoded inside the light field derivatives and set up a TV-regularized inverse problem, which efficiently calculates a dense depth map of the scene while respecting the discontinuities at the boundaries of objects. The extracted depth map is used to remove visual and geometrical artifacts from the reconstruction when the light field is under-sampled. In other words, it can be used to help the reconstruction process in challenging situations. Furthermore, when the light field camera is moving temporally, we show how the depth map can be used to estimate the motion parameters between two consecutive acquisitions with a simple and effective algorithm, which does not require the computation nor the matching of features and performs only simple arithmetic operations directly in the pixel space. In the last part of the thesis, we introduce a novel omnidirectional light field camera that we call Panoptic. We obtain it by layering miniature CMOS imagers onto an hemispherical surface, which are then connected to a network of FPGAs. We show that the proposed mathematical framework is well suited to be embedded in hardware by demonstrating a real time reconstruction of an omnidirectional video stream at 25 frames per second

    Advancements in multi-view processing for reconstruction, registration and visualization.

    Get PDF
    The ever-increasing diffusion of digital cameras and the advancements in computer vision, image processing and storage capabilities have lead, in the latest years, to the wide diffusion of digital image collections. A set of digital images is usually referred as a multi-view images set when the pictures cover different views of the same physical object or location. In multi-view datasets, correlations between images are exploited in many different ways to increase our capability to gather enhanced understanding and information on a scene. For example, a collection can be enhanced leveraging on the camera position and orientation, or with information about the 3D structure of the scene. The range of applications of multi-view data is really wide, encompassing diverse fields such as image-based reconstruction, image-based localization, navigation of virtual environments, collective photographic retouching, computational photography, object recognition, etc. For all these reasons, the development of new algorithms to effectively create, process, and visualize this type of data is an active research trend. The thesis will present four different advancements related to different aspects of the multi-view data processing: - Image-based 3D reconstruction: we present a pre-processing algorithm, that is a special color-to-gray conversion. This was developed with the aim to improve the accuracy of image-based reconstruction algorithms. In particular, we show how different dense stereo matching results can be enhanced by application of a domain separation approach that pre-computes a single optimized numerical value for each image location. - Image-based appearance reconstruction: we present a multi-view processing algorithm, this can enhance the quality of the color transfer from multi-view images to a geo-referenced 3D model of a location of interest. The proposed approach computes virtual shadows and allows to automatically segment shadowed regions from the input images preventing to use those pixels in subsequent texture synthesis. - 2D to 3D registration: we present an unsupervised localization and registration system. This system can recognize a site that has been framed in a multi-view data and calibrate it on a pre-existing 3D representation. The system has a very high accuracy and it can validate the result in a completely unsupervised manner. The system accuracy is enough to seamlessly view input images correctly super-imposed on the 3D location of interest. - Visualization: we present PhotoCloud, a real-time client-server system for interactive exploration of high resolution 3D models and up to several thousand photographs aligned over this 3D data. PhotoCloud supports any 3D models that can be rendered in a depth-coherent way and arbitrary multi-view image collections. Moreover, it tolerates 2D-to-2D and 2D-to-3D misalignments, and it provides scalable visualization of generic integrated 2D and 3D datasets by exploiting data duality. A set of effective 3D navigation controls, tightly integrated with innovative thumbnail bars, enhances the user navigation. These advancements have been developed in tourism and cultural heritage application contexts, but they are not limited to these
    • …
    corecore