126 research outputs found

    Video Shot Boundary Detection using the Scale Invariant Feature Transform and RGB Color Channels

    Get PDF
    Segmentation of the video sequence by detecting shot changes is essential for video analysis, indexing and retrieval. In this context, a shot boundary detection algorithm is proposed in this paper based on the scale invariant feature transform (SIFT). The first step of our method consists on a top down search scheme to detect the locations of transitions by comparing the ratio of matched features extracted via SIFT for every RGB channel of video frames. The overview step provides the locations of boundaries. Secondly, a moving average calculation is performed to determine the type of transition. The proposed method can be used for detecting gradual transitions and abrupt changes without requiring any training of the video content in advance. Experiments have been conducted on a multi type video database and show that this algorithm achieves well performances

    Toward Large Scale Semantic Image Understanding and Retrieval

    Get PDF
    Semantic image retrieval is a multifaceted, highly complex problem. Not only does the solution to this problem require advanced image processing and computer vision techniques, but it also requires knowledge beyond what can be inferred from the image content alone. In contrast, traditional image retrieval systems are based upon keyword searches on filenames or metadata tags, e.g. Google image search, Flickr search, etc. These conventional systems do not analyze the image content and their keywords are not guaranteed to represent the image. Thus, there is significant need for a semantic image retrieval system that can analyze and retrieve images based upon the content and relationships that exist in the real world.In this thesis, I present a framework that moves towards advancing semantic image retrieval in large scale datasets. At a conceptual level, semantic image retrieval requires the following steps: viewing an image, understanding the content of the image, indexing the important aspects of the image, connecting the image concepts to the real world, and finally retrieving the images based upon the index concepts or related concepts. My proposed framework addresses each of these components in my ultimate goal of improving image retrieval. The first task is the essential task of understanding the content of an image. Unfortunately, typically the only data used by a computer algorithm when analyzing images is the low-level pixel data. But, to achieve human level comprehension, a machine must overcome the semantic gap, or disparity that exists between the image data and human understanding. This translation of the low-level information into a high-level representation is an extremely difficult problem that requires more than the image pixel information. I describe my solution to this problem through the use of an online knowledge acquisition and storage system. This system utilizes the extensible, visual, and interactable properties of Scalable Vector Graphics (SVG) combined with online crowd sourcing tools to collect high level knowledge about visual content.I further describe the utilization of knowledge and semantic data for image understanding. Specifically, I seek to incorporate knowledge in various algorithms that cannot be inferred from the image pixels alone. This information comes from related images or structured data (in the form of hierarchies and ontologies) to improve the performance of object detection and image segmentation tasks. These understanding tasks are crucial intermediate steps towards retrieval and semantic understanding. However, the typical object detection and segmentation tasks requires an abundance of training data for machine learning algorithms. The prior training information provides information on what patterns and visual features the algorithm should be looking for when processing an image. In contrast, my algorithm utilizes related semantic images to extract the visual properties of an object and also to decrease the search space of my detection algorithm. Furthermore, I demonstrate the use of related images in the image segmentation process. Again, without the use of prior training data, I present a method for foreground object segmentation by finding the shared area that exists in a set of images. I demonstrate the effectiveness of my method on structured image datasets that have defined relationships between classes i.e. parent-child, or sibling classes.Finally, I introduce my framework for semantic image retrieval. I enhance the proposed knowledge acquisition and image understanding techniques with semantic knowledge through linked data and web semantic languages. This is an essential step in semantic image retrieval. For example, a car class classified by an image processing algorithm not enhanced by external knowledge would have no idea that a car is a type of vehicle which would also be highly related to a truck and less related to other transportation methods like a train . However, a query for modes of human transportation should return all of the mentioned classes. Thus, I demonstrate how to integrate information from both image processing algorithms and semantic knowledge bases to perform interesting queries that would otherwise be impossible. The key component of this system is a novel property reasoner that is able to translate low level image features into semantically relevant object properties. I use a combination of XML based languages such as SVG, RDF, and OWL in order to link to existing ontologies available on the web. My experiments demonstrate an efficient data collection framework and novel utilization of semantic data for image analysis and retrieval on datasets of people and landmarks collected from sources such as IMDB and Flickr. Ultimately, my thesis presents improvements to the state of the art in visual knowledge representation/acquisition and computer vision algorithms such as detection and segmentation toward the goal of enhanced semantic image retrieval

    Fitting and tracking of a scene model in very low bit rate video coding

    Get PDF

    ISMCR 1994: Topical Workshop on Virtual Reality. Proceedings of the Fourth International Symposium on Measurement and Control in Robotics

    Get PDF
    This symposium on measurement and control in robotics included sessions on: (1) rendering, including tactile perception and applied virtual reality; (2) applications in simulated medical procedures and telerobotics; (3) tracking sensors in a virtual environment; (4) displays for virtual reality applications; (5) sensory feedback including a virtual environment application with partial gravity simulation; and (6) applications in education, entertainment, technical writing, and animation

    Advanced Image Acquisition, Processing Techniques and Applications

    Get PDF
    "Advanced Image Acquisition, Processing Techniques and Applications" is the first book of a series that provides image processing principles and practical software implementation on a broad range of applications. The book integrates material from leading researchers on Applied Digital Image Acquisition and Processing. An important feature of the book is its emphasis on software tools and scientific computing in order to enhance results and arrive at problem solution

    Probabilistic modeling for single-photon lidar

    Full text link
    Lidar is an increasingly prevalent technology for depth sensing, with applications including scientific measurement and autonomous navigation systems. While conventional systems require hundreds or thousands of photon detections per pixel to form accurate depth and reflectivity images, recent results for single-photon lidar (SPL) systems using single-photon avalanche diode (SPAD) detectors have shown accurate images formed from as little as one photon detection per pixel, even when half of those detections are due to uninformative ambient light. The keys to such photon-efficient image formation are two-fold: (i) a precise model of the probability distribution of photon detection times, and (ii) prior beliefs about the structure of natural scenes. Reducing the number of photons needed for accurate image formation enables faster, farther, and safer acquisition. Still, such photon-efficient systems are often limited to laboratory conditions more favorable than the real-world settings in which they would be deployed. This thesis focuses on expanding the photon detection time models to address challenging imaging scenarios and the effects of non-ideal acquisition equipment. The processing derived from these enhanced models, sometimes modified jointly with the acquisition hardware, surpasses the performance of state-of-the-art photon counting systems. We first address the problem of high levels of ambient light, which causes traditional depth and reflectivity estimators to fail. We achieve robustness to strong ambient light through a rigorously derived window-based censoring method that separates signal and background light detections. Spatial correlations both within and between depth and reflectivity images are encoded in superpixel constructions, which fill in holes caused by the censoring. Accurate depth and reflectivity images can then be formed with an average of 2 signal photons and 50 background photons per pixel, outperforming methods previously demonstrated at a signal-to-background ratio of 1. We next approach the problem of coarse temporal resolution for photon detection time measurements, which limits the precision of depth estimates. To achieve sub-bin depth precision, we propose a subtractively-dithered lidar implementation, which uses changing synchronization delays to shift the time-quantization bin edges. We examine the generic noise model resulting from dithering Gaussian-distributed signals and introduce a generalized Gaussian approximation to the noise distribution and simple order statistics-based depth estimators that take advantage of this model. Additional analysis of the generalized Gaussian approximation yields rules of thumb for determining when and how to apply dither to quantized measurements. We implement a dithered SPL system and propose a modification for non-Gaussian pulse shapes that outperforms the Gaussian assumption in practical experiments. The resulting dithered-lidar architecture could be used to design SPAD array detectors that can form precise depth estimates despite relaxed temporal quantization constraints. Finally, SPAD dead time effects have been considered a major limitation for fast data acquisition in SPL, since a commonly adopted approach for dead time mitigation is to operate in the low-flux regime where dead time effects can be ignored. We show that the empirical distribution of detection times converges to the stationary distribution of a Markov chain and demonstrate improvements in depth estimation and histogram correction using our Markov chain model. An example simulation shows that correctly compensating for dead times in a high-flux measurement can yield a 20-times speed up of data acquisition. The resulting accuracy at high photon flux could enable real-time applications such as autonomous navigation

    Development of geobiophysical models for cartographic representation of wetlands in Yellow Creek Basin, West Virginia

    Get PDF
    In the Appalachian Mountains of Canaan Valley, the warmer temperatures and fading native species are conducive for invading foreign species. Localized relic communities of red spruce, sphagnum and polytrichum are sensitive to climatic change and potential indicators of global warming. Therefore, the development of a baseline assessment and further research are necessary to observe and model changes. Influencing factors in wetland ecology include slope, aspect, biologically rich and diverse vegetation associations, micro-topography, hydrology, underlying soils, and geology. Three uniquely independent study sites have been established along a single transect of the Yellow Creek stream terraces, in Tucker County, West Virginia. Vegetation physiognomic association, micro-topography, hydrology, and soils data were collected using a variety of technologies. A plane table polar coordinate paper, with K&E® alidade, and Sonin® sonic ranging device for vegetation physiognomic association mappings was originally employed. This was replaced by Magellan® ProMarkX-CP-GPS providing more efficient vegetation association delineation and registration to UTM mapping coordinates. A detailed micro-topographic survey was performed using Nikon® Theodolite Total Station to accurately determine the micro-topography. Collected field data were imported into ER Mapper®5.5 software, the geobiophysical modeling system, via an iterative registration process. Digitized 1995 color infrared one-meter resolution aerial imagery supplied by the United States Forest Service formed the base map for all registration. A combination of image processing techniques including principal component analysis and cluster analysis was applied to extract features for pattern recognition. The processed spectral, spatial, and multi-temporal components were geobiophysically modeled to characterize vegetation physiognomic associations and other identified features. The three-dimensional cartographic representations illustrate the subtle relationships between sphagnum, eriophorum and polytrichum physiognomic associations and surface hydrology

    Design Considerations for an Electron Energy Loss Spectroscopy Parallel Recording System

    Get PDF
    This thesis describes the results of an investigation into the design of a parallel recording system for electron energy loss spectroscopy (EELS). The motivation behind the construction of such a system is the greatly enhanced detection efficiency which can be achieved, as compared to conventional serial recording systems. This is of great benefit in experimental situations where specimen drift, radiation damage, or signal to noise ratio are limiting factors. Chapter 1 provides a brief introduction to the method of EELS analysis in the transmission electron microscope (TEM) and discusses the instrumentation required to generate and record EELS spectra. Chapter 2 contains a detailed review of the theory of homogeneous field magnetic sector spectrometers, following the work of Enge, Brown, and Heighway. The matrix method used to calculate the optical properties of such spectrometers is introduced, and the focussing coefficients for an arbitrary magnetic sector are derived to second order. A spectrometer analysis program based on the theory of chapter 2 is described in chapter 3. The program is used to calculate the aberration coefficients of two well known 2nd order corrected spectrometer designs [Shuman 1983, Scheinfein and Isaacson 1984] and hence determine the nature of the electron intensity distribution at their dispersion planes. Post-spectrometer magnification of the dispersion plane is required in parallel EELS in order to overcome the resolution limiting effects of electron scatter within the detector. The requirement that the magnifications in the dispersive and non-dispersive planes be independent indicates the use of quadrupole lenses as the magnifying elements. Chapter 4 reviews the theory of quadrupoles and extends the matrix transfer method of chapters 2 and 3 to quadrupole lenses. The design of a four lens quadrupole system suitable for post-spectrometer magnification in EELS is described in chapter 5. The system can vary the magnification in the dispersive direction from 5x to 97x (at 100 keV), while maintaining an almost constant magnification in the non-dispersive direction. Chapter 6 considers the types of multielement detectors which could be applied to parallel EELS, and discusses the advantages of using wide aperture linear photodiode arrays operating in the indirect mode as detection elements. The design and construction of the instrumentation required to operate two such arrays, manufactured by Reticon and Hamamatsu, is also reported in this chapter. Experiments on the electrical and optical performance of both these arrays are described in chapter 7. The results of these experiments indicate that the Hamamatsu device is the more suitable for detection of EELS spectra. Chapter 8 contains experimental results on the evaluation of various scintillator screens laid on fibre-optic plates directly coupled to the fibre-optic input window of the Hamamatsu array. The most suitable of the scintillators tested was a screen made from a single crystal of yttrium aluminium garnet (YAG) polished down to a thickness of 30um. The detective quantum efficiency of a prototype detector consisting of the Hamamatsu photodiode array fibre-optically coupled to such a screen is shown to be greater than 0.25 for a range of input electron doses varying from 40 electrons / channel-second to greater than 10
    corecore