94 research outputs found

    Computer Vision from Spatial-Multiplexing Cameras at Low Measurement Rates

    Get PDF
    abstract: In UAVs and parking lots, it is typical to first collect an enormous number of pixels using conventional imagers. This is followed by employment of expensive methods to compress by throwing away redundant data. Subsequently, the compressed data is transmitted to a ground station. The past decade has seen the emergence of novel imagers called spatial-multiplexing cameras, which offer compression at the sensing level itself by providing an arbitrary linear measurements of the scene instead of pixel-based sampling. In this dissertation, I discuss various approaches for effective information extraction from spatial-multiplexing measurements and present the trade-offs between reliability of the performance and computational/storage load of the system. In the first part, I present a reconstruction-free approach to high-level inference in computer vision, wherein I consider the specific case of activity analysis, and show that using correlation filters, one can perform effective action recognition and localization directly from a class of spatial-multiplexing cameras, called compressive cameras, even at very low measurement rates of 1\%. In the second part, I outline a deep learning based non-iterative and real-time algorithm to reconstruct images from compressively sensed (CS) measurements, which can outperform the traditional iterative CS reconstruction algorithms in terms of reconstruction quality and time complexity, especially at low measurement rates. To overcome the limitations of compressive cameras, which are operated with random measurements and not particularly tuned to any task, in the third part of the dissertation, I propose a method to design spatial-multiplexing measurements, which are tuned to facilitate the easy extraction of features that are useful in computer vision tasks like object tracking. The work presented in the dissertation provides sufficient evidence to high-level inference in computer vision at extremely low measurement rates, and hence allows us to think about the possibility of revamping the current day computer systems.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Converting Optical Videos to Infrared Videos Using Attention GAN and Its Impact on Target Detection and Classification Performance

    Get PDF
    To apply powerful deep-learning-based algorithms for object detection and classification in infrared videos, it is necessary to have more training data in order to build high-performance models. However, in many surveillance applications, one can have a lot more optical videos than infrared videos. This lack of IR video datasets can be mitigated if optical-to-infrared video conversion is possible. In this paper, we present a new approach for converting optical videos to infrared videos using deep learning. The basic idea is to focus on target areas using attention generative adversarial network (attention GAN), which will preserve the fidelity of target areas. The approach does not require paired images. The performance of the proposed attention GAN has been demonstrated using objective and subjective evaluations. Most importantly, the impact of attention GAN has been demonstrated in improved target detection and classification performance using real-infrared videos

    Model-Based Acquisition for Compressive Sensing & Imaging

    Get PDF
    Compressive sensing (CS) is a novel imaging technology based on the inherent redundancy of natural scenes. The minimum number of required measurements which defines the maximum image compression rate is lower-bounded by the sparsity of the image but is dependent on the type of acquisition patterns employed. Increased measurements by the Rice single pixel camera (SPC) slows down the acquisition process, which may cause the image recovery to be more susceptible to background noise and thus limit CS's application in imaging, detection, or classifying moving targets. In this study, two methods (hybrid-subspace sparse sampling (HSS) for imaging and secant projection on a manifold for classification are applied to solving this problem. For the HSS method, new image pattern are designed via robust principle component analysis (rPCA) on prior knowledge from a library of images to sense a common structure. After measuring coarse scale commonalities, the residual image becomes sparser, and then fewer measurements are needed. For the secant projection case, patterns that can preserve the pairwise distance between data points based on manifold learning are designed via semi-definite programming. These secant patterns turn out to be better in object classification over those learned from PCA. Both methods considerably decrease the number of required measurements for each task when compared with the purely random patterns of a more universal CS imaging system

    Remote Sensing and Geovisualization of Rock Slopes and Landslides

    Get PDF
    Over the past two decades, advances in remote sensing methods and technology have enabled larger and more sophisticated datasets to be collected. Due to these advances, the need to effectively and efficiently communicate and visualize data is becoming increasingly important. We demonstrate that the use of mixed- (MR) and virtual reality (VR) systems has provided very promising results, allowing the visualization of complex datasets with unprecedented levels of detail and user experience. However, as of today, such visualization techniques have been largely used for communication purposes, and limited applications have been developed to allow for data processing and collection, particularly within the engineering–geology field. In this paper, we demonstrate the potential use of MR and VR not only for the visualization of multi-sensor remote sensing data but also for the collection and analysis of geological data. In this paper, we present a conceptual workflow showing the approach used for the processing of remote sensing datasets and the subsequent visualization using MR and VR headsets. We demonstrate the use of computer applications built in-house to visualize datasets and numerical modelling results, and to perform rock core logging (XRCoreShack) and rock mass characterization (EasyMineXR). While important limitations still exist in terms of hardware capabilities, portability, and accessibility, the expected technological advances and cost reduction will ensure this technology forms a standard mapping and data analysis tool for future engineers and geoscientists

    VGC 2023 - Unveiling the dynamic Earth with digital methods: 5th Virtual Geoscience Conference: Book of Abstracts

    Get PDF
    Conference proceedings of the 5th Virtual Geoscience Conference, 21-22 September 2023, held in Dresden. The VGC is a multidisciplinary forum for researchers in geoscience, geomatics and related disciplines to share their latest developments and applications.:Short Courses 9 Workshops Stream 1 10 Workshop Stream 2 11 Workshop Stream 3 12 Session 1 – Point Cloud Processing: Workflows, Geometry & Semantics 14 Session 2 – Visualisation, communication & Teaching 27 Session 3 – Applying Machine Learning in Geosciences 36 Session 4 – Digital Outcrop Characterisation & Analysis 49 Session 5 – Airborne & Remote Mapping 58 Session 6 – Recent Developments in Geomorphic Process and Hazard Monitoring 69 Session 7 – Applications in Hydrology & Ecology 82 Poster Contributions 9

    Machine Learning in Sensors and Imaging

    Get PDF
    Machine learning is extending its applications in various fields, such as image processing, the Internet of Things, user interface, big data, manufacturing, management, etc. As data are required to build machine learning networks, sensors are one of the most important technologies. In addition, machine learning networks can contribute to the improvement in sensor performance and the creation of new sensor applications. This Special Issue addresses all types of machine learning applications related to sensors and imaging. It covers computer vision-based control, activity recognition, fuzzy label classification, failure classification, motor temperature estimation, the camera calibration of intelligent vehicles, error detection, color prior model, compressive sensing, wildfire risk assessment, shelf auditing, forest-growing stem volume estimation, road management, image denoising, and touchscreens

    Application of mixed and virtual reality in geoscience and engineering geology

    Get PDF
    Visual learning and efficient communication in mining and geotechnical practices is crucial, yet often challenging. With the advancement of Virtual Reality (VR) and Mixed Reality (MR) a new era of geovisualization has emerged. This thesis demonstrates the capabilities of a virtual continuum approach using varying scales of geoscience applications. An application that aids analyses of small-scale geological investigation was constructed using a 3D holographic drill core model. A virtual core logger was also developed to assist logging in the field and subsequent communication by visualizing the core in a complementary holographic environment. Enriched logging practices enhance interpretation with potential economic and safety benefits to mining and geotechnical infrastructure projects. A mine-scale model of the LKAB mine in Sweden was developed to improve communication on mining induced subsidence between geologists, engineers and the public. GPS, InSAR and micro-seismicity data were hosted in a single database, which was geovisualized through Virtual and Mixed Reality. The wide array of applications presented in this thesis illustrate the potential of Mixed and Virtual Reality and improvements gained on current conventional geological and geotechnical data collection, interpretation and communication at all scales from the micro- (e.g. thin section) to the macro- scale (e.g. mine)

    Object detection in dual-band infrared

    Get PDF
    Dual-Band Infrared (DBIR) offers the advantage of combining Mid-Wave Infrared (MWIR) and Long-Wave Infrared (LWIR) within a single field-of-view (FoV). This provides additional information for each spectral band. DBIR camera systems find applications in both military and civilian contexts. This work introduces a novel labeled DBIR dataset that includes civilian vehicles, aircraft, birds, and people. The dataset is designed for utilization in object detection and tracking algorithms. It comprises 233 objects with tracks spanning up to 1,300 frames, encompassing images in both MW and LW. This research reviews pertinent literature related to object detection, object detection in the infrared spectrum, and data fusion. Two sets of experiments were conducted using this DBIR dataset: Motion Detection and CNNbased object detection. For motion detection, a parallel implementation of the Visual Background Extractor (ViBe) was developed, employing ConnectedComponents analysis to generate bounding boxes. To assess these bounding boxes, Intersection-over-Union (IoU) calculations were performed. The results demonstrate that DBIR enhances the IoU of bounding boxes in 6.11% of cases within sequences where the camera’s field of view remains stationary. A size analysis reveals ViBe’s effectiveness in detecting small and dim objects within this dataset. A subsequent experiment employed You Only Look Once (YOLO) versions 4 and 7 to conduct inference on this dataset, following image preprocessing. The inference models were trained using visible spectrum MS COCO data. The findings confirm that YOLOv4/7 effectively detect objects within the infrared spectrum in this dataset. An assessment of these CNNs’ performance relative to the size of the detected object highlights the significance of object size in detection capabilities. Notably, DBIR substantially enhances detection capabilities in both YOLOv4 and YOLOv7; however, in the latter case, the number of False Positive detections increases. Consequently, while DBIR improves the recall of YOLOv4/7, the introduction of DBIR information reduces the precision of YOLOv7. This study also demonstrates the complementary nature of ViBe and YOLO in their detection capabilities based on object size in this data set. Though this is known prior art, an approach using these two approaches in a hybridized configuration is discussed. ViBe excels in detecting small, distant objects, while YOLO excels in detecting larger, closer objects. The research underscores that DBIR offers multiple advantages over MW or LW alone in modern computer vision algorithms, warranting further research investment
    • …
    corecore