22 research outputs found

    Visual Analysis Algorithms for Embedded Systems

    Get PDF
    Visual search systems are very popular applications, but on-line versions in 3G wireless environments suffer from network constraint like unstable or limited bandwidth that entail latency in query delivery, significantly degenerating the user’s experience. An alternative is to exploit the ability of the newest mobile devices to perform heterogeneous activities, like not only creating but also processing images. Visual feature extraction and compression can be performed on on-board Graphical Processing Units (GPUs), making smartphones capable of detecting a generic object (matching) in an exact way or of performing a classification activity. The latest trends in visual search have resulted in dedicated efforts in MPEG standardization, namely the MPEG CDVS (Compact Descriptor for Visual Search) standard. CDVS is an ISO/IEC standard used to extract a compressed descriptor. As regards to classification, in recent years neural networks have acquired an impressive importance and have been applied to several domains. This thesis focuses on the use of Deep Neural networks to classify images by means of Deep learning. Implementing visual search algorithms and deep learning-based classification on embedded environments is not a mere code-porting activity. Recent embedded devices are equipped with a powerful but limited number of resources, like development boards such as GPGPUs. GPU architectures fit particularly well, because they allow to execute more operations in parallel, following the SIMD (Single Instruction Multiple Data) paradigm. Nonetheless, it is necessary to make good design choices for the best use of available hardware and memory. For visual search, following the MPEG CDVS standard, the contribution of this thesis is an efficient feature computation phase, a parallel CDVS detector, completely implemented on embedded devices supporting the OpenCL framework. Algorithmic choices and implementation details to target the intrinsic characteristics of the selected embedded platforms are presented and discussed. Experimental results on several GPUs show that the GPU-based solution is up to 7× faster than the CPU-based one. This speed-up opens new visual search scenarios exploiting entire real-time on-board computations with no data transfer. As regards to the use of Deep convolutional neural networks for off-line image classification, their computational and memory requirements are huge, and this is an issue on embedded devices. Most of the complexity derives from the convolutional layers and in particular from the matrix multiplications they entail. The contribution of this thesis is a self-contained implementation to image classification providing common layers used in neural networks. The approach relies on a heterogeneous CPU-GPU scheme for performing convolutions in the transform domain. Experimental results show that the heterogeneous scheme described in this thesis boasts a 50× speedup over the CPU-only reference and outperforms a GPU-based reference by 2×, while slashing the power consumption by nearly 30%

    HPatches: A benchmark and evaluation of handcrafted and learned local descriptors

    Full text link
    In this paper, we propose a novel benchmark for evaluating local image descriptors. We demonstrate that the existing datasets and evaluation protocols do not specify unambiguously all aspects of evaluation, leading to ambiguities and inconsistencies in results reported in the literature. Furthermore, these datasets are nearly saturated due to the recent improvements in local descriptors obtained by learning them from large annotated datasets. Therefore, we introduce a new large dataset suitable for training and testing modern descriptors, together with strictly defined evaluation protocols in several tasks such as matching, retrieval and classification. This allows for more realistic, and thus more reliable comparisons in different application scenarios. We evaluate the performance of several state-of-the-art descriptors and analyse their properties. We show that a simple normalisation of traditional hand-crafted descriptors can boost their performance to the level of deep learning based descriptors within a realistic benchmarks evaluation

    Seamless Positioning and Navigation in Urban Environment

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Robust Content Identification and De-Duplication with Scalable Fisher Vector In video with Temporal Sampling

    Get PDF
    Title from PDF of title page, viewed august 29, 2017Thesis advisor: Zhu LiVitaIncludes bibliographical references (pages 41-43)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2017Robust content identification and de-duplication of video content in networks and caches have many important applications in content delivery networks. In this work, we propose a scalable hashing scheme based Fisher Vector aggregation of selected key point features, and a frame significance function based non-uniform temporal sampling scheme on the video segments, to create a very compact binary representation of the content fragments that is agnostic to the typical coding and transcoding variations. The key innovations are a key point repeatability model that selects the best key point features, and a non-uniform sampling scheme that significantly reduces the bits required to represent a segment, and scalability from PCA feature dimension reduction and Fisher Vector features, and Simulation with various frame size and bit rate video contents for DASH streaming are tested and the proposed solution have very good performance of precision-recall, achieving 100% precision in duplication detection with recalls at 98% and above range.Introduction -- Software description -- Image processing -- SIFT feature extraction -- Principal component analysis -- Fisher vector aggregation -- Simulation results and discussions -- Conclusion and future work -- Appendi

    A combined multiple action recognition and summarization for surveillance video sequences

    Get PDF
    Human action recognition and video summarization represent challenging tasks for several computer vision applications including video surveillance, criminal investigations, and sports applications. For long videos, it is difficult to search within a video for a specific action and/or person. Usually, human action recognition approaches presented in the literature deal with videos that contain only a single person, and they are able to recognize his action. This paper proposes an effective approach to multiple human action detection, recognition, and summarization. The multiple action detection extracts human bodies’ silhouette, then generates a specific sequence for each one of them using motion detection and tracking method. Each of the extracted sequences is then divided into shots that represent homogeneous actions in the sequence using the similarity between each pair frames. Using the histogram of the oriented gradient (HOG) of the Temporal Difference Map (TDMap) of the frames of each shot, we recognize the action by performing a comparison between the generated HOG and the existed HOGs in the training phase which represents all the HOGs of many actions using a set of videos for training. Also, using the TDMap images we recognize the action using a proposed CNN model. Action summarization is performed for each detected person. The efficiency of the proposed approach is shown through the obtained results for mainly multi-action detection and recognition

    Movement correction in DCE-MRI through windowed and reconstruction dynamic mode decomposition

    Get PDF
    Images of the kidneys using dynamic contrast enhanced magnetic resonance renography (DCE-MRR) contains unwanted complex organ motion due to respiration. This gives rise to motion artefacts that hinder the clinical assessment of kidney function. However, due to the rapid change in contrast agent within the DCE-MR image sequence, commonly used intensity-based image registration techniques are likely to fail. While semi-automated approaches involving human experts are a possible alternative, they pose significant drawbacks including inter-observer variability, and the bottleneck introduced through manual inspection of the multiplicity of images produced during a DCE-MRR study. To address this issue, we present a novel automated, registration-free movement correction approach based on windowed and reconstruction variants of dynamic mode decomposition (WR-DMD). Our proposed method is validated on ten different healthy volunteers’ kidney DCEMRI data sets. The results, using block-matching-block evaluation on the image sequence produced by WR-DMD, show the elimination of 99% of mean motion magnitude when compared to the original data sets, thereby demonstrating the viability of automatic movement correction using WR-DMD

    Movement correction in DCE-MRI through windowed and reconstruction dynamic mode decomposition

    Get PDF
    Images of the kidneys using dynamic contrast enhanced magnetic resonance renography (DCE-MRR) contains unwanted complex organ motion due to respiration. This gives rise to motion artefacts that hinder the clinical assessment of kidney function. However, due to the rapid change in contrast agent within the DCE-MR image sequence, commonly used intensity-based image registration techniques are likely to fail. While semi-automated approaches involving human experts are a possible alternative, they pose significant drawbacks including inter-observer variability, and the bottleneck introduced through manual inspection of the multiplicity of images produced during a DCE-MRR study. To address this issue, we present a novel automated, registration-free movement correction approach based on windowed and reconstruction variants of dynamic mode decomposition (WR-DMD). Our proposed method is validated on ten different healthy volunteers’ kidney DCEMRI data sets. The results, using block-matching-block evaluation on the image sequence produced by WR-DMD, show the elimination of 99% of mean motion magnitude when compared to the original data sets, thereby demonstrating the viability of automatic movement correction using WR-DMD

    Aerodynamic Analysis and Optimization of Gliding Locust Wing Using Nash Genetic Algorithm

    Get PDF
    Natural fliers glide and minimize wing articulation to conserve energy for endured and long range flights. Elucidating the underlying physiology of such capability could potentially address numerous challenging problems in flight engineering. This study investigates the aerodynamic characteristics of an insect species called desert locust (Schistocerca gregaria) with an extraordinary gliding skills at low Reynolds number. Here, locust tandem wings are subjected to a computational fluid dynamics (CFD) simulation using 2D and 3D Navier-Stokes equations revealing fore-hindwing interactions, and the influence of their corrugations on the aerodynamic performance. Furthermore, the obtained CFD results are mathematically parameterized using PARSEC method and optimized based on a novel fusion of Genetic Algorithms and Nash game theory to achieve Nash equilibrium being the optimized wings. It was concluded that the lift-drag (gliding) ratio of the optimized profiles were improved by at least 77% and 150% compared to the original wing and the published literature, respectively. Ultimately, the profiles are integrated and analyzed using 3D CFD simulations that demonstrated a 14% performance improvement validating the proposed wing models for further fabrication and rapid prototyping presented in the future study
    corecore