679 research outputs found

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Development of Deep Learning Techniques for Image Retrieval

    Get PDF
    Images are used in many real-world applications, ranging from personal photo repositories to medical imaging systems. Image retrieval is a process in which the images in the database are first ranked in terms their similarities with respect to a query image, then a certain number of the images are retrieved from the ranked list that are most similar to the query image. The performance of an image retrieval algorithm is measured in terms of mean average precision. There are numerous applications of image retrieval. For example, face retrieval can help identify a person for security purposes, medical image retrieval can help doctors make more informed medical diagnoses, and commodity image retrieval can help customers find desired commodities. In recent years, image retrieval has gained more popularity in view of the emergence of large-capacity storage devices and the availability of low-cost image acquisition equipment. On the other hand, with the size and diversity of image databases continuously growing, the task of image retrieval has become increasingly more complex. Recent image retrieval techniques have focused on using deep learning techniques because of their exceptional feature extraction capability. However, deep image retrieval networks often employ very complex networks to achieve a desired performance, thus limiting their practicability in applications with limited storage and power capacity. The objective of this thesis is to design high-performance, low complexity deep networks for the task of image retrieval. This objective is achieved by developing three different low-complexity strategies for generating rich sets of discriminating features. Spatial information contained in images is crucial for providing detailed information about the positioning and interrelation of various elements within an image and thus, it plays an important role in distinguishing different images. As a result, designing a network to extract features that characterize this spatial information within an image is beneficial for the task of image retrieval. In the light of the importance of spatial information, in our first strategy, we develop two deep convolutional neural networks capable of extracting features with a focus on the spatial information. For the design of the first network, multi-scale dilated convolution operations are used to extract spatial information, whereas in the design of the second network, fusion of feature maps obtained from different hierarchical levels are employed to extract spatial information. Textural, structural, and edge information is very important for distinguishing images, and therefore, a network capable of extracting features characterizing this type of information about the images could be very useful for the task of image retrieval. Hence, in our second strategy, we develop a deep convolutional neural network that is guided to extract textural, structural, and edge information contained in an image. Since morphological operations process the texture and structure of the objects within an image based on their geometrical properties and edges are fundamental features of an image, we use morphological operations to guide the network in extracting textural and structural information, and a novel pooling operation for extracting the edge information in an image. Most of the researchers in the area of image retrieval have focused on developing algorithms aimed at yielding good retrieval performance at low computational complexity by outputting a list of certain number of images ranked in a decreasing order of similarity with respect to the query image. However, there are other researchers who have adopted a course of improving the results of an already existing image retrieval algorithm through a process of a re-ranking technique. A re-ranking scheme for image retrieval accesses the list of the images retrieved by an image retrieval algorithm and re-ranks them so that the re-ranked list at the output the scheme has a mean average precision value higher than that of the originally retrieved list. A re-ranking scheme is an overhead to the process of image retrieval, and therefore, its complexity should be as small as possible. Most of the re-ranking schemes in the literature aim to boost the retrieval performance at the expense of a very high computational complexity. Therefore, in our third strategy, we develop a computationally efficient re-ranking scheme for image retrieval, whose performance is superior to that of the existing re-ranking schemes. Since image hashing offers the dual benefits of computational efficiency and the ability to generate versatile image representation, we adopt it in the proposed re-ranking scheme. Extensive experiments are performed, in this thesis, using benchmark datasets, to demonstrate the effectiveness of the proposed new strategies in designing low-complexity deep networks for image retrieval

    Auto Signature Verification Using Line Projection Features Combined With Different Classifiers and Selection Methods

    Get PDF
    : Signature verification plays a role in the commercial, legal and financial fields. The signature continues to be one of the most preferred types of authentication for many documents such as checks, credit card transaction receipts, and other legal documents. In this study, we propose a system for validating handwritten bank check signatures to determine whether the signature is original or forged. The proposed system includes several steps including improving the signature image quality, noise reduction, feature extraction, and analysis. The extracted features depend on the signature line and projection features. To verify signatures, different classification methods are used. The system is then trained with a set of signatures to demonstrate the validity of the proposed signature verification system. The experimental results show that the best accuracy of 100% was obtained by combining several classification methods

    Texture-boundary detection in real-time

    Get PDF
    Boundary detection is an essential first-step for many computer vision applications. In practice, boundary detection is difficult because most images contain texture. Normally, texture-boundary detectors are complex, and so cannot run in real-time. On the other hand, the few texture boundary detectors that do run in real-time leave much to be desired in terms of quality. This thesis proposes two real-time texture-boundary detectors – the Variance Ridge Detector and the Texton Ridge Detector – both of which can detect high-quality texture-boundaries in real-time. The Variance Ridge Detector is able to run at 47 frames per second on 320 by 240 images, while scoring an F-measure of 0.62 (out of a theoretical maximum of 0.79) on the Berkeley segmentation dataset. The Texton Ridge Detector runs at 10 frames per second but produces slightly better results, with an F-measure score of 0.63. These objective measurements show that the two proposed texture-boundary detectors outperform all other texture-boundary detectors on either quality or speed. As boundary detection is so widely-used, this development could induce improvements to many real-time computer vision applications

    A Smart and Robust Automatic Inspection of Printed Labels Using an Image Hashing Technique

    Get PDF
    This work is focused on the development of a smart and automatic inspection system for printed labels. This is a challenging problem to solve since the collected labels are typically subjected to a variety of geometric and non-geometric distortions. Even though these distortions do not affect the content of a label, they have a substantial impact on the pixel value of the label image. Second, the faulty area may be extremely small as compared to the overall size of the labelling system. A further necessity is the ability to locate and isolate faults. To overcome this issue, a robust image hashing approach for the detection of erroneous labels has been developed. Image hashing techniques are generally used in image authentication, social event detection and image copy detection. Most of the image hashing methods are computationally extensive and also misjudge the images processed through the geometric transformation. In this paper, we present a novel idea to detect the faults in labels by incorporating image hashing along with the traditional computer vision algorithms to reduce the processing time. It is possible to apply Speeded Up Robust Features (SURF) to acquire alignment parameters so that the scheme is resistant to geometric and other distortions. The statistical mean is employed to generate the hash value. Even though this feature is quite simple, it has been found to be extremely effective in terms of computing complexity and the precision with which faults are detected, as proven by the experimental findings. Experimental results show that the proposed technique achieved an accuracy of 90.12%

    Iris Identification using Keypoint Descriptors and Geometric Hashing

    Get PDF
    Iris is one of the most reliable biometric trait due to its stability and randomness. Conventional recognition systems transform the iris to polar coordinates and perform well for co-operative databases. However, the problem aggravates to manifold for recognizing non-cooperative irises. In addition, the transformation of iris to polar domain introduces aliasing effect. In this thesis, the aforementioned issues are addressed by considering Noise Independent Annular Iris for feature extraction. Global feature extraction approaches are rendered as unsuitable for annular iris due to change in scale as they could not achieve invariance to ransformation and illumination. On the contrary, local features are invariant to image scaling, rotation and partially invariant to change in illumination and viewpoint. To extract local features, Harris Corner Points are detected from iris and matched using novel Dual stage approach. Harris corner improves accuracy but fails to achieve scale invariance. Further, Scale Invariant Feature Transform (SIFT) has been applied to annular iris and results are found to be very promising. However, SIFT is computationally expensive for recognition due to higher dimensional descriptor. Thus, a recently evolved keypoint descriptor called Speeded Up Robust Features (SURF) is applied to mark performance improvement in terms of time as well as accuracy. For identification, retrieval time plays a significant role in addition to accuracy. Traditional indexing approaches cannot be applied to biometrics as data are unstructured. In this thesis, two novel approaches has been developed for indexing iris database. In the first approach, Energy Histogram of DCT coefficients is used to form a B-tree. This approach performs well for cooperative databases. In the second approach, indexing is done using Geometric Hashing of SIFT keypoints. The latter indexing approach achieves invariance to similarity transformations, illumination and occlusion and performs with an accuracy of more than 98% for cooperative as well as non-cooperative databases

    Report on shape analysis and matching and on semantic matching

    No full text
    In GRAVITATE, two disparate specialities will come together in one working platform for the archaeologist: the fields of shape analysis, and of metadata search. These fields are relatively disjoint at the moment, and the research and development challenge of GRAVITATE is precisely to merge them for our chosen tasks. As shown in chapter 7 the small amount of literature that already attempts join 3D geometry and semantics is not related to the cultural heritage domain. Therefore, after the project is done, there should be a clear ‘before-GRAVITATE’ and ‘after-GRAVITATE’ split in how these two aspects of a cultural heritage artefact are treated.This state of the art report (SOTA) is ‘before-GRAVITATE’. Shape analysis and metadata description are described separately, as currently in the literature and we end the report with common recommendations in chapter 8 on possible or plausible cross-connections that suggest themselves. These considerations will be refined for the Roadmap for Research deliverable.Within the project, a jargon is developing in which ‘geometry’ stands for the physical properties of an artefact (not only its shape, but also its colour and material) and ‘metadata’ is used as a general shorthand for the semantic description of the provenance, location, ownership, classification, use etc. of the artefact. As we proceed in the project, we will find a need to refine those broad divisions, and find intermediate classes (such as a semantic description of certain colour patterns), but for now the terminology is convenient – not least because it highlights the interesting area where both aspects meet.On the ‘geometry’ side, the GRAVITATE partners are UVA, Technion, CNR/IMATI; on the metadata side, IT Innovation, British Museum and Cyprus Institute; the latter two of course also playing the role of internal users, and representatives of the Cultural Heritage (CH) data and target user’s group. CNR/IMATI’s experience in shape analysis and similarity will be an important bridge between the two worlds for geometry and metadata. The authorship and styles of this SOTA reflect these specialisms: the first part (chapters 3 and 4) purely by the geometry partners (mostly IMATI and UVA), the second part (chapters 5 and 6) by the metadata partners, especially IT Innovation while the joint overview on 3D geometry and semantics is mainly by IT Innovation and IMATI. The common section on Perspectives was written with the contribution of all
    corecore