131 research outputs found

    An Approach for Segmentation of Colored Images with Seeded Spatial Enhancement

    Get PDF
    In the image analysis, image segmentation is the operation that divides image into set of different segments. The work deals about common color image segmentation techniques and methods. Image enhancement is done using four connected approach for seed selection of the image. An algorithm is implemented on the basis of manual seed selection. It select a seed point in an image an then check for its four neighbor pixels connected to that particular seed point. And segment that image in foreground and background framing. At the end, the evaluation criterion will be introduced and applied on the algorithms results. Five most used image segmentation algorithms, namely, efficient graph based, K means, Mean shift, Expectation maximization and hybrid method are compared with implemented algorithm

    Color Image Edge Detection and Segmentation: A Comparison of the Vector Angle and the Euclidean Distance Color Similarity Measures

    Get PDF
    This work is based on Shafer's Dichromatic Reflection Model as applied to color image formation. The color spaces RGB, XYZ, CIELAB, CIELUV, rgb, l1l2l3, and the new h1h2h3 color space are discussed from this perspective. Two color similarity measures are studied: the Euclidean distance and the vector angle. The work in this thesis is motivated from a practical point of view by several shortcomings of current methods. The first problem is the inability of all known methods to properly segment objects from the background without interference from object shadows and highlights. The second shortcoming is the non-examination of the vector angle as a distance measure that is capable of directly evaluating hue similarity without considering intensity especially in RGB. Finally, there is inadequate research on the combination of hue- and intensity-based similarity measures to improve color similarity calculations given the advantages of each color distance measure. These distance measures were used for two image understanding tasks: edge detection, and one strategy for color image segmentation, namely color clustering. Edge detection algorithms using Euclidean distance and vector angle similarity measures as well as their combinations were examined. The list of algorithms is comprised of the modified Roberts operator, the Sobel operator, the Canny operator, the vector gradient operator, and the 3x3 difference vector operator. Pratt's Figure of Merit is used for a quantitative comparison of edge detection results. Color clustering was examined using the k-means (based on the Euclidean distance) and Mixture of Principal Components (based on the vector angle) algorithms. A new quantitative image segmentation evaluation procedure is introduced to assess the performance of both algorithms. Quantitative and qualitative results on many color images (artificial, staged scenes and natural scene images) indicate good edge detection performance using a vector version of the Sobel operator on the h1h2h3 color space. The results using combined hue- and intensity-based difference measures show a slight improvement qualitatively and over using each measure independently in RGB. Quantitative and qualitative results for image segmentation on the same set of images suggest that the best image segmentation results are obtained using the Mixture of Principal Components algorithm on the RGB, XYZ and rgb color spaces. Finally, poor color clustering results in the h1h2h3 color space suggest that some assumptions in deriving a simplified version of the Dichromatic Reflectance Model might have been violated

    A New High-Speed Foreign Fiber Detection System with Machine Vision

    Get PDF
    A new high-speed foreign fiber detection system with machine vision is proposed for removing foreign fibers from raw cotton using optimal hardware components and appropriate algorithms designing. Starting from a specialized lens of 3-charged couple device (CCD) camera, the system applied digital signal processor (DSP) and field-programmable gate array (FPGA) on image acquisition and processing illuminated by ultraviolet light, so as to identify transparent objects such as polyethylene and polypropylene fabric from cotton tuft flow by virtue of the fluorescent effect, until all foreign fibers that have been blown away safely by compressed air quality can be achieved. An image segmentation algorithm based on fast wavelet transform is proposed to identify block-like foreign fibers, and an improved canny detector is also developed to segment wire-like foreign fibers from raw cotton. The procedure naturally provides color image segmentation method with region growing algorithm for better adaptability. Experiments on a variety of images show that the proposed algorithms can effectively segment foreign fibers from test images under various circumstances

    On discovering and learning structure under limited supervision

    Full text link
    Les formes, les surfaces, les événements et les objets (vivants et non vivants) constituent le monde. L'intelligence des agents naturels, tels que les humains, va au-delà de la simple reconnaissance de formes. Nous excellons à construire des représentations et à distiller des connaissances pour comprendre et déduire la structure du monde. Spécifiquement, le développement de telles capacités de raisonnement peut se produire même avec une supervision limitée. D'autre part, malgré son développement phénoménal, les succès majeurs de l'apprentissage automatique, en particulier des modèles d'apprentissage profond, se situent principalement dans les tâches qui ont accès à de grands ensembles de données annotées. Dans cette thèse, nous proposons de nouvelles solutions pour aider à combler cette lacune en permettant aux modèles d'apprentissage automatique d'apprendre la structure et de permettre un raisonnement efficace en présence de tâches faiblement supervisés. Le thème récurrent de la thèse tente de s'articuler autour de la question « Comment un système perceptif peut-il apprendre à organiser des informations sensorielles en connaissances utiles sous une supervision limitée ? » Et il aborde les thèmes de la géométrie, de la composition et des associations dans quatre articles distincts avec des applications à la vision par ordinateur (CV) et à l'apprentissage par renforcement (RL). Notre première contribution ---Pix2Shape---présente une approche basée sur l'analyse par synthèse pour la perception. Pix2Shape exploite des modèles génératifs probabilistes pour apprendre des représentations 3D à partir d'images 2D uniques. Le formalisme qui en résulte nous offre une nouvelle façon de distiller l'information d'une scène ainsi qu'une représentation puissantes des images. Nous y parvenons en augmentant l'apprentissage profond non supervisé avec des biais inductifs basés sur la physique pour décomposer la structure causale des images en géométrie, orientation, pose, réflectance et éclairage. Notre deuxième contribution ---MILe--- aborde les problèmes d'ambiguïté dans les ensembles de données à label unique tels que ImageNet. Il est souvent inapproprié de décrire une image avec un seul label lorsqu'il est composé de plus d'un objet proéminent. Nous montrons que l'intégration d'idées issues de la littérature linguistique cognitive et l'imposition de biais inductifs appropriés aident à distiller de multiples descriptions possibles à l'aide d'ensembles de données aussi faiblement étiquetés. Ensuite, nous passons au paradigme d'apprentissage par renforcement, et considérons un agent interagissant avec son environnement sans signal de récompense. Notre troisième contribution ---HaC--- est une approche non supervisée basée sur la curiosité pour apprendre les associations entre les modalités visuelles et tactiles. Cela aide l'agent à explorer l'environnement de manière autonome et à utiliser davantage ses connaissances pour s'adapter aux tâches en aval. La supervision dense des récompenses n'est pas toujours disponible (ou n'est pas facile à concevoir), dans de tels cas, une exploration efficace est utile pour générer un comportement significatif de manière auto-supervisée. Pour notre contribution finale, nous abordons l'information limitée contenue dans les représentations obtenues par des agents RL non supervisés. Ceci peut avoir un effet néfaste sur la performance des agents lorsque leur perception est basée sur des images de haute dimension. Notre approche a base de modèles combine l'exploration et la planification sans récompense pour affiner efficacement les modèles pré-formés non supervisés, obtenant des résultats comparables à un agent entraîné spécifiquement sur ces tâches. Il s'agit d'une étape vers la création d'agents capables de généraliser rapidement à plusieurs tâches en utilisant uniquement des images comme perception.Shapes, surfaces, events, and objects (living and non-living) constitute the world. The intelligence of natural agents, such as humans is beyond pattern recognition. We excel at building representations and distilling knowledge to understand and infer the structure of the world. Critically, the development of such reasoning capabilities can occur even with limited supervision. On the other hand, despite its phenomenal development, the major successes of machine learning, in particular, deep learning models are primarily in tasks that have access to large annotated datasets. In this dissertation, we propose novel solutions to help address this gap by enabling machine learning models to learn the structure and enable effective reasoning in the presence of weakly supervised settings. The recurring theme of the thesis tries to revolve around the question of "How can a perceptual system learn to organize sensory information into useful knowledge under limited supervision?" And it discusses the themes of geometry, compositions, and associations in four separate articles with applications to computer vision (CV) and reinforcement learning (RL). Our first contribution ---Pix2Shape---presents an analysis-by-synthesis based approach(also referred to as inverse graphics) for perception. Pix2Shape leverages probabilistic generative models to learn 3D-aware representations from single 2D images. The resulting formalism allows us to perform a novel view synthesis of a scene and produce powerful representations of images. We achieve this by augmenting unsupervised learning with physically based inductive biases to decompose a scene structure into geometry, pose, reflectance and lighting. Our Second contribution ---MILe--- addresses the ambiguity issues in single-labeled datasets such as ImageNet. It is often inappropriate to describe an image with a single label when it is composed of more than one prominent object. We show that integrating ideas from Cognitive linguistic literature and imposing appropriate inductive biases helps in distilling multiple possible descriptions using such weakly labeled datasets. Next, moving into the RL setting, we consider an agent interacting with its environment without a reward signal. Our third Contribution ---HaC--- is a curiosity based unsupervised approach to learning associations between visual and tactile modalities. This aids the agent to explore the environment in an analogous self-guided fashion and further use this knowledge to adapt to downstream tasks. In the absence of reward supervision, intrinsic movitivation is useful to generate meaningful behavior in a self-supervised manner. In our final contribution, we address the representation learning bottleneck in unsupervised RL agents that has detrimental effect on the performance on high-dimensional pixel based inputs. Our model-based approach combines reward-free exploration and planning to efficiently fine-tune unsupervised pre-trained models, achieving comparable results to task-specific baselines. This is a step towards building agents that can generalize quickly on more than a single task using image inputs alone

    {3D} Morphable Face Models -- Past, Present and Future

    No full text
    In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation, and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research and highlighting the broad range of current and future applications

    Real-time landing place assessment in man-made environments

    Get PDF
    We propose a novel approach to the real-time landing site detection and assessment in unconstrained man-made environments using passive sensors. Because this task must be performed in a few seconds or less, existing methods are often limited to simple local intensity and edge variation cues. By contrast, we show how to efficiently take into account the potential sites' global shape, which is a critical cue in man-made scenes. Our method relies on a new segmentation algorithm and shape regularity measure to look for polygonal regions in video sequences. In this way, we enforce both temporal consistency and geometric regularity, resulting in very reliable and consistent detections. We demonstrate our approach for the detection of landable sites such as rural fields, building rooftops and runways from color and infrared monocular sequences significantly outperforming the state-of-the-art

    Designing content-based adversarial perturbations and distributed one-class learning for images.

    Get PDF
    PhD Theses.This thesis covers two privacy-related problems for images: designing adversarial perturbations that can be added to the input images to protect the private content of images that a user shares with other users from the undesirable automatic inference of classifiers, and training privacy-preserving classifiers on images that are distributed among their owners (image holders) and contain their private information. Adversarial images can be easily detected using denoising algorithms when high-frequency spatial perturbations are used, or can be noticed by humans when perturbations are large and irrelevant to the content of images. In addition to this, adversarial images are not transferable to unseen classifiers as perturbations are small (in terms of the lp norm). In the first part of the thesis, we propose content-based adversarial perturbations that account for the content of the images (objects, colour, structure and details), human perception and the semantics of the class labels to address the above-mentioned limitations of perturbations. Our adversarial colour perturbations selectively modify the colours of objects within chosen ranges that are perceived as natural by humans. In addition to these natural-looking adversarial images, our structure-aware perturbations exploit traditional image processing filters, such as detail enhancement filter and Gamma correction filter, to generate enhanced adversarial images. We validate the proposed perturbations against three classifiers trained on ImageNet. Experiments show that the proposed perturbations are more robust and transferable and cause misclassification with a label that is semantically different from the label of the original image, when compared with seven state-ofthe- art perturbations. Classifiers are often trained by relying on centralised collection and aggregation of images that could lead to significant privacy concerns by disclosing the sensitive information of image holders. In the second part of the thesis, we propose a privacy-preserving technique, called distributed one-class learning, that enables training to take place on edge devices and therefore image holders do not need to centralise their images. Each image holder can independently use their images to locally train a reconstructive adversarial network as their one-class classifier. As sending the model parameters to the service provider would reveal sensitive information, we secret-share the parameters among two non-colluding service providers. Then, we provide cryptographically private prediction services through a mixture of multi-party computation protocols to achieve substantial gains in complexity and speed. A major advantage of the proposed technique is that none of the image holders and service providers can access the parameters and images of other image holders. We quantify the benefits of the proposed technique and compare its 3 4 performance with centralised training on three privacy-sensitive image-based tasks. Experiments show that the proposed technique achieves similar classification performance as non-private centralised training, while not violating the privacy of the image holders

    Intelligent Transportation Related Complex Systems and Sensors

    Get PDF
    Building around innovative services related to different modes of transport and traffic management, intelligent transport systems (ITS) are being widely adopted worldwide to improve the efficiency and safety of the transportation system. They enable users to be better informed and make safer, more coordinated, and smarter decisions on the use of transport networks. Current ITSs are complex systems, made up of several components/sub-systems characterized by time-dependent interactions among themselves. Some examples of these transportation-related complex systems include: road traffic sensors, autonomous/automated cars, smart cities, smart sensors, virtual sensors, traffic control systems, smart roads, logistics systems, smart mobility systems, and many others that are emerging from niche areas. The efficient operation of these complex systems requires: i) efficient solutions to the issues of sensors/actuators used to capture and control the physical parameters of these systems, as well as the quality of data collected from these systems; ii) tackling complexities using simulations and analytical modelling techniques; and iii) applying optimization techniques to improve the performance of these systems. It includes twenty-four papers, which cover scientific concepts, frameworks, architectures and various other ideas on analytics, trends and applications of transportation-related data
    • …
    corecore