1,676 research outputs found

    Automatic Color Inspection for Colored Wires in Electric Cables

    Get PDF
    In this paper, an automatic optical inspection system for checking the sequence of colored wires in electric cable is presented. The system is able to inspect cables with flat connectors differing in the type and number of wires. This variability is managed in an automatic way by means of a self-learning subsystem and does not require manual input from the operator or loading new data to the machine. The system is coupled to a connector crimping machine and once the model of a correct cable is learned, it can automatically inspect each cable assembled by the machine. The main contributions of this paper are: (i) the self-learning system; (ii) a robust segmentation algorithm for extracting wires from images even if they are strongly bent and partially overlapped; (iii) a color recognition algorithm able to cope with highlights and different finishing of the wire insulation. We report the system evaluation over a period of several months during the actual production of large batches of different cables; tests demonstrated a high level of accuracy and the absence of false negatives, which is a key point in order to guarantee defect-free productions

    Recognition of 2-D occluded objects and their manipulation by PUMA 560 robot

    Get PDF
    Journal ArticleA new method based on a cluster-structure paradigm is presented for the recognition of 2-D partially occluded objects. This method uses the line segments which comprise the boundary of an object in the recognition process. The length of each of these segments as well as the angle between successive segments comprise the only information needed by the program to find an object's position. The technique is applied in several steps which include segment clustering, finding all sequences in one pass over the data, and final clustering of sequences so as to obtain the desired rotational and translational information. The amount of computational effort decreases as the recognition algorithm progresses. As compared to earlier methods, which identify an object based on only one sequence of matched segments, the new technique allows the identification of all parts of the model which match with the apparent image. These parts need not be adjacent to each other. Also the method is able to tolerate a moderate change in scale and a significant amount of shape distortion arising as a result of segmentation or the polygonal approximation of the boundary of the object. The method has been evaluated with respect to a large number of examples where several objects partially occlude one another. A summary of the results is presented

    A PCA based method for image and video pose sequencing

    Get PDF
    Problems exist in image sequence processing that require an ordered set of object views. In some cases, multiple angled images are acquired in random order and the angle of view information is not available. When this occurs, the poses have to be put into proper order. For example, in databases containing images of an object or scene taken over a period of time, each image pose or angled-view with respect to the camera or scene is unknown. This is important to achieve a complete or partial three-dimensional reconstruction. Other applications exist in photogrammetry, machine vision, computer-aided design, and military intelligence. The main contribution of this thesis is an automated method for ordering images of random object views. This method uses Principal Component Analysis (PCA) and a confidence metric in eigenspace. The confidence measure is based on local curvature and correlation of the estimated pose trajectory in a multidimensional manifold. The use of the confidence metric is for detecting areas in the manifold where poses appear similar and ordering becomes difficult. It has been extended for use with synchronized double and multiple camera system by providing a basis for camera selection, choosing the most salient camera view for pose ordering. By adding multiple cameras, a high pose estimation accuracy can be achieved. This thesis compares other classification and recognition methods such as the Scale Invariant Feature Transform (SIFT) and Laplacian Eigenmaps. The SIFT algorithm struggles with pose sequencing because it computes local feature spaces for each image and does not consider the entire set of images. Laplacian eigenmaps show better results for ordering, but close analysis show it is better suited for clustering poses than sequencing. Results for ordering many set of objects, theoretical development, and comparison of methods is presented in this research

    IMPROVING EFFICIENCY AND SCALABILITY IN VISUAL SURVEILLANCE APPLICATIONS

    Get PDF
    We present four contributions to visual surveillance: (a) an action recognition method based on the characteristics of human motion in image space; (b) a study of the strengths of five regression techniques for monocular pose estimation that highlights the advantages of kernel PLS; (c) a learning-based method for detecting objects carried by humans requiring minimal annotation; (d) an interactive video segmentation system that reduces supervision by using occlusion and long term spatio-temporal structure information. We propose a representation for human actions that is based solely on motion information and that leverages the characteristics of human movement in the image space. The representation is best suited to visual surveillance settings in which the actions of interest are highly constrained, but also works on more general problems if the actions are ballistic in nature. Our computationally efficient representation achieves good recognition performance on both a commonly used action recognition dataset and on a dataset we collected to simulate a checkout counter. We study discriminative methods for 3D human pose estimation from single images, which build a map from image features to pose. The main difficulty with these methods is the insufficiency of training data due to the high dimensionality of the pose space. However, real datasets can be augmented with data from character animation software, so the scalability of existing approaches becomes important. We argue that Kernel Partial Least Squares approximates Gaussian Process regression robustly, enabling the use of larger datasets, and we show in experiments that kPLS outperforms two state-of-the-art methods based on GP. The high variability in the appearance of carried objects suggests using their relation to the human silhouette to detect them. We adopt a generate-and-test approach that produces candidate regions from protrusion, color contrast and occlusion boundary cues and then filters them with a kernel SVM classifier on context features. Our method exceeds state of the art accuracy and has good generalization capability. We also propose a Multiple Instance Learning framework for the classifier that reduces annotation effort by two orders of magnitude while maintaining comparable accuracy. Finally, we present an interactive video segmentation system that trades off a small amount of segmentation quality for significantly less supervision than necessary in systems in the literature. While applications like video editing could not directly use the output of our system, reasoning about the trajectories of objects in a scene or learning coarse appearance models is still possible. The unsupervised segmentation component at the base of our system effectively employs occlusion boundary cues and achieves competitive results on an unsupervised segmentation dataset. On videos used to evaluate interactive methods, our system requires less interaction time than others, does not rely on appearance information and can extract multiple objects at the same time

    Robust recognition and segmentation of human actions using HMMs with missing observations

    Get PDF
    This paper describes the integration of missing observation data with hidden Markov models to create a framework that is able to segment and classify individual actions from a stream of human motion using an incomplete 3D human pose estimation. Based on this framework, a model is trained to automatically segment and classify an activity sequence into its constituent subactions during inferencing. This is achieved by introducing action labels into the observation vector and setting these labels as missing data during inferencing, thus forcing the system to infer the probability of each action label. Additionally, missing data provides recognition-level support for occlusions and imperfect silhouette segmentation, permitting the use of a fast (real-time) pose estimation that delegates the burden of handling undetected limbs onto the action recognition system. Findings show that the use of missing data to segment activities is an accurate and elegant approach. Furthermore, action recognition can be accurate even when almost half of the pose feature data is missing due to occlusions, since not all of the pose data is important all of the time

    Towards a Unified Theory of Neocortex: Laminar Cortical Circuits for Vision and Cognition

    Full text link
    A key goal of computational neuroscience is to link brain mechanisms to behavioral functions. The present article describes recent progress towards explaining how laminar neocortical circuits give rise to biological intelligence. These circuits embody two new and revolutionary computational paradigms: Complementary Computing and Laminar Computing. Circuit properties include a novel synthesis of feedforward and feedback processing, of digital and analog processing, and of pre-attentive and attentive processing. This synthesis clarifies the appeal of Bayesian approaches but has a far greater predictive range that naturally extends to self-organizing processes. Examples from vision and cognition are summarized. A LAMINART architecture unifies properties of visual development, learning, perceptual grouping, attention, and 3D vision. A key modeling theme is that the mechanisms which enable development and learning to occur in a stable way imply properties of adult behavior. It is noted how higher-order attentional constraints can influence multiple cortical regions, and how spatial and object attention work together to learn view-invariant object categories. In particular, a form-fitting spatial attentional shroud can allow an emerging view-invariant object category to remain active while multiple view categories are associated with it during sequences of saccadic eye movements. Finally, the chapter summarizes recent work on the LIST PARSE model of cognitive information processing by the laminar circuits of prefrontal cortex. LIST PARSE models the short-term storage of event sequences in working memory, their unitization through learning into sequence, or list, chunks, and their read-out in planned sequential performance that is under volitional control. LIST PARSE provides a laminar embodiment of Item and Order working memories, also called Competitive Queuing models, that have been supported by both psychophysical and neurobiological data. These examples show how variations of a common laminar cortical design can embody properties of visual and cognitive intelligence that seem, at least on the surface, to be mechanistically unrelated.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
    • …
    corecore