278,738 research outputs found

    Feature based three-dimensional object recognition using disparity maps

    Get PDF
    The human vision system is able to recognize objects it has seen before even if the particular orientation of the object being viewed was not specifically seen before. This is due to the adaptability of the cognitive abilities of the human brain to categorize objects by different features. The features and experience used in the human recognition system are also applicable to a computer recognition system. The recognition of three-dimensional objects has been a popular area in computer vision research in recent years, as computer and machine vision is becoming more abundant in areas such as surveillance and product inspection. The purpose of this study is to explore and develop an adaptive computer vision based recognition system which can recognize 3D information of an object from a limited amount of training data in the form of disparity maps. Using this system, it should be possible to recognize an object in many different orientations, even if the specific orientation had not been seen before, as well as distinguish between different objects

    PEMANFAATAN AUGMENTED REALITY PADA KATALOG GEOMETRI

    Get PDF
    As the rapid development of Information Technology (IT) almost all areas of life want things to be interesting, easy and instant. Which is the world's education early / foundation to educate the next generation are required to follow the development of IT, but in reality there are many teachers who do not change and innovate by leveraging IT into teaching methods. Based on the results of previous observations, a teacher had difficulty explaining to the students about the introduction of objects in three-dimensional geometry in particular because the geometry of the material requires visualization capabilities of students is relatively high. Augmented Reality (AR) is a technology that combines virtual objects are two-dimensional and three-dimensional games into a real environment and projecting a three-dimensional virtual objects in the real environment. The research objective is to create a web-based learning media by using AR technology for object recognition Geometry. The method used is the Microsoft Solution Framework (MSF) with a waterfall system development methods and methods Development Object Oriented (OOD) for the method of approach. Stages in this study include identification of the problem, the initial planning, design and design, testing and implementation. It is concluded that AR can display an object geometry both flat field or fields of space into three-dimensional shapes that can be seen as a whole and can be used effectively and access online learning Geometry, manufacture marker formed in the catalog (Catalog Geometry) more interesting than just black and white marker

    2D GEOMETRIC SHAPE AND COLOR RECOGNITION USING DIGITAL IMAGE PROCESSING

    Get PDF
    ABSTRACT: The paper discusses an approach involving digital image processing and geometric logic for recognition of two dimensional shapes of objects such as squares, circles, rectangles and triangles as well as the color of the object. This approach can be extended to applications like robotic vision and computer intelligence. The methods involved are three dimensional RGB image to two dimensional black and white image conversion, color pixel classification for object-background separation, area based filtering and use of bounding box and its properties for calculating object metrics. The object metrics are compared with predetermined values that are characteristic of a particular object's shape. The recognition of the shape of the objects is made invariant to their rotation. Further, the colors of the objects are recognized by analyzing RGB information of all pixels within each object. The algorithm was developed and simulated using MATLAB. A set of 180 images of the four basic 2D geometric shapes and the three primary colors (red, green and blue) were used for analysis and the results were 99% accurate

    The contribution of nonrigid motion and shape information to object perception in pigeons and humans

    Get PDF
    The ability to perceive and recognize objects is essential to many animals, including humans. Until recently, models of object recognition have primarily focused on static cues, such as shape, but more recent research is beginning to show that motion plays an important role in object perception. Most studies have focused on rigid motion, a type of motion most often associated with inanimate objects. In contrast, nonrigid motion is often associated with biological motion and is therefore ecologically important to visually dependent animals. In this study, we examined the relative contribution of nonrigid motion and shape to object perception in humans and pigeons, two species that rely extensively on vision. Using a parametric morphing technique to systematically vary nonrigid motion and three-dimensional shape information, we found that both humans and pigeons were able to rely solely on either shape or nonrigid motion information to identify complex objects when one of the two cues was degraded. Humans and pigeons also showed similar 80% accuracy thresholds when the information from both shape and motion cues were degraded. We argue that the use of nonrigid motion for object perception is evolutionarily important and should be considered in general theories of vision at least with respect to visually sophisticated animals

    Dynamic Construction of Reduced Representations in the Brain for Perceptual Decision Behavior

    Get PDF
    Summary: Over the past decade, extensive studies of the brain regions that support face, object, and scene recognition suggest that these regions have a hierarchically organized architecture that spans the occipital and temporal lobes [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14], where visual categorizations unfold over the first 250 ms of processing [15, 16, 17, 18, 19]. This same architecture is flexibly involved in multiple tasks that require task-specific representationsā€”e.g. categorizing the same object as ā€œa carā€ or ā€œa Porsche.ā€ While we partly understand where and when these categorizations happen in the occipito-ventral pathway, the next challenge is to unravel how these categorizations happen. That is, how does high-dimensional input collapse in the occipito-ventral pathway to become low dimensional representations that guide behavior? To address this, we investigated what information the brain processes in a visual perception task and visualized the dynamic representation of this information in brain activity. To do so, we developed stimulus information representation (SIR), an information theoretic framework, to tease apart stimulus information that supports behavior from that which does not. We then tracked the dynamic representations of both in magneto-encephalographic (MEG) activity. Using SIR, we demonstrate that a rapid (āˆ¼170 ms) reduction of behaviorally irrelevant information occurs in the occipital cortex and that representations of the information that supports distinct behaviors are constructed in the right fusiform gyrus (rFG). Our results thus highlight how SIR can be used to investigate the component processes of the brain by considering interactions between three variables (stimulus information, brain activity, behavior), rather than just two, as is the current norm

    Analisa Perbandingan Citra Hasil Segmentasi Menggunakan Metode K-Means dan Fuzzy C Means pada Citra Input Terkompresi

    Get PDF
    In pattern recognition, image processing plays a role in automatically separating objects from the background. In addition, the object will be processed by the pattern classifier. In the medical world, image processing plays a very important role. CT Scan (Computed Tomography) or CAT Scan (Computed Axial Tomography) is an example of an image processing application that can be used to view fragments or cross sections of parts of the human body. Tomography is the process of producing two-dimensional images from three-dimensional film through several one-dimensional scans. Magnetic resonance imaging (MRI) is the image most often used in the field of radiology. MRI images can display the anatomical details of objects clearly in multiple sections (multiplanar) without changing the patient's position. In this study, two methods were compared, namely K-Means and Fuzzy C Means, in a segmentation process with the aim of separating between normal areas or areas with disturbances (lesions). The images used are brain and chest MRI images with a total of 10 MRI images. The image quality of the segmentation results is compared with the quality test using the Variation of Information (VOI) parameters, Global Consistency Error (GCE), MSE (Mean Square Error), PSNR (Peak Signal to Noise Ratio) and segmentation time.In pattern recognition, image processing plays a role in automatically separating objects from the background. In addition, the object will be processed by the pattern classifier. In the medical world, image processing plays a very important role. CT Scan (Computed Tomography) or CAT Scan (Computed Axial Tomography) is an example of an image processing application that can be used to view fragments or cross sections of parts of the human body. Tomography is the process of producing two-dimensional images from three-dimensional film through several one-dimensional scans. Magnetic resonance imaging (MRI) is the image most often used in the field of radiology. MRI images can display the anatomical details of objects clearly in multiple sections (multiplanar) without changing the patient's position. In this study, two methods were compared, namely K-Means and Fuzzy C Means, in a segmentation process with the aim of separating between normal areas or areas with disturbances (lesions). The images used are brain and chest MRI images with a total of 10 MRI images. The image quality of the segmentation results is compared with the quality test using the Variation of Information (VOI) parameters, Global Consistency Error (GCE), MSE (Mean Square Error), PSNR (Peak Signal to Noise Ratio) and segmentation time

    A Visual Sensor for Domestic Service Robots

    Get PDF
    In this study, we present a visual sensor for domestic service robots, which can capture both color information and three-dimensional information in real time, by calibrating a time of flight camera and two CCD cameras. The problem of occlusions is solved by the proposed occlusion detection algorithm. Since the proposed sensor uses two CCD cameras, missing color information of occluded pixels is compensated by one another. We conduct several evaluations to validate the proposed sensor, including investigation on object recognition task under occluded scenes using the visual sensor. The results revealed the effectiveness of proposed visual sensor

    Single camera pose estimation using Bayesian filtering and Kinect motion priors

    Full text link
    Traditional approaches to upper body pose estimation using monocular vision rely on complex body models and a large variety of geometric constraints. We argue that this is not ideal and somewhat inelegant as it results in large processing burdens, and instead attempt to incorporate these constraints through priors obtained directly from training data. A prior distribution covering the probability of a human pose occurring is used to incorporate likely human poses. This distribution is obtained offline, by fitting a Gaussian mixture model to a large dataset of recorded human body poses, tracked using a Kinect sensor. We combine this prior information with a random walk transition model to obtain an upper body model, suitable for use within a recursive Bayesian filtering framework. Our model can be viewed as a mixture of discrete Ornstein-Uhlenbeck processes, in that states behave as random walks, but drift towards a set of typically observed poses. This model is combined with measurements of the human head and hand positions, using recursive Bayesian estimation to incorporate temporal information. Measurements are obtained using face detection and a simple skin colour hand detector, trained using the detected face. The suggested model is designed with analytical tractability in mind and we show that the pose tracking can be Rao-Blackwellised using the mixture Kalman filter, allowing for computational efficiency while still incorporating bio-mechanical properties of the upper body. In addition, the use of the proposed upper body model allows reliable three-dimensional pose estimates to be obtained indirectly for a number of joints that are often difficult to detect using traditional object recognition strategies. Comparisons with Kinect sensor results and the state of the art in 2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014 conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video: https://www.youtube.com/watch?v=dJMTSo7-uF
    • ā€¦
    corecore