63 research outputs found

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF

    Object Discovery with a Mobile Robot

    Get PDF
    <p>The world is full of objects: cups, phones, computers, books, and</p><p>countless other things. For many tasks, robots need to understand that</p><p>this object is a stapler, that object is a textbook, and this other</p><p>object is a gallon of milk. The classic approach to this problem is</p><p>object recognition, which classifies each observation into one of</p><p>several previously-defined classes. While modern object recognition</p><p>algorithms perform well, they require extensive supervised training:</p><p>in a standard benchmark, the training data average more than four</p><p>hundred images of each object class.</p><p>The cost of manually labeling the training data prohibits these</p><p>techniques from scaling to general environments. Homes and workplaces</p><p>can contain hundreds of unique objects, and the objects in one</p><p>environment may not appear in another.</p><p>We propose a different approach: object discovery. Rather than rely on</p><p>manual labeling, we describe unsupervised algorithms that leverage the</p><p>unique capabilities of a mobile robot to discover the objects (and</p><p>classes of objects) in an environment. Because our algorithms are</p><p>unsupervised, they scale gracefully to large, general environments</p><p>over long periods of time. To validate our results, we collected 67</p><p>robotic runs through a large office environment. This dataset, which</p><p>we have made available to the community, is the largest of its kind.</p><p>At each step, we treat the problem as one of robotics, not disembodied</p><p>computer vision. The scale and quality of our results demonstrate the</p><p>merit of this perspective, and prove the practicality of long-term</p><p>large-scale object discovery.</p>Dissertatio

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    Single View Modeling and View Synthesis

    Get PDF
    This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments. In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm. Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work. In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold
    • …
    corecore