3,864 research outputs found

    Learning Manipulation under Physics Constraints with Visual Perception

    Full text link
    Understanding physical phenomena is a key competence that enables humans and animals to act and interact under uncertain perception in previously unseen environments containing novel objects and their configurations. In this work, we consider the problem of autonomous block stacking and explore solutions to learning manipulation under physics constraints with visual perception inherent to the task. Inspired by the intuitive physics in humans, we first present an end-to-end learning-based approach to predict stability directly from appearance, contrasting a more traditional model-based approach with explicit 3D representations and physical simulation. We study the model's behavior together with an accompanied human subject test. It is then integrated into a real-world robotic system to guide the placement of a single wood block into the scene without collapsing existing tower structure. To further automate the process of consecutive blocks stacking, we present an alternative approach where the model learns the physics constraint through the interaction with the environment, bypassing the dedicated physics learning as in the former part of this work. In particular, we are interested in the type of tasks that require the agent to reach a given goal state that may be different for every new trial. Thereby we propose a deep reinforcement learning framework that learns policies for stacking tasks which are parametrized by a target structure.Comment: arXiv admin note: substantial text overlap with arXiv:1609.04861, arXiv:1711.00267, arXiv:1604.0006

    Learning Manipulation under Physics Constraints with Visual Perception

    No full text
    Understanding physical phenomena is a key competence that enables humans and animals to act and interact under uncertain perception in previously unseen environments containing novel objects and their configurations. In this work, we consider the problem of autonomous block stacking and explore solutions to learning manipulation under physics constraints with visual perception inherent to the task. Inspired by the intuitive physics in humans, we first present an end-to-end learning-based approach to predict stability directly from appearance, contrasting a more traditional model-based approach with explicit 3D representations and physical simulation. We study the model's behavior together with an accompanied human subject test. It is then integrated into a real-world robotic system to guide the placement of a single wood block into the scene without collapsing existing tower structure. To further automate the process of consecutive blocks stacking, we present an alternative approach where the model learns the physics constraint through the interaction with the environment, bypassing the dedicated physics learning as in the former part of this work. In particular, we are interested in the type of tasks that require the agent to reach a given goal state that may be different for every new trial. Thereby we propose a deep reinforcement learning framework that learns policies for stacking tasks which are parametrized by a target structure

    Rehan's Zone System

    Get PDF
    Contemporary digital photographic tools, technologies and techniques such as high dynamic range (HDR) photography, focus stacking and digitally-stitched panoramas have enabled photographers to have an unprecedented level of creative control over the final image. Rehan’s Zone System provides a systematic, adaptive workflow that makes use of these techniques to maximise tonal latitude, extend creative control, and, provide a greater opportunity to capture the decisive moment using faster shutter speeds

    Modeling and applications of the focus cue in conventional digital cameras

    Get PDF
    El enfoque en cámaras digitales juega un papel fundamental tanto en la calidad de la imagen como en la percepción del entorno. Esta tesis estudia el enfoque en cámaras digitales convencionales, tales como cámaras de móviles, fotográficas, webcams y similares. Una revisión rigurosa de los conceptos teóricos detras del enfoque en cámaras convencionales muestra que, a pasar de su utilidad, el modelo clásico del thin lens presenta muchas limitaciones para aplicación en diferentes problemas relacionados con el foco. En esta tesis, el focus profile es propuesto como una alternativa a conceptos clásicos como la profundidad de campo. Los nuevos conceptos introducidos en esta tesis son aplicados a diferentes problemas relacionados con el foco, tales como la adquisición eficiente de imágenes, estimación de profundidad, integración de elementos perceptuales y fusión de imágenes. Los resultados experimentales muestran la aplicación exitosa de los modelos propuestos.The focus of digital cameras plays a fundamental role in both the quality of the acquired images and the perception of the imaged scene. This thesis studies the focus cue in conventional cameras with focus control, such as cellphone cameras, photography cameras, webcams and the like. A deep review of the theoretical concepts behind focus in conventional cameras reveals that, despite its usefulness, the widely known thin lens model has several limitations for solving different focus-related problems in computer vision. In order to overcome these limitations, the focus profile model is introduced as an alternative to classic concepts, such as the near and far limits of the depth-of-field. The new concepts introduced in this dissertation are exploited for solving diverse focus-related problems, such as efficient image capture, depth estimation, visual cue integration and image fusion. The results obtained through an exhaustive experimental validation demonstrate the applicability of the proposed models

    Reconstruction from Periodic Nonlinearities, With Applications to HDR Imaging

    Full text link
    We consider the problem of reconstructing signals and images from periodic nonlinearities. For such problems, we design a measurement scheme that supports efficient reconstruction; moreover, our method can be adapted to extend to compressive sensing-based signal and image acquisition systems. Our techniques can be potentially useful for reducing the measurement complexity of high dynamic range (HDR) imaging systems, with little loss in reconstruction quality. Several numerical experiments on real data demonstrate the effectiveness of our approach

    Plant Seed Identification

    Get PDF
    Plant seed identification is routinely performed for seed certification in seed trade, phytosanitary certification for the import and export of agricultural commodities, and regulatory monitoring, surveillance, and enforcement. Current identification is performed manually by seed analysts with limited aiding tools. Extensive expertise and time is required, especially for small, morphologically similar seeds. Computers are, however, especially good at recognizing subtle differences that humans find difficult to perceive. In this thesis, a 2D, image-based computer-assisted approach is proposed. The size of plant seeds is extremely small compared with daily objects. The microscopic images of plant seeds are usually degraded by defocus blur due to the high magnification of the imaging equipment. It is necessary and beneficial to differentiate the in-focus and blurred regions given that only sharp regions carry distinctive information usually for identification. If the object of interest, the plant seed in this case, is in- focus under a single image frame, the amount of defocus blur can be employed as a cue to separate the object and the cluttered background. If the defocus blur is too strong to obscure the object itself, sharp regions of multiple image frames acquired at different focal distance can be merged together to make an all-in-focus image. This thesis describes a novel non-reference sharpness metric which exploits the distribution difference of uniform LBP patterns in blurred and non-blurred image regions. It runs in realtime on a single core cpu and responses much better on low contrast sharp regions than the competitor metrics. Its benefits are shown both in defocus segmentation and focal stacking. With the obtained all-in-focus seed image, a scale-wise pooling method is proposed to construct its feature representation. Since the imaging settings in lab testing are well constrained, the seed objects in the acquired image can be assumed to have measureable scale and controllable scale variance. The proposed method utilizes real pixel scale information and allows for accurate comparison of seeds across scales. By cross-validation on our high quality seed image dataset, better identification rate (95%) was achieved compared with pre- trained convolutional-neural-network-based models (93.6%). It offers an alternative method for image based identification with all-in-focus object images of limited scale variance. The very first digital seed identification tool of its kind was built and deployed for test in the seed laboratory of Canadian food inspection agency (CFIA). The proposed focal stacking algorithm was employed to create all-in-focus images, whereas scale-wise pooling feature representation was used as the image signature. Throughput, workload, and identification rate were evaluated and seed analysts reported significantly lower mental demand (p = 0.00245) when using the provided tool compared with manual identification. Although the identification rate in practical test is only around 50%, I have demonstrated common mistakes that have been made in the imaging process and possible ways to deploy the tool to improve the recognition rate
    corecore