6,078 research outputs found

    REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction

    Full text link
    The ability to detect and analyze failed executions automatically is crucial for an explainable and robust robotic system. Recently, Large Language Models (LLMs) have demonstrated strong common sense reasoning skills on textual inputs. To leverage the power of LLM for robot failure explanation, we propose a framework REFLECT, which converts multi-sensory data into a hierarchical summary of robot past experiences and queries LLM with a progressive failure explanation algorithm. Conditioned on the explanation, a failure correction planner generates an executable plan for the robot to correct the failure and complete the task. To systematically evaluate the framework, we create the RoboFail dataset and show that our LLM-based framework is able to generate informative failure explanations that assist successful correction planning. Project website: https://roboreflect.github.io

    Hatching Egg Image Segmentation Based on Dense Blocks and the Hierarchical Sampling of Pixels

    Get PDF
    Fertility detection of hatching eggs is crucial in the manufacturing of vaccines. For hatching egg images, the segmentation results of blood vessels, cracks and air chambers are important for detecting the fertility of hatching eggs. In this paper, we propose an image segmentation method based on dense blocks and the hierarchical sampling of pixels. Dense blocks are used instead of the traditional layer-by-layer structure to improve efficiency and model robustness. The hierarchical sampling of pixels uses small batch sampling to add diversity during batch updates, which can accelerate learning. The sampled features are sparsely arranged and classified using an Multi-layer Perceptron(MLP), which can introduce complex nonlinear predictors and improve accuracy. The experimental results show that the MIoU reaches 90.5%. The proposed method can significantly improve the segmentation performance

    Crack Detection in Single- and Multi-Light Images of Painted Surfaces using Convolutional Neural Networks

    Get PDF
    Cracks represent an imminent danger for painted surfaces that needs to be alerted before degenerating into more severe aging effects, such as color loss. Automatic detection of cracks from painted surfaces' images would be therefore extremely useful for art conservators; however, classical image processing solutions are not effective to detect them, distinguish them from other lines or surface characteristics. A possible solution to improve the quality of crack detection exploits Multi-Light Image Collections (MLIC), that are often acquired in the Cultural Heritage domain thanks to the diffusion of the Reflectance Transformation Imaging (RTI) technique, allowing a low cost and rich digitization of artworks' surfaces. In this paper, we propose a pipeline for the detection of crack on egg-tempera paintings from multi-light image acquisitions and that can be used as well on single images. The method is based on single or multi-light edge detection and on a custom Convolutional Neural Network able to classify image patches around edge points as crack or non-crack, trained on RTI data. The pipeline is able to classify regions with cracks with good accuracy when applied on MLIC. Used on single images, it can give still reasonable results. The analysis of the performances for different lighting directions also reveals optimal lighting directions

    A Closer Look at Temporal Ordering in the Segmentation of Instructional Videos

    Full text link
    Understanding the steps required to perform a task is an important skill for AI systems. Learning these steps from instructional videos involves two subproblems: (i) identifying the temporal boundary of sequentially occurring segments and (ii) summarizing these steps in natural language. We refer to this task as Procedure Segmentation and Summarization (PSS). In this paper, we take a closer look at PSS and propose three fundamental improvements over current methods. The segmentation task is critical, as generating a correct summary requires each step of the procedure to be correctly identified. However, current segmentation metrics often overestimate the segmentation quality because they do not consider the temporal order of segments. In our first contribution, we propose a new segmentation metric that takes into account the order of segments, giving a more reliable measure of the accuracy of a given predicted segmentation. Current PSS methods are typically trained by proposing segments, matching them with the ground truth and computing a loss. However, much like segmentation metrics, existing matching algorithms do not consider the temporal order of the mapping between candidate segments and the ground truth. In our second contribution, we propose a matching algorithm that constrains the temporal order of segment mapping, and is also differentiable. Lastly, we introduce multi-modal feature training for PSS, which further improves segmentation. We evaluate our approach on two instructional video datasets (YouCook2 and Tasty) and observe an improvement over the state-of-the-art of 7%\sim7\% and 2.5%\sim2.5\% for procedure segmentation and summarization, respectively.Comment: Accepted at BMVC 202

    Prediction of fish quality level with machine learning

    Get PDF
    In this study, sea bream, sea bass, anchovy and trout were captured and recorded using a digital camera during refrigerated storage for 7 days. In addition, their total viable counts (TVC) were determined on a daily basis. Based on the TVC, each fish was classified as ‘fresh’ when it was [removed]7 log cfu per g. They were uploaded on a web-based machine learning software called Teachable Machine (TM), which was trained about the pupils and heads of the fish. In addition, images of each species from different angles were uploaded to the software in order to ensure the recognition of fish species by TM. The data of the study indicated that the TM was able to distinguish fish species with high accuracy rates and achieved over 86% success in estimating the freshness of the fish species tested. © 2022 Institute of Food Science and Technology

    Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

    Full text link
    First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor virtual home-environment pose significant sample-efficiency challenges for reinforcement learning (RL) agents learning from sparse task rewards. To alleviate these challenges, prior work has provided extensive supervision via a combination of reward-shaping, ground-truth object-information, and expert demonstrations. In this work, we show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task during task learning with an object-centric relational RL agent. Our key insight is that learning an object-model that incorporates object-attention into forward prediction provides a dense learning signal for unsupervised representation learning of both objects and their relationships. This, in turn, enables faster policy learning for an object-centric relational RL agent. We demonstrate our agent by introducing a set of challenging object-interaction tasks in the AI2Thor environment where learning with our attentive object-model is key to strong performance. Specifically, we compare our agent and relational RL agents with alternative auxiliary tasks to a relational RL agent equipped with ground-truth object-information, and show that learning with our object-model best closes the performance gap in terms of both learning speed and maximum success rate. Additionally, we find that incorporating object-attention into an object-model's forward predictions is key to learning representations which capture object-category and object-state

    Pretrained Convolutional Neural Networks As Feature Extrator Of Eggshell Mottling Pattern For Quality Inspection

    Get PDF
    There are technologies available in research in order to have inspection on macro and micro cracks on eggshell. However, there are still some difficulties when coming to inspection on translucent areas where before the micro-cracks happening. Transfer leaning using pre-trianed neural network is used at minimized computational resources while having a very high efficiency in classifying the eggs into three classes which are good, bad and unknown. Alexnet, Resnet and Inception of different architectures are compared to compute respective accuracy. It proved that the Alexnet gives highest predictive accuracy which is 96.80%, followed by Resnet, 93.15% and Inception, 90.16%. Results obtained from Alexnet is used to do statistical analysis such as ANOVA and student-t test to measure statistically significant differences between the means of accuracy from training set and testing set of image data. Visualization on channel along with activation strengths allow to know how a network learn to classify an egg with the help of Pareto chart. The deep dream images are generated by referring to the generation of images that produce desired activations

    FeetBack – Redirecting touch sensation from a prosthetic hand to the human foot

    Get PDF
    Introduction: Adding sensory feedback to myoelectric prosthetic hands was shown to enhance the user experience in terms of controllability and device embodiment. Often this is realized non-invasively by adding devices, such as actuators or electrodes, within the prosthetic shaft to deliver the desired feedback. However, adding a feedback system in the socket adds more weight, steals valuable space, and may interfere with myoelectric signals. To circumvent said drawbacks we tested for the first time if force feedback from a prosthetic hand could be redirected to another similarly sensitive part of the body: the foot. Methods: We developed a vibrotactile insole that vibrates depending on the sensed force on the prosthetic fingers. This self-controlled clinical pilot trial included four experienced users of myoelectric prostheses. The participants solved two types of tasks with the artificial hands: 1) sorting objects depending on their plasticity with the feedback insole but without audio-visual feedback, and 2) manipulating fragile, heavy, and delicate objects with and without the feedback insole. The sorting task was evaluated with Goodman-Kruskal’s gamma for ranked correlation. The manipulation tasks were assessed by the success rate. Results: The results from the sorting task with vibrotactile feedback showed a substantial positive effect. The success rates for manipulation tasks with fragile and heavy objects were high under both conditions (feedback on or off, respectively). The manipulation task with delicate objects revealed inferior success with feedback in three of four participants. Conclusion: We introduced a novel approach to touch sensation in myoelectric prostheses. The results for the sorting task and the manipulation tasks diverged. This is likely linked to the availability of various feedback sources. Our results for redirected feedback to the feet fall in line with previous similar studies that applied feedback to the residual arm
    corecore