19,608 research outputs found

    Local Motion Planner for Autonomous Navigation in Vineyards with a RGB-D Camera-Based Algorithm and Deep Learning Synergy

    Get PDF
    With the advent of agriculture 3.0 and 4.0, researchers are increasingly focusing on the development of innovative smart farming and precision agriculture technologies by introducing automation and robotics into the agricultural processes. Autonomous agricultural field machines have been gaining significant attention from farmers and industries to reduce costs, human workload, and required resources. Nevertheless, achieving sufficient autonomous navigation capabilities requires the simultaneous cooperation of different processes; localization, mapping, and path planning are just some of the steps that aim at providing to the machine the right set of skills to operate in semi-structured and unstructured environments. In this context, this study presents a low-cost local motion planner for autonomous navigation in vineyards based only on an RGB-D camera, low range hardware, and a dual layer control algorithm. The first algorithm exploits the disparity map and its depth representation to generate a proportional control for the robotic platform. Concurrently, a second back-up algorithm, based on representations learning and resilient to illumination variations, can take control of the machine in case of a momentaneous failure of the first block. Moreover, due to the double nature of the system, after initial training of the deep learning model with an initial dataset, the strict synergy between the two algorithms opens the possibility of exploiting new automatically labeled data, coming from the field, to extend the existing model knowledge. The machine learning algorithm has been trained and tested, using transfer learning, with acquired images during different field surveys in the North region of Italy and then optimized for on-device inference with model pruning and quantization. Finally, the overall system has been validated with a customized robot platform in the relevant environment

    Viewfinder: final activity report

    Get PDF
    The VIEW-FINDER project (2006-2009) is an 'Advanced Robotics' project that seeks to apply a semi-autonomous robotic system to inspect ground safety in the event of a fire. Its primary aim is to gather data (visual and chemical) in order to assist rescue personnel. A base station combines the gathered information with information retrieved from off-site sources. The project addresses key issues related to map building and reconstruction, interfacing local command information with external sources, human-robot interfaces and semi-autonomous robot navigation. The VIEW-FINDER system is a semi-autonomous; the individual robot-sensors operate autonomously within the limits of the task assigned to them, that is, they will autonomously navigate through and inspect an area. Human operators monitor their operations and send high level task requests as well as low level commands through the interface to any nodes in the entire system. The human interface has to ensure the human supervisor and human interveners are provided a reduced but good and relevant overview of the ground and the robots and human rescue workers therein

    Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

    Full text link
    This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot's visuomotor policy as a conditional denoising diffusion process. We benchmark Diffusion Policy across 11 different tasks from 4 different robot manipulation benchmarks and find that it consistently outperforms existing state-of-the-art robot learning methods with an average improvement of 46.9%. Diffusion Policy learns the gradient of the action-distribution score function and iteratively optimizes with respect to this gradient field during inference via a series of stochastic Langevin dynamics steps. We find that the diffusion formulation yields powerful advantages when used for robot policies, including gracefully handling multimodal action distributions, being suitable for high-dimensional action spaces, and exhibiting impressive training stability. To fully unlock the potential of diffusion models for visuomotor policy learning on physical robots, this paper presents a set of key technical contributions including the incorporation of receding horizon control, visual conditioning, and the time-series diffusion transformer. We hope this work will help motivate a new generation of policy learning techniques that are able to leverage the powerful generative modeling capabilities of diffusion models. Code, data, and training details will be publicly available

    DoseDiff: Distance-aware Diffusion Model for Dose Prediction in Radiotherapy

    Full text link
    Treatment planning is a critical component of the radiotherapy workflow, typically carried out by a medical physicist using a time-consuming trial-and-error manner. Previous studies have proposed knowledge-based or deep learning-based methods for predicting dose distribution maps to assist medical physicists in improving the efficiency of treatment planning. However, these dose prediction methods usuallylack the effective utilization of distance information between surrounding tissues andtargets or organs-at-risk (OARs). Moreover, they are poor in maintaining the distribution characteristics of ray paths in the predicted dose distribution maps, resulting in a loss of valuable information obtained by medical physicists. In this paper, we propose a distance-aware diffusion model (DoseDiff) for precise prediction of dose distribution. We define dose prediction as a sequence of denoising steps, wherein the predicted dose distribution map is generated with the conditions of the CT image and signed distance maps (SDMs). The SDMs are obtained by a distance transformation from the masks of targets or OARs, which provide the distance information from each pixel in the image to the outline of the targets or OARs. Besides, we propose a multiencoder and multi-scale fusion network (MMFNet) that incorporates a multi-scale fusion and a transformer-based fusion module to enhance information fusion between the CT image and SDMs at the feature level. Our model was evaluated on two datasets collected from patients with breast cancer and nasopharyngeal cancer, respectively. The results demonstrate that our DoseDiff outperforms the state-of-the-art dose prediction methods in terms of both quantitative and visual quality

    LMBiS-Net: A Lightweight Multipath Bidirectional Skip Connection based CNN for Retinal Blood Vessel Segmentation

    Full text link
    Blinding eye diseases are often correlated with altered retinal morphology, which can be clinically identified by segmenting retinal structures in fundus images. However, current methodologies often fall short in accurately segmenting delicate vessels. Although deep learning has shown promise in medical image segmentation, its reliance on repeated convolution and pooling operations can hinder the representation of edge information, ultimately limiting overall segmentation accuracy. In this paper, we propose a lightweight pixel-level CNN named LMBiS-Net for the segmentation of retinal vessels with an exceptionally low number of learnable parameters \textbf{(only 0.172 M)}. The network used multipath feature extraction blocks and incorporates bidirectional skip connections for the information flow between the encoder and decoder. Additionally, we have optimized the efficiency of the model by carefully selecting the number of filters to avoid filter overlap. This optimization significantly reduces training time and enhances computational efficiency. To assess the robustness and generalizability of LMBiS-Net, we performed comprehensive evaluations on various aspects of retinal images. Specifically, the model was subjected to rigorous tests to accurately segment retinal vessels, which play a vital role in ophthalmological diagnosis and treatment. By focusing on the retinal blood vessels, we were able to thoroughly analyze the performance and effectiveness of the LMBiS-Net model. The results of our tests demonstrate that LMBiS-Net is not only robust and generalizable but also capable of maintaining high levels of segmentation accuracy. These characteristics highlight the potential of LMBiS-Net as an efficient tool for high-speed and accurate segmentation of retinal images in various clinical applications

    Guidelines for Best Practice and Quality Checking of Ortho Imagery

    Get PDF
    For almost 10 years JRC's ¿Guidelines for Best Practice and Quality Control of Ortho Imagery¿ has served as a reference document for the production of orthoimagery not only for the purposes of CAP but also for many medium-to-large scale photogrammetric applications. The aim is to provide the European Commission and the remote sensing user community with a general framework of the best approaches for quality checking of orthorectified remotely sensed imagery, and the expected best practice, required to achieve good results. Since the last major revision (2003) the document was regularly updated in order to include state-of-the-art technologies. The major revision of the document was initiated last year in order to consolidate the information that was introduced to the document in the last five years. Following the internal discussion and the outcomes of the meeting with an expert panel it was decided to adopt as possible a process-based structure instead of a more sensor-based used before and also to keep the document as much generic as possible by focusing on the core aspects of the photogrammetric process. Additionally to any structural changes in the document new information was introduced mainly concerned with image resolution and radiometry, digital airborne sensors, data fusion, mosaicking and data compression. The Guidelines of best practice is used as the base for our work on the definition of technical specifications for the orthoimagery. The scope is to establish a core set of measures to ensure sufficient image quality for the purposes of CAP and particularly for the Land Parcel Identification System (PLIS), and also to define the set of metadata necessary for data documentation and overall job tracking.JRC.G.3-Agricultur

    TLS MODELS GENERATION ASSISTED BY UAV SURVEY

    Get PDF
    By now the documentation and 3D modelling activities of built heritage concern in an almost usual way terrestrial Lidar techniques (TLS, Terrestrial Laser Scanning), and large scale mapping derived by UAV (Unmanned Aerial Vehicle) survey. This paper refers an example of 3D survey and reality based modelling applied on landscape and architectural assets. The choice of methods for documentation, in terms of survey techniques, depends primarily on issues and features of the area. The achieved experience, allow to consider that the easy handling of TLS has enabled the use in limited spaces among buildings and collapsed roofs, but the topographic measure of GCPs (Ground Control Points), neither by total station nor by GPS/RTK technique, was easily feasible. Even more than proving the ability of the integration of TLS and UAV photogrammetry to achieve a multi-source and multi-scale whole model of a village, the experience has been a test to experiment the registration of terrestrial clouds with the support of control points derived by UAV survey and finally, a comparison among different strategies of clouds registration is reported. Analysing for each approach a number of parameters (number of clouds registration, number of needed points, processing time, overall accuracy) the further comparisons have been achieved. The test revealed that it is possible to decrease the large number of terrestrial control points when their determination by topographical measures is difficult, and it is possible to combine the techniques not only for the integration of the final 3Dmodel, but also to solve and make the initial stage of the drafting process more effective
    • …
    corecore