2,489 research outputs found

    Impact of Imaging and Distance Perception in VR Immersive Visual Experience

    Get PDF
    Virtual reality (VR) headsets have evolved to include unprecedented viewing quality. Meanwhile, they have become lightweight, wireless, and low-cost, which has opened to new applications and a much wider audience. VR headsets can now provide users with greater understanding of events and accuracy of observation, making decision-making faster and more effective. However, the spread of immersive technologies has shown a slow take-up, with the adoption of virtual reality limited to a few applications, typically related to entertainment. This reluctance appears to be due to the often-necessary change of operating paradigm and some scepticism towards the "VR advantage". The need therefore arises to evaluate the contribution that a VR system can make to user performance, for example to monitoring and decision-making. This will help system designers understand when immersive technologies can be proposed to replace or complement standard display systems such as a desktop monitor. In parallel to the VR headsets evolution there has been that of 360 cameras, which are now capable to instantly acquire photographs and videos in stereoscopic 3D (S3D) modality, with very high resolutions. 360° images are innately suited to VR headsets, where the captured view can be observed and explored through the natural rotation of the head. Acquired views can even be experienced and navigated from the inside as they are captured. The combination of omnidirectional images and VR headsets has opened to a new way of creating immersive visual representations. We call it: photo-based VR. This represents a new methodology that combines traditional model-based rendering with high-quality omnidirectional texture-mapping. Photo-based VR is particularly suitable for applications related to remote visits and realistic scene reconstruction, useful for monitoring and surveillance systems, control panels and operator training. The presented PhD study investigates the potential of photo-based VR representations. It starts by evaluating the role of immersion and user’s performance in today's graphical visual experience, to then use it as a reference to develop and evaluate new photo-based VR solutions. With the current literature on photo-based VR experience and associated user performance being very limited, this study builds new knowledge from the proposed assessments. We conduct five user studies on a few representative applications examining how visual representations can be affected by system factors (camera and display related) and how it can influence human factors (such as realism, presence, and emotions). Particular attention is paid to realistic depth perception, to support which we develop target solutions for photo-based VR. They are intended to provide users with a correct perception of space dimension and objects size. We call it: true-dimensional visualization. The presented work contributes to unexplored fields including photo-based VR and true-dimensional visualization, offering immersive system designers a thorough comprehension of the benefits, potential, and type of applications in which these new methods can make the difference. This thesis manuscript and its findings have been partly presented in scientific publications. In particular, five conference papers on Springer and the IEEE symposia, [1], [2], [3], [4], [5], and one journal article in an IEEE periodical [6], have been published

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    A Benchmark Comparison of Visual Place Recognition Techniques for Resource-Constrained Embedded Platforms

    Get PDF
    Autonomous navigation has become a widely researched area of expertise over the past few years, gaining a massive following due to its necessity in creating a fully autonomous robotic system. Autonomous navigation is an exceedingly difficult task to accomplish in and of itself. Successful navigation relies heavily on the ability to self-localise oneself within a given environment. Without this awareness of one’s own location, it is impossible to successfully navigate in an autonomous manner. Since its inception Simultaneous Localization and Mapping (SLAM) has become one of the most widely researched areas of autonomous navigation. SLAM focuses on self-localization within a mapped or un-mapped environment, and constructing or updating the map of one’s surroundings. Visual Place Recognition (VPR) is an essential part of any SLAM system. VPR relies on visual cues to determine one’s location within a mapped environment. This thesis presents two main topics within the field of VPR. First, this thesis presents a benchmark analysis of several popular embedded platforms when performing VPR. The presented benchmark analyses six different VPR techniques across three different datasets, and investigates accuracy, CPU usage, memory usage, processing time and power consumption. The benchmark demonstrated a clear relationship between platform architecture and the metrics measured, with platforms of the same architecture achieving comparable accuracy and algorithm efficiency. Additionally, the Raspberry Pi platform was noted as a standout in terms of algorithm efficiency and power consumption. Secondly, this thesis proposes an evaluation framework intended to provide information about a VPR technique’s useability within a real-time application. The approach makes use of the incoming frame rate of an image stream and the VPR frame rate, the rate at which the technique can perform VPR, to determine how efficient VPR techniques would be in a real-time environment. This evaluation framework determined that CoHOG would be the most effective algorithm to be deployed in a real-time environment as it had the best ratio between computation time and accuracy

    Design, Integration, and Field Evaluation of a Robotic Blossom Thinning System for Tree Fruit Crops

    Full text link
    The US apple industry relies heavily on semi-skilled manual labor force for essential field operations such as training, pruning, blossom and green fruit thinning, and harvesting. Blossom thinning is one of the crucial crop load management practices to achieve desired crop load, fruit quality, and return bloom. While several techniques such as chemical, and mechanical thinning are available for large-scale blossom thinning such approaches often yield unpredictable thinning results and may cause damage the canopy, spurs, and leaf tissue. Hence, growers still depend on laborious, labor intensive and expensive manual hand blossom thinning for desired thinning outcomes. This research presents a robotic solution for blossom thinning in apple orchards using a computer vision system with artificial intelligence, a six degrees of freedom robotic manipulator, and an electrically actuated miniature end-effector for robotic blossom thinning. The integrated robotic system was evaluated in a commercial apple orchard which showed promising results for targeted and selective blossom thinning. Two thinning approaches, center and boundary thinning, were investigated to evaluate the system ability to remove varying proportion of flowers from apple flower clusters. During boundary thinning the end effector was actuated around the cluster boundary while center thinning involved end-effector actuation only at the cluster centroid for a fixed duration of 2 seconds. The boundary thinning approach thinned 67.2% of flowers from the targeted clusters with a cycle time of 9.0 seconds per cluster, whereas center thinning approach thinned 59.4% of flowers with a cycle time of 7.2 seconds per cluster. When commercially adopted, the proposed system could help address problems faced by apple growers with current hand, chemical, and mechanical blossom thinning approaches

    Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5

    Get PDF
    This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered. First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes. Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification. Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well

    OmniLRS: A Photorealistic Simulator for Lunar Robotics

    Full text link
    Developing algorithms for extra-terrestrial robotic exploration has always been challenging. Along with the complexity associated with these environments, one of the main issues remains the evaluation of said algorithms. With the regained interest in lunar exploration, there is also a demand for quality simulators that will enable the development of lunar robots. % In this paper, we explain how we built a Lunar simulator based on Isaac Sim, Nvidia's robotic simulator. In this paper, we propose Omniverse Lunar Robotic-Sim (OmniLRS) that is a photorealistic Lunar simulator based on Nvidia's robotic simulator. This simulation provides fast procedural environment generation, multi-robot capabilities, along with synthetic data pipeline for machine-learning applications. It comes with ROS1 and ROS2 bindings to control not only the robots, but also the environments. This work also performs sim-to-real rock instance segmentation to show the effectiveness of our simulator for image-based perception. Trained on our synthetic data, a yolov8 model achieves performance close to a model trained on real-world data, with 5% performance gap. When finetuned with real data, the model achieves 14% higher average precision than the model trained on real-world data, demonstrating our simulator's photorealism.% to realize sim-to-real. The code is fully open-source, accessible here: https://github.com/AntoineRichard/LunarSim, and comes with demonstrations.Comment: 7 pages, 4 figure

    Deep learning in crowd counting: A survey

    Get PDF
    Counting high-density objects quickly and accurately is a popular area of research. Crowd counting has significant social and economic value and is a major focus in artificial intelligence. Despite many advancements in this field, many of them are not widely known, especially in terms of research data. The authors proposed a three-tier standardised dataset taxonomy (TSDT). The Taxonomy divides datasets into small-scale, large-scale and hyper-scale, according to different application scenarios. This theory can help researchers make more efficient use of datasets and improve the performance of AI algorithms in specific fields. Additionally, the authors proposed a new evaluation index for the clarity of the dataset: average pixel occupied by each object (APO). This new evaluation index is more suitable for evaluating the clarity of the dataset in the object counting task than the image resolution. Moreover, the authors classified the crowd counting methods from a data-driven perspective: multi-scale networks, single-column networks, multi-column networks, multi-task networks, attention networks and weak-supervised networks and introduced the classic crowd counting methods of each class. The authors classified the existing 36 datasets according to the theory of three-tier standardised dataset taxonomy and discussed and evaluated these datasets. The authors evaluated the performance of more than 100 methods in the past five years on different levels of popular datasets. Recently, progress in research on small-scale datasets has slowed down. There are few new datasets and algorithms on small-scale datasets. The studies focused on large or hyper-scale datasets appear to be reaching a saturation point. The combined use of multiple approaches began to be a major research direction. The authors discussed the theoretical and practical challenges of crowd counting from the perspective of data, algorithms and computing resources. The field of crowd counting is moving towards combining multiple methods and requires fresh, targeted datasets. Despite advancements, the field still faces challenges such as handling real-world scenarios and processing large crowds in real-time. Researchers are exploring transfer learning to overcome the limitations of small datasets. The development of effective algorithms for crowd counting remains a challenging and important task in computer vision and AI, with many opportunities for future research.BHF, AA/18/3/34220Hope Foundation for Cancer Research, RM60G0680GCRF, P202PF11;Sino‐UK Industrial Fund, RP202G0289LIAS, P202ED10, P202RE969Data Science Enhancement Fund, P202RE237Sino‐UK Education Fund, OP202006Fight for Sight, 24NN201Royal Society International Exchanges Cost Share Award, RP202G0230MRC, MC_PC_17171BBSRC, RM32G0178B

    Perception Intelligence Integrated Vehicle-to-Vehicle Optical Camera Communication.

    Get PDF
    Ubiquitous usage of cameras and LEDs in modern road and aerial vehicles open up endless opportunities for novel applications in intelligent machine navigation, communication, and networking. To this end, in this thesis work, we hypothesize the benefit of dual-mode usage of vehicular built-in cameras through novel machine perception capabilities combined with optical camera communication (OCC). Current key conception of understanding a line-of-sight (LOS) scenery is from the aspect of object, event, and road situation detection. However, the idea of blending the non-line-of-sight (NLOS) information with the LOS information to achieve a see-through vision virtually is new. This improves the assistive driving performance by enabling a machine to see beyond occlusion. Another aspect of OCC in the vehicular setup is to understand the nature of mobility and its impact on the optical communication channel quality. The research questions gathered from both the car-car mobility modelling, and evaluating a working setup of OCC communication channel can also be inherited to aerial vehicular situations like drone-drone OCC. The aim of this thesis is to answer the research questions along these new application domains, particularly, (i) how to enable a virtual see-through perception in the car assisting system that alerts the human driver about the visible and invisible critical driving events to help drive more safely, (ii) how transmitter-receiver cars behaves while in the mobility and the overall channel performance of OCC in motion modality, (iii) how to help rescue lost Unmanned Aerial Vehicles (UAVs) through coordinated localization with fusion of OCC and WiFi, (iv) how to model and simulate an in-field drone swarm operation experience to design and validate UAV coordinated localization for group of positioning distressed drones. In this regard, in this thesis, we present the end-to-end system design, proposed novel algorithms to solve the challenges in applying such a system, and evaluation results through experimentation and/or simulation

    ABC: Adaptive, Biomimetic, Configurable Robots for Smart Farms - From Cereal Phenotyping to Soft Fruit Harvesting

    Get PDF
    Currently, numerous factors, such as demographics, migration patterns, and economics, are leading to the critical labour shortage in low-skilled and physically demanding parts of agriculture. Thus, robotics can be developed for the agricultural sector to address these shortages. This study aims to develop an adaptive, biomimetic, and configurable modular robotics architecture that can be applied to multiple tasks (e.g., phenotyping, cutting, and picking), various crop varieties (e.g., wheat, strawberry, and tomato) and growing conditions. These robotic solutions cover the entire perception–action–decision-making loop targeting the phenotyping of cereals and harvesting fruits in a natural environment. The primary contributions of this thesis are as follows. a) A high-throughput method for imaging field-grown wheat in three dimensions, along with an accompanying unsupervised measuring method for obtaining individual wheat spike data are presented. The unsupervised method analyses the 3D point cloud of each trial plot, containing hundreds of wheat spikes, and calculates the average size of the wheat spike and total spike volume per plot. Experimental results reveal that the proposed algorithm can effectively identify spikes from wheat crops and individual spikes. b) Unlike cereal, soft fruit is typically harvested by manual selection and picking. To enable robotic harvesting, the initial perception system uses conditional generative adversarial networks to identify ripe fruits using synthetic data. To determine whether the strawberry is surrounded by obstacles, a cluster complexity-based perception system is further developed to classify the harvesting complexity of ripe strawberries. c) Once the harvest-ready fruit is localised using point cloud data generated by a stereo camera, the platform’s action system can coordinate the arm to reach/cut the stem using the passive motion paradigm framework, as inspired by studies on neural control of movement in the brain. Results from field trials for strawberry detection, reaching/cutting the stem of the fruit with a mean error of less than 3 mm, and extension to analysing complex canopy structures/bimanual coordination (searching/picking) are presented. Although this thesis focuses on strawberry harvesting, ongoing research is heading toward adapting the architecture to other crops. The agricultural food industry remains a labour-intensive sector with a low margin, and cost- and time-efficiency business model. The concepts presented herein can serve as a reference for future agricultural robots that are adaptive, biomimetic, and configurable

    A robotic platform for precision agriculture and applications

    Get PDF
    Agricultural techniques have been improved over the centuries to match with the growing demand of an increase in global population. Farming applications are facing new challenges to satisfy global needs and the recent technology advancements in terms of robotic platforms can be exploited. As the orchard management is one of the most challenging applications because of its tree structure and the required interaction with the environment, it was targeted also by the University of Bologna research group to provide a customized solution addressing new concept for agricultural vehicles. The result of this research has blossomed into a new lightweight tracked vehicle capable of performing autonomous navigation both in the open-filed scenario and while travelling inside orchards for what has been called in-row navigation. The mechanical design concept, together with customized software implementation has been detailed to highlight the strengths of the platform and some further improvements envisioned to improve the overall performances. Static stability testing has proved that the vehicle can withstand steep slopes scenarios. Some improvements have also been investigated to refine the estimation of the slippage that occurs during turning maneuvers and that is typical of skid-steering tracked vehicles. The software architecture has been implemented using the Robot Operating System (ROS) framework, so to exploit community available packages related to common and basic functions, such as sensor interfaces, while allowing dedicated custom implementation of the navigation algorithm developed. Real-world testing inside the university’s experimental orchards have proven the robustness and stability of the solution with more than 800 hours of fieldwork. The vehicle has also enabled a wide range of autonomous tasks such as spraying, mowing, and on-the-field data collection capabilities. The latter can be exploited to automatically estimate relevant orchard properties such as fruit counting and sizing, canopy properties estimation, and autonomous fruit harvesting with post-harvesting estimations.Le tecniche agricole sono state migliorate nel corso dei secoli per soddisfare la crescente domanda di aumento della popolazione mondiale. I recenti progressi tecnologici in termini di piattaforme robotiche possono essere sfruttati in questo contesto. Poiché la gestione del frutteto è una delle applicazioni più impegnative, a causa della sua struttura arborea e della necessaria interazione con l'ambiente, è stata oggetto di ricerca per fornire una soluzione personalizzata che sviluppi un nuovo concetto di veicolo agricolo. Il risultato si è concretizzato in un veicolo cingolato leggero, capace di effettuare una navigazione autonoma sia nello scenario di pieno campo che all'interno dei frutteti (navigazione interfilare). La progettazione meccanica, insieme all'implementazione del software, sono stati dettagliati per evidenziarne i punti di forza, accanto ad alcuni ulteriori miglioramenti previsti per incrementarne le prestazioni complessive. I test di stabilità statica hanno dimostrato che il veicolo può resistere a ripidi pendii. Sono stati inoltre studiati miglioramenti per affinare la stima dello slittamento che si verifica durante le manovre di svolta, tipico dei veicoli cingolati. L'architettura software è stata implementata utilizzando il framework Robot Operating System (ROS), in modo da sfruttare i pacchetti disponibili relativi a componenti base, come le interfacce dei sensori, e consentendo al contempo un'implementazione personalizzata degli algoritmi di navigazione sviluppati. I test in condizioni reali all'interno dei frutteti sperimentali dell'università hanno dimostrato la robustezza e la stabilità della soluzione con oltre 800 ore di lavoro sul campo. Il veicolo ha permesso di attivare e svolgere un'ampia gamma di attività agricole in maniera autonoma, come l'irrorazione, la falciatura e la raccolta di dati sul campo. Questi ultimi possono essere sfruttati per stimare automaticamente le proprietà più rilevanti del frutteto, come il conteggio e la calibratura dei frutti, la stima delle proprietà della chioma e la raccolta autonoma dei frutti con stime post-raccolta
    corecore