8,170 research outputs found

    Deep saliency detection-based pedestrian detection with multispectral multi-scale features fusion network

    Get PDF
    In recent years, there has been increased interest in multispectral pedestrian detection using visible and infrared image pairs. This is due to the complementary visual information provided by these modalities, which enhances the robustness and reliability of pedestrian detection systems. However, current research in multispectral pedestrian detection faces the challenge of effectively integrating different modalities to reduce miss rates in the system. This article presents an improved method for multispectral pedestrian detection. The method utilises a saliency detection technique to modify the infrared image and obtain an infrared-enhanced map with clear pedestrian features. Subsequently, a multiscale image features fusion network is designed to efficiently fuse visible and IR-enhanced maps. Finally, the fusion network is supervised by three loss functions for illumination perception, light intensity, and texture information in conjunction with the light perception sub-network. The experimental results demonstrate that the proposed method improves the logarithmic mean miss rate for the three main subgroups (all day, day and night) to 3.12%, 3.06%, and 4.13% respectively, at “reasonable” settings. This is an improvement over the traditional method, which achieved rates of 3.11%, 2.77%, and 2.56% respectively, thus demonstrating the effectiveness of the proposed method

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Une méthode de mesure du mouvement humain pour la programmation par démonstration

    Full text link
    Programming by demonstration (PbD) is an intuitive approach to impart a task to a robot from one or several demonstrations by the human teacher. The acquisition of the demonstrations involves the solution of the correspondence problem when the teacher and the learner differ in sensing and actuation. Kinesthetic guidance is widely used to perform demonstrations. With such a method, the robot is manipulated by the teacher and the demonstrations are recorded by the robot's encoders. In this way, the correspondence problem is trivial but the teacher dexterity is afflicted which may impact the PbD process. Methods that are more practical for the teacher usually require the identification of some mappings to solve the correspondence problem. The demonstration acquisition method is based on a compromise between the difficulty of identifying these mappings, the level of accuracy of the recorded elements and the user-friendliness and convenience for the teacher. This thesis proposes an inertial human motion tracking method based on inertial measurement units (IMUs) for PbD for pick-and-place tasks. Compared to kinesthetic guidance, IMUs are convenient and easy to use but can present a limited accuracy. Their potential for PbD applications is investigated. To estimate the trajectory of the teacher's hand, 3 IMUs are placed on her/his arm segments (arm, forearm and hand) to estimate their orientations. A specific method is proposed to partially compensate the well-known drift of the sensor orientation estimation around the gravity direction by exploiting the particular configuration of the demonstration. This method, called heading reset, is based on the assumption that the sensor passes through its original heading with stationary phases several times during the demonstration. The heading reset is implemented in an integration and vector observation algorithm. Several experiments illustrate the advantages of this heading reset. A comprehensive inertial human hand motion tracking (IHMT) method for PbD is then developed. It includes an initialization procedure to estimate the orientation of each sensor with respect to the human arm segment and the initial orientation of the sensor with respect to the teacher attached frame. The procedure involves a rotation and a static position of the extended arm. The measurement system is thus robust with respect to the positioning of the sensors on the segments. A procedure for estimating the position of the human teacher relative to the robot and a calibration procedure for the parameters of the method are also proposed. At the end, the error of the human hand trajectory is measured experimentally and is found in an interval between 28.528.5 mm and 61.861.8 mm. The mappings to solve the correspondence problem are identified. Unfortunately, the observed level of accuracy of this IHMT method is not sufficient for a PbD process. In order to reach the necessary level of accuracy, a method is proposed to correct the hand trajectory obtained by IHMT using vision data. A vision system presents a certain complementarity with inertial sensors. For the sake of simplicity and robustness, the vision system only tracks the objects but not the teacher. The correction is based on so-called Positions Of Interest (POIs) and involves 3 steps: the identification of the POIs in the inertial and vision data, the pairing of the hand POIs to objects POIs that correspond to the same action in the task, and finally, the correction of the hand trajectory based on the pairs of POIs. The complete method for demonstration acquisition is experimentally evaluated in a full PbD process. This experiment reveals the advantages of the proposed method over kinesthesy in the context of this work.La programmation par démonstration est une approche intuitive permettant de transmettre une tâche à un robot à partir d'une ou plusieurs démonstrations faites par un enseignant humain. L'acquisition des démonstrations nécessite cependant la résolution d'un problème de correspondance quand les systèmes sensitifs et moteurs de l'enseignant et de l'apprenant diffèrent. De nombreux travaux utilisent des démonstrations faites par kinesthésie, i.e., l'enseignant manipule directement le robot pour lui faire faire la tâche. Ce dernier enregistre ses mouvements grâce à ses propres encodeurs. De cette façon, le problème de correspondance est trivial. Lors de telles démonstrations, la dextérité de l'enseignant peut être altérée et impacter tout le processus de programmation par démonstration. Les méthodes d'acquisition de démonstration moins invalidantes pour l'enseignant nécessitent souvent des procédures spécifiques pour résoudre le problème de correspondance. Ainsi l'acquisition des démonstrations se base sur un compromis entre complexité de ces procédures, le niveau de précision des éléments enregistrés et la commodité pour l'enseignant. Cette thèse propose ainsi une méthode de mesure du mouvement humain par capteurs inertiels pour la programmation par démonstration de tâches de ``pick-and-place''. Les capteurs inertiels sont en effet pratiques et faciles à utiliser, mais sont d'une précision limitée. Nous étudions leur potentiel pour la programmation par démonstration. Pour estimer la trajectoire de la main de l'enseignant, des capteurs inertiels sont placés sur son bras, son avant-bras et sa main afin d'estimer leurs orientations. Une méthode est proposée afin de compenser partiellement la dérive de l'estimation de l'orientation des capteurs autour de la direction de la gravité. Cette méthode, appelée ``heading reset'', est basée sur l'hypothèse que le capteur passe plusieurs fois par son azimut initial avec des phases stationnaires lors d'une démonstration. Cette méthode est implémentée dans un algorithme d'intégration et d'observation de vecteur. Des expériences illustrent les avantages du ``heading reset''. Cette thèse développe ensuite une méthode complète de mesure des mouvements de la main humaine par capteurs inertiels (IHMT). Elle comprend une première procédure d'initialisation pour estimer l'orientation des capteurs par rapport aux segments du bras humain ainsi que l'orientation initiale des capteurs par rapport au repère de référence de l'humain. Cette procédure, consistant en une rotation et une position statique du bras tendu, est robuste au positionnement des capteurs. Une seconde procédure est proposée pour estimer la position de l'humain par rapport au robot et pour calibrer les paramètres de la méthode. Finalement, l'erreur moyenne sur la trajectoire de la main humaine est mesurée expérimentalement entre 28.5 mm et 61.8 mm, ce qui n'est cependant pas suffisant pour la programmation par démonstration. Afin d'atteindre le niveau de précision nécessaire, une nouvelle méthode est développée afin de corriger la trajectoire de la main par IHMT à partir de données issues d'un système de vision, complémentaire des capteurs inertiels. Pour maintenir une certaine simplicité et robustesse, le système de vision ne suit que les objets et pas l'enseignant. La méthode de correction, basée sur des ``Positions Of Interest (POIs)'', est constituée de 3 étapes: l'identification des POIs dans les données issues des capteurs inertiels et du système de vision, puis l'association de POIs liées à la main et de POIs liées aux objets correspondant à la même action, et enfin, la correction de la trajectoire de la main à partir des paires de POIs. Finalement, la méthode IHMT corrigée est expérimentalement évaluée dans un processus complet de programmation par démonstration. Cette expérience montre l'avantage de la méthode proposée sur la kinesthésie dans le contexte de ce travail

    Neural Architecture Search for Image Segmentation and Classification

    Get PDF
    Deep learning (DL) is a class of machine learning algorithms that relies on deep neural networks (DNNs) for computations. Unlike traditional machine learning algorithms, DL can learn from raw data directly and effectively. Hence, DL has been successfully applied to tackle many real-world problems. When applying DL to a given problem, the primary task is designing the optimum DNN. This task relies heavily on human expertise, is time-consuming, and requires many trial-and-error experiments. This thesis aims to automate the laborious task of designing the optimum DNN by exploring the neural architecture search (NAS) approach. Here, we propose two new NAS algorithms for two real-world problems: pedestrian lane detection for assistive navigation and hyperspectral image segmentation for biosecurity scanning. Additionally, we also introduce a new dataset-agnostic predictor of neural network performance, which can be used to speed-up NAS algorithms that require the evaluation of candidate DNNs

    Robot-Enabled Construction Assembly with Automated Sequence Planning based on ChatGPT: RoboGPT

    Full text link
    Robot-based assembly in construction has emerged as a promising solution to address numerous challenges such as increasing costs, labor shortages, and the demand for safe and efficient construction processes. One of the main obstacles in realizing the full potential of these robotic systems is the need for effective and efficient sequence planning for construction tasks. Current approaches, including mathematical and heuristic techniques or machine learning methods, face limitations in their adaptability and scalability to dynamic construction environments. To expand the ability of the current robot system in sequential understanding, this paper introduces RoboGPT, a novel system that leverages the advanced reasoning capabilities of ChatGPT, a large language model, for automated sequence planning in robot-based assembly applied to construction tasks. The proposed system adapts ChatGPT for construction sequence planning and demonstrate its feasibility and effectiveness through experimental evaluation including Two case studies and 80 trials about real construction tasks. The results show that RoboGPT-driven robots can handle complex construction operations and adapt to changes on the fly. This paper contributes to the ongoing efforts to enhance the capabilities and performance of robot-based assembly systems in the construction industry, and it paves the way for further integration of large language model technologies in the field of construction robotics.Comment: 14 pages, 20 figures, submitted to IEEE Acces

    Pre-Deployment Testing of Low Speed, Urban Road Autonomous Driving in a Simulated Environment

    Full text link
    Low speed autonomous shuttles emulating SAE Level L4 automated driving using human driver assisted autonomy have been operating in geo-fenced areas in several cities in the US and the rest of the world. These autonomous vehicles (AV) are operated by small to mid-sized technology companies that do not have the resources of automotive OEMs for carrying out exhaustive, comprehensive testing of their AV technology solutions before public road deployment. Due to the low speed of operation and hence not operating on roads containing highways, the base vehicles of these AV shuttles are not required to go through rigorous certification tests. The way the driver assisted AV technology is tested and allowed for public road deployment is continuously evolving but is not standardized and shows differences between the different states where these vehicles operate. Currently, AVs and AV shuttles deployed on public roads are using these deployments for testing and improving their technology. However, this is not the right approach. Safe and extensive testing in a lab and controlled test environment including Model-in-the-Loop (MiL), Hardware-in-the-Loop (HiL) and Autonomous-Vehicle-in-the-Loop (AViL) testing should be the prerequisite to such public road deployments. This paper presents three dimensional virtual modeling of an AV shuttle deployment site and simulation testing in this virtual environment. We have two deployment sites in Columbus of these AV shuttles through the Department of Transportation funded Smart City Challenge project named Smart Columbus. The Linden residential area AV shuttle deployment site of Smart Columbus is used as the specific example for illustrating the AV testing method proposed in this paper

    High-statistics pedestrian dynamics on stairways and their probabilistic fundamental diagrams

    Full text link
    Staircases play an essential role in crowd dynamics, allowing pedestrians to flow across large multi-level public facilities such as transportation hubs, and office buildings. Achieving a robust understanding of pedestrian behavior in these facilities is a key societal necessity. What makes this an outstanding scientific challenge is the extreme randomness intrinsic to pedestrian behavior. Any quantitative understanding necessarily needs to be probabilistic, including average dynamics and fluctuations. In this work, we analyze data from an unprecedentedly high statistics year-long pedestrian tracking campaign, in which we anonymously collected millions of trajectories across a staircase within Eindhoven train station (NL). Made possible thanks to a state-of-the-art, faster than real-time, computer vision approach hinged on 3D depth imaging, and YOLOv7-based depth localization. We consider both free-stream conditions, i.e. pedestrians walking in undisturbed, and trafficked conditions, uni/bidirectional flows. We report the position vs density, considering the crowd as a 'compressible' physical medium. We show how pedestrians willingly opt to occupy fewer space than available, accepting a certain degree of compressibility. This is a non-trivial physical feature of pedestrian dynamics and we introduce a novel way to quantify this effect. As density increases, pedestrians strive to keep a minimum distance d = 0.6 m from the person in front of them. Finally, we establish first-of-kind fully resolved probabilistic fundamental diagrams, where we model the pedestrian walking velocity as a mixture of a slow and fast-paced component. Notably, averages and modes of velocity distribution turn out to be substantially different. Our results, including probabilistic parametrizations based on few variables, are key towards improved facility design and realistic simulation of pedestrians on staircases

    La traduzione specializzata all’opera per una piccola impresa in espansione: la mia esperienza di internazionalizzazione in cinese di Bioretics© S.r.l.

    Get PDF
    Global markets are currently immersed in two all-encompassing and unstoppable processes: internationalization and globalization. While the former pushes companies to look beyond the borders of their country of origin to forge relationships with foreign trading partners, the latter fosters the standardization in all countries, by reducing spatiotemporal distances and breaking down geographical, political, economic and socio-cultural barriers. In recent decades, another domain has appeared to propel these unifying drives: Artificial Intelligence, together with its high technologies aiming to implement human cognitive abilities in machinery. The “Language Toolkit – Le lingue straniere al servizio dell’internazionalizzazione dell’impresa” project, promoted by the Department of Interpreting and Translation (Forlì Campus) in collaboration with the Romagna Chamber of Commerce (Forlì-Cesena and Rimini), seeks to help Italian SMEs make their way into the global market. It is precisely within this project that this dissertation has been conceived. Indeed, its purpose is to present the translation and localization project from English into Chinese of a series of texts produced by Bioretics© S.r.l.: an investor deck, the company website and part of the installation and use manual of the Aliquis© framework software, its flagship product. This dissertation is structured as follows: Chapter 1 presents the project and the company in detail; Chapter 2 outlines the internationalization and globalization processes and the Artificial Intelligence market both in Italy and in China; Chapter 3 provides the theoretical foundations for every aspect related to Specialized Translation, including website localization; Chapter 4 describes the resources and tools used to perform the translations; Chapter 5 proposes an analysis of the source texts; Chapter 6 is a commentary on translation strategies and choices
    corecore