3,245 research outputs found

    ImageNet Large Scale Visual Recognition Challenge

    Get PDF
    The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the five years of the challenge, and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL VOC (per-category comparisons in Table 3, distribution of localization difficulty in Fig 16), a list of queries used for obtaining object detection images (Appendix C), and some additional reference

    Autonomous Vehicles an overview on system, cyber security, risks, issues, and a way forward

    Full text link
    This chapter explores the complex realm of autonomous cars, analyzing their fundamental components and operational characteristics. The initial phase of the discussion is elucidating the internal mechanics of these automobiles, encompassing the crucial involvement of sensors, artificial intelligence (AI) identification systems, control mechanisms, and their integration with cloud-based servers within the framework of the Internet of Things (IoT). It delves into practical implementations of autonomous cars, emphasizing their utilization in forecasting traffic patterns and transforming the dynamics of transportation. The text also explores the topic of Robotic Process Automation (RPA), illustrating the impact of autonomous cars on different businesses through the automation of tasks. The primary focus of this investigation lies in the realm of cybersecurity, specifically in the context of autonomous vehicles. A comprehensive analysis will be conducted to explore various risk management solutions aimed at protecting these vehicles from potential threats including ethical, environmental, legal, professional, and social dimensions, offering a comprehensive perspective on their societal implications. A strategic plan for addressing the challenges and proposing strategies for effectively traversing the complex terrain of autonomous car systems, cybersecurity, hazards, and other concerns are some resources for acquiring an understanding of the intricate realm of autonomous cars and their ramifications in contemporary society, supported by a comprehensive compilation of resources for additional investigation. Keywords: RPA, Cyber Security, AV, Risk, Smart Car

    Assinatura de objectos em rádio frequência

    Get PDF
    Mestrado em Engenharia Eletrónica e TelecomunicaçõesThe RF signature can be consider as a fingerprint of an object when submitted to electromagnetic radiation. Based on this concept, the initial goal of this work was to elaborate a comparative analysis of the Radio Frequency signature of different materials. Through the design of a prototype based on an adapted Wi-Fi network was developed an innovative system capable of distinguishing materials with the analysis of their interference in the propagated channel. In order to refine this distinction was utilized a signal processing tool, the Wavelet Transform. This technique serve as support tool of the system for a better differentiation of the studied targets. The versatility of this concept was proved through the analysis of signatures of static targets like metal, wood and plastic, as well as moving targets, giving the example of a moving human. Due to the promising results obtained, the initial objective of the work was expanded being also presented in this document the concept of intruder detection through a Wi-Fi network by the analysis of the Wavelet coefficients.A Assinatura em Rádio Frequência pode ser considerada como a impressão digital que um objeto manifesta quando submetido a radiação eletromagnética. O objetivo inicial deste trabalho era a elaboração de uma análise comparativa das assinaturas em Rádio Frequência de diferentes materiais. Tendo por base uma rede Wi-Fi adaptada, foi desenvolvido um sistema inovador capaz de distinguir materiais pela análise da interferência dos mesmos no canal de propagação. Com vista a melhorar o desempenho do protótipo inicial, o sinal recebido foi processado através da Transformada de Wavelet. Esta técnica serviu como ferramenta de suporte do sistema para a obtenção de uma diferenciação mais clara dos alvos estudados. Demonstrando a versatilidade deste conceito foram avaliadas as assinaturas de alvos estáticos como o metal, madeira e plástico bem como de alvos móveis dando, como exemplo, uma pessoa em movimento. Devido aos resultados promissores obtidos, o objetivo inicial do sistema foi alargado estando também presente neste documento o conceito de deteção de intrusos através de uma rede Wi-Fi pela análise dos coeficientes de Wavelet

    Learned perception systems for self-driving vehicles

    Get PDF
    2022 Spring.Includes bibliographical references.Building self-driving vehicles is one of the most impactful technological challenges of modern artificial intelligence. Self-driving vehicles are widely anticipated to revolutionize the way people and freight move. In this dissertation, we present a collection of work that aims to improve the capability of the perception module, an essential module for safe and reliable autonomous driving. Specifically, it focuses on two perception topics: 1) Geo-localization (mapping) of spatially-compact static objects, and 2) Multi-target object detection and tracking of moving objects in the scene. Accurately estimating the position of static objects, such as traffic lights, from the moving camera of a self-driving car is a challenging problem. In this dissertation, we present a system that improves the localization of static objects by jointly optimizing the components of the system via learning. Our system is comprised of networks that perform: 1) 5DoF object pose estimation from a single image, 2) association of objects between pairs of frames, and 3) multi-object tracking to produce the final geo-localization of the static objects within the scene. We evaluate our approach using a publicly available data set, focusing on traffic lights due to data availability. For each component, we compare against contemporary alternatives and show significantly improved performance. We also show that the end-to-end system performance is further improved via joint training of the constituent models. Next, we propose an efficient joint detection and tracking model named DEFT, or "Detection Embeddings for Tracking." The proposed approach relies on an appearance-based object matching network jointly learned with an underlying object detection network. An LSTM is also added to capture motion constraints. DEFT has comparable accuracy and speed to the top methods on 2D online tracking leaderboards while having significant advantages in robustness when applied to more challenging tracking data. DEFT raises the bar on the nuScenes monocular 3D tracking challenge, more than doubling the performance of the previous top method (3.8x on AMOTA, 2.1x on MOTAR). We analyze the difference in performance between DEFT and the next best-published method on nuScenes and find that DEFT is more robust to occlusions and large inter-frame displacements, making it a superior choice for many use-cases. Third, we present an end-to-end model to solve the tasks of detection, tracking, and sequence modeling from raw sensor data, called Attention-based DEFT. Attention-based DEFT extends the original DEFT by adding an attentional encoder module that uses attention to compute tracklet embedding that 1) jointly reasons about the tracklet dependencies and interaction with other objects present in the scene and 2) captures the context and temporal information of the tracklet's past observations. The experimental results show that Attention-based DEFT performs favorably against or comparable to state-of-the-art trackers. Reasoning about the interactions between the actors in the scene allows Attention-based DEFT to boost the model tracking performance in heavily crowded and complex interactive scenes. We validate the sequence modeling effectiveness of the proposed approach by showing its superiority for velocity estimation task over other baseline methods on both simple and complex scenes. The experiments demonstrate the effectiveness of Attention-based DEFT for capturing spatio-temporal interaction of the crowd for velocity estimation task, which helps it to be more robust to handle complexities in densely crowded scenes. The experimental results show that all the joint models in this dissertation perform better than solving each problem independently
    corecore