3,245 research outputs found
ImageNet Large Scale Visual Recognition Challenge
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in
object category classification and detection on hundreds of object categories
and millions of images. The challenge has been run annually from 2010 to
present, attracting participation from more than fifty institutions.
This paper describes the creation of this benchmark dataset and the advances
in object recognition that have been possible as a result. We discuss the
challenges of collecting large-scale ground truth annotation, highlight key
breakthroughs in categorical object recognition, provide a detailed analysis of
the current state of the field of large-scale image classification and object
detection, and compare the state-of-the-art computer vision accuracy with human
accuracy. We conclude with lessons learned in the five years of the challenge,
and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL
VOC (per-category comparisons in Table 3, distribution of localization
difficulty in Fig 16), a list of queries used for obtaining object detection
images (Appendix C), and some additional reference
Autonomous Vehicles an overview on system, cyber security, risks, issues, and a way forward
This chapter explores the complex realm of autonomous cars, analyzing their
fundamental components and operational characteristics. The initial phase of
the discussion is elucidating the internal mechanics of these automobiles,
encompassing the crucial involvement of sensors, artificial intelligence (AI)
identification systems, control mechanisms, and their integration with
cloud-based servers within the framework of the Internet of Things (IoT). It
delves into practical implementations of autonomous cars, emphasizing their
utilization in forecasting traffic patterns and transforming the dynamics of
transportation. The text also explores the topic of Robotic Process Automation
(RPA), illustrating the impact of autonomous cars on different businesses
through the automation of tasks. The primary focus of this investigation lies
in the realm of cybersecurity, specifically in the context of autonomous
vehicles. A comprehensive analysis will be conducted to explore various risk
management solutions aimed at protecting these vehicles from potential threats
including ethical, environmental, legal, professional, and social dimensions,
offering a comprehensive perspective on their societal implications. A
strategic plan for addressing the challenges and proposing strategies for
effectively traversing the complex terrain of autonomous car systems,
cybersecurity, hazards, and other concerns are some resources for acquiring an
understanding of the intricate realm of autonomous cars and their ramifications
in contemporary society, supported by a comprehensive compilation of resources
for additional investigation.
Keywords: RPA, Cyber Security, AV, Risk, Smart Car
Recommended from our members
Explainable and Advisable Learning for Self-driving Vehicles
Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly
Assinatura de objectos em rádio frequência
Mestrado em Engenharia Eletrónica e TelecomunicaçõesThe RF signature can be consider as a fingerprint of an object when submitted to electromagnetic radiation. Based on this concept, the initial goal of this work was to elaborate a comparative analysis of the Radio Frequency signature of different materials.
Through the design of a prototype based on an adapted Wi-Fi network was developed an innovative system capable of distinguishing materials with the analysis of their interference in the propagated channel.
In order to refine this distinction was utilized a signal processing tool, the Wavelet Transform. This technique serve as support tool of the system for a better differentiation of the studied targets.
The versatility of this concept was proved through the analysis of signatures of static targets like metal, wood and plastic, as well as moving targets, giving the example of a moving human.
Due to the promising results obtained, the initial objective of the work was expanded being also presented in this document the concept of intruder detection through a Wi-Fi network by the analysis of the Wavelet coefficients.A Assinatura em Rádio Frequência pode ser considerada como a impressão digital que um objeto manifesta quando submetido a radiação eletromagnética. O objetivo inicial deste trabalho era a elaboração de uma análise comparativa das assinaturas em Rádio Frequência de diferentes materiais.
Tendo por base uma rede Wi-Fi adaptada, foi desenvolvido um sistema inovador capaz de distinguir materiais pela análise da interferência dos mesmos no canal de propagação.
Com vista a melhorar o desempenho do protótipo inicial, o sinal recebido foi processado através da Transformada de Wavelet. Esta técnica serviu como ferramenta de suporte do sistema para a obtenção de uma diferenciação mais clara dos alvos estudados.
Demonstrando a versatilidade deste conceito foram avaliadas as assinaturas de alvos estáticos como o metal, madeira e plástico bem como de alvos móveis dando, como exemplo, uma pessoa em movimento.
Devido aos resultados promissores obtidos, o objetivo inicial do sistema foi alargado estando também presente neste documento o conceito de deteção de intrusos através de uma rede Wi-Fi pela análise dos coeficientes de Wavelet
Learned perception systems for self-driving vehicles
2022 Spring.Includes bibliographical references.Building self-driving vehicles is one of the most impactful technological challenges of modern artificial intelligence. Self-driving vehicles are widely anticipated to revolutionize the way people and freight move. In this dissertation, we present a collection of work that aims to improve the capability of the perception module, an essential module for safe and reliable autonomous driving. Specifically, it focuses on two perception topics: 1) Geo-localization (mapping) of spatially-compact static objects, and 2) Multi-target object detection and tracking of moving objects in the scene. Accurately estimating the position of static objects, such as traffic lights, from the moving camera of a self-driving car is a challenging problem. In this dissertation, we present a system that improves the localization of static objects by jointly optimizing the components of the system via learning. Our system is comprised of networks that perform: 1) 5DoF object pose estimation from a single image, 2) association of objects between pairs of frames, and 3) multi-object tracking to produce the final geo-localization of the static objects within the scene. We evaluate our approach using a publicly available data set, focusing on traffic lights due to data availability. For each component, we compare against contemporary alternatives and show significantly improved performance. We also show that the end-to-end system performance is further improved via joint training of the constituent models. Next, we propose an efficient joint detection and tracking model named DEFT, or "Detection Embeddings for Tracking." The proposed approach relies on an appearance-based object matching network jointly learned with an underlying object detection network. An LSTM is also added to capture motion constraints. DEFT has comparable accuracy and speed to the top methods on 2D online tracking leaderboards while having significant advantages in robustness when applied to more challenging tracking data. DEFT raises the bar on the nuScenes monocular 3D tracking challenge, more than doubling the performance of the previous top method (3.8x on AMOTA, 2.1x on MOTAR). We analyze the difference in performance between DEFT and the next best-published method on nuScenes and find that DEFT is more robust to occlusions and large inter-frame displacements, making it a superior choice for many use-cases. Third, we present an end-to-end model to solve the tasks of detection, tracking, and sequence modeling from raw sensor data, called Attention-based DEFT. Attention-based DEFT extends the original DEFT by adding an attentional encoder module that uses attention to compute tracklet embedding that 1) jointly reasons about the tracklet dependencies and interaction with other objects present in the scene and 2) captures the context and temporal information of the tracklet's past observations. The experimental results show that Attention-based DEFT performs favorably against or comparable to state-of-the-art trackers. Reasoning about the interactions between the actors in the scene allows Attention-based DEFT to boost the model tracking performance in heavily crowded and complex interactive scenes. We validate the sequence modeling effectiveness of the proposed approach by showing its superiority for velocity estimation task over other baseline methods on both simple and complex scenes. The experiments demonstrate the effectiveness of Attention-based DEFT for capturing spatio-temporal interaction of the crowd for velocity estimation task, which helps it to be more robust to handle complexities in densely crowded scenes. The experimental results show that all the joint models in this dissertation perform better than solving each problem independently
- …