425 research outputs found
Recommended from our members
End to End Learning in Autonomous Driving Systems
Convolutional neural networks have advanced visual perception significantly in recent years. Two major ingredients that enable such a success are the composition of simple modules into a complex network and the end to end optimization. However, such success has not yet revolutionized robotics as much as vision, even if robotics suffer from similar problems as traditional computer vision, i.e. imperfectness of the manual pipeline design of the system. This thesis investigates using end-to-end learning for the autonomous driving system, a concrete robotic application. End to end learning can produce reasonable driving behaviors, even in the complex urban driving scenarios. Representation learning in end-to-end driving models is crucial, and auxiliary vision tasks such as semantic segmentation can help to form a more informative driving representation especially when training data is limited. Naive convolutional neural networks are usually only capable of doing reactive control and can not involve complex reasoning in a particular scenario. This thesis also studies how to handle scene conditioned driving behavior, which goes beyond the capability of reactive control. Alongside the end-to-end structure, learning methods also play a critical role. Imitation learning methods will acquire meaningful behaviors but usually, the robot can not master the skill. Reinforcement learning, on the contrary, either barely learns anything if the environment is too complex, or it can master the skill otherwise. To get the best of both worlds, this thesis proposes an algorithmically unified method to learn from both demonstration data and the environment
Kartezio: Evolutionary Design of Explainable Pipelines for Biomedical Image Analysis
An unresolved issue in contemporary biomedicine is the overwhelming number
and diversity of complex images that require annotation, analysis and
interpretation. Recent advances in Deep Learning have revolutionized the field
of computer vision, creating algorithms that compete with human experts in
image segmentation tasks. Crucially however, these frameworks require large
human-annotated datasets for training and the resulting models are difficult to
interpret. In this study, we introduce Kartezio, a modular Cartesian Genetic
Programming based computational strategy that generates transparent and easily
interpretable image processing pipelines by iteratively assembling and
parameterizing computer vision functions. The pipelines thus generated exhibit
comparable precision to state-of-the-art Deep Learning approaches on instance
segmentation tasks, while requiring drastically smaller training datasets, a
feature which confers tremendous flexibility, speed, and functionality to this
approach. We also deployed Kartezio to solve semantic and instance segmentation
problems in four real-world Use Cases, and showcase its utility in imaging
contexts ranging from high-resolution microscopy to clinical pathology. By
successfully implementing Kartezio on a portfolio of images ranging from
subcellular structures to tumoral tissue, we demonstrated the flexibility,
robustness and practical utility of this fully explicable evolutionary designer
for semantic and instance segmentation.Comment: 36 pages, 6 main Figures. The Extended Data Movie is available at the
following link: https://www.youtube.com/watch?v=r74gdzb6hdA. The source code
is available on Github: https://github.com/KevinCortacero/Kartezi
Explainability in Deep Reinforcement Learning
A large set of the explainable Artificial Intelligence (XAI) literature is
emerging on feature relevance techniques to explain a deep neural network (DNN)
output or explaining models that ingest image source data. However, assessing
how XAI techniques can help understand models beyond classification tasks, e.g.
for reinforcement learning (RL), has not been extensively studied. We review
recent works in the direction to attain Explainable Reinforcement Learning
(XRL), a relatively new subfield of Explainable Artificial Intelligence,
intended to be used in general public applications, with diverse audiences,
requiring ethical, responsible and trustable algorithms. In critical situations
where it is essential to justify and explain the agent's behaviour, better
explainability and interpretability of RL models could help gain scientific
insight on the inner workings of what is still considered a black box. We
evaluate mainly studies directly linking explainability to RL, and split these
into two categories according to the way the explanations are generated:
transparent algorithms and post-hoc explainaility. We also review the most
prominent XAI works from the lenses of how they could potentially enlighten the
further deployment of the latest advances in RL, in the demanding present and
future of everyday problems.Comment: Article accepted at Knowledge-Based System
Mixed-input second-hand car price estimation model based on scraped data
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThe number of second-hand cars is growing year by year. More and more people prefer to buy a
second-hand car rather than a new one due to the increasing cost of new cars and their fast
devaluation in price. Consequently, there has also been an increase in online marketplaces for peerto-
peer (P2P) second-hand cars trades. A robust price estimation is needed for both dealers, to have a
good idea on how to price their cars, and buyers, to understand whether a listing is overpriced or not.
Price estimation for second-hand cars has been, to my knowledge, so far only explored with numerical
and categorical features such as mileage driven, brand or production year. An approach that also uses
image data has yet to be developed.
This work aims to investigate the use of a multi-input price estimation model for second-hand cars
taking advantage of a convolutional neural network (CNN), to extract features from car images,
combined with an artificial neural network (ANN), dealing with the categorical-numerical features, and
assess whether this method improves accuracy in price estimation over more traditional single-input
methods.
To train and evaluate the model, a dataset of second-hand car images and textual features is scraped
from a marketplace and curated such that more than 700 images can be used for the training
Scene Understanding For Real Time Processing Of Queries Over Big Data Streaming Video
With heightened security concerns across the globe and the increasing need to monitor, preserve and protect infrastructure and public spaces to ensure proper operation, quality assurance and safety, numerous video cameras have been deployed. Accordingly, they also need to be monitored effectively and efficiently. However, relying on human operators to constantly monitor all the video streams is not scalable or cost effective. Humans can become subjective, fatigued, even exhibit bias and it is difficult to maintain high levels of vigilance when capturing, searching and recognizing events that occur infrequently or in isolation. These limitations are addressed in the Live Video Database Management System (LVDBMS), a framework for managing and processing live motion imagery data. It enables rapid development of video surveillance software much like traditional database applications are developed today. Such developed video stream processing applications and ad hoc queries are able to reuse advanced image processing techniques that have been developed. This results in lower software development and maintenance costs. Furthermore, the LVDBMS can be intensively tested to ensure consistent quality across all associated video database applications. Its intrinsic privacy framework facilitates a formalized approach to the specification and enforcement of verifiable privacy policies. This is an important step towards enabling a general privacy certification for video surveillance systems by leveraging a standardized privacy specification language. With the potential to impact many important fields ranging from security and assembly line monitoring to wildlife studies and the environment, the broader impact of this work is clear. The privacy framework protects the general public from abusive use of surveillance technology; iii success in addressing the trust issue will enable many new surveillance-related applications. Although this research focuses on video surveillance, the proposed framework has the potential to support many video-based analytical applications
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
- …