324 research outputs found
A deep representation for depth images from synthetic data
Convolutional Neural Networks (CNNs) trained on large scale RGB databases
have become the secret sauce in the majority of recent approaches for object
categorization from RGB-D data. Thanks to colorization techniques, these
methods exploit the filters learned from 2D images to extract meaningful
representations in 2.5D. Still, the perceptual signature of these two kind of
images is very different, with the first usually strongly characterized by
textures, and the second mostly by silhouettes of objects. Ideally, one would
like to have two CNNs, one for RGB and one for depth, each trained on a
suitable data collection, able to capture the perceptual properties of each
channel for the task at hand. This has not been possible so far, due to the
lack of a suitable depth database. This paper addresses this issue, proposing
to opt for synthetically generated images rather than collecting by hand a 2.5D
large scale database. While being clearly a proxy for real data, synthetic
images allow to trade quality for quantity, making it possible to generate a
virtually infinite amount of data. We show that the filters learned from such
data collection, using the very same architecture typically used on visual
data, learns very different filters, resulting in depth features (a) able to
better characterize the different facets of depth images, and (b) complementary
with respect to those derived from CNNs pre-trained on 2D datasets. Experiments
on two publicly available databases show the power of our approach
Learning to see across domains and modalities
Deep learning has recently raised hopes and expectations as a general solution for many applications (computer vision, natural language processing, speech recognition, etc.); indeed it has proven effective, but it also showed a strong dependence on large quantities of data. Generally speaking, deep learning models are especially susceptible to overfitting, due to their large number of internal parameters.
Luckily, it has also been shown that, even when data is scarce, a successful model can be trained by reusing prior knowledge. Thus, developing techniques for \textit{transfer learning} (as this process is known), in its broadest definition, is a crucial element towards the deployment of effective and accurate intelligent systems into the real world.
This thesis will focus on a family of transfer learning methods applied to the task of visual object recognition, specifically image classification. The visual recognition problem is central to computer vision research: many desired applications, from robotics to information retrieval, demand the ability to correctly identify categories, places, and objects.
Transfer learning is a general term, and specific settings have been given specific names: when the learner has access to only unlabeled data from the target domain (where the model should perform) and labeled data from a different domain (the source), the problem is called unsupervised domain adaptation (DA). The first part of this thesis will focus on three methods for this setting.
The three presented techniques for domain adaptation are fully distinct: the first one proposes the use of Domain Alignment layers to structurally align the distributions of the source and target domains in feature space. While the general idea of aligning feature distribution is not novel,
we distinguish our method by being one of the very few that do so without adding losses. The second method is based on GANs: we propose a bidirectional architecture that jointly learns how to map the source images into the target visual style and vice-versa, thus alleviating the domain shift at the pixel level. The third method features an adversarial learning process that transforms both the images and the features of both domains in order to map them to a common, agnostic, space.
While the first part of the thesis presented general purpose DA methods, the second part will focus on the real life issues of robotic perception, specifically RGB-D recognition.
Robotic platforms are usually not limited to color perception; very often they also carry a Depth camera.
Unfortunately, the depth modality is rarely used for visual recognition due to the lack of pretrained models from which to transfer and little data to train one on from scratch.
We will first explore the use of synthetic data as proxy for real images by training a Convolutional Neural Network (CNN) on virtual depth maps, rendered from 3D CAD models, and then testing it on real robotic datasets. The second approach leverages the existence of RGB pretrained models, by learning how to map the depth data into the most discriminative RGB representation and then using existing models for recognition. This second technique is actually a pretty generic Transfer Learning method which can be applied to share knowledge across modalities
From source to target and back: symmetric bi-directional adaptive GAN
The effectiveness of generative adversarial approaches in producing images
according to a specific style or visual domain has recently opened new
directions to solve the unsupervised domain adaptation problem. It has been
shown that source labeled images can be modified to mimic target samples making
it possible to train directly a classifier in the target domain, despite the
original lack of annotated data. Inverse mappings from the target to the source
domain have also been evaluated but only passing through adapted feature
spaces, thus without new image generation. In this paper we propose to better
exploit the potential of generative adversarial networks for adaptation by
introducing a novel symmetric mapping among domains. We jointly optimize
bi-directional image transformations combining them with target self-labeling.
Moreover we define a new class consistency loss that aligns the generators in
the two directions imposing to conserve the class identity of an image passing
through both domain mappings. A detailed qualitative and quantitative analysis
of the reconstructed images confirm the power of our approach. By integrating
the two domain specific classifiers obtained with our bi-directional network we
exceed previous state-of-the-art unsupervised adaptation results on four
different benchmark datasets
Bridging Between Computer and Robot Vision Through Data Augmentation: A Case Study on Object Recognition
Despite the impressive progress brought by deep network in visual object recognition, robot vision is still far from being a solved problem. The most successful convolutional architectures are developed starting from ImageNet, a large scale collection of images of object categories downloaded from the Web. This kind of images is very different from the situated and embodied visual experience of robots deployed in unconstrained settings. To reduce the gap between these two visual experiences, this paper proposes a simple yet effective data augmentation layer that zooms on the object of interest and simulates the object detection outcome of a robot vision system. The layer, that can be used with any convolutional deep architecture, brings to an increase in object recognition performance of up to 7{\%}, in experiments performed over three different benchmark databases. An implementation of our robot data augmentation layer has been made publicly available
Compagnie aeree low cost, competitivitĂ dei sistemi aeroportuali e ricadute sui territori locali
The aim of this work is to analyze the role of air transport for the economic development of peripheral areas. Liberalization and deregulation processes have involved all aspects of the air transport activity, producing significant change to the supply side of the air transport market. Airlines have tried to expand their market share, efficiency, and above all, they have tried to specialise their activity focusing on specific demand targets. As a consequence, even airports have had to change their management activity, increasingly modifying their strategy to attract new carriers and plan new goals, such as, for example, increasing accessibility. As a consequence even the airport have had to change the management activity, modifying the strategy to attract new carriers. This paper is focusing on a dynamical approach for analyzing airport choice factors for airlines and for airport management. Its main advantage is the ability to linearly depict the several relationships occurring amongst the different subjects involved, with increased advantages as opposite to more traditional approaches, like the “Costs-Benefits” model, or the “Multi-criteria” techniques.trasporto aereo, territorio, system dynamics
CompetitivitĂ ed efficienza delle infrastrutture terminali del trasporto marittimo: analisi del sistema dei porti nel Mediterraneo e livello di integrazione logistica
The choice processes of maritime terminals included in the networks have become, in recent times, a significant importance due to the general improvement of accessibility that has characterized most of the infrastructure. The high degree of interconnection of the terminals, achieved thanks to the logistics innovation processes, revealed other attributes of the terminal activity, other than the localisation of facilities, which may be decisive for the users. In this context, considerable importance have strategies adopted by operators of shipping terminals whose level of competitiveness and efficiency is the key element in the process of selection of ports. Therefore the organizational and functional complexity of most of the ports may prove extremely useful to the analysis of characteristics of terminal for its right collocation in the network.port, Mediterranean, logistics, value-chain
Domain Generalization by Solving Jigsaw Puzzles
Human adaptability relies crucially on the ability to learn and merge
knowledge both from supervised and unsupervised learning: the parents point out
few important concepts, but then the children fill in the gaps on their own.
This is particularly effective, because supervised learning can never be
exhaustive and thus learning autonomously allows to discover invariances and
regularities that help to generalize. In this paper we propose to apply a
similar approach to the task of object recognition across domains: our model
learns the semantic labels in a supervised fashion, and broadens its
understanding of the data by learning from self-supervised signals how to solve
a jigsaw puzzle on the same images. This secondary task helps the network to
learn the concepts of spatial correlation while acting as a regularizer for the
classification task. Multiple experiments on the PACS, VLCS, Office-Home and
digits datasets confirm our intuition and show that this simple method
outperforms previous domain generalization and adaptation solutions. An
ablation study further illustrates the inner workings of our approach.Comment: Accepted at CVPR 2019 (oral
Effects of political-economic integration and trade liberalization on exports of Italian Quality Wines Produced in Determined Regions (QWPDR)
The aim of this work is to explain the magnitude of the trade flows for high quality wine from Italy to its main importing countries. This objective has been reached by establishing an appropriate econometric model derived from an extended form of the “Gravity Model”. This model has been broadly applied to the analysis of international trade because it provides robust estimates. The results obtained and the model itself are useful in forecasting potential trends in the exportation of high quality Italian wines. In particular, these estimates give a quantitative evaluation of the export gains that could result from the enlargement of the EU and from an increasing liberalization in international trade. Moreover, it is possible to identify the growing markets where Italian ventures could exploit certain promotional and communication strategies.Italy, Exports, QWPDR, Integration, Gravity Model
- …