134 research outputs found

    Visual Similarity Using Limited Supervision

    Get PDF
    The visual world is a conglomeration of objects, scenes, motion, and much more. As humans, we look at the world through our eyes, but we understand it by using our brains. From a young age, humans learn to recognize objects by association, meaning that we link an object or action to the most similar one in our memory to make sense of it. Within the field of Artificial Intelligence, Computer Vision gives machines the ability to see. While digital cameras provide eyes to the machine, Computer Vision develops its brain. To that purpose, Deep Learning has emerged as a very successful tool. This method allows machines to learn solutions to problems directly from the data. On the basis of Deep Learning, computers nowadays can also learn to interpret the visual world. However, the process of learning in machines is very different from ours. In Deep Learning, images and videos are grouped into predefined, artificial categories. However, describing a group of objects, or actions, with a single integer (category) disregards most of its characteristics and pair-wise relationships. To circumvent this, we propose to expand the categorical model by using visual similarity which better mirrors the human approach. Deep Learning requires a large set of manually annotated samples, that form the training set. Retrieving training samples is easy given the endless amount of images and videos available on the internet. However, this also requires manual annotations, which are very costly and laborious to obtain and thus a major bottleneck in modern computer vision. In this thesis, we investigate visual similarity methods to solve image and video classification. In particular, we search for a solution where human super- vision is marginal. We focus on Zero-Shot Learning (ZSL), where only a subset of categories are manually annotated. After studying existing methods in the field, we identify common limitations and propose methods to tackle them. In particular, ZSL image classification is trained using only discriminative supervi- sion, i.e. predefined categories, while ignoring other descriptive characteristics. To tackle this, we propose a new approach to learn shared features, i.e. non- discriminative, thus descriptive characteristics, which improves existing methods by a large margin. However, while ZSL has shown great potential for the task of image classification, for example in case of face recognition, it has performed poorly for video classification. We identify the reasons for the lack of growth in the field and provide a new, powerful baseline. Unfortunately, even if ZSL requires only partial labeled data, it still needs supervision during training. For that reason, we also investigate purely unsuper- vised methods. A successful paradigm is self-supervision: the model is trained using a surrogate task where supervision is automatically provided. The key to self-supervision is the ability of deep learning to transfer the knowledge learned from one task to a new task. The more similar the two tasks are, the more effective the transfer is. Similar to our work on ZSL, we also studied the com- mon limitations of existing self-supervision approaches and proposed a method to overcome them. To improve self-supervised learning, we propose a policy network which controls the parameters of the surrogate task and is trained through reinforcement learning. Finally, we present a real-life application where utilizing visual similarity with limited supervision provides a better solution compared to existing parametric approaches. We analyze the behavior of motor-impaired rodents during a single repeating action for which our method provides an objective similarity of behav- ior, facilitating comparisons across animal subjects and time during recovery

    Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

    Get PDF
    Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at this http URL

    Senior Recital: Sophia Brattoli, Trombone; Lu Witzig, Piano; October 28, 2023

    Get PDF
    Kemp Recital HallOctober 28, 2023Saturday Evening6:00 p.m

    Analisi e gestione della comunicazione: le nuove frontiere

    Get PDF
    Le difficoltà di comunicare per un’azienda di grandi dimensioni. La comunicazione di crisi, l’importanza delle fonti, la necessità di essere percepito dai giornalisti come un interlocutore credibile, affidabile e corretto. Una dialettica spesso non facile, nel rispetto del ruolo di ciascuno. Una professione bella, ma allo stesso tempo totalizzante e spesso faticosa. Negli ultimi anni una maggiore definizione e uno sviluppo del quadro normativo hanno rafforzato e delineato in maniera sempre più chiara gli obblighi delle società quotate nei confronti delle comunicazioni al mercato e agli stakeholders; dall’altro, è aumentata la consapevolezza che una buona comunicazione sia un’attività chiave dell’azienda che può essere strumento per la creazione di valore della società stessa. La legislazione vigente ha opportunamente distinto tra le attività di “comunicazione”, svolte in linea di massima dagli uffici di relazioni con il pubblico (URP) e quelle di “informazione”, assegnando queste ultime a uffici stampa retti da giornalisti. Proprio su quest’ultimo fronte, la nuova normativa ha dato impulso alla nascita di uffici espressamente dedicati alle relazioni con il mondo dei mass media, in particolare nelle amministrazioni territoriali o nei principali centri di ricerca, parchi scientifici e università italiani. Ciò vale, per esempio, per i parchi scientifici come AREA Science Park di Trieste, che, nel loro ruolo di ponte tra ricerca scientifica e mondo delle imprese, trovano nella comunicazione un fattore chiave di success

    Sophomore Recital: Sophia Brattoli, Trombone; Lu Witzig, Piano; April 10, 2022

    Get PDF
    Kemp Recital Hall April 10, 2022 Sunday Afternoon 4:30p

    Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

    Get PDF
    Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at this http URL

    GIS-based modelling of odour emitted from the waste processing plant: case study

    Full text link
    The emission of odours into the atmospheric air from the municipal economy and industrial plants, especially in urbanized areas, causes a serious problem, which the mankind has been struggling with for years. The excessive exposure of people to odours may result in many negative health effects, including, for example, headaches and vomiting. There are many different methods that are used in order to evaluate the odour nuisance. The results obtained through those methods can then be used to carry out a visualization and an analysis of a distribution of the odour concentrations in a given area by using the GIS (Geographic Information System). By their application to the spatial analysis of the impact of odours, we can enable the assessment of the magnitude and likelihood of the occurrence of odour nuisance. Modelling using GIS tools and spatial interpolation like IDW method and kriging can provide an alternative to the standard modelling tools, which generally use the emission values from sources that are identified as major emitters of odours. The work presents the result, based on the odour measurements data from waste processing plant, of the attempt to connect two different tools – the reference model OPERAT FB and GIS-based dispersion modelling performed using IDW method and ordinary kriging to analyse their behaviour in terms of limited observation values
    • …
    corecore