226 research outputs found
A Comprehensive Survey on Deep-Learning-based Vehicle Re-Identification: Models, Data Sets and Challenges
Vehicle re-identification (ReID) endeavors to associate vehicle images
collected from a distributed network of cameras spanning diverse traffic
environments. This task assumes paramount importance within the spectrum of
vehicle-centric technologies, playing a pivotal role in deploying Intelligent
Transportation Systems (ITS) and advancing smart city initiatives. Rapid
advancements in deep learning have significantly propelled the evolution of
vehicle ReID technologies in recent years. Consequently, undertaking a
comprehensive survey of methodologies centered on deep learning for vehicle
re-identification has become imperative and inescapable. This paper extensively
explores deep learning techniques applied to vehicle ReID. It outlines the
categorization of these methods, encompassing supervised and unsupervised
approaches, delves into existing research within these categories, introduces
datasets and evaluation criteria, and delineates forthcoming challenges and
potential research directions. This comprehensive assessment examines the
landscape of deep learning in vehicle ReID and establishes a foundation and
starting point for future works. It aims to serve as a complete reference by
highlighting challenges and emerging trends, fostering advancements and
applications in vehicle ReID utilizing deep learning models
Automatically Discovering and Learning New Visual Categories with Ranking Statistics
We tackle the problem of discovering novel classes in an image collection
given labelled examples of other classes. This setting is similar to
semi-supervised learning, but significantly harder because there are no
labelled examples for the new classes. The challenge, then, is to leverage the
information contained in the labelled images in order to learn a
general-purpose clustering model and use the latter to identify the new classes
in the unlabelled data. In this work we address this problem by combining three
ideas: (1) we suggest that the common approach of bootstrapping an image
representation using the labeled data only introduces an unwanted bias, and
that this can be avoided by using self-supervised learning to train the
representation from scratch on the union of labelled and unlabelled data; (2)
we use rank statistics to transfer the model's knowledge of the labelled
classes to the problem of clustering the unlabelled images; and, (3) we train
the data representation by optimizing a joint objective function on the
labelled and unlabelled subsets of the data, improving both the supervised
classification of the labelled data, and the clustering of the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform
current methods for novel category discovery by a significant margin.Comment: ICLR 2020, code: http://www.robots.ox.ac.uk/~vgg/research/auto_nove
Promotional Campaigns in the Era of Social Platforms
The rise of social media has facilitated the diffusion of information to more easily reach millions of users. While some users connect with friends and organically share information and opinions on social media, others have exploited these platforms to gain influence and profit through promotional campaigns and advertising. The existence of promotional campaigns contributes to the spread of misleading information, spam, and fake news. Thus, these campaigns affect the trustworthiness and reliability of social media and render it as a crowd advertising platform. This dissertation studies the existence of promotional campaigns in social media and explores different ways users and bots (i.e. automated accounts) engage in such campaigns. In this dissertation, we design a suite of detection, ranking, and mining techniques. We study user-generated reviews in online e-commerce sites, such as Google Play, to extract campaigns. We identify cooperating sets of bots and classify their interactions in social networks such as Twitter, and rank the bots based on the degree of their malevolence. Our study shows that modern online social interactions are largely modulated by promotional campaigns such as political campaigns, advertisement campaigns, and incentive-driven campaigns. We measure how these campaigns can potentially impact information consumption of millions of social media users
Towards generalizable machine learning models for computer-aided diagnosis in medicine
Hidden stratification represents a phenomenon in which a training dataset contains unlabeled (hidden) subsets of cases that may affect machine learning model performance. Machine learning models that ignore the hidden stratification phenomenon--despite promising overall performance measured as accuracy and sensitivity--often fail at predicting the low prevalence cases, but those cases remain important. In the medical domain, patients with diseases are often less common than healthy patients, and a misdiagnosis of a patient with a disease can have significant clinical impacts. Therefore, to build a robust and trustworthy CAD system and a reliable treatment effect prediction model, we cannot only pursue machine learning models with high overall accuracy, but we also need to discover any hidden stratification in the data and evaluate the proposing machine learning models with respect to both overall performance and the performance on certain subsets (groups) of the data, such as the ‘worst group’.
In this study, I investigated three approaches for data stratification: a novel algorithmic deep learning (DL) approach that learns similarities among cases and two schema completion approaches that utilize domain expert knowledge. I further proposed an innovative way to integrate the discovered latent groups into the loss functions of DL models to allow for better model generalizability under the domain shift scenario caused by the data heterogeneity.
My results on lung nodule Computed Tomography (CT) images and breast cancer histopathology images demonstrate that learning homogeneous groups within heterogeneous data significantly improves the performance of the computer-aided diagnosis (CAD) system, particularly for low-prevalence or worst-performing cases. This study emphasizes the importance of discovering and learning the latent stratification within the data, as it is a critical step towards building ML models that are generalizable and reliable. Ultimately, this discovery can have a profound impact on clinical decision-making, particularly for low-prevalence cases
An application of user segmentation and predictive modelling at a telecom company
Internship report presented as partial requirement for obtaining the Master’s degree in Advanced AnalyticsInternship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics“The squeaky wheel gets the grease” is an American proverb used to convey the notion that only those
who speak up tend to be heard. This was believed to be the case at the telecom company I interned at
– they believed that while those who complain about an issue (in particular, an issue of no access to
the service) get their problem resolved, there are others who have an issue but do not complain about
it. The latter are likely to be dissatisfied customers, and must be identified. This report describes the
approach taken to address this problem using machine learning. Unsupervised learning was used to
segment the customer base into user profiles based on their viewing behaviour, to better understand
their needs; and supervised learning was used to develop a predictive model to identify customers
who have no access to the TV service, and to explore what factors (or combination of factors) are
indicative of this issue
Advancing Aircraft Operations in a Net-Centric Environment with the Incorporation of Increasingly Autonomous Systems and Human Teaming
NextGen has begun the modernization of the nations air transportation system, with goals to improve system safety, increase operation efficiency and capacity, provide enhanced predictability, resilience and robustness. With these improvements, NextGen is poised to handle significant increases in air traffic operations, more than twice the number recorded in 2016, by 2025.1 NextGen is evolving toward collaborative decision-making across many agents, including automation, by use of a Net-Centric architecture, which in itself creates a very complex environment in which the navigation and operation of aircraft are to take place. An intricate environment such as this, coupled with the expected upsurge of air traffic operations generates concern respecting the ability of the human-agent to both fly and manage aircraft within. Therefore, it is both necessary and practical to begin the process of increasingly autonomous systems within the cockpit that will act independently to assist the human-agent achieve the overall goal of NextGen. However, the straightforward technological development and implementation of intelligent machines into the cockpit is only part of what is necessary to maintain, at minimum, or improve human-agent functionality, as desired, while operating in NextGen. The full integration of Increasingly Autonomous Systems (IAS) within the cockpit can only be accomplished when the IAS works in concert with the human, formulating trust between the two, thereby establishing a team atmosphere. Imperative to cockpit implementation is ensuring the proper performance of the IAS by the development team and the human-agent with which it will be paired when given a specific piloting, navigation, or observational task. Described in this paper are the steps taken, at NASA Langley Research Center, during the second and third phases of the development of an IAS, the Traffic Data Manager (TDM), its verification and validation by human-agents, and the foundational development of Human Autonomy Teaming (HAT) between the two
- …