35,003 research outputs found
Distributionally Robust Deep Learning using Hardness Weighted Sampling
Limiting failures of machine learning systems is vital for safety-critical
applications. In order to improve the robustness of machine learning systems,
Distributionally Robust Optimization (DRO) has been proposed as a
generalization of Empirical Risk Minimization (ERM)aiming at addressing this
need. However, its use in deep learning has been severely restricted due to the
relative inefficiency of the optimizers available for DRO in comparison to the
wide-spread variants of Stochastic Gradient Descent (SGD) optimizers for ERM.
We propose SGD with hardness weighted sampling, a principled and efficient
optimization method for DRO in machine learning that is particularly suited in
the context of deep learning. Similar to a hard example mining strategy in
essence and in practice, the proposed algorithm is straightforward to implement
and computationally as efficient as SGD-based optimizers used for deep
learning, requiring minimal overhead computation. In contrast to typical ad hoc
hard mining approaches, and exploiting recent theoretical results in deep
learning optimization, we prove the convergence of our DRO algorithm for
over-parameterized deep learning networks with ReLU activation and finite
number of layers and parameters. Our experiments on brain tumor segmentation in
MRI demonstrate the feasibility and the usefulness of our approach. Using our
hardness weighted sampling leads to a decrease of 2% of the interquartile range
of the Dice scores for the enhanced tumor and the tumor core regions. The code
for the proposed hard weighted sampler will be made publicly available
Good Features to Correlate for Visual Tracking
During the recent years, correlation filters have shown dominant and
spectacular results for visual object tracking. The types of the features that
are employed in these family of trackers significantly affect the performance
of visual tracking. The ultimate goal is to utilize robust features invariant
to any kind of appearance change of the object, while predicting the object
location as properly as in the case of no appearance change. As the deep
learning based methods have emerged, the study of learning features for
specific tasks has accelerated. For instance, discriminative visual tracking
methods based on deep architectures have been studied with promising
performance. Nevertheless, correlation filter based (CFB) trackers confine
themselves to use the pre-trained networks which are trained for object
classification problem. To this end, in this manuscript the problem of learning
deep fully convolutional features for the CFB visual tracking is formulated. In
order to learn the proposed model, a novel and efficient backpropagation
algorithm is presented based on the loss function of the network. The proposed
learning framework enables the network model to be flexible for a custom
design. Moreover, it alleviates the dependency on the network trained for
classification. Extensive performance analysis shows the efficacy of the
proposed custom design in the CFB tracking framework. By fine-tuning the
convolutional parts of a state-of-the-art network and integrating this model to
a CFB tracker, which is the top performing one of VOT2016, 18% increase is
achieved in terms of expected average overlap, and tracking failures are
decreased by 25%, while maintaining the superiority over the state-of-the-art
methods in OTB-2013 and OTB-2015 tracking datasets.Comment: Accepted version of IEEE Transactions on Image Processin
Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor Failures
Reinforcement learning (RL) constitutes a promising solution for alleviating
the problem of traffic congestion. In particular, deep RL algorithms have been
shown to produce adaptive traffic signal controllers that outperform
conventional systems. However, in order to be reliable in highly dynamic urban
areas, such controllers need to be robust with the respect to a series of
exogenous sources of uncertainty. In this paper, we develop an open-source
callback-based framework for promoting the flexible evaluation of different
deep RL configurations under a traffic simulation environment. With this
framework, we investigate how deep RL-based adaptive traffic controllers
perform under different scenarios, namely under demand surges caused by special
events, capacity reductions from incidents and sensor failures. We extract
several key insights for the development of robust deep RL algorithms for
traffic control and propose concrete designs to mitigate the impact of the
considered exogenous uncertainties.Comment: 8 page
Deep Anomaly Detection for Time-series Data in Industrial IoT: A Communication-Efficient On-device Federated Learning Approach
Since edge device failures (i.e., anomalies) seriously affect the production
of industrial products in Industrial IoT (IIoT), accurately and timely
detecting anomalies is becoming increasingly important. Furthermore, data
collected by the edge device may contain the user's private data, which is
challenging the current detection approaches as user privacy is calling for the
public concern in recent years. With this focus, this paper proposes a new
communication-efficient on-device federated learning (FL)-based deep anomaly
detection framework for sensing time-series data in IIoT. Specifically, we
first introduce a FL framework to enable decentralized edge devices to
collaboratively train an anomaly detection model, which can improve its
generalization ability. Second, we propose an Attention Mechanism-based
Convolutional Neural Network-Long Short Term Memory (AMCNN-LSTM) model to
accurately detect anomalies. The AMCNN-LSTM model uses attention
mechanism-based CNN units to capture important fine-grained features, thereby
preventing memory loss and gradient dispersion problems. Furthermore, this
model retains the advantages of LSTM unit in predicting time series data.
Third, to adapt the proposed framework to the timeliness of industrial anomaly
detection, we propose a gradient compression mechanism based on Top-\textit{k}
selection to improve communication efficiency. Extensive experiment studies on
four real-world datasets demonstrate that the proposed framework can accurately
and timely detect anomalies and also reduce the communication overhead by 50\%
compared to the federated learning framework that does not use a gradient
compression scheme.Comment: IEEE Internet of Things Journa
- …