7 research outputs found
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
The inability to interpret the model prediction in semantically and visually
meaningful ways is a well-known shortcoming of most existing computer-aided
diagnosis methods. In this paper, we propose MDNet to establish a direct
multimodal mapping between medical images and diagnostic reports that can read
images, generate diagnostic reports, retrieve images by symptom descriptions,
and visualize attention, to provide justifications of the network diagnosis
process. MDNet includes an image model and a language model. The image model is
proposed to enhance multi-scale feature ensembles and utilization efficiency.
The language model, integrated with our improved attention mechanism, aims to
read and explore discriminative image feature descriptions from reports to
learn a direct mapping from sentence words to image pixels. The overall network
is trained end-to-end by using our developed optimization strategy. Based on a
pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we
conduct sufficient experiments to demonstrate that MDNet outperforms
comparative baselines. The proposed image model obtains state-of-the-art
performance on two CIFAR datasets as well.Comment: CVPR2017 Ora
Privacy-Aware Recommender Systems Challenge on Twitter's Home Timeline
Recommender systems constitute the core engine of most social network
platforms nowadays, aiming to maximize user satisfaction along with other key
business objectives. Twitter is no exception. Despite the fact that Twitter
data has been extensively used to understand socioeconomic and political
phenomena and user behaviour, the implicit feedback provided by users on Tweets
through their engagements on the Home Timeline has only been explored to a
limited extent. At the same time, there is a lack of large-scale public social
network datasets that would enable the scientific community to both benchmark
and build more powerful and comprehensive models that tailor content to user
interests. By releasing an original dataset of 160 million Tweets along with
engagement information, Twitter aims to address exactly that. During this
release, special attention is drawn on maintaining compliance with existing
privacy laws. Apart from user privacy, this paper touches on the key challenges
faced by researchers and professionals striving to predict user engagements. It
further describes the key aspects of the RecSys 2020 Challenge that was
organized by ACM RecSys in partnership with Twitter using this dataset.Comment: 16 pages, 2 table
Loss-Based Attention for Deep Multiple Instance Learning
Although attention mechanisms have been widely used in deep learning for many tasks, they are rarely utilized to solve multiple instance learning (MIL) problems, where only a general category label is given for multiple instances contained in one bag. Additionally, previous deep MIL methods firstly utilize the attention mechanism to learn instance weights and then employ a fully connected layer to predict the bag label, so that the bag prediction is largely determined by the effectiveness of learned instance weights. To alleviate this issue, in this paper, we propose a novel loss based attention mechanism, which simultaneously learns instance weights and predictions, and bag predictions for deep multiple instance learning. Specifically, it calculates instance weights based on the loss function, e.g. softmax+cross-entropy, and shares the parameters with the fully connected layer, which is to predict instance and bag predictions. Additionally, a regularization term consisting of learned weights and cross-entropy functions is utilized to boost the recall of instances, and a consistency cost is used to smooth the training process of neural networks for boosting the model generalization performance. Extensive experiments on multiple types of benchmark databases demonstrate that the proposed attention mechanism is a general, effective and efficient framework, which can achieve superior bag and image classification performance over other state-of-the-art MIL methods, with obtaining higher instance precision and recall than previous attention mechanisms. Source codes are available on https://github.com/xsshi2015/Loss-Attention