6,689 research outputs found
Context-aware Captions from Context-agnostic Supervision
We introduce an inference technique to produce discriminative context-aware
image captions (captions that describe differences between images or visual
concepts) using only generic context-agnostic training data (captions that
describe a concept or an image in isolation). For example, given images and
captions of "siamese cat" and "tiger cat", we generate language that describes
the "siamese cat" in a way that distinguishes it from "tiger cat". Our key
novelty is that we show how to do joint inference over a language model that is
context-agnostic and a listener which distinguishes closely-related concepts.
We first apply our technique to a justification task, namely to describe why an
image contains a particular fine-grained category as opposed to another
closely-related category of the CUB-200-2011 dataset. We then study
discriminative image captioning to generate language that uniquely refers to
one of two semantically-similar images in the COCO dataset. Evaluations with
discriminative ground truth for justification and human studies for
discriminative image captioning reveal that our approach outperforms baseline
generative and speaker-listener approaches for discrimination.Comment: Accepted to CVPR 2017 (Spotlight
Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes
PURPOSE: The medical literature relevant to germline genetics is growing
exponentially. Clinicians need tools monitoring and prioritizing the literature
to understand the clinical implications of the pathogenic genetic variants. We
developed and evaluated two machine learning models to classify abstracts as
relevant to the penetrance (risk of cancer for germline mutation carriers) or
prevalence of germline genetic mutations. METHODS: We conducted literature
searches in PubMed and retrieved paper titles and abstracts to create an
annotated dataset for training and evaluating the two machine learning
classification models. Our first model is a support vector machine (SVM) which
learns a linear decision rule based on the bag-of-ngrams representation of each
title and abstract. Our second model is a convolutional neural network (CNN)
which learns a complex nonlinear decision rule based on the raw title and
abstract. We evaluated the performance of the two models on the classification
of papers as relevant to penetrance or prevalence. RESULTS: For penetrance
classification, we annotated 3740 paper titles and abstracts and used 60% for
training the model, 20% for tuning the model, and 20% for evaluating the model.
The SVM model achieves 89.53% accuracy (percentage of papers that were
correctly classified) while the CNN model achieves 88.95 % accuracy. For
prevalence classification, we annotated 3753 paper titles and abstracts. The
SVM model achieves 89.14% accuracy while the CNN model achieves 89.13 %
accuracy. CONCLUSION: Our models achieve high accuracy in classifying abstracts
as relevant to penetrance or prevalence. By facilitating literature review,
this tool could help clinicians and researchers keep abreast of the burgeoning
knowledge of gene-cancer associations and keep the knowledge bases for clinical
decision support tools up to date
Impact of Bt Cotton on the Farmer's Livelihood System in China
In order to analyze the impacts of Bt cotton on the farmers' livelihood system, we interviewed 169 farmers and extension personnel in the main cotton production areas in Hebei province in the year 2002 and 2003. An integrative method was used in which a multidisciplinary approach was employed including agronomy, economics and sociology. The results showed that the application of Bt cotton increased the cotton growing area as well as farmers' income. For 67% of the farmers interviewed, cotton area has been continuously increasing since 1997. The cotton net margin in one cropping cycle came out to be higher than the combined net margins of wheat and corn in two cropping cycles. The income from cotton played a significant role in the investment to education, leisure and health care. The socio-economic impacts of cotton production are nevertheless not yet optimal because there were still many factors limiting them. Lack of labor and land were the main limiting factors. Productivity is restrained by the high price of Bt cotton seeds which pushed farmers to keep seeds from their own cotton production (42% of the farmers in 2002 and 2003). Farmers are still lacking technical command in using Bt-cotton: 78% of the farmers admitted that while more than 94% of the farmers complained not getting information from local extension and technical services. More success in using Bt-cotton calls upon going beyond providing seeds and asks for continuous assistance from research and extension department, notably to achieve a full knowledge of the Bt-cotton characteristic so as to optimally integrate it into the farmers' system.China; Bt Cotton; biotechnologies; impact evaluation; Livelihood
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework
Multimodal reasoning is a critical component in the pursuit of artificial
intelligence systems that exhibit human-like intelligence, especially when
tackling complex tasks. While the chain-of-thought (CoT) technique has gained
considerable attention, the existing ScienceQA dataset, which focuses on
multimodal scientific questions and explanations from elementary and high
school textbooks, lacks a comprehensive evaluation of diverse approaches. To
address this gap, we present COCO Multi-Modal Reasoning Dataset(COCO-MMRD), a
novel dataset that encompasses an extensive collection of open-ended questions,
rationales, and answers derived from the large object dataset COCO. Unlike
previous datasets that rely on multiple-choice questions, our dataset pioneers
the use of open-ended questions in the context of multimodal CoT, introducing a
more challenging problem that effectively assesses the reasoning capability of
CoT models. Through comprehensive evaluations and detailed analyses, we provide
valuable insights and propose innovative techniques, including multi-hop
cross-modal attention and sentence-level contrastive learning, to enhance the
image and text encoders. Extensive experiments demonstrate the efficacy of the
proposed dataset and techniques, offering novel perspectives for advancing
multimodal reasoning
Enhancing Attention’s Explanation Using Interpretable Tsetlin Machine
Explainability is one of the key factors in Natural Language Processing (NLP) specially for legal documents, medical diagnosis, and clinical text. Attention mechanism has been a popular choice for such explainability recently by estimating the relative importance of input units. Recent research has revealed, however, that such processes tend to misidentify irrelevant input units when explaining them. This is due to the fact that language representation layers are initialized by pretrained word embedding that is not context-dependent. Such a lack of context-dependent knowledge in the initial layer makes it difficult for the model to concentrate on the important aspects of input. Usually, this does not impact the performance of the model, but the explainability differs from human understanding. Hence, in this paper, we propose an ensemble method to use logic-based information from the Tsetlin Machine to embed it into the initial representation layer in the neural network to enhance the model in terms of explainability. We obtain the global clause score for each word in the vocabulary and feed it into the neural network layer as context-dependent information. Our experiments show that the ensemble method enhances the explainability of the attention layer without sacrificing any performance of the model and even outperforming in some datasets.publishedVersio
- …