29 research outputs found
Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision
Deep learning has emerged as a versatile tool for a wide range of NLP tasks,
due to its superior capacity in representation learning. But its applicability
is limited by the reliance on annotated examples, which are difficult to
produce at scale. Indirect supervision has emerged as a promising direction to
address this bottleneck, either by introducing labeling functions to
automatically generate noisy examples from unlabeled text, or by imposing
constraints over interdependent label decisions. A plethora of methods have
been proposed, each with respective strengths and limitations. Probabilistic
logic offers a unifying language to represent indirect supervision, but
end-to-end modeling with probabilistic logic is often infeasible due to
intractable inference and learning. In this paper, we propose deep
probabilistic logic (DPL) as a general framework for indirect supervision, by
composing probabilistic logic with deep learning. DPL models label decisions as
latent variables, represents prior knowledge on their relations using weighted
first-order logical formulas, and alternates between learning a deep neural
network for the end task and refining uncertain formula weights for indirect
supervision, using variational EM. This framework subsumes prior indirect
supervision methods as special cases, and enables novel combination via
infusion of rich domain and linguistic knowledge. Experiments on biomedical
machine reading demonstrate the promise of this approach.Comment: EMNLP 2018 final versio
Characteristics of Useful Code Reviews: An Empirical Study at Microsoft
Abstract-Over the past decade, both open source and commercial software projects have adopted contemporary peer code review practices as a quality control mechanism. Prior research has shown that developers spend a large amount of time and effort performing code reviews. Therefore, identifying factors that lead to useful code reviews can benefit projects by increasing code review effectiveness and quality. In a three-stage mixed research study, we qualitatively investigated what aspects of code reviews make them useful to developers, used our findings to build and verify a classification model that can distinguish between useful and not useful code review feedback, and finally we used this classifier to classify review comments enabling us to empirically investigate factors that lead to more effective code review feedback. In total, we analyzed 1.5 millions review comments from five Microsoft projects and uncovered many factors that affect the usefulness of review feedback. For example, we found that the proportion of useful comments made by a reviewer increases dramatically in the first year that he or she is at Microsoft but tends to plateau afterwards. In contrast, we found that the more files that are in a change, the lower the proportion of comments in the code review that will be of value to the author of the change. Based on our findings, we provide recommendations for practitioners to improve effectiveness of code reviews
Automatic Question Generation Using Semantic Role Labeling for Morphologically Rich Languages
In this paper, a novel approach to automatic question generation (AQG) using semantic role labeling (SRL) for morphologically rich languages is presented. A model for AQG is developed for our native speaking language, Croatian. Croatian language is a highly inflected language that belongs to Balto-Slavic family of languages. Globally this article can be divided into two stages. In the first stage we present a novel approach to SRL of texts written in Croatian language that uses Conditional Random Fields (CRF). SRL traditionally consists of predicate disambiguation, argument identification and argument classification. After these steps most approaches use beam search to find optimal sequence of arguments based on given predicate. We propose the architecture for predicate identification and argument classification in which finding the best sequence of arguments is handled by Viterbi decoding. We enrich SRL features with custom attributes that are custom made for this language. Our SRL system achieves F1 score of 78% in argument classification step on Croatian hr 500k corpus. In the second stage the proposed SRL model is used to develop AQG system for question generation from texts written in Croatian language. We proposed custom templates for AQG that were used to generate a total of 628 questions which were evaluated by experts scoring every question on a Likert scale. Expert evaluation of the system showed that our AQG achieved good results. The evaluation showed that 68% of the generated questions could be used for educational purposes. With these results the proposed AQG system could be used for possible implementation inside educational systems such as Intelligent Tutoring Systems
Don't Just Listen, Use Your Imagination: Leveraging Visual Common Sense for Non-Visual Tasks
Artificial agents today can answer factual questions. But they fall short on
questions that require common sense reasoning. Perhaps this is because most
existing common sense databases rely on text to learn and represent knowledge.
But much of common sense knowledge is unwritten - partly because it tends not
to be interesting enough to talk about, and partly because some common sense is
unnatural to articulate in text. While unwritten, it is not unseen. In this
paper we leverage semantic common sense knowledge learned from images - i.e.
visual common sense - in two textual tasks: fill-in-the-blank and visual
paraphrasing. We propose to "imagine" the scene behind the text, and leverage
visual cues from the "imagined" scenes in addition to textual cues while
answering these questions. We imagine the scenes as a visual abstraction. Our
approach outperforms a strong text-only baseline on these tasks. Our proposed
tasks can serve as benchmarks to quantitatively evaluate progress in solving
tasks that go "beyond recognition". Our code and datasets are publicly
available
Explaining Explanation: An Empirical Study on Explanation in Code Reviews
Code review is an important process for quality assurance in software
development. For an effective code review, the reviewers must explain their
feedback to enable the authors of the code change to act on them. However, the
explanation needs may differ among developers, who may require different types
of explanations. It is therefore crucial to understand what kind of
explanations reviewers usually use in code reviews. To the best of our
knowledge, no study published to date has analyzed the types of explanations
used in code review. In this study, we present the first analysis of
explanations in useful code reviews. We extracted a set of code reviews based
on their usefulness and labeled them based on whether they contained an
explanation, a solution, or both a proposed solution and an explanation
thereof.
Based on our analysis, we found that a significant portion of the code review
comments (46%) only include solutions without providing an explanation. We
further investigated the remaining 54% of code review comments containing an
explanation and conducted an open card sorting to categorize the reviewers'
explanations. We distilled seven distinct categories of explanations based on
the expression forms developers used. Then, we utilize large language models,
specifically ChatGPT, to assist developers in getting a code review explanation
that suits their preferences. Specifically, we created prompts to transform a
code review explanation into a specific type of explanation. Our evaluation
results show that ChatGPT correctly generated the specified type of explanation
in 88/90 cases and that 89/90 of the cases have the correct explanation.
Overall, our study provides insights into the types of explanations that
developers use in code review and showcases how ChatGPT can be leveraged during
the code review process to generate a specific type of explanation
Semantic Neighborhoods as Hypergraphs
Ambiguity preserving representations such as lattices are very useful in a number of NLP tasks, including paraphrase generation, paraphrase recognition, and machine translation evaluation. Lattices compactly represent lexical variation, but word order variation leads to a combinatorial explosion of states. We advocate hypergraphs as compact representations for sets of utterances describing the same event or object. We present a method to construct hypergraphs from sets of utterances, and evaluate this method on a simple recognition task. Given a set of utterances that describe a single object or event, we construct such a hypergraph, and demonstrate that it can recognize novel descriptions of the same event with high accuracy