27 research outputs found
Flud: a hybrid crowd-algorithm approach for visualizing biological networks
Modern experiments in many disciplines generate large quantities of network
(graph) data. Researchers require aesthetic layouts of these networks that
clearly convey the domain knowledge and meaning. However, the problem remains
challenging due to multiple conflicting aesthetic criteria and complex
domain-specific constraints. In this paper, we present a strategy for
generating visualizations that can help network biologists understand the
protein interactions that underlie processes that take place in the cell.
Specifically, we have developed Flud, an online game with a purpose (GWAP) that
allows humans with no expertise to design biologically meaningful graph layouts
with the help of algorithmically generated suggestions. Further, we propose a
novel hybrid approach for graph layout wherein crowdworkers and a simulated
annealing algorithm build on each other's progress. To showcase the
effectiveness of Flud, we recruited crowd workers on Amazon Mechanical Turk to
lay out complex networks that represent signaling pathways. Our results show
that the proposed hybrid approach outperforms state-of-the-art techniques for
graphs with a large number of feedback loops. We also found that the
algorithmically generated suggestions guided the players when they are stuck
and helped them improve their score. Finally, we discuss broader implications
for mixed-initiative interactions in human computation games.Comment: This manuscript is currently under revie
Virtual Movement from Natural Language Text
It is a challenging task for machines to follow a textual instruction. Properly understanding and using the meaning of the textual instruction in some application areas, such as robotics, animation, etc. is very difficult for machines. The interpretation of textual instructions for the automatic generation of the corresponding motions (e.g. exercises) and the validation of these movements are difficult tasks. To achieve our initial goal of having machines properly understand textual instructions and generate some motions accordingly, we recorded five different exercises in random order with the help of seven amateur performers using a Microsoft Kinect device. During the recording, we found that the same exercise was interpreted differently by each human performer even though they were given identical textual instructions. We performed a quality assessment study based on the derived data using a crowdsourcing approach. Later, we tested the inter-rater agreement for different types of visualization, and found the RGB-based visualization showed the best agreement among the annotatorsa animation with a virtual character standing in second position. In the next phase we worked with physical exercise instructions. Physical exercise is an everyday activity domain in which textual exercise descriptions are usually focused on body movements. Body movements are considered to be a common element across a broad range of activities that are of interest for robotic automation. Our main goal is to develop a text-to-animation system which we can use in different application areas and which we can also use to develop multiple-purpose robots whose operations are based on textual instructions. This system could be also used in different text to scene and text to animation systems. To generate a text-based animation system for physical exercises the process requires the robot to have natural language understanding (NLU) including understanding non-declarative sentences. It also requires the extraction of semantic information from complex syntactic structures with a large number of potential interpretations. Despite a comparatively high density of semantic references to body movements, exercise instructions still contain large amounts of underspecified information. Detecting, and bridging and/or filling such underspecified elements is extremely challenging when relying on methods from NLU alone. However, humans can often add such implicit information with ease due to its embodied nature. We present a process that contains the combination of a semantic parser and a Bayesian network. In the semantic parser, the system extracts all the information present in the instruction to generate the animation. The Bayesian network adds some brain to the system to extract the information that is implicit in the instruction. This information is very important for correctly generating the animation and is very easy for a human to extract but very difficult for machines. Using crowdsourcing, with the help of human brains, we updated the Bayesian network. The combination of the semantic parser and the Bayesian network explicates the information that is contained in textual movement instructions so that an animation execution of the motion sequences performed by a virtual humanoid character can be rendered. To generate the animation from the information we basically used two different types of Markup languages. Behaviour Markup Language is used for 2D animation. Humanoid Animation uses Virtual Reality Markup Language for 3D animation
Enhancing the Reasoning Capabilities of Natural Language Inference Models with Attention Mechanisms and External Knowledge
Natural Language Inference (NLI) is fundamental to natural language understanding. The task summarises the natural language understanding capabilities within a simple formulation of determining whether a natural language hypothesis can be inferred from a given natural language premise. NLI requires an inference system to address the full complexity of linguistic as well as real-world commonsense knowledge and, hence, the inferencing and reasoning capabilities of an NLI system are utilised in other complex language applications such as summarisation and machine comprehension. Consequently, NLI has received significant recent attention from both academia and industry. Despite extensive research, contemporary neural NLI models face challenges arising from the sole reliance on training data to comprehend all the linguistic and real-world commonsense knowledge. Further, different attention mechanisms, crucial to the success of neural NLI models, present the prospects of better utilisation when employed in combination. In addition, the NLI research field lacks a coherent set of guidelines for the application of one of the most crucial regularisation hyper-parameters in the RNN-based NLI models -- dropout.
In this thesis, we present neural models capable of leveraging the attention mechanisms and the models that utilise external knowledge to reason about inference. First, a combined attention model to leverage different attention mechanisms is proposed. Experimentation demonstrates that the proposed model is capable of better modelling the semantics of long and complex sentences. Second, to address the limitation of the sole reliance on the training data, two novel neural frameworks utilising real-world commonsense and domain-specific external knowledge are introduced. Employing the rule-based external knowledge retrieval from the knowledge graphs, the first model takes advantage of the convolutional encoders and factorised bilinear pooling to augment the reasoning capabilities of the state-of-the-art NLI models. Utilising the significant advances in the research of contextual word representations, the second model, addresses the existing crucial challenges of external knowledge retrieval, learning the encoding of the retrieved knowledge and the fusion of the learned encodings to the NLI representations, in unique ways. Experimentation demonstrates the efficacy and superiority of the proposed models over previous state-of-the-art approaches. Third, for the limitation on dropout investigations, formulated on exhaustive evaluation, analysis and validation on the proposed RNN-based NLI models, a coherent set of guidelines is introduced
Making the Most of Crowd Information: Learning and Evaluation in AI tasks with Disagreements.
PhD ThesesThere is plenty of evidence that humans disagree on the interpretation of many
tasks in Natural Language Processing (nlp) and Computer Vision (cv), from objective
tasks rooted in linguistics such as part-of-speech tagging to more subjective (observerdependent)
tasks such as classifying an image or deciding whether a proposition follows
from a certain premise. While most learning in Artificial Intelligence (ai) still relies
on the assumption that a single interpretation, captured by the gold label, exists for
each item, a growing research body in recent years has focused on learning methods
that do not rely on this assumption. Rather, they aim to learn ranges of truth amidst
disagreement. This PhD research makes a contribution to this field of study.
Firstly, we analytically review the evidence for disagreement on nlp and cv tasks,
focusing on tasks where substantial datasets with such information have been created.
As part of this review, we also discuss the most popular approaches to training
models from datasets containing multiple judgments and group these methods
together according to their handling of disagreement. Secondly, we make three proposals
for learning with disagreement; soft-loss, multi-task learning from gold and
crowds, and automatic temperature-scaled soft-loss. Thirdly, we address one gap in
this field of study – the prevalence of hard metrics for model evaluation even when
the gold assumption is shown to be an idealization – by proposing several previously
existing metrics and novel soft metrics that do not make this assumption and analyzing
the merits and assumptions of all the metrics, hard and soft. Finally, we carry
out a systematic investigation of the key proposals in learning with disagreement by
training them across several tasks, considering several ways to evaluate the resulting
models and assessing the conditions under which each approach is effective. This is
a key contribution of this research as research in learning with disagreement do not
often test proposals across tasks, compare proposals with a variety of approaches, or
evaluate using both soft metrics and hard metrics.
The results obtained suggest, first of all, that it is essential to reach a consensus
on how to evaluate models. This is because the relative performance of the various
training methods is critically affected by the chosen form of evaluation. Secondly,
we observed a strong dataset effect. With substantial datasets, providing many judgments
by high-quality coders for each item, training directly with soft labels achieved
better results than training from aggregated or even gold labels. This result holds for
both hard and soft evaluation. But when the above conditions do not hold, leveraging
both gold and soft labels generally achieved the best results in the hard evaluation.
All datasets and models employed in this paper are freely available as supplementary
materials
Recommended from our members
Making digital history: The impact of digitality on public participation and scholarly practices in historical research
This thesis investigates tow key questions: firstly, how do two broad groups - academic, family and local historians, and the public - evaluate, use, and contribute to digital history resources? And consequently, what impact have digital technologies had on public participation and scholarly practices in historical research?
Analysing the impact of design on participant experiences and the reception of digital historiography by demonstrating the value of methods drawn from human-computer interaction, including heuristic evaluation, trace ethnography and semi-structured interviews. This thesis also investigates the relationship between heritage crowdsourcing projects (which ask the public to help with meaningful, inherently rewarding tasks that contribute to a shared, significant goal or research interest related to cultural heritage collections or knowledge) and the development of historical skills and interests. It situates crowdsourcing and citizen history within the broader field of participatory digital history and then focuses on the impact of digitality on the research practices of faculty and community historians.
Chapter 1 provides an overview of over 400 digital history projects aimed at engaging the public or collecting, creating or enhancing records about historical materials for scholarly and general audiences. Chapter 2 discusses design factors that may influence the success of crowdsourcing projects. Following this, Chapter 3 explores the ways in which some crowdsourcing projects encourage deeper engagement with history or science, and the role of communities of practice in citizen history. Chapter 4 shifts our focus from public participation to scholarly practices in historical research, presenting the results of interviews conducted with 29 faculty and community historians. Finally, the Conclusion draws together the threads that link public participation and scholarly practices, teasing out the ways in which the practices of discovering, gathering, creating and sharing historical materials and knowledge have been affected by digital methods, tools and resources