371 research outputs found
Recommended from our members
Verifying baselines for crisis event information classification on Twitter
Social media are rich information sources during and in the aftermath of crisis events such as earthquakes and terrorist attacks. Despite myriad challenges, with the right tools, significant insight can be gained which can assist emergency responders and related applications. However, most extant approaches are incomparable, using bespoke definitions, models, datasets and even evaluation metrics. Furthermore, it is rare that code, trained models, or exhaustive parametrisation details are made openly available. Thus, even confirmation of self-reported performance is problematic; authoritatively determining the state of the art (SOTA) is essentially impossible. Consequently, to begin addressing such endemic ambiguity, this paper seeks to make 3 contributions: 1) the replication and results confirmation of a leading (and generalisable) technique; 2) testing straightforward modifications of the technique likely to improve performance; and 3) the extension of the technique to a novel and complimentary type of crisis-relevant information to demonstrate it’s generalisability
An Attention-Based Model for Predicting Contextual Informativeness and Curriculum Learning Applications
Both humans and machines learn the meaning of unknown words through
contextual information in a sentence, but not all contexts are equally helpful
for learning. We introduce an effective method for capturing the level of
contextual informativeness with respect to a given target word. Our study makes
three main contributions. First, we develop models for estimating contextual
informativeness, focusing on the instructional aspect of sentences. Our
attention-based approach using pre-trained embeddings demonstrates
state-of-the-art performance on our single-context dataset and an existing
multi-sentence context dataset. Second, we show how our model identifies key
contextual elements in a sentence that are likely to contribute most to a
reader's understanding of the target word. Third, we examine how our contextual
informativeness model, originally developed for vocabulary learning
applications for students, can be used for developing better training curricula
for word embedding models in batch learning and few-shot machine learning
settings. We believe our results open new possibilities for applications that
support language learning for both human and machine learner
The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction
This study investigates the use of unsupervised word embeddings and sequence
features for sample representation in an active learning framework built to
extract clinical concepts from clinical free text. The objective is to further
reduce the manual annotation effort while achieving higher effectiveness
compared to a set of baseline features. Unsupervised features are derived from
skip-gram word embeddings and a sequence representation approach. The
comparative performance of unsupervised features and baseline hand-crafted
features in an active learning framework are investigated using a wide range of
selection criteria including least confidence, information diversity,
information density and diversity, and domain knowledge informativeness. Two
clinical datasets are used for evaluation: the i2b2/VA 2010 NLP challenge and
the ShARe/CLEF 2013 eHealth Evaluation Lab. Our results demonstrate significant
improvements in terms of effectiveness as well as annotation effort savings
across both datasets. Using unsupervised features along with baseline features
for sample representation lead to further savings of up to 9% and 10% of the
token and concept annotation rates, respectively
Neural models of language use:Studies of language comprehension and production in context
Artificial neural network models of language are mostly known and appreciated today for providing a backbone for formidable AI technologies. This thesis takes a different perspective. Through a series of studies on language comprehension and production, it investigates whether artificial neural networks—beyond being useful in countless AI applications—can serve as accurate computational simulations of human language use, and thus as a new core methodology for the language sciences
A framework for developing requirements engineering tools for computational business intelligence
立命館大ĺ¦ĺŤšĺŁ«(ĺ·Ąĺ¦)doctoral thesi
Neural models of language use:Studies of language comprehension and production in context
Artificial neural network models of language are mostly known and appreciated today for providing a backbone for formidable AI technologies. This thesis takes a different perspective. Through a series of studies on language comprehension and production, it investigates whether artificial neural networks—beyond being useful in countless AI applications—can serve as accurate computational simulations of human language use, and thus as a new core methodology for the language sciences
Uncertainty Measures and Transfer Learning in Active Learning for Text Classification
Dyp læring har blitt et fremtredende og populært verktøy i et bredt spekter av applikasjoner som omhandler behandling av komplekse data. For å kunne trene en modell tilstrekkelig, er imidlertid dyp læring avhengig av store mengder annotert data. Selv når data i seg selv er lett tilgjengelig, kan annotering være tidkrevende, dyrt, og ofte avhengig av en ekspert. Aktiv læring (AL) tar sikte på å redusere datakravet i dyp læring, og maskinlæring generelt, og dermed redusere annoteringskostnadene. Ved å la modellen aktivt velge de dataene den ønsker å lære fra, ønsker aktiv læring å kun annotere de mest verdifulle dataene, og trene en modell med kun et lite annotert treningssett. Ideén er at modellen skal kunne identifisere informative eksempler fra en stor samling med uannotert data, hvor informativitet ofte knyttes til modellens usikkerhet. Gjennom denne oppgaven utforskes flere aspekter ved aktiv læring i tekstklassifisering, ved å kombinere idéer som har vist gode resultater individuelt. For å sikre mangfold i aktivt valgte data har to metoder for å utforske større deler av rommet blitt utforsket. Den ene blander inn noen tilfeldig valgte data i det aktive utvalget, mens den andre grupperer den store samlingen med uannortert data, og velger kun ett datapunkt i hver klynge. Videre har en bayesiansk tilnærming til modellusikkerhet blitt testet, i og med at dype modeller som regel ikke representerer modellusikkerhet. Til slutt utforskes også de ulike idéene sammen med transfer learning. Forsøkene viser tydelig hvordan aktiv læring avhenger av data og modell, da de to forskjellige modellene og datasettene viste tydelig ulike resultater. De to modellene er en CNN for setningsklassifisering, og an AWD LSTM med pre-trening, som begge er testet på et filmanmeldelse-datasett (IMDB) med to klasser, of et nyhetsartikkel-datasett (AG) med fire klasser. Selv om ingen metoder viste noen effekt på AG, forbedret alle variasjoner resultatene for IMDB med CNN. Mens grupperingsmetoden virket som det mest fordelsaktige valget for CNN, ga det kun negativ effekt med AWD LSTM. Kombinasjonen av gruppering og bayesianske tilnærminger ga ingen bedre sammenlagt effekt, selv om begge ga gode resultater individuelt. Alt i alt viste ingen metoder overdrevent bedre resultater enn tilfeldig utvalgt data, men mange av resultatene ga interessante idéer for videre arbeid.Deep learning has become a prominent and popular tool in a wide range of applications concerned with processing of complex data. However, in order to train a sufficient model for supervised tasks, deep learning relies on vast amounts of labelled data. Even when data itself is easily attainable, acquiring labels can be tedious, expensive, and in need of an expert annotator. Active learning (AL) aims to lower the data requirement in deep learning, and machine learning in general, and consequently reduce labelling cost. By letting the learner actively choose the data it wants to learn from, active learning aspires to label only the most valuable data, and to train a classifier with only a small labelled training set. The idea is that the model is able to single out examples of high informativeness from a pool of unlabelled data, i.e. instances from which the model will gain the most information, which often is linked to model uncertainty. Through this thesis, several aspects of pool-based active learning in text classification are explored, by combining ideas that have shown good results individually. To ensure diverse actively queried samples, both adding randomness to the active selection, and clustering of the unlabelled pool have been investigated. Further, seeing that deep models rarely represent models uncertainty, a Bayesian approximation is computed by sampling sub-models by applying dropout at test time, and averaging over their predictions. Lastly, active learning is studied in a transfer learning setting, combined with the previously explored ideas. The experiments clearly show how active learning depends on data and model, as the two different models and datasets showed quite dissimilar results. The models in question are a simple CNN for sentence classification, and an AWD LSTM with pre-training, both tested on the binary sentiment analysis IMDB movie review dataset, and the multi-class AG news corpus. While there were no effect from any AL strategy on AG, with or without advances, all variations showed improved results on IMDB with the CNN. Although clustering appeared as the preferred choice for the CNN, it had a negative effect when combined with transfer learning and the AWD LSTM. The combination of clustering and Bayesian approximations did not add anything more than raised computational cost, even though both boosted validation accuracy and loss individually with the CNN. All in all, no method was exceedingly better than random sampling, however, many results introduced interesting ideas for further work
Understanding and Supporting Vocabulary Learners via Machine Learning on Behavioral and Linguistic Data
This dissertation presents various machine learning applications for predicting different cognitive states of students while they are using a vocabulary tutoring system, DSCoVAR. We conduct four studies, each of which includes a comprehensive analysis of behavioral and linguistic data and provides data-driven evidence for designing personalized features for the system.
The first study presents how behavioral and linguistic interactions from the vocabulary tutoring system can be used to predict students' off-task states. The study identifies which predictive features from interaction signals are more important and examines different types of off-task behaviors. The second study investigates how to automatically evaluate students' partial word knowledge from open-ended responses to definition questions. We present a technique that augments modern word-embedding techniques with a classic semantic differential scaling method from cognitive psychology. We then use this interpretable semantic scale method for predicting students' short- and long-term learning.
The third and fourth studies show how to develop a model that can generate more efficient training curricula for both human and machine vocabulary learners. The third study illustrates a deep-learning model to score sentences for a contextual vocabulary learning curriculum. We use pre-trained language models, such as ELMo or BERT, and an additional attention layer to capture how the context words are less or more important with respect to the meaning of the target word. The fourth study examines how the contextual informativeness model, originally designed to develop curricula for human vocabulary learning, can also be used for developing curricula for various word embedding models. We identify sentences predicted as low informative for human learners are also less helpful for machine learning algorithms.
Having a rich understanding of user behaviors, responses, and learning stimuli is imperative to develop an intelligent online system. Our studies demonstrate interpretable methods with cross-disciplinary approaches to understand various cognitive states of students during learning. The analysis results provide data-driven evidence for designing personalized features that can maximize learning outcomes. Datasets we collected from the studies will be shared publicly to promote future studies related to online tutoring systems. And these findings can also be applied to represent different user states observed in other online systems. In the future, we believe our findings can help to implement a more personalized vocabulary learning system, to develop a system that uses non-English texts or different types of inputs, and to investigate how the machine learning outputs interact with students.PHDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162999/1/sjnam_1.pd
- …