38 research outputs found
Look at the First Sentence: Position Bias in Question Answering
Many extractive question answering models are trained to predict start and
end positions of answers. The choice of predicting answers as positions is
mainly due to its simplicity and effectiveness. In this study, we hypothesize
that when the distribution of the answer positions is highly skewed in the
training set (e.g., answers lie only in the k-th sentence of each passage), QA
models predicting answers as positions can learn spurious positional cues and
fail to give answers in different positions. We first illustrate this position
bias in popular extractive QA models such as BiDAF and BERT and thoroughly
examine how position bias propagates through each layer of BERT. To safely
deliver position information without position bias, we train models with
various de-biasing methods including entropy regularization and bias
ensembling. Among them, we found that using the prior distribution of answer
positions as a bias model is very effective at reducing position bias,
recovering the performance of BERT from 37.48% to 81.64% when trained on a
biased SQuAD dataset.Comment: 13 pages, EMNLP 202
Adaptive Accelerated Failure Time modeling with a Semiparametric Skewed Error Distribution
The accelerated failure time (AFT) model is widely used to analyze
relationships between variables in the presence of censored observations.
However, this model relies on some assumptions such as the error distribution,
which can lead to biased or inefficient estimates if these assumptions are
violated. In order to overcome this challenge, we propose a novel approach that
incorporates a semiparametric skew-normal scale mixture distribution for the
error term in the AFT model. By allowing for more flexibility and robustness,
this approach reduces the risk of misspecification and improves the accuracy of
parameter estimation. We investigate the identifiability and consistency of the
proposed model and develop a practical estimation algorithm. To evaluate the
performance of our approach, we conduct extensive simulation studies and real
data analyses. The results demonstrate the effectiveness of our method in
providing robust and accurate estimates in various scenarios
Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations
Most weakly supervised named entity recognition (NER) models rely on
domain-specific dictionaries provided by experts. This approach is infeasible
in many domains where dictionaries do not exist. While a phrase retrieval model
was used to construct pseudo-dictionaries with entities retrieved from
Wikipedia automatically in a recent study, these dictionaries often have
limited coverage because the retriever is likely to retrieve popular entities
rather than rare ones. In this study, we present a novel framework, HighGEN,
that generates NER datasets with high-coverage pseudo-dictionaries.
Specifically, we create entity-rich dictionaries with a novel search method,
called phrase embedding search, which encourages the retriever to search a
space densely populated with various entities. In addition, we use a new
verification process based on the embedding distance between candidate entity
mentions and entity types to reduce the false-positive noise in weak labels
generated by high-coverage dictionaries. We demonstrate that HighGEN
outperforms the previous best model by an average F1 score of 4.7 across five
NER benchmark datasets.Comment: ACL 202
Simple Questions Generate Named Entity Recognition Datasets
Recent named entity recognition (NER) models often rely on human-annotated
datasets requiring the vast engagement of professional knowledge on the target
domain and entities. This work introduces an ask-to-generate approach, which
automatically generates NER datasets by asking simple natural language
questions to an open-domain question answering system (e.g., "Which disease?").
Despite using fewer training resources, our models solely trained on the
generated datasets largely outperform strong low-resource models by 20.8 F1
score on average across six popular NER benchmarks. Our models also show
competitive performance with rich-resource models that additionally leverage
in-domain dictionaries provided by domain experts. In few-shot NER, we
outperform the previous best model by 5.2 F1 score on three benchmarks and
achieve new state-of-the-art performance.Comment: Code available at https://github.com/dmis-lab/GeNE
Fast frequency discrimination and phoneme recognition using a biomimetic membrane coupled to a neural network
In the human ear, the basilar membrane plays a central role in sound
recognition. When excited by sound, this membrane responds with a
frequency-dependent displacement pattern that is detected and identified by the
auditory hair cells combined with the human neural system. Inspired by this
structure, we designed and fabricated an artificial membrane that produces a
spatial displacement pattern in response to an audible signal, which we used to
train a convolutional neural network (CNN). When trained with single frequency
tones, this system can unambiguously distinguish tones closely spaced in
frequency. When instead trained to recognize spoken vowels, this system
outperforms existing methods for phoneme recognition, including the discrete
Fourier transform (DFT), zoom FFT and chirp z-transform, especially when tested
in short time windows. This sound recognition scheme therefore promises
significant benefits in fast and accurate sound identification compared to
existing methods.Comment: 7 pages, 4 figure