1,251 research outputs found
AutoSense Model for Word Sense Induction
Word sense induction (WSI), or the task of automatically discovering multiple
senses or meanings of a word, has three main challenges: domain adaptability,
novel sense detection, and sense granularity flexibility. While current latent
variable models are known to solve the first two challenges, they are not
flexible to different word sense granularities, which differ very much among
words, from aardvark with one sense, to play with over 50 senses. Current
models either require hyperparameter tuning or nonparametric induction of the
number of senses, which we find both to be ineffective. Thus, we aim to
eliminate these requirements and solve the sense granularity problem by
proposing AutoSense, a latent variable model based on two observations: (1)
senses are represented as a distribution over topics, and (2) senses generate
pairings between the target word and its neighboring word. These observations
alleviate the problem by (a) throwing garbage senses and (b) additionally
inducing fine-grained word senses. Results show great improvements over the
state-of-the-art models on popular WSI datasets. We also show that AutoSense is
able to learn the appropriate sense granularity of a word. Finally, we apply
AutoSense to the unsupervised author name disambiguation task where the sense
granularity problem is more evident and show that AutoSense is evidently better
than competing models. We share our data and code here:
https://github.com/rktamplayo/AutoSense.Comment: AAAI 201
Evidentiality-aware Retrieval for Overcoming Abstractiveness in Open-Domain Question Answering
The long-standing goal of dense retrievers in abtractive open-domain question
answering (ODQA) tasks is to learn to capture evidence passages among relevant
passages for any given query, such that the reader produce factually correct
outputs from evidence passages. One of the key challenge is the insufficient
amount of training data with the supervision of the answerability of the
passages. Recent studies rely on iterative pipelines to annotate answerability
using signals from the reader, but their high computational costs hamper
practical applications. In this paper, we instead focus on a data-centric
approach and propose Evidentiality-Aware Dense Passage Retrieval (EADPR), which
leverages synthetic distractor samples to learn to discriminate evidence
passages from distractors. We conduct extensive experiments to validate the
effectiveness of our proposed method on multiple abstractive ODQA tasks.Comment: Findings of EACL 202
Myotonic Dystrophy Type 1 Presenting as Male Infertility
Myotonic dystrophy 1 (DM1) is a multi-system disorder characterized by endocrine defects that include testicular and tubular atrophy, oligospermia and azoospermia, and increased follicle-stimulating hormone levels. We describe a rare case of DM1 presenting as infertility in a 29-year-old man
A High-Yield Fabrication Process for Silicon Neural Probes
There is a great need for silicon microelectrodes that can simultaneously
monitor the activity of many neurons in the brain. However,
one of the existing processes for fabricating silicon microelectrodes—reactive-
ion etching in combination with anisotropic KOH etching—breaks
down at the wet-etching step for device release. Here we describe amodified
wet-etching sidewall-protection technique for the high-yield fabrication of
well-defined silicon probe structures, using a Teflon® shield and low-pressure
chemical vapor deposition (LPCVD) silicon nitride. In the proposed
method, a micro-tab holds each individual probe to the central scaffold, allowing
uniform anisotropicKOHetching. Using this approach, we obtained
a well-defined probe structure without device loss during the wet-etching
process. This simple method yielded more accurate fabrication and an improved
mechanical profile.This work was supported in part by the Korean Science and Foundation (KOSEF) through the Nano-Bioelectronics and Systems Research Center, Seoul National Universit
Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback
Code editing is an essential step towards reliable program synthesis to
automatically correct critical errors generated from code LLMs. Recent studies
have demonstrated that closed-source LLMs (i.e., ChatGPT and GPT-4) are capable
of generating corrective feedback to edit erroneous inputs. However, it remains
challenging for open-source code LLMs to generate feedback for code editing,
since these models tend to adhere to the superficial formats of feedback and
provide feedback with misleading information. Hence, the focus of our work is
to leverage open-source code LLMs to generate helpful feedback with correct
guidance for code editing. To this end, we present Coffee, a collected dataset
specifically designed for code fixing with feedback. Using this dataset, we
construct CoffeePots, a framework for COde Fixing with FEEdback via
Preference-Optimized Tuning and Selection. The proposed framework aims to
automatically generate helpful feedback for code editing while minimizing the
potential risk of superficial feedback. The combination of Coffee and
CoffeePots marks a significant advancement, achieving state-of-the-art
performance on HumanEvalFix benchmark. Codes and model checkpoints are publicly
available at https://github.com/Lune-Blue/COFFEE.Comment: Work in progres
Challenges in Diagnosing Narcolepsy and Idiopathic Hypersomnia
Narcolepsy and idiopathic hypersomnia are central disorders of hypersomnolence accompanied by excessive daytime sleepiness, which are not caused by nocturnal sleep disturbance, sleep deficiency, or circadian rhythm sleep disorders. Several studies have questioned the repeatability of the Multiple Sleep Latency Test (MSLT) in type 2 narcolepsy (NT2) patients. After two or more repeated MSLTs, the diagnosis of type 1 narcolepsy (NT1) is maintained in more than 90% of cases, while only half of the NT2 patients retain their original diagnosis. The diagnosis of NT2 may shift to idiopathic hypersomnia based on the MSLT results, making the differential diagnosis of NT2 and idiopathic hypersomnia particularly challenging. Therefore, this study suggests the need for new tests in addition to the MSLT for diagnostic consistency in NT2 and idiopathic hypersomnia
- …