2,891 research outputs found
Improving Negative Sampling for Word Representation using Self-embedded Features
Although the word-popularity based negative sampler has shown superb
performance in the skip-gram model, the theoretical motivation behind
oversampling popular (non-observed) words as negative samples is still not well
understood. In this paper, we start from an investigation of the gradient
vanishing issue in the skipgram model without a proper negative sampler. By
performing an insightful analysis from the stochastic gradient descent (SGD)
learning perspective, we demonstrate that, both theoretically and intuitively,
negative samples with larger inner product scores are more informative than
those with lower scores for the SGD learner in terms of both convergence rate
and accuracy. Understanding this, we propose an alternative sampling algorithm
that dynamically selects informative negative samples during each SGD update.
More importantly, the proposed sampler accounts for multi-dimensional
self-embedded features during the sampling process, which essentially makes it
more effective than the original popularity-based (one-dimensional) sampler.
Empirical experiments further verify our observations, and show that our
fine-grained samplers gain significant improvement over the existing ones
without increasing computational complexity.Comment: Accepted in WSDM 201
Conduction of Ultracold Fermions Through a Mesoscopic Channel
In a mesoscopic conductor electric resistance is detected even if the device
is defect-free. We engineer and study a cold-atom analog of a mesoscopic
conductor. It consists of a narrow channel connecting two macroscopic
reservoirs of fermions that can be switched from ballistic to diffusive. We
induce a current through the channel and find ohmic conduction, even for a
ballistic channel. An analysis of in-situ density distributions shows that in
the ballistic case the chemical potential drop occurs at the entrance and exit
of the channel, revealing the presence of contact resistance. In contrast, a
diffusive channel with disorder displays a chemical potential drop spread over
the whole channel. Our approach opens the way towards quantum simulation of
mesoscopic devices with quantum gases
Development and external validation of an automated computer-aided risk score for predicting sepsis in emergency medical admissions using the patient’s first electronically recorded vital signs and blood test results
YesObjectives: To develop a logistic regression model to predict the risk of sepsis following emergency medical admission using the patient’s first, routinely collected, electronically recorded vital signs and blood test results and to validate this novel computer-aided risk of sepsis model, using data from another hospital.
Design: Cross-sectional model development and external validation study reporting the C-statistic based on a validated optimized algorithm to identify sepsis and severe sepsis (including septic shock) from administrative hospital databases using International Classification of Diseases, 10th Edition, codes.
Setting: Two acute hospitals (York Hospital - development data; Northern Lincolnshire and Goole Hospital - external validation data).
Patients: Adult emergency medical admissions discharged over a 24-month period with vital signs and blood test results recorded at admission.
Interventions: None.
Main Results: The prevalence of sepsis and severe sepsis was lower in York Hospital (18.5% = 4,861/2,6247; 5.3% = 1,387/2,6247) than Northern Lincolnshire and Goole Hospital (25.1% = 7,773/30,996; 9.2% = 2,864/30,996). The mortality for sepsis (York Hospital: 14.5% = 704/4,861; Northern Lincolnshire and Goole Hospital: 11.6% = 899/7,773) was lower than the mortality for severe sepsis (York Hospital: 29.0% = 402/1,387; Northern Lincolnshire and Goole Hospital: 21.4% = 612/2,864). The C-statistic for computer-aided risk of sepsis in York Hospital (all sepsis 0.78; sepsis: 0.73; severe sepsis: 0.80) was similar in an external hospital setting (Northern Lincolnshire and Goole Hospital: all sepsis 0.79; sepsis: 0.70; severe sepsis: 0.81). A cutoff value of 0.2 gives reasonable performance.
Conclusions: We have developed a novel, externally validated computer-aided risk of sepsis, with reasonably good performance for estimating the risk of sepsis for emergency medical admissions using the patient’s first, electronically recorded, vital signs and blood tests results. Since computer-aided risk of sepsis places no additional data collection burden on clinicians and is automated, it may now be carefully introduced and evaluated in hospitals with sufficient informatics infrastructure.Health Foundatio
Energy efficiency of information transmission by electrically coupled neurons
The generation of spikes by neurons is energetically a costly process. This
paper studies the consumption of energy and the information entropy in the
signalling activity of a model neuron both when it is supposed isolated and
when it is coupled to another neuron by an electrical synapse. The neuron has
been modelled by a four dimensional Hindmarsh-Rose type kinetic model for which
an energy function has been deduced. For the isolated neuron values of energy
consumption and information entropy at different signalling regimes have been
computed. For two neurons coupled by a gap junction we have analyzed the roles
of the membrane and synapse in the contribution of the energy that is required
for their organized signalling. Computational results are provided for cases of
identical and nonidentical neurons coupled by unidirectional and bidirectional
gap junctions. One relevant result is that there are values of the coupling
strength at which the organized signalling of two neurons induced by the gap
junction takes place at relatively low values of energy consumption and the
ratio of mutual information to energy consumption is relatively high.
Therefore, communicating at these coupling values could be energetically the
most efficient option
Recognizing Treelike k-Dissimilarities
A k-dissimilarity D on a finite set X, |X| >= k, is a map from the set of
size k subsets of X to the real numbers. Such maps naturally arise from
edge-weighted trees T with leaf-set X: Given a subset Y of X of size k, D(Y) is
defined to be the total length of the smallest subtree of T with leaf-set Y .
In case k = 2, it is well-known that 2-dissimilarities arising in this way can
be characterized by the so-called "4-point condition". However, in case k > 2
Pachter and Speyer recently posed the following question: Given an arbitrary
k-dissimilarity, how do we test whether this map comes from a tree? In this
paper, we provide an answer to this question, showing that for k >= 3 a
k-dissimilarity on a set X arises from a tree if and only if its restriction to
every 2k-element subset of X arises from some tree, and that 2k is the least
possible subset size to ensure that this is the case. As a corollary, we show
that there exists a polynomial-time algorithm to determine when a
k-dissimilarity arises from a tree. We also give a 6-point condition for
determining when a 3-dissimilarity arises from a tree, that is similar to the
aforementioned 4-point condition.Comment: 18 pages, 4 figure
SPECTRE: a Suite of PhylogEnetiC Tools for Reticulate Evolution
Split-networks are a generalization of phylogenetic trees that have proven to be a powerful tool in phylogenetics. Various ways have been developed for computing such networks, including split-decomposition, NeighborNet, QNet and FlatNJ. Some of these approaches are implemented in the user-friendly SplitsTree software package. However, to give the user the option to adjust and extend these approaches and to facilitate their integration into analysis pipelines, there is a need for robust, open-source implementations of associated data structures and algorithms. Here we present SPECTRE, a readily available, open-source library of data structures written in Java, that comes complete with new implementations of several pre-published algorithms and a basic interactive graphical interface for visualizing planar split networks. SPECTRE also supports the use of longer running algorithms by providing command line interfaces, which can be executed on servers or in High Performance Computing (HPC) environments
Recommended from our members
Parsing Arabic Dialects
The Arabic language is a collection of spoken dialects with important phonological, morphological, lexical, and syntactic differences, along with a standard written language, Modern Standard Arabic (MSA). Since the spoken dialects are not officially written, it is very costly to obtain adequate corpora to use for training dialect NLP tools such as parsers. In this paper, we address the problem of parsing transcribed spoken Levantine Arabic (LA). We do not assume the existence of any annotated LA corpus (except for development and testing), nor of a parallel corpus LA-MSA. Instead, we use explicit knowledge about the relation between LA and MSA
Separating the influences of prereading skills on early word and nonword reading
The essential first step for a beginning reader is to learn to match printed forms to phonological representations. For a new word, this is an effortful process where each grapheme must be translated individually (serial decoding). The role of phonological awareness in developing a decoding strategy is well known. We examined whether beginning readers recruit different skills depending on the nature of the words being read (familiar words vs. nonwords). Print knowledge, phoneme and rhyme awareness, rapid automatized naming (RAN), phonological short-term memory (STM), nonverbal reasoning, vocabulary, auditory skills, and visual attention were measured in 392 prereaders 4 and 5 years of age. Word and nonword reading were measured 9 months later. We used structural equation modeling to examine the skills–reading relationship and modeled correlations between our two reading outcomes and among all prereading skills. We found that a broad range of skills were associated with reading outcomes: early print knowledge, phonological STM, phoneme awareness and RAN. Whereas all of these skills were directly predictive of nonword reading, early print knowledge was the only direct predictor of word reading. Our findings suggest that beginning readers draw most heavily on their existing print knowledge to read familiar words
- …