1,639 research outputs found
On Using Active Learning and Self-Training when Mining Performance Discussions on Stack Overflow
Abundant data is the key to successful machine learning. However, supervised
learning requires annotated data that are often hard to obtain. In a
classification task with limited resources, Active Learning (AL) promises to
guide annotators to examples that bring the most value for a classifier. AL can
be successfully combined with self-training, i.e., extending a training set
with the unlabelled examples for which a classifier is the most certain. We
report our experiences on using AL in a systematic manner to train an SVM
classifier for Stack Overflow posts discussing performance of software
components. We show that the training examples deemed as the most valuable to
the classifier are also the most difficult for humans to annotate. Despite
carefully evolved annotation criteria, we report low inter-rater agreement, but
we also propose mitigation strategies. Finally, based on one annotator's work,
we show that self-training can improve the classification accuracy. We conclude
the paper by discussing implication for future text miners aspiring to use AL
and self-training.Comment: Preprint of paper accepted for the Proc. of the 21st International
Conference on Evaluation and Assessment in Software Engineering, 201
Compressible turbulent boundary layer interaction experiments
Four phases of research results are reported: (1) experiments on the compressible turbulent boundary layer flow in a streamwise corner; (2) the two dimensional (2D) interaction of incident shock waves with a compressible turbulent boundary layer; (3) three dimensional (3D) shock/boundary layer interactions; and (4) cooperative experiments at Princeton and numerical computations at NASA-Ames
The Perfect Lion: The Life and Death of Confederate Artillerist John Pelham
Fresh Biographical Sketch of Confederate Artillerist
Four books have now been written on Confederate artillerist John Pelham who, at the time of his death on March 18,1863, following the battle of Kellyâs Ford, Virginia, was but a majorâa twenty-four year old major. Thousands of Civil W...
Opportunities and challenges grow from Arabidopsis genome sequencing
Aârecent Cold Spring Harbor Laboratory meeting in December 1997 provided the first meeting on the Arabidopsisgenome featuring a unique combination of functional studies and sequencing efforts; it included a broad range of talks covering genome sequencing and analysis efforts, mapping and defining genes, and gene expression patterns and function. Significant points to come out of the meeting were that a number of international consortiums have completed substantial portions of sequence on all five chromosomes with 17 Mb of sequence currently available through various web pages and 8 Mb of annotated sequence available through GenBank. Although physical maps of three of the five chromosomes have not yet been completed, David Bouchez (INRA, Versailles, France) reported that >90% of the clones in the CIC (CNRS, INRA, CEPH) Arabidopsis YAC library have been anchored via hybridization to genetically mapped markers. This should greatly facilitate the construction of physical maps. Michael Mindrinos from the Ausubel laboratory (Massachusetts General Hospital, Boston, MA) reported the development of a new class of PCR-based marker, the SNAPs (single nucleotideamplified polymorphisms), which should greatly assist positional cloning efforts. Daphne Preuss (University of Chicago, IL) reported the use of tetrad analysis to place the centromeres on the genetic map (Fig. 1), taking advantage of the pollen mutant quartet1 (Preuss et al. 1994; Copenhaver et al. 1998). Interestingly, this analysis placed the centromeres very close to, but not necessarily within, the centromeric repeat blocks mapped recently by Round et al. (1997)
Substance Abuse via Legally Prescribed Drugs: The Case of Vicodin in the United States
Vicodin is the most commonly prescribed pain reliever in the United States.
Research indicates that there are two million people who are currently abusing
Vicodin, and the majority of those who abuse Vicodin were initially exposed to
it via prescription. Our goal is to determine the most effective strategies for
reducing the overall population of Vicodin abusers. More specifically, we focus
on whether prevention methods aimed at educating doctors and patients on the
potential for drug abuse or treatment methods implemented after a person abuses
Vicodin will have a greater overall impact. We consider one linear and two
non-linear compartmental models in which medical users of Vicodin can
transition into the abuser compartment or leave the population by no longer
taking the drug. Once Vicodin abusers, people can transition into a treatment
compartment, with the possibility of leaving the population through successful
completion of treatment or of relapsing and re-entering the abusive
compartment. The linear model assumes no social interaction, while both
non-linear models consider interaction. One considers interaction with abusers
affecting the relapse rate, while the other assumes both this and an additional
interaction between the number of abusers and the number of new prescriptions.
Sensitivity analyses are conducted varying the rates of success of these
intervention methods measured by the parameters to determine which strategy has
the greatest impact on controlling the population of Vicodin abusers. From
these models and analyses, we determine that manipulating parameters tied to
prevention measures has a greater impact on reducing the population of abusers
than manipulating parameters associated with treatment. We also note that
increasing the rate at which abusers seek treatment affects the population of
abusers more than the success rate of treatment itself
Non-exponential kinetic behavior of confined water
We present the results of molecular dynamics simulations of SPC/E water
confined in a realistic model of a silica pore. The single-particle dynamics
have been studied at ambient temperature for different hydration levels. The
confinement near the hydrophilic surface makes the dynamic behaviour of the
liquid strongly dependent on the hydration level. Upon decrease of the number
of water molecules in the pore we observe the onset of a slow dynamics due to
the ``cage effect''. The conventional picture of a stochastic single-particle
diffusion process thus looses its validity
Draft genome sequence of Pseudomonas moraviensis R28-S
We report the draft genome sequence of Pseudomonas moraviensis R28-S, isolated from the municipal wastewater treatment plant of Moscow, ID. The strain carries a native mercury resistance plasmid, poorly maintains introduced IncP-1 antibiotic resistance plasmids, and has been useful for studying the evolution of plasmid host range and stability
Active Sampling-based Binary Verification of Dynamical Systems
Nonlinear, adaptive, or otherwise complex control techniques are increasingly
relied upon to ensure the safety of systems operating in uncertain
environments. However, the nonlinearity of the resulting closed-loop system
complicates verification that the system does in fact satisfy those
requirements at all possible operating conditions. While analytical proof-based
techniques and finite abstractions can be used to provably verify the
closed-loop system's response at different operating conditions, they often
produce conservative approximations due to restrictive assumptions and are
difficult to construct in many applications. In contrast, popular statistical
verification techniques relax the restrictions and instead rely upon
simulations to construct statistical or probabilistic guarantees. This work
presents a data-driven statistical verification procedure that instead
constructs statistical learning models from simulated training data to separate
the set of possible perturbations into "safe" and "unsafe" subsets. Binary
evaluations of closed-loop system requirement satisfaction at various
realizations of the uncertainties are obtained through temporal logic
robustness metrics, which are then used to construct predictive models of
requirement satisfaction over the full set of possible uncertainties. As the
accuracy of these predictive statistical models is inherently coupled to the
quality of the training data, an active learning algorithm selects additional
sample points in order to maximize the expected change in the data-driven model
and thus, indirectly, minimize the prediction error. Various case studies
demonstrate the closed-loop verification procedure and highlight improvements
in prediction error over both existing analytical and statistical verification
techniques.Comment: 23 page
A Probe of New Physics in Top Quark Pair Production at Colliders
We describe how to probe new physics through examination of the form factors
describing the Ztt couplings via the scattering process e^-e^+->t+tbar. We
focus on experimental methods on how the top quark momentum can be determined
and show how this can be applied to select polarized samples of
pairs through the angular correlations in the final state leptons. We also
study the dependence on the energy and luminosity of an \ee\ collider to probe
a CP violating asymmetry at the level.}Comment: 24 pages in TeXsis (figures available upon request) (revised July
1993
Multiple-scattering effects on incoherent neutron scattering in glasses and viscous liquids
Incoherent neutron scattering experiments are simulated for simple dynamic
models: a glass (with a smooth distribution of harmonic vibrations) and a
viscous liquid (described by schematic mode-coupling equations). In most
situations multiple scattering has little influence upon spectral
distributions, but it completely distorts the wavenumber-dependent amplitudes.
This explains an anomaly observed in recent experiments
- âŠ