31,319 research outputs found
Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction
For large, real-world inductive learning problems, the number of training
examples often must be limited due to the costs associated with procuring,
preparing, and storing the training examples and/or the computational costs
associated with learning from them. In such circumstances, one question of
practical importance is: if only n training examples can be selected, in what
proportion should the classes be represented? In this article we help to answer
this question by analyzing, for a fixed training-set size, the relationship
between the class distribution of the training data and the performance of
classification trees induced from these data. We study twenty-six data sets
and, for each, determine the best class distribution for learning. The
naturally occurring class distribution is shown to generally perform well when
classifier performance is evaluated using undifferentiated error rate (0/1
loss). However, when the area under the ROC curve is used to evaluate
classifier performance, a balanced distribution is shown to perform well. Since
neither of these choices for class distribution always generates the
best-performing classifier, we introduce a budget-sensitive progressive
sampling algorithm for selecting training examples based on the class
associated with each example. An empirical analysis of this algorithm shows
that the class distribution of the resulting training set yields classifiers
with good (nearly-optimal) classification performance
Statistics of statisticians: Critical mass of statistics and operational research groups in the UK
Using a recently developed model, inspired by mean field theory in
statistical physics, and data from the UK's Research Assessment Exercise, we
analyse the relationship between the quality of statistics and operational
research groups and the quantity researchers in them. Similar to other academic
disciplines, we provide evidence for a linear dependency of quality on quantity
up to an upper critical mass, which is interpreted as the average maximum
number of colleagues with whom a researcher can communicate meaningfully within
a research group. The model also predicts a lower critical mass, which research
groups should strive to achieve to avoid extinction. For statistics and
operational research, the lower critical mass is estimated to be 9 3. The
upper critical mass, beyond which research quality does not significantly
depend on group size, is about twice this value
Evaluation of the micro-carburetor
A prototype sonic, variable-venturi automotive carburetor was evaluated for its effects on vehicle performance, fuel economy, and exhaust emissions. A 350 CID Chevrolet Impala vehicle was tested on a chassis dynamometer over the 1975 Federal Test Procedure, urban driving cycle. The Micro-carburetor was tested and compared with stock and modified-stock engine configurations. Subsequently, the test vehicle's performance characteristics were examined with the stock carburetor and again with the Micro-carburetor in a series of on-road driveability tests. The test engine was then removed from the vehicle and installed on an engine dynamometer. Engine tests were conducted to compare the fuel economy, thermal efficiency, and cylinder-to-cylinder mixture distribution of the Micro-carburetor to that of the stock configuration. Test results show increases in thermal efficiency and improvements in fuel economy at all test conditions. Improve fuel/air mixture preparation is implied from the information presented. Further improvements in fuel economy and exhaust emissions are possible through a detailed recalibration of the Micro-carburetor
Relevance of multiple-quasiparticle tunneling between edge states at \nu =p/(2np+1)
We present an explanation for the anomalous behavior in tunneling conductance
and noise through a point contact between edge states in the Jain series
, for extremely weak-backscattering and low temperatures [Y.C.
Chung, M. Heiblum, and V. Umansky, Phys. Rev. Lett. {\bf{91}}, 216804 (2003)].
We consider edge states with neutral modes propagating at finite velocity, and
we show that the activation of their dynamics causes the unexpected change in
the temperature power-law of the conductance. Even more importantly, we
demonstrate that multiple-quasiparticles tunneling at low energies becomes the
most relevant process. This result will be used to explain the experimental
data on current noise where tunneling particles have a charge that can reach
times the single quasiparticle charge. In this paper we analyze the
conductance and the shot noise to substantiate quantitatively the proposed
scenario.Comment: 4 pages, 2 figure
Deformation of grain boundaries in polar ice
The ice microstructure (grain boundaries) is a key feature used to study ice
evolution and to investigate past climatic changes. We studied a deep ice core,
in Dome Concordia, Antarctica, which records past mechanical deformations. We
measured a "texture tensor" which characterizes the pattern geometry and
reveals local heterogeneities of deformation along the core. These results
question key assumptions of the current models used for dating
QCD NLO with Powheg matching and top threshold matching in WHIZARD
We present the status of the automation of NLO processes within the event
generator WHIZARD. The program provides an automated FKS subtraction and phase
space integration over the FKS regions, while the (QCD) NLO matrix element is
accessed via the Binoth Les Houches Interface from an externally linked
one-loop program. Massless and massive test cases and validation are shown for
several e+e- processes. Furthermore, we discuss work in progress and future
plans. The second part covers the matching of the NRQCD prediction with NLL
threshold resummation to the NLO continuum top pair production at lepton
colliders. Both the S-wave and P-wave production of the top pair are taken into
account in the resummation. The inclusion in WHIZARD allows to study more
exclusive observables than just the total cross section and automatically
accounts for important electroweak and relativistic corrections in the
threshold region.Comment: 9 pages, 3 figures, Talk given at 12th International Symposium on
Radiative Corrections (Radcor 2015) and LoopFest XIV (Radiative Corrections
for the LHC and Future Colliders); v2: reference adde
Theory of Nonlinear Dispersive Waves and Selection of the Ground State
A theory of time dependent nonlinear dispersive equations of the Schroedinger
/ Gross-Pitaevskii and Hartree type is developed. The short, intermediate and
large time behavior is found, by deriving nonlinear Master equations (NLME),
governing the evolution of the mode powers, and by a novel multi-time scale
analysis of these equations. The scattering theory is developed and coherent
resonance phenomena and associated lifetimes are derived. Applications include
BEC large time dynamics and nonlinear optical systems. The theory reveals a
nonlinear transition phenomenon, ``selection of the ground state'', and NLME
predicts the decay of excited state, with half its energy transferred to the
ground state and half to radiation modes. Our results predict the recent
experimental observations of Mandelik et. al. in nonlinear optical waveguides
Diffusive behavior of a greedy traveling salesman
Using Monte Carlo simulations we examine the diffusive properties of the
greedy algorithm in the d-dimensional traveling salesman problem. Our results
show that for d=3 and 4 the average squared distance from the origin is
proportional to the number of steps t. In the d=2 case such a scaling is
modified with some logarithmic corrections, which might suggest that d=2 is the
critical dimension of the problem. The distribution of lengths also shows
marked differences between d=2 and d>2 versions. A simple strategy adopted by
the salesman might resemble strategies chosen by some foraging and hunting
animals, for which anomalous diffusive behavior has recently been reported and
interpreted in terms of Levy flights. Our results suggest that broad and
Levy-like distributions in such systems might appear due to dimension-dependent
properties of a search space.Comment: accepted in Phys. Rev.
Imaging Pauli repulsion in scanning tunneling microscopy
A scanning tunneling microscope (STM) has been equipped with a nanoscale
force sensor and signal transducer composed of a single D2 molecule that is
confined in the STM junction. The uncalibrated sensor is used to obtain
ultra-high geometric image resolution of a complex organic molecule adsorbed on
a noble metal surface. By means of conductance-distance spectroscopy and
corresponding density functional calculations the mechanism of the
sensor/transducer is identified. It probes the short-range Pauli repulsion and
converts this signal into variations of the junction conductance.Comment: 4 pages, 4 figures, accepted to Phys. Rev. Let
- …