31,319 research outputs found

    Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction

    Full text link
    For large, real-world inductive learning problems, the number of training examples often must be limited due to the costs associated with procuring, preparing, and storing the training examples and/or the computational costs associated with learning from them. In such circumstances, one question of practical importance is: if only n training examples can be selected, in what proportion should the classes be represented? In this article we help to answer this question by analyzing, for a fixed training-set size, the relationship between the class distribution of the training data and the performance of classification trees induced from these data. We study twenty-six data sets and, for each, determine the best class distribution for learning. The naturally occurring class distribution is shown to generally perform well when classifier performance is evaluated using undifferentiated error rate (0/1 loss). However, when the area under the ROC curve is used to evaluate classifier performance, a balanced distribution is shown to perform well. Since neither of these choices for class distribution always generates the best-performing classifier, we introduce a budget-sensitive progressive sampling algorithm for selecting training examples based on the class associated with each example. An empirical analysis of this algorithm shows that the class distribution of the resulting training set yields classifiers with good (nearly-optimal) classification performance

    Statistics of statisticians: Critical mass of statistics and operational research groups in the UK

    Get PDF
    Using a recently developed model, inspired by mean field theory in statistical physics, and data from the UK's Research Assessment Exercise, we analyse the relationship between the quality of statistics and operational research groups and the quantity researchers in them. Similar to other academic disciplines, we provide evidence for a linear dependency of quality on quantity up to an upper critical mass, which is interpreted as the average maximum number of colleagues with whom a researcher can communicate meaningfully within a research group. The model also predicts a lower critical mass, which research groups should strive to achieve to avoid extinction. For statistics and operational research, the lower critical mass is estimated to be 9 ±\pm 3. The upper critical mass, beyond which research quality does not significantly depend on group size, is about twice this value

    Evaluation of the micro-carburetor

    Get PDF
    A prototype sonic, variable-venturi automotive carburetor was evaluated for its effects on vehicle performance, fuel economy, and exhaust emissions. A 350 CID Chevrolet Impala vehicle was tested on a chassis dynamometer over the 1975 Federal Test Procedure, urban driving cycle. The Micro-carburetor was tested and compared with stock and modified-stock engine configurations. Subsequently, the test vehicle's performance characteristics were examined with the stock carburetor and again with the Micro-carburetor in a series of on-road driveability tests. The test engine was then removed from the vehicle and installed on an engine dynamometer. Engine tests were conducted to compare the fuel economy, thermal efficiency, and cylinder-to-cylinder mixture distribution of the Micro-carburetor to that of the stock configuration. Test results show increases in thermal efficiency and improvements in fuel economy at all test conditions. Improve fuel/air mixture preparation is implied from the information presented. Further improvements in fuel economy and exhaust emissions are possible through a detailed recalibration of the Micro-carburetor

    Relevance of multiple-quasiparticle tunneling between edge states at \nu =p/(2np+1)

    Full text link
    We present an explanation for the anomalous behavior in tunneling conductance and noise through a point contact between edge states in the Jain series ν=p/(2np+1)\nu=p/(2np+1), for extremely weak-backscattering and low temperatures [Y.C. Chung, M. Heiblum, and V. Umansky, Phys. Rev. Lett. {\bf{91}}, 216804 (2003)]. We consider edge states with neutral modes propagating at finite velocity, and we show that the activation of their dynamics causes the unexpected change in the temperature power-law of the conductance. Even more importantly, we demonstrate that multiple-quasiparticles tunneling at low energies becomes the most relevant process. This result will be used to explain the experimental data on current noise where tunneling particles have a charge that can reach pp times the single quasiparticle charge. In this paper we analyze the conductance and the shot noise to substantiate quantitatively the proposed scenario.Comment: 4 pages, 2 figure

    Deformation of grain boundaries in polar ice

    Full text link
    The ice microstructure (grain boundaries) is a key feature used to study ice evolution and to investigate past climatic changes. We studied a deep ice core, in Dome Concordia, Antarctica, which records past mechanical deformations. We measured a "texture tensor" which characterizes the pattern geometry and reveals local heterogeneities of deformation along the core. These results question key assumptions of the current models used for dating

    QCD NLO with Powheg matching and top threshold matching in WHIZARD

    Full text link
    We present the status of the automation of NLO processes within the event generator WHIZARD. The program provides an automated FKS subtraction and phase space integration over the FKS regions, while the (QCD) NLO matrix element is accessed via the Binoth Les Houches Interface from an externally linked one-loop program. Massless and massive test cases and validation are shown for several e+e- processes. Furthermore, we discuss work in progress and future plans. The second part covers the matching of the NRQCD prediction with NLL threshold resummation to the NLO continuum top pair production at lepton colliders. Both the S-wave and P-wave production of the top pair are taken into account in the resummation. The inclusion in WHIZARD allows to study more exclusive observables than just the total cross section and automatically accounts for important electroweak and relativistic corrections in the threshold region.Comment: 9 pages, 3 figures, Talk given at 12th International Symposium on Radiative Corrections (Radcor 2015) and LoopFest XIV (Radiative Corrections for the LHC and Future Colliders); v2: reference adde

    Theory of Nonlinear Dispersive Waves and Selection of the Ground State

    Full text link
    A theory of time dependent nonlinear dispersive equations of the Schroedinger / Gross-Pitaevskii and Hartree type is developed. The short, intermediate and large time behavior is found, by deriving nonlinear Master equations (NLME), governing the evolution of the mode powers, and by a novel multi-time scale analysis of these equations. The scattering theory is developed and coherent resonance phenomena and associated lifetimes are derived. Applications include BEC large time dynamics and nonlinear optical systems. The theory reveals a nonlinear transition phenomenon, ``selection of the ground state'', and NLME predicts the decay of excited state, with half its energy transferred to the ground state and half to radiation modes. Our results predict the recent experimental observations of Mandelik et. al. in nonlinear optical waveguides

    Diffusive behavior of a greedy traveling salesman

    Full text link
    Using Monte Carlo simulations we examine the diffusive properties of the greedy algorithm in the d-dimensional traveling salesman problem. Our results show that for d=3 and 4 the average squared distance from the origin is proportional to the number of steps t. In the d=2 case such a scaling is modified with some logarithmic corrections, which might suggest that d=2 is the critical dimension of the problem. The distribution of lengths also shows marked differences between d=2 and d>2 versions. A simple strategy adopted by the salesman might resemble strategies chosen by some foraging and hunting animals, for which anomalous diffusive behavior has recently been reported and interpreted in terms of Levy flights. Our results suggest that broad and Levy-like distributions in such systems might appear due to dimension-dependent properties of a search space.Comment: accepted in Phys. Rev.

    Imaging Pauli repulsion in scanning tunneling microscopy

    Get PDF
    A scanning tunneling microscope (STM) has been equipped with a nanoscale force sensor and signal transducer composed of a single D2 molecule that is confined in the STM junction. The uncalibrated sensor is used to obtain ultra-high geometric image resolution of a complex organic molecule adsorbed on a noble metal surface. By means of conductance-distance spectroscopy and corresponding density functional calculations the mechanism of the sensor/transducer is identified. It probes the short-range Pauli repulsion and converts this signal into variations of the junction conductance.Comment: 4 pages, 4 figures, accepted to Phys. Rev. Let
    • …
    corecore