    Inducing safer oblique trees without costs

    Decision tree induction has been widely studied and applied. In safety applications, such as determining whether a chemical process is safe or whether a person has a medical condition, the cost of misclassification in one of the classes is significantly higher than in the other class. Several authors have tackled this problem by developing cost-sensitive decision tree learning algorithms or have suggested ways of changing the distribution of training examples to bias the decision tree learning process so as to take account of costs. A prerequisite for applying such algorithms is the availability of costs of misclassification. Although this may be possible for some applications, obtaining reasonable estimates of costs of misclassification is not easy in the area of safety. This paper presents a new algorithm for applications where the cost of misclassifications cannot be quantified, although the cost of misclassification in one class is known to be significantly higher than in another class. The algorithm utilizes linear discriminant analysis to identify oblique relationships between continuous attributes and then carries out an appropriate modification to ensure that the resulting tree errs on the side of safety. The algorithm is evaluated with respect to one of the best known cost-sensitive algorithms (ICET), a well-known oblique decision tree algorithm (OC1) and an algorithm that utilizes robust linear programming

    Automated Retraining Methods for Document Classification and Their Parameter Tuning

    This paper addresses the problem of semi-supervised classification on document collections using retraining (also called self-training). A possible application is focused Web crawling which may start with very few, manually selected, training documents but can be enhanced by automatically adding initially unlabeled, positively classified Web pages for retraining. Such an approach is by itself not robust and faces tuning problems regarding parameters like the number of selected documents, the number of retraining iterations, and the ratio of positive and negative classified samples used for retraining. The paper develops methods for automatically tuning these parameters, based on predicting the leave-one-out error for a re-trained classifier and avoiding that the classifier is diluted by selecting too many or weak documents for retraining. Our experiments with three different datasets confirm the practical viability of the approach

    Radio Location of Partial Discharge Sources: A Support Vector Regression Approach

    Partial discharge (PD) can provide a useful forewarning of asset failure in electricity substations. A significant proportion of assets are susceptible to PD due to incipient weakness in their dielectrics. This paper examines a low cost approach for uninterrupted monitoring of PD using a network of inexpensive radio sensors to sample the spatial patterns of PD received signal strength. Machine learning techniques are proposed for localisation of PD sources. Specifically, two models based on Support Vector Machines (SVMs) are developed: Support Vector Regression (SVR) and Least-Squares Support Vector Regression (LSSVR). These models construct an explicit regression surface in a high dimensional feature space for function estimation. Their performance is compared to that of artificial neural network (ANN) models. The results show that both SVR and LSSVR methods are superior to ANNs in accuracy. LSSVR approach is particularly recommended as practical alternative for PD source localisation due to it low complexity

    The Intentional Use of Service Recovery Strategies to Influence Consumer Emotion, Cognition and Behaviour

    Service recovery strategies have been identified as a critical factor in the success of. service organizations. This study develops a conceptual frame work to investigate how specific service recovery strategies influence the emotional, cognitive and negative behavioural responses of . consumers., as well as how emotion and cognition influence negative behavior. Understanding the impact of specific service recovery strategies will allow service providers' to more deliberately and intentionally engage in strategies that result in positive organizational outcomes. This study was conducted using a 2 x 2 between-subjects quasi-experimental design. The results suggest that service recovery has a significant impact on emotion, cognition and negative behavior. Similarly, satisfaction, negative emotion and positive emotion all influence negative behavior but distributive justice has no effect

    Tensor Correlations Measured in 3He(e,e'pp)n

    We have measured the 3He(e,e'pp)n reaction at an incident energy of 4.7 GeV over a wide kinematic range. We identified spectator correlated pp and pn nucleon pairs using kinematic cuts and measured their relative and total momentum distributions. This is the first measurement of the ratio of pp to pn pairs as a function of pair total momentum, ptotp_{tot}. For pair relative momenta between 0.3 and 0.5 GeV/c, the ratio is very small at low ptotp_{tot} and rises to approximately 0.5 at large ptotp_{tot}. This shows the dominance of tensor over central correlations at this relative momentum.Comment: 4 pages, 4 figures, submitted to PR

    Measurement of the nuclear multiplicity ratio for Ks0K^0_s hadronization at CLAS

    The influence of cold nuclear matter on lepto-production of hadrons in semi-inclusive deep inelastic scattering is measured using the CLAS detector in Hall B at Jefferson Lab and a 5.014 GeV electron beam. We report the Ks0K_s^0 multiplicity ratios for targets of C, Fe, and Pb relative to deuterium as a function of the fractional virtual photon energy zz transferred to the Ks0K_s^0 and the transverse momentum squared pT2p_{T}^2 of the Ks0K_s^0. We find that the multiplicity ratios for Ks0K^0_s are reduced in the nuclear medium at high zz and low pT2p_{T}^2, with a trend for the Ks0K^0_s transverse momentum to be broadened in the nucleus for large pT2p_{T}^2.Comment: Submitted to Phys. Lett.

    Demonstration of a novel technique to measure two-photon exchange effects in elastic e±pe^\pm p scattering

    The discrepancy between proton electromagnetic form factors extracted using unpolarized and polarized scattering data is believed to be a consequence of two-photon exchange (TPE) effects. However, the calculations of TPE corrections have significant model dependence, and there is limited direct experimental evidence for such corrections. We present the results of a new experimental technique for making direct e±pe^\pm p comparisons, which has the potential to make precise measurements over a broad range in Q2Q^2 and scattering angles. We use the Jefferson Lab electron beam and the Hall B photon tagger to generate a clean but untagged photon beam. The photon beam impinges on a converter foil to generate a mixed beam of electrons, positrons, and photons. A chicane is used to separate and recombine the electron and positron beams while the photon beam is stopped by a photon blocker. This provides a combined electron and positron beam, with energies from 0.5 to 3.2 GeV, which impinges on a liquid hydrogen target. The large acceptance CLAS detector is used to identify and reconstruct elastic scattering events, determining both the initial lepton energy and the sign of the scattered lepton. The data were collected in two days with a primary electron beam energy of only 3.3 GeV, limiting the data from this run to smaller values of Q2Q^2 and scattering angle. Nonetheless, this measurement yields a data sample for e±pe^\pm p with statistics comparable to those of the best previous measurements. We have shown that we can cleanly identify elastic scattering events and correct for the difference in acceptance for electron and positron scattering. The final ratio of positron to electron scattering: R=1.027±0.005±0.05R=1.027\pm0.005\pm0.05 for =0.206=0.206 GeV2^2 and 0.830ϵ0.9430.830\leq \epsilon\leq 0.943

    Multiwavelength studies of MHD waves in the solar chromosphere: An overview of recent results

    The chromosphere is a thin layer of the solar atmosphere that bridges the relatively cool photosphere and the intensely heated transition region and corona. Compressible and incompressible waves propagating through the chromosphere can supply significant amounts of energy to the interface region and corona. In recent years an abundance of high-resolution observations from state-of-the-art facilities have provided new and exciting ways of disentangling the characteristics of oscillatory phenomena propagating through the dynamic chromosphere. Coupled with rapid advancements in magnetohydrodynamic wave theory, we are now in an ideal position to thoroughly investigate the role waves play in supplying energy to sustain chromospheric and coronal heating. Here, we review the recent progress made in characterising, categorising and interpreting oscillations manifesting in the solar chromosphere, with an impetus placed on their intrinsic energetics.Comment: 48 pages, 25 figures, accepted into Space Science Review

    Transductive Learning for Spatial Data Classification

    Learning classifiers of spatial data presents several issues, such as the heterogeneity of spatial objects, the implicit definition of spatial relationships among objects, the spatial autocorrelation and the abundance of unlabelled data which potentially convey a large amount of information. The first three issues are due to the inherent structure of spatial units of analysis, which can be easily accommodated if a (multi-)relational data mining approach is considered. The fourth issue demands for the adoption of a transductive setting, which aims to make predictions for a given set of unlabelled data. Transduction is also motivated by the contiguity of the concept of positive autocorrelation, which typically affect spatial phenomena, with the smoothness assumption which characterize the transductive setting. In this work, we investigate a relational approach to spatial classification in a transductive setting. Computational solutions to the main difficulties met in this approach are presented. In particular, a relational upgrade of the nave Bayes classifier is proposed as discriminative model, an iterative algorithm is designed for the transductive classification of unlabelled data, and a distance measure between relational descriptions of spatial objects is defined in order to determine the k-nearest neighbors of each example in the dataset. Computational solutions have been tested on two real-world spatial datasets. The transformation of spatial data into a multi-relational representation and experimental results are reported and commented