463 research outputs found

    The Bayesian Decision Tree Technique with a Sweeping Strategy

    Full text link
    The uncertainty of classification outcomes is of crucial importance for many safety critical applications including, for example, medical diagnostics. In such applications the uncertainty of classification can be reliably estimated within a Bayesian model averaging technique that allows the use of prior information. Decision Tree (DT) classification models used within such a technique gives experts additional information by making this classification scheme observable. The use of the Markov Chain Monte Carlo (MCMC) methodology of stochastic sampling makes the Bayesian DT technique feasible to perform. However, in practice, the MCMC technique may become stuck in a particular DT which is far away from a region with a maximal posterior. Sampling such DTs causes bias in the posterior estimates, and as a result the evaluation of classification uncertainty may be incorrect. In a particular case, the negative effect of such sampling may be reduced by giving additional prior information on the shape of DTs. In this paper we describe a new approach based on sweeping the DTs without additional priors on the favorite shape of DTs. The performances of Bayesian DT techniques with the standard and sweeping strategies are compared on a synthetic data as well as on real datasets. Quantitatively evaluating the uncertainty in terms of entropy of class posterior probabilities, we found that the sweeping strategy is superior to the standard strategy

    Biclustering models for structured microarray data

    Get PDF
    ©2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.Microarrays have become a standard tool for investigating gene function and more complex microarray experiments are increasingly being conducted. For example, an experiment may involve samples from several groups or may investigate changes in gene expression over time for several subjects, leading to large three-way data sets. In response to this increase in data complexity, we propose some extensions to the plaid model, a biclustering method developed for the analysis of gene expression data. This model-based method lends itself to the incorporation of any additional structure such as external grouping or repeated measures. We describe how the extended models may be fitted and illustrate their use on real data

    Transthoracic echocardiography for imaging of the different coronary artery segments: a feasibility study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Transthoracic echocardiography (TTE) may be used for direct inspection of various parts of the main coronary arteries for detection of coronary stenoses and occlusions. We aimed to assess the feasibility of TTE to visualise the complete segments of the left main (LM), left descending (LAD), circumflex (Cx) and right (RCA) coronary arteries.</p> <p>Methods</p> <p>One hundred and eleven patients scheduled for diagnostic coronary angiography because of chest pain or acute coronary syndrome had a TTE study to map the passage of the main coronary arteries. LAD, Cx and RCA were each divided into proximal, middle and distal segments. If any part of the individual segment of a coronary artery with antegrade blood flow was not visualised, the segment was labeled as not satisfactorily seen.</p> <p>Results</p> <p>Complete imaging of the LM was achieved in 98% of the patients. With antegrade directed coronary artery flow, the proximal, middle and distal segments of LAD were completely seen in 96%, 95% and 91% of patients, respectively. Adding the completely seen segments with antegrade coronary flow and segments with retrograde coronary flow, the proximal, middle and distal segments of LAD were adequately visualised in 96%, 96% and 93% of patients, respectively. With antegrade directed coronary artery flow, the proximal, middle and distal segments of Cx were completely seen in 88%, 61% and 3% and in RCA in 40%, 28% and 54% of patients. Retrograde coronary artery flow was correctly identified as verified by coronary angiography in seven coronary segments, mainly in the posterior descending artery (labeled as the distal segment of RCA) and distal LAD.</p> <p>Conclusions</p> <p>TTE is a feasible method for complete demonstration of coronary flow in the LM, the proximal Cx and the different segments of LAD, but less suitable for the RCA and mid and distal segments of the Cx. (ClinicalTrials.gov number NTC00281346.)</p

    Strategies for non-parametric smoothing of the location model in mixed-variable discriminant analysis

    Get PDF
    The non-parametric smoothing of the location model proposed by Asparoukhov and Krzanowski (2000) for allocating objects with mixtures of variables into two groups is studied. The strategy for selecting the smoothing parameter through the maximisation of the pseudo-likelihood function is reviewed. Problems with previous methods are highlighted, and two alternative strategies are proposed. Some investigations into other possible smoothing procedures for estimating cell probabilities are discussed. A leave-one-out method is proposed for constructing the allocation rule and evaluating its performance by estimating the true error rate. Results of a numerical study on simulated data highlight the feasibility of the proposed allocation rule as well as its advantages over previous methods, and an example using real data is presented

    Reduced coronary flow reserve in Anderson-Fabry disease measured by transthoracic Doppler echocardiography

    Get PDF
    Coronary flow reserve was assessed in a patient with Anderson-Fabry disease complicated by symmetric left ventricular hypertrophy. Coronary flow reserve was measurable in all three major coronary arteries providing an opportunity to compare regional coronary flow reserve from different vascular beds. In this patient all the three vascular beds supplied diffusely hypertrophied myocardium. Coronary flow disturbances in small intramyocardial perforating arteries were visible. The coronary flow reserve was reduced to a similar level (around to 2.0) in all three major arteries. In our patient with Anderson-Fabry disease, the coronary vasodilatation was blunted in a diffuse pattern corresponding to the myocardial hypertrophy distribution. In small intramyocardial arteries coronary flow was also disturbed. Accordingly, retrograde systolic flow and accelerated anterograde diastolic flow were documented

    Recipes for sparse LDA of horizontal data

    Get PDF
    Many important modern applications require analyzing data with more variables than observations, called for short horizontal. In such situation the classical Fisher’s linear discriminant analysis (LDA) does not possess solution because the within-group scatter matrix is singular. Moreover, the number of the variables is usually huge and the classical type of solutions (discriminant functions) are difficult to interpret as they involve all available variables. Nowadays, the aim is to develop fast and reliable algorithms for sparse LDA of horizontal data. The resulting discriminant functions depend on very few original variables, which facilitates their interpretation. The main theoretical and numerical challenge is how to cope with the singularity of the within-group scatter matrix. This work aims at classifying the existing approaches according to the way they tackle this singularity issue, and suggest new ones

    Standard survey methods for estimating colony losses and explanatory risk factors in Apis mellifera

    Get PDF
    This chapter addresses survey methodology and questionnaire design for the collection of data pertaining to estimation of honey bee colony loss rates and identification of risk factors for colony loss. Sources of error in surveys are described. Advantages and disadvantages of different random and non-random sampling strategies and different modes of data collection are presented to enable the researcher to make an informed choice. We discuss survey and questionnaire methodology in some detail, for the purpose of raising awareness of issues to be considered during the survey design stage in order to minimise error and bias in the results. Aspects of survey design are illustrated using surveys in Scotland. Part of a standardized questionnaire is given as a further example, developed by the COLOSS working group for Monitoring and Diagnosis. Approaches to data analysis are described, focussing on estimation of loss rates. Dutch monitoring data from 2012 were used for an example of a statistical analysis with the public domain R software. We demonstrate the estimation of the overall proportion of losses and corresponding confidence interval using a quasi-binomial model to account for extra-binomial variation. We also illustrate generalized linear model fitting when incorporating a single risk factor, and derivation of relevant confidence intervals

    Random matrix theory for portfolio optimization: a stability approach

    Get PDF
    We apply Random Matrix Theory (RMT) on an empirically-measured financial correlation matrix, C, and show that this matrix contains a large amount of noise. In order to determine the sensitivity of the spectral properties of a random matrix to noise, we simulate a set of data and add different volumes of random noise. Having ascertained that the eigenspectrum is independent of the standard deviation of added noise, we use RMT to determine the noise percentage in a correlation matrix based on real data from S&P500. Eigenvalue and eigenvector analyses are applied and the experimental results for each of them are presented to identify qualitatively and quantitatively different spectral properties of the empirical correlation matrix to a random counterpart. Finally we attempt to separate the noisy part from the non-noisy part of C. We apply an existing technique to cleaning C and then discuss its associated problems. We propose a technique of filtering C which has many advantages, from a stability point of view over the existing method of cleaning

    High connectivity among locally adapted populations of a marine fish (Menidia menidia)

    Get PDF
    Author Posting. © Ecological Society of America, 2010. This article is posted here by permission of Ecological Society of America for personal use, not for redistribution. The definitive version was published in Ecology 91 (2010): 3526–3537, doi:10.1890/09-0548.1.Patterns of connectivity are important in understanding the geographic scale of local adaptation in marine populations. While natural selection can lead to local adaptation, high connectivity can diminish the potential for such adaptation to occur. Connectivity, defined as the exchange of individuals among subpopulations, is presumed to be significant in most marine species due to life histories that include widely dispersive stages. However, evidence of local adaptation in marine species, such the Atlantic silverside, Menidia menidia, raises questions concerning the degree of connectivity. We examined geochemical signatures in the otoliths, or ear bones, of adult Atlantic silversides collected in 11 locations along the northeastern coast of the United States from New Jersey to Maine in 2004 and eight locations in 2005 using laser ablation inductively coupled plasma mass spectrometry (ICP-MS) and isotope ratio monitoring mass spectrometry (irm-MS). These signatures were then compared to baseline signatures of juvenile fish of known origin to determine natal origin of these adult fish. We then estimated migration distances and the degree of mixing from these data. In both years, fish generally had the highest probability of originating from the same location in which they were captured (0.01–0.80), but evidence of mixing throughout the sample area was present. Furthermore, adult M. menidia exhibit highly dispersive behavior with some fish migrating over 700 km. The probability of adult fish returning to natal areas differed between years, with the probability being, on average, 0.2 higher in the second year. These findings demonstrate that marine species with largely open populations are capable of local adaptation despite apparently high gene flow.This work was funded by the National Science Foundation (grant OCE-0425830 to D. O. Conover and grant OCE- 0134998 to S. R. Thorrold) and the New York State Department of Environmental Conservation
    corecore