1,425 research outputs found

    Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data

    Get PDF
    BACKGROUND: Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal. RESULTS: A hierarchical statistical model named kernel-imbedded Gaussian process (KIGP) is developed under a unified Bayesian framework for binary disease classification problems using microarray gene expression data. In particular, based on a probit regression setting, an adaptive algorithm with a cascading structure is designed to find the appropriate kernel, to discover the potentially significant genes, and to make the optimal class prediction accordingly. A Gibbs sampler is built as the core of the algorithm to make Bayesian inferences. Simulation studies showed that, even without any knowledge of the underlying generative model, the KIGP performed very close to the theoretical Bayesian bound not only in the case with a linear Bayesian classifier but also in the case with a very non-linear Bayesian classifier. This sheds light on its broader usability to microarray data analysis problems, especially to those that linear methods work awkwardly. The KIGP was also applied to four published microarray datasets, and the results showed that the KIGP performed better than or at least as well as any of the referred state-of-the-art methods did in all of these cases. CONCLUSION: Mathematically built on the kernel-induced feature space concept under a Bayesian framework, the KIGP method presented in this paper provides a unified machine learning approach to explore both the linear and the possibly non-linear underlying relationship between the target features of a given binary disease classification problem and the related explanatory gene expression data. More importantly, it incorporates the model parameter tuning into the framework. The model selection problem is addressed in the form of selecting a proper kernel type. The KIGP method also gives Bayesian probabilistic predictions for disease classification. These properties and features are beneficial to most real-world applications. The algorithm is naturally robust in numerical computation. The simulation studies and the published data studies demonstrated that the proposed KIGP performs satisfactorily and consistently

    Haplotype inference in crossbred populations without pedigree information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Current methods for haplotype inference without pedigree information assume random mating populations. In animal and plant breeding, however, mating is often not random. A particular form of nonrandom mating occurs when parental individuals of opposite sex originate from distinct populations. In animal breeding this is called <it>crossbreeding </it>and <it>hybridization </it>in plant breeding. In these situations, association between marker and putative gene alleles might differ between the founding populations and origin of alleles should be accounted for in studies which estimate breeding values with marker data. The sequence of alleles from one parent constitutes one haplotype of an individual. Haplotypes thus reveal allele origin in data of crossbred individuals.</p> <p>Results</p> <p>We introduce a new method for haplotype inference without pedigree that allows nonrandom mating and that can use genotype data of the parental populations and of a crossbred population. The aim of the method is to estimate line origin of alleles. The method has a Bayesian set up with a Dirichlet Process as prior for the haplotypes in the two parental populations. The basic idea is that only a subset of the complete set of possible haplotypes is present in the population.</p> <p>Conclusion</p> <p>Line origin of approximately 95% of the alleles at heterozygous sites was assessed correctly in both simulated and real data. Comparing accuracy of haplotype frequencies inferred with the new algorithm to the accuracy of haplotype frequencies inferred with PHASE, an existing algorithm for haplotype inference, showed that the DP algorithm outperformed PHASE in situations of crossbreeding and that PHASE performed better in situations of random mating.</p

    Search for time-dependent B0s - B0s-bar oscillations using a vertex charge dipole technique

    Get PDF
    We report a search for B0s - B0s-bar oscillations using a sample of 400,000 hadronic Z0 decays collected by the SLD experiment. The analysis takes advantage of the electron beam polarization as well as information from the hemisphere opposite that of the reconstructed B decay to tag the B production flavor. The excellent resolution provided by the pixel CCD vertex detector is exploited to cleanly reconstruct both B and cascade D decay vertices, and tag the B decay flavor from the charge difference between them. We exclude the following values of the B0s - B0s-bar oscillation frequency: Delta m_s < 4.9 ps-1 and 7.9 < Delta m_s < 10.3 ps-1 at the 95% confidence level.Comment: 18 pages, 3 figures, replaced by version accepted for publication in Phys.Rev.D; results differ slightly from first versio

    Measurement of the running of the QED coupling in small-angle Bhabha scattering at LEP

    Full text link
    Using the OPAL detector at LEP, the running of the effective QED coupling alpha(t) is measured for space-like momentum transfer from the angular distribution of small-angle Bhabha scattering. In an almost ideal QED framework, with very favourable experimental conditions, we obtain: Delta alpha(-6.07GeV^2) - Delta alpha(-1.81GeV^2) = (440 pm 58 pm 43 pm 30) X 10^-5, where the first error is statistical, the second is the experimental systematic and the third is the theoretical uncertainty. This agrees with current evaluations of alpha(t).The null hypothesis that alpha remains constant within the above interval of -t is excluded with a significance above 5sigma. Similarly, our results are inconsistent at the level of 3sigma with the hypothesis that only leptonic loops contribute to the running. This is currently the most significant direct measurment where the running alpha(t) is probed differentially within the measured t range.Comment: 43 pages, 12 figures, Submitted to Euro. Phys. J.

    Selected static foot assessments do not predict medial longitudinal arch motion during running

    Get PDF
    Background: Static assessments of the foot are commonly advocated within the running community to classify the foot with a view to recommending the appropriate type of running shoe. The aim of this work was to determine whether selected static foot assessment could predict medial longitudinal arch (MLA) motion during running. Methods: Fifteen physically active males (27 ± 5 years, 1.77 ± 0.04m, 80 ± 10kg) participated in the study. Foot Posture Index (FPI-6), MLA angle and rearfoot angle were measured in a relaxed standing position. MLA motion was calculated using the position of retro-reflective markers tracked by a VICON motion analysis system, while participants ran barefoot on a treadmill at a self-selected pace (2.8 ± 0.5m.s-1). Bivariate linear regression was used to determine whether the static measures predicted MLA deformation and MLA angles at initial contact, midsupport and toe off. Results: All three foot classification measures were significant predictors of MLA angle at initial contact, midsupport and toe off (p < .05) explaining 41-90% of the variance. None of the static foot classification measures were significant predictors of MLA deformation during the stance phase of running. Conclusion: Selected static foot measures did not predict dynamic MLA deformation during running. Given that MLA deformation has theoretically been linked to running injuries, the clinical relevance of predicting MLA angle at discrete time points during the stance phase of running is questioned. These findings also question the validity of the selected static foot classification measures when looking to characterise the foot during running. This indicates that alternative means of assessing the foot to inform footwear selection are required

    A Measurement of Rb using a Double Tagging Method

    Get PDF
    The fraction of Z to bbbar events in hadronic Z decays has been measured by the OPAL experiment using the data collected at LEP between 1992 and 1995. The Z to bbbar decays were tagged using displaced secondary vertices, and high momentum electrons and muons. Systematic uncertainties were reduced by measuring the b-tagging efficiency using a double tagging technique. Efficiency correlations between opposite hemispheres of an event are small, and are well understood through comparisons between real and simulated data samples. A value of Rb = 0.2178 +- 0.0011 +- 0.0013 was obtained, where the first error is statistical and the second systematic. The uncertainty on Rc, the fraction of Z to ccbar events in hadronic Z decays, is not included in the errors. The dependence on Rc is Delta(Rb)/Rb = -0.056*Delta(Rc)/Rc where Delta(Rc) is the deviation of Rc from the value 0.172 predicted by the Standard Model. The result for Rb agrees with the value of 0.2155 +- 0.0003 predicted by the Standard Model.Comment: 42 pages, LaTeX, 14 eps figures included, submitted to European Physical Journal

    Measurement of the B+ and B-0 lifetimes and search for CP(T) violation using reconstructed secondary vertices

    Get PDF
    The lifetimes of the B+ and B-0 mesons, and their ratio, have been measured in the OPAL experiment using 2.4 million hadronic Z(0) decays recorded at LEP. Z(0) --> b (b) over bar decays were tagged using displaced secondary vertices and high momentum electrons and muons. The lifetimes were then measured using well-reconstructed charged and neutral secondary vertices selected in this tagged data sample. The results aretau(B+) = 1.643 +/- 0.037 +/- 0.025 pstau(Bo) = 1.523 +/- 0.057 +/- 0.053 pstau(B+)/tau(Bo) = 1.079 +/- 0.064 +/- 0.041,where in each case the first error is statistical and the second systematic.A larger data sample of 3.1 million hadronic Z(o) decays has been used to search for CP and CPT violating effects by comparison of inclusive b and (b) over bar hadron decays, No evidence fur such effects is seen. The CP violation parameter Re(epsilon(B)) is measured to be Re(epsilon(B)) = 0.001 +/- 0.014 +/- 0.003and the fractional difference between b and (b) over bar hadron lifetimes is measured to(Delta tau/tau)(b) = tau(b hadron) - tau((b) over bar hadron)/tau(average) = -0.001 +/- 0.012 +/- 0.008

    Metal-enhanced fluorescence of colloidal nanocrystals with nanoscale control

    Get PDF
    Engineering the spectral properties of fluorophores, such as the enhancement of luminescence intensity, can be achieved through coupling with surface plasmons in metallic nanostructures This process, referred to as metal-enhanced fluorescence, offers promise for a range of applications, including LEDs, sensor technology, microarrays and single-molecule studies. It becomes even more appealing when applied to colloidal semiconductor nanocrystals, which exhibit size-dependent optical properties, have high photochemical stability, and are characterized by broad excitation spectra and narrow emission bands. Other approaches have relied upon the coupling of fluorophores (typically organic dyes) to random distributions of metallic nanoparticles or nanoscale roughness in metallic films. Here, we develop a new strategy based on the highly reproducible fabrication of ordered arrays of gold nanostructures coupled to CdSe/ZnS nanocrystals dispersed in a polymer blend. We demonstrate the possibility of obtaining precise control and a high spatial selectivity of the fluorescence enhancement process

    Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015

    Get PDF
    SummaryBackground The Global Burden of Diseases, Injuries, and Risk Factors Study 2015 provides an up-to-date synthesis of the evidence for risk factor exposure and the attributable burden of disease. By providing national and subnational assessments spanning the past 25 years, this study can inform debates on the importance of addressing risks in context. Methods We used the comparative risk assessment framework developed for previous iterations of the Global Burden of Disease Study to estimate attributable deaths, disability-adjusted life-years (DALYs), and trends in exposure by age group, sex, year, and geography for 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks from 1990 to 2015. This study included 388 risk-outcome pairs that met World Cancer Research Fund-defined criteria for convincing or probable evidence. We extracted relative risk and exposure estimates from randomised controlled trials, cohorts, pooled cohorts, household surveys, census data, satellite data, and other sources. We used statistical models to pool data, adjust for bias, and incorporate covariates. We developed a metric that allows comparisons of exposure across risk factors—the summary exposure value. Using the counterfactual scenario of theoretical minimum risk level, we estimated the portion of deaths and DALYs that could be attributed to a given risk. We decomposed trends in attributable burden into contributions from population growth, population age structure, risk exposure, and risk-deleted cause-specific DALY rates. We characterised risk exposure in relation to a Socio-demographic Index (SDI). Findings Between 1990 and 2015, global exposure to unsafe sanitation, household air pollution, childhood underweight, childhood stunting, and smoking each decreased by more than 25%. Global exposure for several occupational risks, high body-mass index (BMI), and drug use increased by more than 25% over the same period. All risks jointly evaluated in 2015 accounted for 57·8% (95% CI 56·6–58·8) of global deaths and 41·2% (39·8–42·8) of DALYs. In 2015, the ten largest contributors to global DALYs among Level 3 risks were high systolic blood pressure (211·8 million [192·7 million to 231·1 million] global DALYs), smoking (148·6 million [134·2 million to 163·1 million]), high fasting plasma glucose (143·1 million [125·1 million to 163·5 million]), high BMI (120·1 million [83·8 million to 158·4 million]), childhood undernutrition (113·3 million [103·9 million to 123·4 million]), ambient particulate matter (103·1 million [90·8 million to 115·1 million]), high total cholesterol (88·7 million [74·6 million to 105·7 million]), household air pollution (85·6 million [66·7 million to 106·1 million]), alcohol use (85·0 million [77·2 million to 93·0 million]), and diets high in sodium (83·0 million [49·3 million to 127·5 million]). From 1990 to 2015, attributable DALYs declined for micronutrient deficiencies, childhood undernutrition, unsafe sanitation and water, and household air pollution; reductions in risk-deleted DALY rates rather than reductions in exposure drove these declines. Rising exposure contributed to notable increases in attributable DALYs from high BMI, high fasting plasma glucose, occupational carcinogens, and drug use. Environmental risks and childhood undernutrition declined steadily with SDI; low physical activity, high BMI, and high fasting plasma glucose increased with SDI. In 119 countries, metabolic risks, such as high BMI and fasting plasma glucose, contributed the most attributable DALYs in 2015. Regionally, smoking still ranked among the leading five risk factors for attributable DALYs in 109 countries; childhood underweight and unsafe sex remained primary drivers of early death and disability in much of sub-Saharan Africa. Interpretation Declines in some key environmental risks have contributed to declines in critical infectious diseases. Some risks appear to be invariant to SDI. Increasing risks, including high BMI, high fasting plasma glucose, drug use, and some occupational exposures, contribute to rising burden from some conditions, but also provide opportunities for intervention. Some highly preventable risks, such as smoking, remain major causes of attributable DALYs, even as exposure is declining. Public policy makers need to pay attention to the risks that are increasingly major contributors to global burden. Funding Bill & Melinda Gates Foundation
    corecore