337 research outputs found

    Entropy-based gene ranking without selection bias for the predictive classification of microarray data

    Get PDF
    BACKGROUND: We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process). RESULTS: With E-RFE, we speed up the recursive feature elimination (RFE) with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. CONCLUSIONS: Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance

    Integrating gene expression profiling and clinical data

    Get PDF
    AbstractWe propose a combination of machine learning techniques to integrate predictive profiling from gene expression with clinical and epidemiological data. Starting from BioDCV, a complete software setup for predictive classification and feature ranking without selection bias, we apply semisupervised profiling for detecting outliers and deriving informative subtypes of patients. During the profiling process, sampletracking curves are extracted, and then clustered according to a distance derived from dynamic time warping. Sampletracking allows also the identification of outlier cases, whose removal is shown to improve predictive accuracy and stability of derived gene profiles. Here we propose to employ clinical features to validate the semisupervising procedure. The procedure is demonstrated in the analysis of a liver cancer dataset of 213 samples described by 1993 genes and by pathological features

    Transmission dynamics of the ongoing chikungunya outbreak in Central Italy. From coastal areas to the metropolitan city of Rome, summer 2017

    Get PDF
    A large chikungunya outbreak is ongoing in Italy, with a main cluster in the Anzio coastal municipality. With preliminary epidemiological data, and a transmission model using mosquito abundance and biting rates, we estimated the basic reproduction number R0 at 2.07 (95% credible interval: 1.47–2.59) and the first case importation between 21 May and 18 June 2017. Outbreak risk was higher in coastal/rural sites than urban ones. Novel transmission foci could occur up to mid-November

    Constructing Metropolis-Hastings proposals using damped BFGS updates

    Full text link
    The computation of Bayesian estimates of system parameters and functions of them on the basis of observed system performance data is a common problem within system identification. This is a previously studied issue where stochastic simulation approaches have been examined using the popular Metropolis--Hastings (MH) algorithm. This prior study has identified a recognised difficulty of tuning the {proposal distribution so that the MH method provides realisations with sufficient mixing to deliver efficient convergence. This paper proposes and empirically examines a method of tuning the proposal using ideas borrowed from the numerical optimisation literature around efficient computation of Hessians so that gradient and curvature information of the target posterior can be incorporated in the proposal.Comment: 16 pages, 2 figures. Accepted for publication in the Proceedings of the 18th IFAC Symposium on System Identification (SYSID

    A combinatorial model of malware diffusion via Bluetooth connections

    Get PDF
    We outline here the mathematical expression of a diffusion model for cellphones malware transmitted through Bluetooth channels. In particular, we provide the deterministic formula underlying the proposed infection model, in its equivalent recursive (simple but computationally heavy) and closed form (more complex but efficiently computable) expression.Comment: In press on PlosON

    Estimating measles transmission potential in Italy over the period 2010-2011

    Get PDF
    Background. Recent history of measles epidemiology in Italy is characterized by the recurrence of spatially localized epidemics. Aim. In this study we investigate the three major outbreaks occurred in Italy over the period 2010-2011 and estimate the measles transmission potential. The epidemics mainly involved individuals aged 10-28 years and the transmission potential, measured as effective reproduction number – i.e. the number of new infections generated by a primary infector – was estimated to be 1.9-5.9.Results. Despite such high values, we found that, in all investigated outbreaks, the reproduction number has remained above the epidemic threshold for no more than twelve weeks, suggesting that measles may hardly have the potential to give rise to new nationwide epidemics.Conclusion. In conclusion, the performed analysis highlights the need of planning additional vaccination programs targeting those age classes currently showing a higher susceptibility to infection, in order not to compromise the elimination goal by 201

    The 2014 Ebola virus disease outbreak in Pujehun, Sierra Leone: epidemiology and impact of interventions

    Full text link
    BACKGROUND: In July 2014, an outbreak of Ebola virus disease (EVD) started in Pujehun district, Sierra Leone. On January 10th, 2015, the district was the first to be declared Ebola-free by local authorities after 49 cases and a case fatality rate of 85.7 %. The Pujehun outbreak represents a precious opportunity for improving the body of work on the transmission characteristics and effects of control interventions during the 2014–2015 EVD epidemic in West Africa. METHODS: By integrating hospital registers and contact tracing form data with healthcare worker and local population interviews, we reconstructed the transmission chain and investigated the key time periods of EVD transmission. The impact of intervention measures has been assessed using a microsimulation transmission model calibrated with the collected data. RESULTS: The mean incubation period was 9.7 days (range, 6–15). Hospitalization rate was 89 %. The mean time from the onset of symptoms to hospitalization was 4.5 days (range, 1–9). The mean serial interval was 13.7 days (range, 2–18). The distribution of the number of secondary cases (R(0) = 1.63) was well fitted by a negative binomial distribution with dispersion parameter k = 0.45 (95 % CI, 0.19–1.32). Overall, 74.3 % of transmission events occurred between members of the same family or extended family, 17.9 % in the community, mainly between friends, and 7.7 % in hospital. The mean number of contacts investigated per EVD case raised from 11.5 in July to 25 in September 2014. In total, 43.0 % of cases were detected through contact investigation. Model simulations suggest that the most important factors determining the probability of disease elimination are the number of EVD beds, the mean time from symptom onset to isolation, and the mean number of contacts traced per case. By assuming levels and timing of interventions performed in Pujehun, the estimated probability of eliminating an otherwise large EVD outbreak is close to 100 %. CONCLUSIONS: Containment of EVD in Pujehun district is ascribable to both the natural history of the disease (mainly transmitted through physical contacts, long generation time, overdispersed distribution of secondary cases per single primary case) and intervention measures (isolation of cases and contact tracing), which in turn strongly depend on preparedness, population awareness, and compliance. Our findings are also essential to determine a successful ring vaccination strategy. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12916-015-0524-z) contains supplementary material, which is available to authorized users
    • …
    corecore