48 research outputs found
Supervised classification of combined copy number and gene expression data
Summary In this paper we apply a predictive profiling method to genome copy number aberrations (CNA) in combination with gene expression and clinical data to identify molecular patterns of cancer pathophysiology. Predictive models and optimal feature lists for the platforms are developed by a complete validation SVM-based machine learning system. Ranked list of genome CNA sites (assessed by comparative genomic hybridization arrays – aCGH) and of differentially expressed genes (assessed by microarray profiling with Affy HG-U133A chips) are computed and combined on a breast cancer dataset for the discrimination of Luminal/ ER+ (Lum/ER+) and Basal-like/ER- classes. Different encodings are developed and applied to the CNA data, and predictive variable selection is discussed. We analyze the combination of profiling information between the platforms, also considering the pathophysiological data. A specific subset of patients is identified that has a different response to classification by chromosomal gains and losses and by differentially expressed genes, corroborating the idea that genomic CNA can represent an independent source for tumor classification
Recommended from our members
Two apples a day lower serum cholesterol and improve cardiometabolic biomarkers in mildly hypercholesterolemic adults: a randomized, controlled, crossover trial
Background: Apples are rich in bioactive polyphenols and fiber. Evidence suggests that consumption of apples, or their bioactive components is associated with beneficial effects on lipid metabolism and other markers of cardiovascular disease (CVD). However, adequately powered randomized controlled trials are necessary to confirm these data and explore the
mechanisms.
Objective: To determine the effects of apple consumption on circulating lipids, vascular function and other CVD risk markers.
Design: The trial was a randomized, controlled, crossover, intervention study. Healthy mildly hypercholesterolemic volunteers (23 women, 17 men), with a mean BMI (± SD) 25.3 (± 3.7)kg/m2 and age (± SD) 51.4 (± 11) years, consumed 2 apples/day (Renetta Canada, rich in proanthocyanidins), or a sugar and energy matched apple control beverage (CB) for 8 weeks separated by a 4-week washout period. Fasted blood was collected before and after each treatment. Serum lipids, glucose, insulin, bile acids, endothelial and inflammation biomarkers were measured, in addition to microvascular reactivity, using laser Doppler imaging with Iontophoresis and arterial stiffness, using Pulse Wave Analysis.
Results: Whole apple (WA) consumption decreased serum total (WA: 5.89 mmol/l, CB: 6.11mmol/l; P=0.006) and LDL cholesterol (WA: 3.72 mmol/l, CB: 3.86 mmol/l; P=0.031), triacylglycerol (WA: 1.17 mmol/l, CB: 1.30 mmol/l; P=0.021) and intercellular cell adhesion molecule-1 (WA: 153.9 ng/ml, CB: 159.4 ng/ml; P=0.028), and increased serum uric acid (WA:341.4 μmol/l, CB: 330 μmol/l; P=0.020) compared with the CB. The response to endothelium dependent microvascular vasodilation was greater after the apples (WA: 853 (PU, perfusion units), CB: 760 PU; P=0.037) compared with the CB. Apples had no effect on blood pressure or other CVD markers.
Conclusions: These data support beneficial hypocholesterolemic and vascular effects of the daily consumption of proanthocyanidin-rich apples by mildly hypercholesterolemic individuals
Two-omics data revealed commonalities and differences between Rpv12- and Rpv3-mediated resistance in grapevine
Plasmopara viticola is the causal agent of grapevine downy mildew (DM). DM resistant varieties deploy effector-triggered immunity (ETI) to inhibit pathogen growth, which is activated by major resistance loci, the most common of which are Rpv3 and Rpv12. We previously showed that a quick metabolome response lies behind the ETI conferred by Rpv3 TIR-NB-LRR genes. Here we used a grape variety operating Rpv12-mediated ETI, which is conferred by an independent locus containing CC-NB-LRR genes, to investigate the defence response using GC/MS, UPLC, UHPLC and RNA-Seq analyses. Eighty-eight metabolites showed significantly different concentration and 432 genes showed differential expression between inoculated resistant leaves and controls. Most metabolite changes in sugars, fatty acids and phenols were similar in timing and direction to those observed in Rpv3-mediated ETI but some of them were stronger or more persistent. Activators, elicitors and signal transducers for the formation of reactive oxygen species were early observed in samples undergoing Rpv12-mediated ETI and were paralleled and followed by the upregulation of genes belonging to ontology categories associated with salicylic acid signalling, signal transduction, WRKY transcription factors and synthesis of PR-1, PR-2, PR-5 pathogenesis-related proteins
Algebraic Comparison of Partial Lists in Bioinformatics
The outcome of a functional genomics pipeline is usually a partial list of
genomic features, ranked by their relevance in modelling biological phenotype
in terms of a classification or regression model. Due to resampling protocols
or just within a meta-analysis comparison, instead of one list it is often the
case that sets of alternative feature lists (possibly of different lengths) are
obtained. Here we introduce a method, based on the algebraic theory of
symmetric groups, for studying the variability between lists ("list stability")
in the case of lists of unequal length. We provide algorithms evaluating
stability for lists embedded in the full feature set or just limited to the
features occurring in the partial lists. The method is demonstrated first on
synthetic data in a gene filtering task and then for finding gene profiles on a
recent prostate cancer dataset
Effect of Size and Heterogeneity of Samples on Biomarker Discovery: Synthetic and Real Data Assessment
MOTIVATION:
The identification of robust lists of molecular biomarkers related to a disease is a fundamental step for early diagnosis and treatment. However, methodologies for the discovery of biomarkers using microarray data often provide results with limited overlap. These differences are imputable to 1) dataset size (few subjects with respect to the number of features); 2) heterogeneity of the disease; 3) heterogeneity of experimental protocols and computational pipelines employed in the analysis. In this paper, we focus on the first two issues and assess, both on simulated (through an in silico regulation network model) and real clinical datasets, the consistency of candidate biomarkers provided by a number of different methods.
METHODS:
We extensively simulated the effect of heterogeneity characteristic of complex diseases on different sets of microarray data. Heterogeneity was reproduced by simulating both intrinsic variability of the population and the alteration of regulatory mechanisms. Population variability was simulated by modeling evolution of a pool of subjects; then, a subset of them underwent alterations in regulatory mechanisms so as to mimic the disease state.
RESULTS:
The simulated data allowed us to outline advantages and drawbacks of different methods across multiple studies and varying number of samples and to evaluate precision of feature selection on a benchmark with known biomarkers. Although comparable classification accuracy was reached by different methods, the use of external cross-validation loops is helpful in finding features with a higher degree of precision and stability. Application to real data confirmed these results
Performance of the ATLAS electromagnetic calorimeter end-cap module 0
The construction and beam test results of the ATLAS electromagnetic end-cap calorimeter pre-production module 0 are presented. The stochastic term of the energy resolution is between 10% GeV^1/2 and 12.5% GeV^1/2 over the full pseudorapidity range. Position and angular resolutions are found to be in agreement with simulation. A global constant term of 0.6% is obtained in the pseudorapidity range 2.5 eta 3.2 (inner wheel)
Experimental shear testing of timber-masonry dry connections for the seismic retrofit of unreinforced masonry shear walls
The mechanical coupling of timber products to the masonry walls of unreinforced masonry (URM) buildings
is generating considerable interest in terms of seismic vulnerability mitigation. An extensive experimental
investigation on timber panel to masonry wall connections realised with screw anchor fasteners
is presented. A total of 64 shear tests under monotonic, cyclic and semi-cyclic loading conditions were
performed on site in a historic URM building. The examined parameters were: masonry type, timber
panel product and material, load-to-grain direction, fastener geometry and steel grade. The outcomes
of the campaign are then reported and discussed focusing on the strength and stiffness properties and
on the dissipation capacity and residual strength of the connection under cyclic load. Moreover, a lognormal
distribution fitting is proposed for the maximum load and slip modulus measurements of all
the cyclic test configurations analysed. Finally, the principal experimental observations are listed along
with recommendations for future work or use in practic
Machine learning methods for predictive proteomics
The search for predictive biomarkers of disease from high-throughput mass spectrometry (MS) data requires a complex analysis path. Preprocessing and machine-learning modules are pipelined, starting from raw spectra, to set up a predictive classifier based on a shortlist of candidate features. As a machine-learning problem, proteomic profiling on MS data needs caution like the microarray case. The risk of overfitting and of selection bias effects is pervasive: not only potential features easily outnumber samples by 103 times, but it is easy to neglect information-leakage effects during preprocessing from spectra to peaks. The aim of this review is to explain how to build a general purpose design analysis protocol (DAP) for predictive proteomic profiling: we show how to limit leakage due to parameter tuning and how to organize classification and ranking on large numbers of replicate versions of the original data to avoid selection bias. The DAP can be used with alternative components, i.e. with different preprocessing methods (peak clustering or wavelet based), classifiers e.g. Suport Vector Machine (SVM) or feature ranking methods recursive feature elimination (RFE) or I-Relief. A procedure for assessing stability and predictive value of the resulting biomarkers’ list is also provided. The approach is exemplified with experiments on synthetic datasets (from the Cromwell MS simulator) and with publicly available datasets from cancer studies