Search CORE

122 research outputs found

Editorial–data analysis in metabolomics

Author: Jansen Jeroen J.
Westerhuis Johan A.
Publication venue: Springer US
Publication date: 01/01/2012
Field of study

Item does not contain fulltext2 p

Crossref

Springer - Publisher Connector

PubMed Central

Radboud Repository

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Critical evaluation of assessor difference correction approaches in sensory analysis

Author: Großmann Justus L.
Næs Tormod
Smilde Age K.
Westerhuis Johan A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

In sensory data analysis, assessor-dependent scaling effects may hinder the analysis of product differences. Romano et al. (2008) compared several approaches to reduce scaling differences between assessors by their ability to maximise the product effect F-values in a mixed ANOVA analysis. Their study on a sensory dataset of 14 cheese samples assessed by twelve assessors on a continuous scale showed that some of these approaches apparently improved the F-value of the product effect. However, this direct comparison is only legitimate if these F-values originate from the same null distribution. To obtain the null distributions of the different correction methods, we employed a permutation approach on the same cheese dataset also used by Romano et al. (2008) and a random noise simulation approach. Based on the empirically obtained null distributions, we calculated the corrected product effect significance to directly compare the performance of the preprocessing methods. Our results show that the null distributions of some preprocessing methods do not correspond to the expected F-distribution. In particular for the ten Berge method, the null distribution is shifted towards higher F-values. Therefore, an observed increase of the product effect F-value, as compared to the F-value on raw data, does not necessarily lead to increased product effect significance. If p-values are calculated based on such inflated F-values, significance may thus be overestimated. In contrast, calculation of p-values directly from the empirical null distributions obtained by permutation provides a common ground to properly compare method performance. Moreover, we show that differences in reproducibility between assessors, as they exist in real-world sensory datasets, may lead to overestimation of product effect significance by the mixed assessor model (MAM).publishedVersio

NOFIMA Repository

Metabolic network discovery through reverse engineering of metabolome data

Author: Hendriks Margriet M. W. B.
Smilde Age K.
Westerhuis Johan A.
Çakır Tunahan
Publication venue: Springer US
Publication date: 01/01/2009
Field of study

Reverse engineering of high-throughput omics data to infer underlying biological networks is one of the challenges in systems biology. However, applications in the field of metabolomics are rather limited. We have focused on a systematic analysis of metabolic network inference from in silico metabolome data based on statistical similarity measures. Three different data types based on biological/environmental variability around steady state were analyzed to compare the relative information content of the data types for inferring the network. Comparing the inference power of different similarity scores indicated the clear superiority of conditioning or pruning based scores as they have the ability to eliminate indirect interactions. We also show that a mathematical measure based on the Fisher information matrix gives clues on the information quality of different data types to better represent the underlying metabolic network topology. Results on several datasets of increasing complexity consistently show that metabolic variations observed at steady state, the simplest experimental analysis, are already informative to reveal the connectivity of the underlying metabolic network with a low false-positive rate when proper similarity-score approaches are employed. For experimental situations this implies that a single organism under slightly varying conditions may already generate more than enough information to rightly infer networks. Detailed examination of the strengths of interactions of the underlying metabolic networks demonstrates that the edges that cannot be captured by similarity scores mainly belong to metabolites connected with weak interaction strength

Springer - Publisher Connector

PubMed Central

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Metabolomics variable selection and classification in the presence of observations below the detection limit using an extension of ERp

Author: Carolus J. Reinecke
J Hendrik Venter
Johan A. Westerhuis
Mari van Reenen
Publication venue: Springer Nature
Publication date
Field of study

A compressed folder (XERp Software.zip) containing the Matlab scripts to perform XERp as well as an example application. (ZIP 11 kb

Springer - Publisher Connector

FigShare

Common and distinct variation in data fusion of designed experimental data

Author: Alinaghi Masoumeh
Bertram Hanne Christine
Brunse Anders
Smilde Age K.
Westerhuis Johan A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Copenhagen University Research Information System

Divide et impera: How disentangling common and distinctive variability in multiset data analysis can aid industrial process troubleshooting and understanding

Author: Ferrer Alberto
Noord Onno E. de
Smilde Age K.
Vitale Raffaele
Westerhuis Johan A.
Publication venue: 'Wiley'
Publication date: 01/02/2021
Field of study

[EN] The possibility of addressing the problem of process troubleshooting and understanding by modelling common and distinctive sources of variation (factorsorcomponents) underlying two sets of measurements was explored in a real-world industrial case study. The used strategy includes a novel approach to systematically detect the number of common and distinctive components. An extension of this strategy for the analysis of a larger number of data blocks, which allows the comparison of data from multiple processing units, is also discussed.Spanish Ministry of Economy and Competitiveness, Grant/Award Number: DPI2017-82896-C2-1-RVitale, R.; Noord, OED.; Westerhuis, JA.; Smilde, AK.; Ferrer, A. (2021). Divide et impera: How disentangling common and distinctive variability in multiset data analysis can aid industrial process troubleshooting and understanding. Journal of Chemometrics. 35(2):1-12. https://doi.org/10.1002/cem.3266S11235

Crossref

RiuNet

International Migration, Integration and Social Cohesion online publications

Lipidomic Response to Coffee Consumption

Author: Cornelis Marilyn C.
Erlund Iris
Herder Christian
Kuang Alan
Tuomilehto Jaakko
Westerhuis Johan A.
Publication venue
Publication date: 01/12/2018
Field of study

Coffee is widely consumed and contains many bioactive compounds, any of which may impact pathways related to disease development. Our objective was to identify individual lipid changes in response to coffee drinking. We profiled the lipidome of fasting serum samples collected from a previously reported single blinded, three-stage clinical trial. Forty-seven habitual coffee consumers refrained from drinking coffee for 1 month, consumed 4 cups of coffee/day in the second month and 8 cups/day in the third month. Samples collected after each coffee stage were subject to quantitative lipidomic profiling using ion-mobility spectrometry-mass spectrometry. A total of 853 lipid species mapping to 14 lipid classes were included for univariate analysis. Three lysophosphatidylcholine (LPC) species including LPC (20:4), LPC (22:1) and LPC (22:2), significantly decreased after coffee intake (p 0.05); 58 of these decreased after coffee intake. In conclusion, coffee intake leads to lower levels of specific LPC species with potential impacts on glycerophospholipid metabolism more generally.Peer reviewe

Directory of Open Access Journals

Helsingin yliopiston digitaalinen arkisto

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Variable selection and validation in multivariate modelling

Author: Brunius Carl
Landberg Rikard
Ros\ue9n Johan
Shi Lin
Westerhuis Johan A.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

Motivation Validation of variable selection and predictive performance is crucial in construction of robust multivariate models that generalize well, minimize overfitting and facilitate interpretation of results. Inappropriate variable selection leads instead to selection bias, thereby increasing the risk of model overfitting and false positive discoveries. Although several algorithms exist to identify a minimal set of most informative variables (i.e. the minimal-optimal problem), few can select all variables related to the research question (i.e. the all-relevant problem). Robust algorithms combining identification of both minimal-optimal and all-relevant variables with proper cross-validation are urgently needed. Results We developed the MUVR algorithm to improve predictive performance and minimize overfitting and false positives in multivariate analysis. In the MUVR algorithm, minimal variable selection is achieved by performing recursive variable elimination in a repeated double cross-validation (rdCV) procedure. The algorithm supports partial least squares and random forest modelling, and simultaneously identifies minimal-optimal and all-relevant variable sets for regression, classification and multilevel analyses. Using three authentic omics datasets, MUVR yielded parsimonious models with minimal overfitting and improved model performance compared with state-of-the-art rdCV. Moreover, MUVR showed advantages over other variable selection algorithms, i.e. Boruta and VSURF, including simultaneous variable selection and validation scheme and wider applicability

Chalmers Research

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Heterofusion:Fusing genomics data of different measurement scales

Author: Aben Nanne
Kiers Henk A. L.
Smilde Age K.
Song Yipeng
Wessels Lodewyk F. A.
Westerhuis Johan
Publication venue: 'Wiley'
Publication date: 23/04/2019
Field of study

In systems biology, it is becoming increasingly common to measure biochemical entities at different levels of the same biological system. Hence, data fusion problems are abundant in the life sciences. With the availability of a multitude of measuring techniques, one of the central problems is the heterogeneity of the data. In this paper, we discuss a specific form of heterogeneity, namely, that of measurements obtained at different measurement scales, such as binary, ordinal, interval, and ratio‐scaled variables. Three generic fusion approaches are presented of which two are new to the systems biology community. The methods are presented, put in context, and illustrated with a real‐life genomics example

arXiv.org e-Print Archive

Proceedings - University of Groningen

Crossref

University of Groningen

ARTS repository - University of Groningen

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Dissertations of the University of Groningen