9 research outputs found
Biometrical tools for heterosis research
Molecular biological technologies are frequently applied for heterosis research. Large datasets are generated, which are usually analyzed with linear models or linear mixed models. Both types of model make a number of assumptions, and it is important to ensure that the underlying theory applies for datasets at hand. Simultaneous violation of the normality and homoscedasticity assumptions in the linear model setup can produce highly misleading results of associated t- and F-tests. Linear mixed models assume multivariate normality of random effects and errors. These distributional assumptions enable (restricted) maximum likelihood based procedures for estimating variance components. Violations of these assumptions lead to results, which are unreliable and, thus, are potentially misleading. A simulation-based approach for the residual analysis of linear models is introduced, which is extended to linear mixed models. Based on simulation results, the concept of simultaneous tolerance bounds is developed, which facilitates assessing various diagnostic plots. This is exemplified by applying the approach to the residual analysis of different datasets, comparing results to those of other authors. It is shown that the approach is also beneficial, when applied to formal significance tests, which may be used for assessing model assumptions as well. This is supported by the results of a simulation study, where various alternative, non-normal distributions were used for generating data of various experimental designs of varying complexity. For linear mixed models, where studentized residuals are not pivotal quantities, as is the case for linear models, a simulation study is employed for assessing whether the nominal error rate under the null hypothesis complies with the expected nominal error rate.
Furthermore, a novel step within the preprocessing pipeline of two-color cDNA microarray data is introduced. The additional step comprises spatial smoothing of microarray background intensities. It is investigated whether anisotropic correlation models need to be employed or isotropic models are sufficient. A self-versus-self dataset with superimposed sets of simulated, differentially expressed genes is used to demonstrate several beneficial features of background smoothing. In combination with background correction algorithms, which avoid negative intensities and which have already been shown to be superior, this additional step increases the power in finding differentially expressed genes, lowers the number of false positive results, and increases the accuracy of estimated fold changes.Molekularbiologische Verfahren werden häufig in der Heterosis-Forschung eingesetzt. Dabei werden große Datensätze generiert, welche gewöhnlich mittels linearer oder linearer gemischter Modelle analysiert werden. Beide Modellklassen setzen bestimmte Annahmen voraus, damit deren zugrunde liegende Theorie greift. Werden die Annahmen der Normalität und Varianzhomogenität für lineare Modelle gleichzeitig verletzt, kann das zu völlig falschen Ergebnissen bei den zugehörigen t- und F-Tests führen. Bei linearen gemischten Modellen wird multivariate Normalverteilung der zufälligen Effekte sowie der Fehlerterme vorausgesetzt. Diese Verteilungsannahmen ermöglichen die Anwendung des (Restricted) Maximum Likelihood Verfahrens zur Schätzung der Varianzkomponenten. Verletzung dieser Annahmen führen zu ungenauen Schätzungen und sind deshalb von geringem Nutzen. Es wird ein auf Simulation beruhendes Verfahren für die Residuenanalyse linearer Modelle vorgestellt, welches dann auf lineare gemischte Modelle erweitert wird. Basierend auf den simulierten Daten wird das Konzept simultaner Toleranzgrenzen entwickelt, welches die Bewertung verschiedener diagnostischer Plots vereinfacht. Dies wird anhand der jeweiligen Residuenanalyse für verschiedene Datensätze gezeigt, wobei die Ergebnisse des auf Simulation beruhenden Verfahrens mit denen anderer Autoren verglichen werden. Außerdem wird gezeigt, dass dieses Verfahren auf Signifikanztests, welche man ebenfalls zur Überprüfung der Modellvoraussetzungen benutzen könnte, übertragen werden kann und dabei von Vorteil ist. Die Ergebnisse einer Simulationsstudie lassen dies erkennen, wobei verschiedene alternative Verteilungen benutzt werden, um Daten verschiedener, unterschiedlich komplexer Designs zu erzeugen. Im Falle von linearen gemischten Modellen sind studentisierte Residuen nicht unabhängig von Modellparametern, was bei linearen Modellen der Fall ist. Aus diesem Grund wird eine Simulationsstudie präsentiert, welche die Fragestellung klären soll, ob die empirischen Fehlerraten von simultanen Toleranzgrenzen von den erwarteten Fehlerraten abweichen, wenn man Daten unter der Nullhypothese simuliert.
Desweiteren wird ein Verfahren für die komplexe Preprozessierung von 2-Kanal cDNA Microarrays vorgestellt. Dieser zusätzliche Schritt umfasst räumliche Glättungsverfahren für die Hintergrundfluoreszens von Microarrays. Es wird der Frage nachgegangen, ob man Verfahren benötigt, welche anisotrope Korrelationsmodelle verwenden, oder ob isotrope Modelle ausreichen. Um die verschiedenen vorteilhaften Eigenschaften dieses Verfahrens zu zeigen, wird ein Self-versus-Self Microarray Datensatz mit einem simulierten Anteil differentiell exprimierter Gene verwendet. Kombiniert man Verfahren zur Glättung der Hintergrundwerte mit etablierten Verfahren zur Hintergrundkorrektur, welche negative Spot-Intensitäten vermeiden, kann eine höhere statistische Power beim Nachweis differentiell exprimierter Gene erzielt werden. Außerdem kann der Anteil falsch-positiver Ergebnisse reduziert und die Präzision der Quantifizierung von differentieller Expression erhöht werden
refineR: A Novel Algorithm for Reference Interval Estimation from Real-World Data
Reference intervals are essential for the interpretation of laboratory test results in medicine. We propose a novel indirect approach to estimate reference intervals from real-world data as an alternative to direct methods, which require samples from healthy individuals. The presented refineR algorithm separates the non-pathological distribution from the pathological distribution of observed test results using an inverse approach and identifies the model that best explains the non-pathological distribution. To evaluate its performance, we simulated test results from six common laboratory analytes with a varying location and fraction of pathological test results. Estimated reference intervals were compared to the ground truth, an alternative indirect method (kosmic), and the direct method (N = 120 and N = 400 samples). Overall, refineR achieved the lowest mean percentage error of all methods (2.77%). Analyzing the amount of reference intervals within ± 1 total error deviation from the ground truth, refineR (82.5%) was inferior to the direct method with N = 400 samples (90.1%), but outperformed kosmic (70.8%) and the direct method with N = 120 (67.4%). Additionally, reference intervals estimated from pediatric data were comparable to published direct method studies. In conclusion, the refineR algorithm enables precise estimation of reference intervals from real-world data and represents a viable complement to the direct method
Recommended from our members
Clinical risk assessment of biotin interference with a high-sensitivity cardiac troponin T assay.
Objectives Biotin >20.0 ng/mL (81.8 nmol/L) can reduce Elecsys® Troponin T Gen 5 (TnT Gen 5; Roche Diagnostics) assay recovery, potentially leading to false-negative results in patients with suspected acute myocardial infarction (AMI). We aimed to determine the prevalence of elevated biotin and AMI misclassification risk from biotin interference with the TnT Gen 5 assay. Methods Biotin was measured using an Elecsys assay in two cohorts: (i) 797 0-h and 646 3-h samples from 850 US emergency department patients with suspected acute coronary syndrome (ACS); (ii) 2023 random samples from a US laboratory network, in which biotin distributions were extrapolated for higher values using pharmacokinetic modeling. Biotin >20.0 ng/mL (81.8 nmol/L) prevalence and biotin 99th percentile values were calculated. AMI misclassification risk due to biotin interference with the TnT Gen 5 assay was modeled using different assay cutoffs and test timepoints. Results ACS cohort: 1/797 (0.13%) 0-h and 1/646 (0.15%) 3-h samples had biotin >20.0 ng/mL (81.8 nmol/L); 99th percentile biotin was 2.62 ng/mL (10.7 nmol/L; 0-h) and 2.38 ng/mL (9.74 nmol/L; 3-h). Using conservative assumptions, the likelihood of false-negative AMI prediction due to biotin interference was 0.026% (0-h result; 19 ng/L TnT Gen 5 assay cutoff). US laboratory cohort: 15/2023 (0.74%) samples had biotin >20.0 ng/mL (81.8 nmol/L); 99th percentile biotin was 16.6 ng/mL (68.0 nmol/L). Misclassification risk due to biotin interference (19 ng/L TnT Gen 5 assay cutoff) was 0.025% (0-h), 0.0064% (1-h), 0.00048% (3-h), and <0.00001% (6-h). Conclusions Biotin interference has minimal impact on the TnT Gen 5 assay's clinical utility, and the likelihood of false-negative AMI prediction is extremely low
A pipeline for the fully automated estimation of continuous reference intervals using real-world data
Abstract Reference intervals are essential for interpreting laboratory test results. Continuous reference intervals precisely capture physiological age-specific dynamics that occur throughout life, and thus have the potential to improve clinical decision-making. However, established approaches for estimating continuous reference intervals require samples from healthy individuals, and are therefore substantially restricted. Indirect methods operating on routine measurements enable the estimation of one-dimensional reference intervals, however, no automated approach exists that integrates the dependency on a continuous covariate like age. We propose an integrated pipeline for the fully automated estimation of continuous reference intervals expressed as a generalized additive model for location, scale and shape based on discrete model estimates using an indirect method (refineR). The results are free of subjective user-input, enable conversion of test results into z-scores and can be integrated into laboratory information systems. Comparison of our results to established and validated reference intervals from the CALIPER and PEDREF studies and manufacturers’ package inserts shows good agreement of reference limits, indicating that the proposed pipeline generates high-quality results. In conclusion, the developed pipeline enables the generation of high-precision percentile charts and continuous reference intervals. It represents the first parameter-less and fully automated solution for the indirect estimation of continuous reference intervals
Specification of Cortical Parenchyma and Stele of Maize Primary Roots by Asymmetric Levels of Auxin, Cytokinin, and Cytokinin-Regulated Proteins1[C][W][OA]
In transverse orientation, maize (Zea mays) roots are composed of a central stele that is embedded in multiple layers of cortical parenchyma. The stele functions in the transport of water, nutrients, and photosynthates, while the cortical parenchyma fulfills metabolic functions that are not very well characterized. To better understand the molecular functions of these root tissues, protein- and phytohormone-profiling experiments were conducted. Two-dimensional gel electrophoresis combined with electrospray ionization tandem mass spectrometry identified 59 proteins that were preferentially accumulated in the cortical parenchyma and 11 stele-specific proteins. Hormone profiling revealed preferential accumulation of indole acetic acid and its conjugate indole acetic acid-aspartate in the stele and predominant localization of the cytokinin cis-zeatin, its precursor cis-zeatin riboside, and its conjugate cis-zeatin O-glucoside in the cortical parenchyma. A root-specific β-glucosidase that functions in the hydrolysis of cis-zeatin O-glucoside was preferentially accumulated in the cortical parenchyma. Similarly, four enzymes involved in ammonium assimilation that are regulated by cytokinin were preferentially accumulated in the cortical parenchyma. The antagonistic distribution of auxin and cytokinin in the stele and cortical parenchyma, together with the cortical parenchyma-specific accumulation of cytokinin-regulated proteins, suggest a molecular framework that specifies the function of these root tissues that also play a role in the formation of lateral roots from pericycle and endodermis cells
Influence of land-use intensity on the spatial distribution of N-cycling microorganisms in grassland soils
International audienceA geostatistical approach using replicated grassland sites (10 m Ă— 10 m) was applied to investigate the influence of grassland management, i.e. unfertilized pastures and fertilized mown meadows representing low and high land-use intensity (LUI), on soil biogeochemical properties and spatial distributions of ammonia-oxidizing and denitrifying microorganisms in soil. Spatial autocorrelations of the different N-cycling communities ranged between 1.4 and 7.6 m for ammonia oxidizers and from 0.3 m for nosZ-type denitrifiers to scales >14 m for nirK-type denitrifiers. The spatial heterogeneity of ammonia oxidizers and nirS-type denitrifiers increased in high LUI, but decreased for biogeochemical properties, suggesting that biotic and/or abiotic factors other than those measured are driving the spatial distribution of these microorganisms at the plot scale. Furthermore, ammonia oxidizers (amoA ammonia-oxidizing archaea and amoA ammonia-oxidizing bacteria) and nitrate reducers (napA and narG) showed spatial coexistence, whereas niche partitioning was found between nirK- and nirS-type denitrifiers. Together, our results indicate that spatial analysis is a useful tool to characterize the distribution of different functional microbial guilds with respect to soil biogeochemical properties and land-use management. In addition, spatial analyses allowed us to identify distinct distribution ranges indicating the coexistence or niche partitioning of N-cycling communities in grassland soil