Browse wiki

jump-to-nav Jump to: navigation, search
SE139:/DS1
DS Description Multivariate statistical analyses, compris Multivariate statistical analyses, comprising principal component analysis (PCA) and orthogonal projections to latent structures–discriminant analysis (OPLS-DA) (Bylesjo et al. 2006; Trygg et al. 2007), were performed using SIMCA-P 12.0 software (Umetrics AB, Umea, Sweden) with log10 transformation and unit variance scaling. The models were used to visualize the high-dimensional data and determine the metabolomic variation between the control WT and mto mutants. PCA captures the main sources of variation in an unsupervised manner, whereas OPLS-DA extracts as much of the class-separating (mutant vs. WT) variation as possible. With OPLS-DA, the variance in X (the MS profiles) is decomposed into three parts: Y-predictive, Y-orthogonal and the unmodeled residual. The Y-predictive components capture the class-separating variation (mutant vs. WT) and can be used to interpret the differences between the genotypes. The Y-orthogonal variation on the other hand models variation that is strictly unrelated to genotypic differences. With this distinction, OPLS-DA provides a convenient way to inspect the genotype-related differences without confusing them with other systematic variance coming from residual analytical bias, for example. All OPLS-DA models were validated using leave-one-out cross-validation and diagnosed using the prediction error-rate defined as the number of inaccurate predictions of left-out samples divided by the total number of samples, N. Since N was fairly low, we also computed an empirical p value, p CV, by randomizing the class labels 100 times and counting the number of times we obtained a lower error-rate than with the original labels. Statistical tests were performed using the R statistical environment (http://www.r-project.org/) on the basis of log10-transformed data. The resulting p values were adjusted for multiple testing using the false discovery rate (FDR) procedure (Benjamini and Hochberg 1995). To facilitate biological interpretation of the metabolite profiles in mto mutants, we performed metabolite set enrichment analysis (MSEA, Redestig et al. unpublished method), which is similar to gene set enrichment analysis (GSEA) (Mootha et al. 2003). The procedure of this approach was as follows: first, unification of metabolite identifiers was done using MetMask (http://metmask.sourceforge.net, Redestig et al. submitted), which is a tool for chemical identifier linking. Metabolites were then classified into 47 groups based on the classification scheme provided by the PlantCyc compound classes database version 3.0 (Zhang et al. 2005). Detected metabolites that were not mentioned by PlantCyc were classified manually. The groups of metabolites consist of metabolites that share biological function and/or biochemical pathways. Because of the rather crude grouping, we refer to the groups as “metabolite bins”, analogous to the widely used MapMan gene bins (Thimm et al. 2004). Genotype differences were estimated using the t-statistics obtained when comparing mutants and the WT. For each metabolite class the t-statistic distribution was compared with those of all other metabolites using the Kolmogorov–Smirnov test. The analysis identified classes of metabolites that were particularly affected in the corresponding comparison. The p values were FDR corrected to adjust for multiple testing. corrected to adjust for multiple testing.
DS ID DS1  +
DS Title Statistical data analysis  +
Modification dateThis property is a special property in this wiki. 23 March 2018 01:20:47  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.
Personal tools
View and Edit Metadata
Variants
Views
Actions
Toolbox