23 research outputs found
Image_1_Fast and Accurate Approaches for Large-Scale, Automated Mapping of Food Diaries on Food Composition Tables.PDF
<p>Aim of Study: The use of weighed food diaries in nutritional studies provides a powerful method to quantify food and nutrient intakes. Yet, mapping these records onto food composition tables (FCTs) is a challenging, time-consuming and error-prone process. Experts make this effort manually and no automation has been previously proposed. Our study aimed to assess automated approaches to map food items onto FCTs.</p><p>Methods: We used food diaries (~170,000 records pertaining to 4,200 unique food items) from the DiOGenes randomized clinical trial. We attempted to map these items onto six FCTs available from the EuroFIR resource. Two approaches were tested: the first was based solely on food name similarity (fuzzy matching). The second used a machine learning approach (C5.0 classifier) combining both fuzzy matching and food energy. We tested mapping food items using their original names and also an English-translation. Top matching pairs were reviewed manually to derive performance metrics: precision (the percentage of correctly mapped items) and recall (percentage of mapped items).</p><p>Results: The simpler approach: fuzzy matching, provided very good performance. Under a relaxed threshold (score > 50%), this approach enabled to remap 99.49% of the items with a precision of 88.75%. With a slightly more stringent threshold (score > 63%), the precision could be significantly improved to 96.81% while keeping a recall rate > 95% (i.e., only 5% of the queried items would not be mapped). The machine learning approach did not lead to any improvements compared to the fuzzy matching. However, it could increase substantially the recall rate for food items without any clear equivalent in the FCTs (+7 and +20% when mapping items using their original or English-translated names). Our approaches have been implemented as R packages and are freely available from GitHub.</p><p>Conclusion: This study is the first to provide automated approaches for large-scale food item mapping onto FCTs. We demonstrate that both high precision and recall can be achieved. Our solutions can be used with any FCT and do not require any programming background. These methodologies and findings are useful to any small or large nutritional study (observational as well as interventional).</p
Additional file 1: of Analysis of circulating angiopoietin-like protein 3 and genetic variants in lipid metabolism and liver health: the DiOGenes study
Figure S1. QQ plot of the relationship between expected and observed distribution at baseline. Quantile-quantile plot of baseline data. The relationship between observed (y-axis) and expected (x-axis) distribution. The statistical significance is measured by the negative log of the corresponding p-value for each SNP. (JPEG 92Â kb
Additional file 3: of Analysis of circulating angiopoietin-like protein 3 and genetic variants in lipid metabolism and liver health: the DiOGenes study
Table S1. Effect of rs4360730 on BMI, Lipid Profile and Liver Markers. Table S2 Effect of rs9994520 on BMI, Lipid Profile and Liver Markers. (DOCX 21Â kb
Additional file 2: of Analysis of circulating angiopoietin-like protein 3 and genetic variants in lipid metabolism and liver health: the DiOGenes study
Figure S2. QQ plot of the relationship between expected and observed distribution during weight loss period. Quantile-quantile plot for the analysis of the weight loss period. The relationship between observed (y-axis) and expected (x-axis) distribution. The statistical significance is measured by the negative log of the corresponding p-value for each SNP. (JPEG 94Â kb
<i>CPS1</i> gene expression and weight maintenance.
<p>Box plots for <i>CPS1</i> gene expression at PWL, for weight maintainers (WMs) and weight maintenance resistors (WRs), defined as ΔBMI < 4% and ΔBMI > = 4% increase (p-value = 0.04, corrected for age and sex).</p
Manhattan plot of Glycine.
<p>(a) Genome-wide association of glycine and (b) zoom on chromosome 2. The two green dots represent the SNPs significantly associated with glycine levels: rs10206976 (chr 2:210749914) and rs12613336 (chr 2:210704675), in the <i>CPS1</i> gene. Significativity (p—value = 5.0e-8) and suggestive (p—value = 1.0e-5) thresholds are provided as red and blue lines respectively.</p
Gene expression and weight maintenance.
<p>Gene expression and weight maintenance.</p
<i>CPS1</i> and rs3856348.
<p>Correlation between genotypes at rs3856348, and <i>CPS1</i> transcription level.</p
<i>CPS1</i> –Glycine network.
<p>Networks for <i>CPS1</i> and Glycine (77 and 383 nodes, respectively) were built from publicly available databases and subsequently merged. Purple nodes belong to the <i>CPS1</i> network; green nodes belong to the Glycine network; orange nodes are shared between the two networks (33). The bigger nodes represent <i>CPS1</i> and Glycine. Nodes with bigger labels represent the genes in the One carbon pool by folate KEGG pathway underlined from the gene enrichment analysis.</p