MultiQC: summarize analysis results for multiple tools and samples in a single report

Abstract

<h2>Highlights</h2> <h3>Better configs</h3> <p>As of this release, you can now set all of your config variables via environment variables! (see <a href="https://multiqc.info/docs/getting_started/config/#config-with-environment-variables">docs</a>).</p> <p>Better still, YAML config files can now use string interpolation to parse environment variables within strings (see <a href="https://multiqc.info/docs/getting_started/config/#referencing-environment-variables-in-yaml-configs">docs</a>), eg:</p> <pre><code class="language-yaml">report_header_info: - Contact E-mail: !ENV "NAME:info@{NAME:info}@{DOMAIN:example.com}" </code></pre> <h3>Picard refactoring</h3> <p>In this release, there was a significant refactoring of the Picard module. It has been generalized for better code sharing with other Picard-based software, like Sentieon and Parabricks. As a result of this, the standalone Sentieon module was removed: Sentieon QC files will be interpreted directly as Picard QC files.</p> <p>If you were using the Sentieon module in your pipelines, make sure to update any places that reference the module name:</p> <ul> <li>MultiQC command line (e.g. replace <code>--module sentieon</code> with <code>--module picard</code>).</li> <li>MultiQC configs (e.g. replace <code>sentieon</code> with <code>picard</code> in options like <code>run_modules</code>, <code>exclude_modules</code>, <code>module_order</code>).</li> <li>Downstream code that relies on names of the files in <code>multiqc_data</code> or <code>multiqc_plots</code> saves (e.g., <code>multiqc_data/multiqc_sentieon_AlignmentSummaryMetrics.txt</code> becomes <code>multiqc_data/multiqc_picard_AlignmentSummaryMetrics.txt</code>).</li> <li>Code that parses data files like <code>multiqc_data/multiqc_data.json</code>.</li> <li>Custom plugins and templates that rely on HTML anchors (e.g. <code>#sentieon_aligned_reads</code> becomes <code>#picard_AlignmentSummaryMetrics</code>).</li> <li>Also, note that Picard fetches sample names from the commands it finds inside the QC headers (e.g. <code># net.sf.picard.analysis.CollectMultipleMetrics INPUT=Szabo_160930_SN583_0215_AC9H20ACXX.bam ...</code> -> <code>Szabo_160930_SN583_0215_AC9H20ACXX</code>), whereas the removed Sentieon module prioritized the QC file names. To revert to the old Sentieon approach, use the <a href="https://multiqc.info/docs/getting_started/config/#using-log-filenames-as-sample-names"><code>use_filename_as_sample_name</code> config flag</a>.</li> </ul> <h2>MultiQC updates</h2> <ul> <li>Config can be set with environment variables, including env var interpolation (<a href="https://github.com/ewels/MultiQC/pull/2178">#2178</a>)</li> <li>Try find config in <code>~/.config</code> or <code>$XDG_CONFIG_HOME</code> (<a href="https://github.com/ewels/MultiQC/pull/2183">#2183</a>)</li> <li>Better sample name cleaning with pairs of input filenames (<a href="https://github.com/ewels/MultiQC/pull/2181">#2181</a>)</li> <li>Software versions: allow any string as a version tag (<a href="https://github.com/ewels/MultiQC/pull/2166">#2166</a>)</li> <li>Table columns with non-numeric values and now trigger a linting error if <code>scale</code> is set (<a href="https://github.com/ewels/MultiQC/pull/2176">#2176</a>)</li> <li>Stricter config variable typing (<a href="https://github.com/ewels/MultiQC/pull/2178">#2178</a>)</li> <li>Remove <code>position:absolute</code> CSS from table values (<a href="https://github.com/ewels/MultiQC/pull/2169">#2169</a>)</li> <li>Fix column sorting in exported TSV files from a matplotlib linegraph plot (<a href="https://github.com/ewels/MultiQC/pull/2143">#2143</a>)</li> <li>Fix custom anchors for kraken (<a href="https://github.com/ewels/MultiQC/pull/2170">#2170</a>)</li> <li>Fix logging spillover bug (<a href="https://github.com/ewels/MultiQC/pull/2174">#2174</a>)</li> </ul> <h2>New Modules</h2> <ul> <li><a href="https://github.com/seqeralabs/tower-cli"><strong>Seqera Platform CLI</strong></a> (<a href="https://github.com/ewels/MultiQC/pull/2151">#2151</a>)<ul> <li>Seqera Platform CLI reports statistics generated by the Seqera Platform CLI.</li> </ul> </li> <li><a href="https://github.com/data61/gossamer/blob/master/docs/xenome.md"><strong>Xenome</strong></a> (<a href="https://github.com/ewels/MultiQC/pull/1860">#1860</a>)<ul> <li>A tool for classifying reads from xenograft sources.</li> </ul> </li> <li><a href="https://gitlab.com/genomeinformatics/xengsort"><strong>xengsort</strong></a> (<a href="https://github.com/ewels/MultiQC/pull/2168">#2168</a>)<ul> <li>xengsort is a fast xenograft read sorter based on space-efficient k-mer hashing</li> </ul> </li> </ul> <h2>Module updates</h2> <ul> <li><strong>fastp</strong>: add version parsing (<a href="https://github.com/ewels/MultiQC/pull/2159">#2159</a>)</li> <li><strong>fastp</strong>: correctly parse sample name from <code>--in1</code>/<code>--in2</code> in bash command. Prefer file name if not <code>fastp.json</code>; fallback to file name when error (<a href="https://github.com/ewels/MultiQC/pull/2139">#2139</a>)</li> <li><strong>Kaiju</strong>: fix <code>division by zero</code> error (<a href="https://github.com/ewels/MultiQC/pull/2179">#2179</a>)</li> <li><strong>Nanostat</strong>: account for both tab and spaces in <code>v1.41+</code> search pattern (<a href="https://github.com/ewels/MultiQC/pull/2155">#2155</a>)</li> <li><strong>Pangolin</strong>: update for v4: add QC Note , update tool versions columns (<a href="https://github.com/ewels/MultiQC/pull/2157">#2157</a>)</li> <li><strong>Picard</strong>: Generalize to directly support Sentieon and Parabricks outputs (<a href="https://github.com/ewels/MultiQC/pull/2110">#2110</a>)</li> <li><strong>Sentieon</strong>: Removed the module in favour of directly supporting parsing by the <strong>Picard</strong> module (<a href="https://github.com/ewels/MultiQC/pull/2110">#2110</a>)<ul> <li>Note that any code that relies on the module name needs to be updated, e.g. <code>-m sentieon</code> will no longer work</li> <li>The exported plot and data files will be now be prefixed as <code>picard</code> instead of <code>sentieon</code>, etc.</li> <li>Note that the Sentieon module used to fetch the sample names from the file names by default, and now it follows the Picard module's logic, and prioritizes the commands recorded in the logs. To override, use the <code>use_filename_as_sample_name</code> config flag</li> </ul> </li> </ul>Please consider citing MultiQC if you use it in your analysis

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 10/05/2024