Running Summary Report
-
Double click the Summary Report icon on the Desktop.
-
Select the samples to be added to the Summary Report. This can be achieved by three methods:
- Selecting samples, holding
shiftorctrland move them across to the right panel using the right-facing arrow button. - Use the search function to subset the available samples and select as above.
- Paste a newline delimited list of sample names, exactly matching the name as it appears in the
resultsdirectory, to the Paste Bin (Feature 7) and click 'Add Samples'.
- Selecting samples, holding
-
Decide the output format for the report data.
-
By the 'Output File' section select 'Browse' and choose the destination and appropriate file extension for the output.
-
Click on 'Generate Summary' to build the Summary Report. The bottom-left corner has a status indicator. View the terminal outputs for further information and debugging.
Quick Tip
Note: The tools save files by default to directories on the storage medium or in the analysis environment file structure. Navigate to the top (root) level in any 'save' dialogue box and choose home to save to the host machine's local storage.
Description of features
See the diagram below and the associated table for explanations on Summary Report features.
| Number | Feature | Description |
|---|---|---|
| 1 | Results directory | Path selection for the Metagenomics Workflow's results directory. Useful if users move/organise outputs for archiving.* |
| 2 | Output file | The output file path for Summary Report. |
| 3 | Sample search bar | Enter key phrases to subset available samples. |
| 4 | Available Samples panel | Populated automatically from the provided results directory. |
| 5 | Sample selection controls | Move selected or all samples to and from the Selected Samples panel. Clear and refresh to start again. |
| 6 | Selected Samples panel | A list of the samples included for summary |
| 7 | Sample paste bin | Paste a list (newline delimited) of sample names from an external source to quickly load on to Selected Samples.** |
| 8 | Relative abundance | Set a threshold of relative abundance for the non-viral taxa. Taxa falling below will be excluded entirely from the report. |
| 9 | Format output | Change default XLSX output to CSV. Change janky newline delimitation of nested lists to ';' for easier parsing. |
| 10 | Generate summary | Run the script |
| 11 | Status indicator | Data on available samples and selections |
: Summary report diagram legend
Will not be able to see outside of mounted directories of the container (Default: the NHS RMg platform SSD). Modify the launch script to mount additional host directories. *No path checking is performed on pasted samples. They will be excluded from the Summary Report if missing. See terminal output for list of missing samples.
Description of outputs
| Column Name | Explanation |
|---|---|
| Sample | LabID provided on launching the Metagenomics Workflow |
| Experiment | The exact name matching the experiment name on MinKNOW entered by the user when initiating a sequencing run. This is populated automatically from the /data directory. |
| SampleID | The exact name matching the Sample name on MinKNOW entered by the user when initiating a sequencing run. This is populated automatically from the /data/{experiment_id}/ |
| Barcode | The ONT library index/barcode used. Green colour indicates the barcode directory has been validated. |
| LabID | LabID provided on launching the Metagenomics Workflow |
| biosample_id | The Sample Accession (if provided). May also be an anonymised study number derived from the Sample Accession. |
| biosample_source_id | The Hospital number provided to the Metagenomics Launcher. May also be an anonymised study number derived from the Hospital Number. |
| Collection Date | Provided to the Metagenomics Launcher |
| SampleClass | Specimen, postive contol, standard etc. Provided to the launcher. |
| SampleType | Sampling site: BAL, SPT, NDL, ETT, NPA, PFL etc. |
| Operator | Operator initials. Sourced from Launcher. |
| Notes | Additional notes. Sourced from Launcher, shown or reports. |
| RunID | Identifier assigned to the experiment by the sequencing device. Derived from FASTQ. |
| Flow_Cell_ID | Flow cell ID - derived from FASTQ |
| Total reads X hrs | Total reads - pre-human scrubbing |
| Human reads X hrs | Human reads removed |
| Human reads (%) X hrs | Proportion of total reads identified as human and removed |
| Total classified reads X hrs | Total reads post-human scrubbing |
| Sequencing N50 (bp) X hrs | Post human-scrubbing (microbial reads) read length metric |
| Proportion >Q15 quality (%) X hrs | Proportion of microbial reads with a PHRED score >15 |
| Median read quality (PHRED score) N | Median PHRED core for microbial reads |
| Total bases (bp) X hrs | Total bases sequenced including human |
| Organisms (excluding viruses) X hrs | A list of organism identified except for viruses |
| Organisms (excluding viruses) read counts X hrs | Counts of reads for each non-viral taxa identified |
| Organism (excluding viruses) percentage abundance X hrs | |
| Viral organisms X hrs | Virus taxa list |
| Viral read counts X hrs | Counts of viruses identified |
| Auto Query top taxon X hrs | Auto Query's most supported taxon - note by default taxa below threshold will not be subject to Auto Query an therefore will be shown as 'missing' here. See 'configuration' section for more info. |
| Auto Query top percent X hrs | The percentage of top alignments supporting the the top taxon |
| Auto Query 2nd taxon X hrs | |
| Auto Query 2nd percent X hrs | Auto Query's second most supported taxon |
| AvgLength 0.5 hrs | Average length of Auto Query alignments for the top hit |
| AvgPID 0.5 hrs | Average percent identity of Auto Query alignments for the top hit |
| IsMatched50 0.5 hrs | Would a 'green light' be shown on the report |
: Summary report output fields

