QuaC
🦆🦆 Don't duck that QC thingy 🦆🦆
Note
In the past life, QuaC repo used to live at UAB Gitlab. It was migrated to Github in Jan 2023, and the Gitlab version has been archived.
What is QuaC?
QuaC is a snakemake-based pipeline that runs several QC tools for WGS/WES samples and then summarizes their results using pre-defined, configurable QC thresholds.
In summary, QuaC performs the following:
- Runs several QC tools using
BAM
andVCF
files as input. At our center CGDS, these files are produced as part of the small variant caller pipeline. - Using QuaC-Watch tool, it performs QC checkup based on the expected thresholds for certain QC metrics and summarizes the results for easier human consumption
- Aggregates QC output as well as QuaC-Watch output using MulitQC, both at the sample level and project level.
- Optionally, above mentioned QuaC-Watch and QC aggregation steps can accept pre-run results from few QC tools (fastqc,
fastq-screen, picard's markduplicates) when run with flag
--include_prior_qc
.
CGDS users only
- At CGDS, BAM and VCF files produced by the small variant caller pipeline are used as input to QuaC.
- Tools fastqc, fastq-screen, and picard's markduplicates, whose output are accepted by QuaC when used with
flag
--include_prior_qc
, are produced by this small_variant_caller_pipeline.
Info
QuaC is built to use with Human WGS/WES data. If you would like to use it with non-human data, please modify the pipeline as needed -- especially the thresholds used in QuaC-Watch configs.
QC tools
Tools run by QuaC
QuaC quacks using the tools listed below:
Tool | Use | QC Type |
---|---|---|
Qualimap | Summarizes several alignment metrics using BAM file | BAM quality |
Picard-CollectMultipleMetrics | Summarizes alignment metrics from BAM file using several modules | BAM quality |
Picard-CollectWgsMetrics | Collects metrics about coverage and performance using BAM file | BAM quality |
mosdepth | Fast alignment depth calculation using BAM file | BAM quality |
indexcov | Estimate coverage from BAM index for GS (Skipped in exome mode) |
BAM quality |
covviz | Identifies large, coverage-based anomalies for GS using Indexcov output (Skipped in exome mode) |
BAM quality |
bcftools stats | Summarizes VCF file stats | VCF quality |
verifybamid | Estimates within-species (i.e., cross-sample) contamination using BAM file | Within-species contamination |
somalier | Estimation of sex, ancestry and relatedness using BAM file | Sex, ancestry and relatedness estimation |
Optional QC output consumed by QuaC
Optionally QuaC can also utilize QC results produced by the tools listed below when run with flag --include_prior_qc
.
Tool | Use | QC Type |
---|---|---|
fastqc | Performs QC on raw sequence reads data (FASTQ) | FASTQ quality |
FastQ Screen | Screens FASTQ for other-species contamination | FASTQ quality |
Picard's MarkDuplicates | Determines level of read duplication on BAM files | BAM quality |
CGDS users only
- At CGDS, these optional tools were run by our small_variant_caller_pipeline.
Documentation
Full documentation, including installation and how to run QuaC, is available at https://quac.readthedocs.io.
Repo owner
- Manavalan Gajapathy