nextflow run pgscatalog/pgsc_calc -profile docker --input samplesheet.csv --target_build GRCh38 --pgs_id PGS002785,PGS002786,PGS002789,PGS002787,PGS002788,PGS002790 --parallel --max_memory 30.GB --max_cpus 14 --min_overlap 0.4
Additional documentation is available that explains some of the terms used this report in more detail
## keep_multiallelic: false
## keep_ambiguous : false
## min_overlap : 0.4
In the future mean-imputation will be supported in small samplesets using ancestry-matched reference samplesets to ensure consistent calculation of score sums (e.g. 1000G Genomes).
6 scores for 1 samples processed
Below is a summary of the aggregated scores, which might be useful for debugging.
## # A tibble: 1 × 8
## sampleset IID PGS002790_hmPO…¹ PGS00…² PGS00…³ PGS00…⁴ PGS00…⁵ PGS00…⁶
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 NG1RDRPK1V NG1RDRPK1V 0.183 0.263 0.301 0.788 0.842 0.195
## # … with abbreviated variable names ¹PGS002790_hmPOS_GRCh38_SUM,
## # ²PGS002785_hmPOS_GRCh38_SUM, ³PGS002786_hmPOS_GRCh38_SUM,
## # ⁴PGS002788_hmPOS_GRCh38_SUM, ⁵PGS002787_hmPOS_GRCh38_SUM,
## # ⁶PGS002789_hmPOS_GRCh38_SUM
See here for an explanation of plink2 column names
The summary density plots show up to six scoring files
All scores can be found in “aggregated_scores.txt.gz”, in the results folder output by the pipeline.
For scores from the PGS Catalog, please remember to cite the original publications from which they came (these are listed in the metadata table.)
PGS Catalog Calculator (in development). PGS Catalog Team. https://github.com/PGScatalog/pgsc_calc
Lambert et al. (2021) The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nature Genetics. 53:420–425 doi:10.1038/s41588-021-00783-5.