Description of sub-scripts¶
Script in the bin folder¶
Compiling QC metrics, Generating QC-Report and running copynumber analysis¶
dmp_compile_qc_metrics.pl [options]
–bamList | -i S File of files having list of all bam files (required)
—titleFile | -t S tab-delimited title file for the samples (required and submit with full path)
–AllMetrics | -am S Path to AllMetrics script (required)
—LoessNorm | -ln S Path to Loess Normalization Script (required)
–BestCN | -cn S Path to Best Copy Number Script (required)
—GCBias | -gcb S Path to GC bias file (required)
–HistNorm | -his S Path to Directory with all historical normal files (required)
—queue | -q S Name of the Sun Grd Engine Queue where the pipeline needs to run (required)
--qsub S Path to qsub executable for SGE(default:None,optional) —bsub S Path to bsub executable for LSF(default:None,required)
–metricsScript | -ms S Name of the script used to generate .html and .pdf files
—outdir | -o S Path where all the output files will be written (optional) [default:cwd]
Genotyping Variants across multiple samples¶
dmp_genotype_allele.pl [options]
–FilteredMutationVcfFile | -fmv S vcf file describing details about the mutations (required)
—BamFile | -bam S bam file to be used for genotyping (required)
–RefFile | -rf S Path to genome reference file (required)
—samtools | -s S Path to samtools (required)
–bedtools | -b S Path to bedtools (required)
—MinBaseQualit | -mbq I Min. Base Quality Threshold (optional;default:5)
–MinMappingQuality | -mmq I Min. Mapping Quality Threshold (optional;default:5)
—deleteUnwantedFiles | -d I 2=>To delete files 1=> To keep files (default:2,optional)
–outdir | -o S Path where all the output files will be written (optional;default:current working directory)
—outFile | -of S Name of the allele depth output file (optional;default:BamFame-.bam+_mpileup.alleledepth)
–bamId | -bi S Bam Id to be used (optional;default:bamfile name)
—queue | -q S Name of the SGE / LSF Queue where the pipeline needs to run (required)
--qsub S Path to qsub executable for SGE(default:None,optional) —bsub S Path to bsub executable for LSF(default:None,required)
–mpileUpOutFile | -mof S Name of samtools mpileup output file (optional;default:BamFile-.bam+.mpileup)
—typeOfSample | -tos S Type of Sample (optional;default:Tumor;canbe Tumor or Normal)
Running Indel Realignment using ABRA¶
usage: Run_AbraRealignment.py [options]
Run ABRA Indel Realignment
- arguments:
-h, --help show this help message and exit - -i BamFile.list, –bamList BamFile.list
- Full path to the tumor bam files as a fof.
-p PatientID, --patientId PatientID Id of the Patient for which the bam files are to be realigned -v, --verbose make lots of noise [default] -t 5, –threads 5 Number of Threads to be used to run ABRA -d, –mdp Threshold for downsampling depth to run ABRA -k [43 [43 ...]], –kmers [43 [43 ...]]
Number of k-mers to be used to run ABRA; Multiple k-mers are separated by space- -temp /somepath/tmpdir, –temporaryDirectory /somepath/tmpdir
- Full Path to temporary directory
- -r /somepath/Homo_Sapeins_hg19.fasta, –referenceFile /somepath/Homo_Sapeins_hg19.fasta
- Full Path to the reference file with the bwa index.
- -a /somepath/ABRA.jar, –abraJar /somepath/ABRA.jar
- Full Path to the ABRA jar file.
- -tr /somepath/targetRegion.bed, –targetRegion /somepath/targetRegion.bed
- Full Path to the target region bed file
- -j /somepath/java, –javaPATH /somepath/java
- Path to java executable.
- -b /somepath/bin, –bwaPATH /somepath/bin
- Path to the bin of bwa executable.
- -q all.q or clin.q, –queue all.q or clin.q
- Name of the SGE queue
- -o /somepath/output, –outDir /somepath/output
- Full Path to the output dir.
- -qsub /somepath/qsub, –qsubPath /somepath/qsub
- Full Path to the qsub executables of SGE.
- -bsub /somepath/bsub, –bsubPath /somepath/bsub
- Full Path to the bsub executables of LSF.
Call Indels > 25bp using Pindel¶
usage: Run_Pindel.py [options]
Run Pindel for Long Indels & MNPS (32bp-350bp)
- optional arguments:
-h, --help show this help message and exit - -i pindel.conf, –pindelConfig pindel.conf
- Full path to the pindel configuration
- -pId PatientID, –patientId PatientID
- Id of the Patient for which the bam files are to be realigned
-v, --verbose make lots of noise [default] -t 5, –threads 5 Number of Threads to be used to run Pindel -r /somepath/Homo_Sapeins_hg19.fasta, –referenceFile /somepath/Homo_Sapeins_hg19.fasta
Full Path to the reference file with the bwa index.- -p /somepath/pindel/bin, –pindelDir /somepath/pindel/bin
- Full Path to the Pindel executables.
- -chr ALL, –chromosomes ALL
- Which chr/fragment. Pindel will process reads for one chromosome each time. ChrName must be the same as in reference sequence and in read file.
- -q all.q or clin.q, –queue all.q or clin.q
- Name of the SGE queue
- -o /somepath/output, –outDir /somepath/output
- Full Path to the output dir.
- -op TumorID, –outPrefix TumorID
- Id of the Tumor bam file which will be used as the prefix for Pindel output files
- -qsub /somepath/qsub, –qsubPath /somepath/qsub
- Full Path to the qsub executables of SGE.
- -bsub /somepath/bsub, –bsubPath /somepath/bsub
- Full Path to the bsub executables of LSF.
Script in the support-scripts folder¶
Calculate intervals from bam file that have some minimum coverage¶
usage: Run_FindCoveredInterval.py [options]
This will run find covered interval program from GATK.
- optional arguments:
-h, --help show this help message and exit - -i BamFile.list, –bamList BamFile.list
- Full path to the tumor bam files as a fof.
- -of OutFilePrefix, –outFilePrefix OutFilePrefix
- Output Covered Interval File Prefix for the bam files.
-v, --verbose make lots of noise [default] - -t 5, –threads 5 Number of Threads to be used to run
- FindCoveredIntervals
- -dp 20, –totaldepth 20
- Total depth threshold
- -mbq 20, –minbasequality 20
- Threshold for minimum base quality for Running Find Covered Interval
- -mmq 20, –minmappingquality 20
- Threshold for minimum mapping quality for Running Find Covered Interval
- -r /somepath/Homo_Sapeins_hg19.fasta, –referenceFile /somepath/Homo_Sapeins_hg19.fasta
- Full Path to the reference file with the bwa index.
- -g /somepath/GenomeAnalysisTK.jar, –gatkJar /somepath/GenomeAnalysisTK.jar
- Full Path to the GATK jar file.
- -j /somepath/java, –javaPATH /somepath/java
- Path to java executable.
- -q all.q or clin.q, –queue all.q or clin.q
- Name of the SGE queue
- -o /somepath/output, –outDir /somepath/output
- Full Path to the output dir.
- -qsub /somepath/qsub, –qsubPath /somepath/qsub
- Full Path to the qsub executables of SGE.
- -bsub /somepath/bsub, –bsubPath /somepath/bsub
- Full Path to the bsub executables of LSF.
Annotating variants in merged variant file¶
dmp_annotate_variants.pl [options]
–SomaticMutIndelFile | -si S File containing mutations (required and submit with full path,Ex:/SomePath/Some_SomaticMutIndel.txt)
—ConfigurationFile | -c S Configuration file that contains the locations for the programs and the databases (required and submit with full path)
–titleFile | -t S tab-delimited title file for the samples (required and submit with full path)
—outdir | -o S Path where all the output files will be written (optional;default:cwd)
–exonCoverageFile | -ec S Path where the all exon coverage file is located (full path]
—geneCoverageFile | -gc S Path where the gene coverage file is located (full path)]
—deleteUnwantedFiles | -d I 2=>To delete files 1=> To keep files (default:2,optional)
Filter variants after annotation¶
dmp_filter_genotyped_variants.pl [options]
–input | -i S File containing mutations with genotype information (required)
—hotspots | -h S File containing the list of hotspots (required)
–clinicalExons | -ce S File containing the list of clinical exons (required)
—titleFile | -t S Title file (required)
–minimumDPforSNVs | -dp_snv I Minumum accepted DP for novel SNVs (default: 20)
—minimumADforSNVs | -ad_snv I Minimum accepted AD for novel SNVs (default: 10)
–minimumVFforSNVs | -vf_snv F Minimum accepted VF for novel SNVs (default: 0.05)
—minimumDPforSNVhotspot | -dp_snvHS I Minumum accepted DP for Hotspot SNVs (default: 20)
–minimumADforSNVhotspot | -ad_snvHS I Minimum accepted AD for Hotspot SNVs (default: 8)
—minimumVFforSNVhotspot | -vf_snvHS F Minimum accepted VF for Hotspot SNVs (default: 0.02)
–minimumDPforINDELs | -dp_indel I Minumum accepted DP for novel INDELs (default: 20)
—minimumADforINDELs | -ad_indel I Minimum accepted AD for novel INDELs (default: 10)
–minimumVFforINDELs | -vf_indel I Minimum accepted VF for novel INDELs (default: 0.05)
—minimumDPforINDELhotspot | -dp_indelHS F Minumum accepted DP for Hotspot INDELs (default: 20)
–minimumADforINDELhotspot | -ad_indelHS I Minimum accepted AD for Hotspot INDELs (default: 8)
—minimumVFforINDELhotspot | -vf_indelHS F Minimum accepted VF for Hotspot INDELs (default: 0.02)
–minimumOccurrencePercent | -occurrence S Minimum accepted value of occurrence in other normals, in percent (default: 20)
—TNfreqRatioThreshold | -tn_ratio S Minimum value for VFt/VFn value (default: 5)
–MAFthreshold | -mt F Minimum accepted MAF values for unmatched variant calls (default : 0.01)
Filter indels from SomaticIndelDetector before genotyping¶
dmp_filter_indel.pl [options]
–IndelTxtFile|t S tab-delimted Indel file describing details about the mutations (required)
—IndelVcfFile|v S VCF format Indel file describing details about the mutations (required)
–sampleName|s S Name of the sample (required)
—totaldepth|dp I Tumor total depth threshold for Somatic Indel Detector(default:0,optional)
–alleledepth|ad I Tumor Allele depth threshold for Somatic Indel Detector(default:3,optional)
—variantfreq|vf F Tumor variant frequency threshold for Somatic Indel Detector(default:0.01,optional)
–TNratio|tnr I Tumor-Normal variant frequency ratio threshold for Somatic Indel Detector(default:5,optional)
—outdir|o S Path where all the output files will be written (optional) default:current working directory
Filter snv from MuTect before genotyping¶
dmp_filter_mutect.pl [options]
–MutationTxtFile|t S tab-delimted Mutect file describing details about the mutations (required)
–MutationVcfFile|v S VCF format Mutect file describing details about the mutations (required)
—sampleName|s S Name of the sample (required)
–totaldepth|dp I Tumor total depth threshold for Mutect(default:0,optional).
—alleledepth|ad I Tumor Allele depth threshold for Mutect(default:3,optional).
–variantfreq|vf F Tumor variant frequency threshold for Mutect(default:0.01,optional).
—TNratio|tnr I Tumor-Normal variant frequency ratio threshold for Mutect(default:5,optional).
–outdir|o S Path where all the output files will be written (optional) default:current working directory
Filter indels from PINDEL before genotyping¶
usage: dmp_filter_pindel.py [options]
Filter Indels from the output of pindel
- optional arguments:
-h, --help show this help message and exit -v, --verbose make lots of noise [default] - -i SomeID.vcf, -inputVcf SomeID.vcf
- Input vcf freebayes file which needs to be filtered
- -tsn SomeName, –tsampleName SomeName
- Name of the tumor Sample
- -dp 0, –totaldepth 0
- Tumor total depth threshold
- -ad 3, –alleledepth 3
- Tumor allele depth threshold
-tnr 5, –tnRatio 5 Tumor-Normal variant frequency ratio threshold -vf 0.01, –variantfrequency 0.01
Tumor variant frequency threshold- -o /somepath/output, –outDir /somepath/output
- Full Path to the output dir.
- -min 25, –min_var_len 25
- Minimum length of the Indels
- -max 500, –max_var_len 500
- Max length of the Indels