========================== Description of sub-scripts ========================== -------------------------- Script in the bin folder -------------------------- Compiling QC metrics, Generating QC-Report and running copynumber analysis ========================================================================== **dmp_compile_qc_metrics.pl [options]** --bamList | -i S File of files having list of all bam files (required) --titleFile | -t S tab-delimited title file for the samples (required and submit with full path) --AllMetrics | -am S Path to AllMetrics script (required) --LoessNorm | -ln S Path to Loess Normalization Script (required) --BestCN | -cn S Path to Best Copy Number Script (required) --GCBias | -gcb S Path to GC bias file (required) --HistNorm | -his S Path to Directory with all historical normal files (required) --queue | -q S Name of the Sun Grd Engine Queue where the pipeline needs to run (required) --qsub S Path to qsub executable for SGE(default:None,optional) --bsub S Path to bsub executable for LSF(default:None,required) --metricsScript | -ms S Name of the script used to generate .html and .pdf files --outdir | -o S Path where all the output files will be written (optional) [default:cwd] Genotyping Variants across multiple samples =========================================== **dmp_genotype_allele.pl [options]** --FilteredMutationVcfFile | -fmv S vcf file describing details about the mutations (required) --BamFile | -bam S bam file to be used for genotyping (required) --RefFile | -rf S Path to genome reference file (required) --samtools | -s S Path to samtools (required) --bedtools | -b S Path to bedtools (required) --MinBaseQualit | -mbq I Min. Base Quality Threshold (optional;default:5) --MinMappingQuality | -mmq I Min. Mapping Quality Threshold (optional;default:5) --deleteUnwantedFiles | -d I 2=>To delete files 1=> To keep files (default:2,optional) --outdir | -o S Path where all the output files will be written (optional;default:current working directory) --outFile | -of S Name of the allele depth output file (optional;default:BamFame-.bam+_mpileup.alleledepth) --bamId | -bi S Bam Id to be used (optional;default:bamfile name) --queue | -q S Name of the SGE / LSF Queue where the pipeline needs to run (required) --qsub S Path to qsub executable for SGE(default:None,optional) --bsub S Path to bsub executable for LSF(default:None,required) --mpileUpOutFile | -mof S Name of samtools mpileup output file (optional;default:BamFile-.bam+.mpileup) --typeOfSample | -tos S Type of Sample (optional;default:Tumor;canbe Tumor or Normal) Running Indel Realignment using ABRA ==================================== **usage: Run_AbraRealignment.py [options]** Run ABRA Indel Realignment arguments: -h, --help show this help message and exit -i BamFile.list, --bamList BamFile.list Full path to the tumor bam files as a fof. -p PatientID, --patientId PatientID Id of the Patient for which the bam files are to be realigned -v, --verbose make lots of noise [default] -t 5, --threads 5 Number of Threads to be used to run ABRA -d, --mdp Threshold for downsampling depth to run ABRA -k [43 [43 ...]], --kmers [43 [43 ...]] Number of k-mers to be used to run ABRA; Multiple k-mers are separated by space -temp /somepath/tmpdir, --temporaryDirectory /somepath/tmpdir Full Path to temporary directory -r /somepath/Homo_Sapeins_hg19.fasta, --referenceFile /somepath/Homo_Sapeins_hg19.fasta Full Path to the reference file with the bwa index. -a /somepath/ABRA.jar, --abraJar /somepath/ABRA.jar Full Path to the ABRA jar file. -tr /somepath/targetRegion.bed, --targetRegion /somepath/targetRegion.bed Full Path to the target region bed file -j /somepath/java, --javaPATH /somepath/java Path to java executable. -b /somepath/bin, --bwaPATH /somepath/bin Path to the bin of bwa executable. -q all.q or clin.q, --queue all.q or clin.q Name of the SGE queue -o /somepath/output, --outDir /somepath/output Full Path to the output dir. -qsub /somepath/qsub, --qsubPath /somepath/qsub Full Path to the qsub executables of SGE. -bsub /somepath/bsub, --bsubPath /somepath/bsub Full Path to the bsub executables of LSF. Call Indels > 25bp using Pindel =============================== **usage: Run_Pindel.py [options]** Run Pindel for Long Indels & MNPS (32bp-350bp) optional arguments: -h, --help show this help message and exit -i pindel.conf, --pindelConfig pindel.conf Full path to the pindel configuration -pId PatientID, --patientId PatientID Id of the Patient for which the bam files are to be realigned -v, --verbose make lots of noise [default] -t 5, --threads 5 Number of Threads to be used to run Pindel -r /somepath/Homo_Sapeins_hg19.fasta, --referenceFile /somepath/Homo_Sapeins_hg19.fasta Full Path to the reference file with the bwa index. -p /somepath/pindel/bin, --pindelDir /somepath/pindel/bin Full Path to the Pindel executables. -chr ALL, --chromosomes ALL Which chr/fragment. Pindel will process reads for one chromosome each time. ChrName must be the same as in reference sequence and in read file. -q all.q or clin.q, --queue all.q or clin.q Name of the SGE queue -o /somepath/output, --outDir /somepath/output Full Path to the output dir. -op TumorID, --outPrefix TumorID Id of the Tumor bam file which will be used as the prefix for Pindel output files -qsub /somepath/qsub, --qsubPath /somepath/qsub Full Path to the qsub executables of SGE. -bsub /somepath/bsub, --bsubPath /somepath/bsub Full Path to the bsub executables of LSF. ------------------------------------ Script in the support-scripts folder ------------------------------------ Calculate intervals from bam file that have some minimum coverage ================================================================= **usage: Run_FindCoveredInterval.py [options]** This will run find covered interval program from GATK. optional arguments: -h, --help show this help message and exit -i BamFile.list, --bamList BamFile.list Full path to the tumor bam files as a fof. -of OutFilePrefix, --outFilePrefix OutFilePrefix Output Covered Interval File Prefix for the bam files. -v, --verbose make lots of noise [default] -t 5, --threads 5 Number of Threads to be used to run FindCoveredIntervals -dp 20, --totaldepth 20 Total depth threshold -mbq 20, --minbasequality 20 Threshold for minimum base quality for Running Find Covered Interval -mmq 20, --minmappingquality 20 Threshold for minimum mapping quality for Running Find Covered Interval -r /somepath/Homo_Sapeins_hg19.fasta, --referenceFile /somepath/Homo_Sapeins_hg19.fasta Full Path to the reference file with the bwa index. -g /somepath/GenomeAnalysisTK.jar, --gatkJar /somepath/GenomeAnalysisTK.jar Full Path to the GATK jar file. -j /somepath/java, --javaPATH /somepath/java Path to java executable. -q all.q or clin.q, --queue all.q or clin.q Name of the SGE queue -o /somepath/output, --outDir /somepath/output Full Path to the output dir. -qsub /somepath/qsub, --qsubPath /somepath/qsub Full Path to the qsub executables of SGE. -bsub /somepath/bsub, --bsubPath /somepath/bsub Full Path to the bsub executables of LSF. Annotating variants in merged variant file ============================================ **dmp_annotate_variants.pl [options]** --SomaticMutIndelFile | -si S File containing mutations (required and submit with full path,Ex:/SomePath/Some_SomaticMutIndel.txt) --ConfigurationFile | -c S Configuration file that contains the locations for the programs and the databases (required and submit with full path) --titleFile | -t S tab-delimited title file for the samples (required and submit with full path) --outdir | -o S Path where all the output files will be written (optional;default:cwd) --exonCoverageFile | -ec S Path where the all exon coverage file is located (full path] --geneCoverageFile | -gc S Path where the gene coverage file is located (full path)] --deleteUnwantedFiles | -d I 2=>To delete files 1=> To keep files (default:2,optional) Filter variants after annotation ================================= **dmp_filter_genotyped_variants.pl [options]** --input | -i S File containing mutations with genotype information (required) --hotspots | -h S File containing the list of hotspots (required) --clinicalExons | -ce S File containing the list of clinical exons (required) --titleFile | -t S Title file (required) --minimumDPforSNVs | -dp_snv I Minumum accepted DP for novel SNVs (default: 20) --minimumADforSNVs | -ad_snv I Minimum accepted AD for novel SNVs (default: 10) --minimumVFforSNVs | -vf_snv F Minimum accepted VF for novel SNVs (default: 0.05) --minimumDPforSNVhotspot | -dp_snvHS I Minumum accepted DP for Hotspot SNVs (default: 20) --minimumADforSNVhotspot | -ad_snvHS I Minimum accepted AD for Hotspot SNVs (default: 8) --minimumVFforSNVhotspot | -vf_snvHS F Minimum accepted VF for Hotspot SNVs (default: 0.02) --minimumDPforINDELs | -dp_indel I Minumum accepted DP for novel INDELs (default: 20) --minimumADforINDELs | -ad_indel I Minimum accepted AD for novel INDELs (default: 10) --minimumVFforINDELs | -vf_indel I Minimum accepted VF for novel INDELs (default: 0.05) --minimumDPforINDELhotspot | -dp_indelHS F Minumum accepted DP for Hotspot INDELs (default: 20) --minimumADforINDELhotspot | -ad_indelHS I Minimum accepted AD for Hotspot INDELs (default: 8) --minimumVFforINDELhotspot | -vf_indelHS F Minimum accepted VF for Hotspot INDELs (default: 0.02) --minimumOccurrencePercent | -occurrence S Minimum accepted value of occurrence in other normals, in percent (default: 20) --TNfreqRatioThreshold | -tn_ratio S Minimum value for VFt/VFn value (default: 5) --MAFthreshold | -mt F Minimum accepted MAF values for unmatched variant calls (default : 0.01) Filter indels from SomaticIndelDetector before genotyping ========================================================= **dmp_filter_indel.pl [options]** --IndelTxtFile|t S tab-delimted Indel file describing details about the mutations (required) --IndelVcfFile|v S VCF format Indel file describing details about the mutations (required) --sampleName|s S Name of the sample (required) --totaldepth|dp I Tumor total depth threshold for Somatic Indel Detector(default:0,optional) --alleledepth|ad I Tumor Allele depth threshold for Somatic Indel Detector(default:3,optional) --variantfreq|vf F Tumor variant frequency threshold for Somatic Indel Detector(default:0.01,optional) --TNratio|tnr I Tumor-Normal variant frequency ratio threshold for Somatic Indel Detector(default:5,optional) --outdir|o S Path where all the output files will be written (optional) default:current working directory Filter snv from MuTect before genotyping ========================================= **dmp_filter_mutect.pl [options]** --MutationTxtFile|t S tab-delimted Mutect file describing details about the mutations (required) --MutationVcfFile|v S VCF format Mutect file describing details about the mutations (required) --sampleName|s S Name of the sample (required) --totaldepth|dp I Tumor total depth threshold for Mutect(default:0,optional). --alleledepth|ad I Tumor Allele depth threshold for Mutect(default:3,optional). --variantfreq|vf F Tumor variant frequency threshold for Mutect(default:0.01,optional). --TNratio|tnr I Tumor-Normal variant frequency ratio threshold for Mutect(default:5,optional). --outdir|o S Path where all the output files will be written (optional) default:current working directory Filter indels from PINDEL before genotyping =========================================== **usage: dmp_filter_pindel.py [options]** Filter Indels from the output of pindel optional arguments: -h, --help show this help message and exit -v, --verbose make lots of noise [default] -i SomeID.vcf, -inputVcf SomeID.vcf Input vcf freebayes file which needs to be filtered -tsn SomeName, --tsampleName SomeName Name of the tumor Sample -dp 0, --totaldepth 0 Tumor total depth threshold -ad 3, --alleledepth 3 Tumor allele depth threshold -tnr 5, --tnRatio 5 Tumor-Normal variant frequency ratio threshold -vf 0.01, --variantfrequency 0.01 Tumor variant frequency threshold -o /somepath/output, --outDir /somepath/output Full Path to the output dir. -min 25, --min_var_len 25 Minimum length of the Indels -max 500, --max_var_len 500 Max length of the Indels