1 Genotype cancer hotspots

cancerhotspots allows rapid genotyping of known somatic hotspots from the tumor BAM files. This facilitates to get a quick overlook of 3,181 known somatic hot-spots in a matter of minutes, without spending hours on variant calling and annotation. In simple words, it fetches nucleotide frequencies of known somatic hotspots and prioritizes them based on allele frequency.

Output includes a browsable HTML file with variants passing the VAF/read depth filters and, a TSV file including nucleotide counts of all variants analyzed.

Input BAM file           :  Tumor.bam
Variants                 :  cancerhotspots_v2_GRCh37.tsv
VAF filter               :  0.050
min reads for t_allele   :  8
MAPQ filter              :  10
FLAG filter              :  1024
Coverage filter          :  30
HTSlib version           :  1.7

Processed 1000 entries..
Processed 2000 entries..
Processed 3000 entries..
Done!

Summary:
Total variants processed :  3181
Variants > 0.05 threshold:  3
Avg. depth of coverage   :  83.02
Output html report       :  Tumor.html
Output TSV file          :  Tumor.tsv

Above command generates an HTML report and a TSV file with the readcounts.

CLI version of cancerhotspots can be found here

2 Fetch readcounts for targetted loci

bamreadcounts function extracts nucleotide distribution for targeted loci from the BAM files. The function name is an homage to bam-readcount tool and additionally supports INDELS.

##     chr  pos
## 1: seq1 1340
## 2: seq2 1483

Get readcounts for every nucleotide from a BAM file

## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.14-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.14-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] maftools_2.10.05
## 
## loaded via a namespace (and not attached):
##  [1] lattice_0.20-45    digest_0.6.29      grid_4.1.2         R6_2.5.1          
##  [5] jsonlite_1.8.0     magrittr_2.0.2     evaluate_0.15      stringi_1.7.6     
##  [9] rlang_1.0.1        cli_3.2.0          data.table_1.14.2  jquerylib_0.1.4   
## [13] Matrix_1.4-0       bslib_0.3.1        rmarkdown_2.11     splines_4.1.2     
## [17] RColorBrewer_1.1-2 tools_4.1.2        stringr_1.4.0      survival_3.2-13   
## [21] xfun_0.29          yaml_2.3.5         fastmap_1.1.0      compiler_4.1.2    
## [25] htmltools_0.5.2    knitr_1.37         sass_0.4.0