MapScape is a visualization tool for spatial clonal evolution. MapScape displays a cropped anatomical image surrounded by two representations of each tumour sample representing the distribution of clones throughout anatomic space. The first, a cellular aggregate or donut view, displays the prevalence of each clone. The second shows a skeleton of the patient’s clonal phylogeny while highlighting only those clones present in the sample.
Note: the cellular aggregate does not accurately represent the positions of clones within a sample. We therefore provide the alternative donut chart view as a less artistic representation of the tumour sample. See the Interactivity section below for instructions to switch between views.
To install MapScape, type the following commands in R:
Run the examples by:
Three visualizations will appear in your browser (optimized for Chrome).
For instance, the first visualization is of metastatic prostate cancer data published in Gundem et al. (2015):
The required parameters for MapScape are as follows:
\(clonal\_prev\) is a data frame consisting of clonal prevalences for each clone at each time point. The columns in this data frame are:
\(tree\_edges\) is a data frame describing the edges of a rooted clonal phylogeny. The columns in this data frame are:
\(sample\_locations\) is a data frame describing the anatomical locations for each tumour sample. The columns in this data frame are:
\(img\_ref\) is a reference for the custom anatomical image to use, in PNG format, either a URL to an image hosted online or a path to the image in local file system.
\(mutations\) is a data frame consisting of the mutations originating in each clone. The columns in this data frame are:
If this parameter is provided, a mutation table will appear at the bottom of the view.
The parameter \(sample\_ids\) is used to specify the order in which the user would like to display the samples radially in the visualization. Compare:
The \(n\_cells\) parameter specifies how many cells should be shown in the cellular aggregate representation of each tumour sample. Compare:
If \(show\_low\_prev\_gtypes\) is set to FALSE, the low-prevalence (<0.01) genotypes will NOT be shown in the phylogenetic tree of each tumour sample. If, however, \(show\_low\_prev\_gtypes\) is set to TRUE, the low-prevalence genotypes WILL be shown in the phylogenetic tree of each tumour sample as empty circles. Note that some clonality inference methods always assign a non-zero value to each clone in each sample, indicating that there is some (albeit small) probability of that clone existing in the sample. Hence, if this parameter is set to TRUE, it may be that all clones are shown in each tumour sample’s phylogeny. Compare:
Many titles throughout the view may be altered, including the phylogeny title (parameter \(phylogeny\_title\)), anatomy title in the legend (parameter \(anatomy\_title\)), and classification title in the legend (parameter \(classification\_title\)).
Each sample can be represented as either
E-scape takes as input a clonal phylogeny and clonal prevalences per clone per sample. At the time of submission many methods have been proposed for obtaining these values, and accurate estimation of these quantities is the focus of ongoing research. We describe a method for estimating clonal phylogenies and clonal prevalence using PyClone (Roth et al., 2014; source code available at https://bitbucket.org/aroth85/pyclone/wiki/Home) and citup (Malikic et al., 2016; source code available at https://github.com/sfu-compbio/citup). In brief, PyClone inputs are prepared by processing fastq files resulting from a targeted deep sequencing experiment. Using samtools mpileup (http://samtools.sourceforge.net/mpileup.shtml), the number of nucleotides matching the reference and non-reference are counted for each targeted SNV. Copy number is also required for each SNV. We recommend inferring copy number from whole genome or whole exome sequencing of samples taken from the same anatomic location / timepoint as the samples to which targeted deep sequencing was applied. Copy number can be inferred using Titan (Ha et al., 2014; source code available at https://github.com/gavinha/TitanCNA). Sample specific SNV information is compiled into a set of TSV files, one per sample. The tables includes mutation id, reference and variant read counts, normal copy number, and major and minor tumour copy number (see PyClone readme). PyClone is run on these files using the PyClone run_analysis_pipeline
subcommand, and produces the tables/cluster.tsv
in the working directory. Citup can be used to infer a clonal phylogeny and clone prevalences from the cellular prevalences produced by PyClone. The tables/cluster.tsv
file contains per sample, per SNV cluster estimates of cellular prevalence. The table is reshaped into a TSV file of cellular prevalences with rows as clusters and columns as samples, and the mean
of each cluster taken from tables/cluster.tsv
for the values of the table. The iterative version of citup is run on the table of cellular frequencies, producing an hdf5 output results file. Within the hdf5 results, the /results/optimal
can be used to identify the id of the optimal tree solution. The clonal phylogeny as an adjacency list is then the /trees/{tree_solution}/adjacency_list
entry and the clone frequencies are the /trees/{tree_solution}/clone_freq
entry in the hdf5 file. The adjacency list can be written as a TSV with the column names source
, target
to be input into E-scape, and the clone frequencies should be reshaped such that each row represents a clonal frequency in a specific sample for a specific clone, with the columns representing the time or space ID, the clone ID, and the clonal prevalence.
Interactive components in the top toolbar:
Interactive components in main view:
Interactive components in legend:
Interactive components in mutation table:
To view the documentation for MapScape, type the following command in R:
or:
MapScape was developed at the Shah Lab for Computational Cancer Biology at the BC Cancer Research Centre.
References:
Gundem, Gunes, et al. “The evolutionary history of lethal metastatic prostate cancer.” Nature 520.7547 (2015): 353-357.
Ha, Gavin, et al. “TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data.” Genome research 24.11 (2014): 1881-1893.
Malikic, Salem, et al. “Clonality inference in multiple tumor samples using phylogeny.” Bioinformatics 31.9 (2015): 1349-1356.
McPherson, Andrew, et al. “Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer.” Nature genetics (2016).
Roth, Andrew, et al. “PyClone: statistical inference of clonal population structure in cancer.” Nature methods 11.4 (2014): 396-398.