TimeScape is a visualization tool for temporal clonal evolution.
To install TimeScape, type the following commands in R:
Run the examples by:
The following visualizations will appear in your browser (optimized for Chrome):
The first visualization is of the acute myeloid leukemia patient from Ding et al., 2012. The second visualization is of the metastatic ovarian cancer patient 7 from McPherson and Roth et al., 2016.
The required parameters for TimeScape are as follows:
\(clonal\_prev\) is a data frame consisting of clonal prevalences for each clone at each time point. The columns in this data frame are:
\(tree\_edges\) is a data frame describing the edges of a rooted clonal phylogeny. The columns in this data frame are:
\(mutations\) is a data frame consisting of the mutations originating in each clone. The columns in this data frame are:
If this parameter is provided, a mutation table will appear at the bottom of the view.
Clone colours may be changed using the \(clone\_colours\) parameter, for instance, compare the default colours :
with specified custom colours:
The alpha value of each colour may be tweaked in the \(alpha\) parameter (a numeric value between [0, 100]). Compare alpha of 10:
with the alpha value of 90:
The x-axis, y-axis and phylogeny titles may be changed using the \(xaxis\_title\), \(yaxis\_title\) and \(phylogeny\_title\) parameters, which take in a character string.
Here are some custom titles:
The position of each genotype with respect to its ancestor can be altered. The “stack” layout is the default layout. It stacks genotypes one on top of another to clearly display genotype prevalences at each time point. The “space” layout uses the same stacking method while maintaining (where possible) a minimum amount of space between each genotype. The “centre” layout centers genotypes with respect to their ancestors. Here we’ll see an example of each:
Perturbation events may be added to the TimeScape using the \(perturbations\) parameter. Adding perturbations will simply add a label along the x-axis where the perturbation occurs. The \(perturbations\) parameter is a data frame consisting of the following columns:
E-scape takes as input a clonal phylogeny and clonal prevalences per clone per sample. At the time of submission many methods have been proposed for obtaining these values, and accurate estimation of these quantities is the focus of ongoing research. We describe a method for estimating clonal phylogenies and clonal prevalence using PyClone (Roth et al., 2014; source code available at https://bitbucket.org/aroth85/pyclone/wiki/Home) and citup (Malikic et al., 2016; source code available at https://github.com/sfu-compbio/citup). In brief, PyClone inputs are prepared by processing fastq files resulting from a targeted deep sequencing experiment. Using samtools mpileup (http://samtools.sourceforge.net/mpileup.shtml), the number of nucleotides matching the reference and non-reference are counted for each targeted SNV. Copy number is also required for each SNV. We recommend inferring copy number from whole genome or whole exome sequencing of samples taken from the same anatomic location / timepoint as the samples to which targeted deep sequencing was applied. Copy number can be inferred using Titan (Ha et al., 2014; source code available at https://github.com/gavinha/TitanCNA). Sample specific SNV information is compiled into a set of TSV files, one per sample. The tables includes mutation id, reference and variant read counts, normal copy number, and major and minor tumour copy number (see PyClone readme). PyClone is run on these files using the PyClone run_analysis_pipeline
subcommand, and produces the tables/cluster.tsv
in the working directory. Citup can be used to infer a clonal phylogeny and clone prevalences from the cellular prevalences produced by PyClone. The tables/cluster.tsv
file contains per sample, per SNV cluster estimates of cellular prevalence. The table is reshaped into a TSV file of cellular prevalences with rows as clusters and columns as samples, and the mean
of each cluster taken from tables/cluster.tsv
for the values of the table. The iterative version of citup is run on the table of cellular frequencies, producing an hdf5 output results file. Within the hdf5 results, the /results/optimal
can be used to identify the id of the optimal tree solution. The clonal phylogeny as an adjacency list is then the /trees/{tree_solution}/adjacency_list
entry and the clone frequencies are the /trees/{tree_solution}/clone_freq
entry in the hdf5 file. The adjacency list can be written as a TSV with the column names source
, target
to be input into E-scape, and the clone frequencies should be reshaped such that each row represents a clonal frequency in a specific sample for a specific clone, with the columns representing the time or space ID, the clone ID, and the clonal prevalence.
Interactive components:
To view the documentation for TimeScape, type the following command in R:
or:
TimeScape was developed at the Shah Lab for Computational Cancer Biology at the BC Cancer Research Centre.
References:
Ding, Li, et al. “Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing.” Nature 481.7382 (2012): 506-510.
Ha, Gavin, et al. “TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data.” Genome research 24.11 (2014): 1881-1893.
Malikic, Salem, et al. “Clonality inference in multiple tumor samples using phylogeny.” Bioinformatics 31.9 (2015): 1349-1356.
McPherson, Andrew, et al. “Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer.” Nature genetics (2016).
Roth, Andrew, et al. “PyClone: statistical inference of clonal population structure in cancer.” Nature methods 11.4 (2014): 396-398.