Large libraries of cancer cell lines collections of cells that represent tumor types seen in cancer patients; but can yield profound insights into tumors’ unique genetic features; and their sensitivities to current and potential treatments. The data produced by these libraries is invaluable for developing new therapeutic options for patients.
Such is the case with the Cancer Cell Line Encyclopedia (CCLE), a collection of more than 900 cell lines; assembled starting in 2008 by the Broad Cancer Program; in collaboration with the Novartis Institutes for BioMedical Research. In 2012, the CCLE collaborators took a deep dive into the genomic features; and drug sensitivities of these cells, cataloging gene expression, chromosomal copy number; and targeted gene sequencing data from all 947 lines and a number of drug-response profiles.
This information has transformed how cancer scientists characterize drug targets and measure drug activity. For instance, the CCLE collection was instrumental in pinpointing the gene PRMT5; as a promising target in certain brain, lung, pancreatic, ovarian, and blood cancers; and WRN in cancer cells lacking a key DNA proofreading mechanism.
A multi-center research team has now greatly augmented this cancer research resource by incorporating new cell lines; but adding new data spanning the molecular spectrum from sequence to expression to protein. But writing in Nature, the team—led by core institute member William Sellers; therefore institute member on leave Levi Garraway, and Broad alumni Mahmoud Ghandi and Franklin Huang; report a major expansion of the CCLE dataset, which now includes:
- RNA sequencing data for 1,019 cell lines
- MicroRNA expression profiles for 954 lines
- Protein array data (899 lines)
- Genome-wide histone modifications (897)
- DNA methylation (843)
- Whole genome sequencing (329), and
- Whole exome sequencing (326)
CRISPR and RNA interference
The new dataset, which is freely available at https://depmap.org/portal/ccle/, also blends in CRISPR and RNA interference gene dependency data from the Broad’s Cancer Dependency Map (DepMap) team and drug sensitivity data from the Wellcome Trust Sanger Institute’s Genomics of Drug Sensitivity in Cancer project.
“We suspect that there are ways of looking beyond pairwise correlations like expression and protein levels to identify states of cancer that only reveal themselves when you see all the data in aggregate,” Sellers explained. “We hope that with all of the data available, the community will help draw those macro-level pictures, enabling improved drug discovery efforts broadly in industry and academia.”
In a companion paper in Nature Medicine, another team led by Sellers, Chemical Biology and Therapeutics Science Program graduate student Haoxin Li, and institute scientist and Metabolomics Platform senior director Clary Clish also opened a new view into cancer biology by probing the abundances of 225 metabolites of 928 of the CCLE lines—the first such systematic metabolomic survey of a cell line collection of this size and diversity.
“These data, along with statistical models, allow us to see otherwise-hidden connections between genetic and epigenetic errors in cancer cells and changes in those cells’ metabolic profiles,” Li said. “The data reveal metabolic dependencies that, for instance, point to opportunities to expand the use of the anti-cancer drug asparaginase, and to exploit levels of a metabolite called kynurenine as a prognostic bio-marker for certain kinds of immunotherapy.”