Tracing the evolution of single-cell 3D genomes in Kras-driven cancers

Ethics Statement

All animal studies were approved by the Institutional Animal Care and Use Committee of Yale University. The tumor sizes in this study did not exceed the maximal tumor size (1 cm3) permitted by the institutional review board.

Mouse and cell lines

Mouse strains

Mice were housed in a specific-pathogen-free facility with controlled temperature, humidity and day/night cycles and maintained in a mixed background. MADM11-GT (013749), MADM11-TG (013751) and Pdx1-cre (014647) were obtained from the Jackson Laboratory (JAX). KrasLSL-G12D (JAX, 008179) and Trp53KO (JAX, 002101) mice were a gift from T. Jacks. KrasLSL-G12D/WT; MADM11-TG, Trp53KO/MADM11-TG, Trp53WT breeder mice were generated as previously described30. KrasLSL-G12D/WT; MADM11-TG, Trp53KO/MADM11-TG, Trp53WT were crossed with MADM11-GT mice (with or without Pdx1-cre) to generate K-MADM-Trp53 or Pdx1-cre; K-MADM-Trp53 experimental mice of both sexes for lung and pancreatic cancer analyses, respectively. Genotyping primers and protocols have been previously reported30. Large-scale chromatin tracing in lung included four WT mice (91-day male, 191-day female, 283-day male and 91-day female), four mice with adenoma (225-day male, 221-day male, 232-day female and 224-day male) and five mice with LUAD (232-day male, 487-day male, 494-day male, 221-day male and 551-day male). Large-scale chromatin tracing in pancreas included a WT mouse (59-day male) and a mouse with PanIN and PDAC tumors (46-day male). Fine-scale chromatin tracing in mouse lung included a WT mouse (191-day female), a mouse with adenoma (225-day male) and a mouse with LUAD (494-day male).

Lentivirus production and infection

pPGK-Cre lentiviral backbone was amplified and transfected into 293T cells using VSV-G (Addgene, 8454) and psPAX2 (Addgene, 12260) packaging vectors and TransIT-LT1 kit (MirusBio). After 48–72 h, virus was collected, filtered and ultracentrifuged. Lentivirus was resuspended in OptiMEM (Thermo Fisher Scientific) and administered intratracheally to K-MADM-Trp53 mice at 6–10 weeks of age to generate lung tumors.

Cell line

The KP LUAD cell line, 31671, was a gift from N. Joshi and was derived from an autochthonous KrasLSL-G12D/WT; Trp53flox/flox mouse administered with Adeno-Cre77. The K-MADM-Trp53 LUAD cell line (SA6082inf) was derived from collagenase-based dissociation of a large GFP+ tumor dissected from a K-MADM-Trp53 mouse. Kras and Trp53 genotypes were confirmed by PCR. Both cell lines were cultured in DMEM (Corning, 10-013-CV) containing 10% (vol/vol) FBS (Gibco, 26140-079) and 1% (vol/vol) Penicillin-Streptomycin (Thermo Fisher Scientific, 15140-122) at 37 °C with 5% CO2. Cells were passaged whenever they reached confluency. Both cell lines were seeded onto UV-sterilized coverslips (Bioptechs, 40-mm diameter; 1.5) and allowed to grow until 60–70% confluency before primary probe hybridization. All cell lines were tested for the presence of mycoplasma using PCR (ATCC, 30-1012K).

Probe design and synthesis

Detailed probe design and synthesis information are included in the Supplementary Note. Template probe sequences for chromatin tracing and RNA MERFISH are provided in Supplementary Table 1. Codebooks for chromatin tracing and RNA MERFISH, and genome coordinates and target region features in chromatin tracing are provided in Supplementary Table 2. PCR primers, reverse transcription primers for probe synthesis and adaptors, readout and blocking probes for sequential readout imaging are provided in Supplementary Table 3.

Mouse K-MADM-Trp53 lung tissue chromatin tracing experiments

Tissue preparation

K-MADM-Trp53 mice were dissected at sign of respiratory distress through CO2 asphyxiation. Lungs were perfused with ice-cold 1× PBS, incubated in 4% (vol/vol) paraformaldehyde (Electron Microscopy Sciences, 15710-S) overnight, and cryoprotected with 30% sucrose. Tissue was embedded in Tissue-Tek optimal cutting temperature compound (OCT) and frozen on dry ice before storage at −80 °C.

Tissue sectioning

Frozen mouse lung blocks were sectioned at a thickness of 10 μm at −20 °C on a cryostat. The following three consecutive tissue slices were sectioned at a time: one for hematoxylin and eosin (H&E) staining, one for whole-section fluorescence imaging and one for chromatin tracing. The sections were air-dried at room temperature for 1 h and then either used directly or stored at −20 °C for months.

H&E staining and whole-section fluorescence imaging

H&E staining was performed by Yale Pathology Tissue Services. For whole-section fluorescence imaging, the section was mounted in mounting medium with DAPI (VectorLabs, H-1800-2). Stitched fluorescence images in 353-nm, 488-nm and 592-nm illumination channels for DAPI, GFP and tdTomato, respectively, were collected with a Plan-Apochromat ×10/0.45 M27 objective on a Zeiss Axio Imager M2 microscope.

Co-immunofluorescence and fluorescent protein imaging of the chromatin tracing section

Frozen sections were first balanced at room temperature for 10 min, followed by hydration in Dulbecco’s phosphate-buffered saline (DPBS) for 5 min twice. Tissue sections were permeabilized with 0.5% (vol/vol) Triton X-100 in DPBS for 30 min at room temperature and washed twice in DPBS for 2 min. Tissue sections were then blocked for 30 min at room temperature in blocking buffer containing 1% (wt/vol) BSA (Sigma-Aldrich, A9647-100G), 22.52 mg ml−1 glycine (AmericanBio, AB00730), 10% (vol/vol) donkey serum (MilliporeSigma, S30–100 ml), 5% (vol/vol) goat serum (Invitrogen, 31873) and 0.1% (vol/vol) Tween-20 in DPBS. Tissue sections were incubated with rabbit anti-SPC antibody (MilliporeSigma, AB3786, 1:50) and rat anti-CD45 antibody (BioLegend, 103101, 1:100) in blocking buffer at 4 °C overnight. The samples were washed thrice in DPBS for 5 min and then incubated with DyLight 800-labeled donkey anti-rabbit secondary antibody (Invitrogen, SA5-10044, 1:1,000) and Alexa Fluor 647-labeled goat anti-rat secondary antibody (Invitrogen, A-21247, 1:1,000) in blocking buffer for 1 h at room temperature. The samples were washed thrice in DPBS for 5 min and then incubated with DAPI (Thermo Fisher Scientific, 62248) at 1:1,000 dilution in 2× saline-sodium citrate buffer (SSC) for 10 min. The samples were then mounted onto a Bioptechs FCS2 flow chamber and replenished with freshly prepared imaging buffer with an oxygen scavenging system (50 mM Tris–HCl pH 8.0, 10% wt/vol glucose, 2 mM Trolox (Sigma-Aldrich, 238813), 0.5 mg ml−1 glucose oxidase (Sigma-Aldrich, G2133), 40 μg ml−1 catalase (Sigma-Aldrich, C30)). The imaging buffer was covered with a layer of mineral oil (Sigma-Aldrich, 330779) in the reservoir tube to prevent continuous oxidation. We then selected multiple fields of view (FOVs) at predefined tumor regions based on tumor grades of the adjacent whole-section fluorescence images. Tumor grades were confirmed independently by two investigators (S.S.A. and M.D.M.). At each FOV, we sequentially took z-stack images of DAPI, GFP fluorescence, tdTomato fluorescence, anti-SPC immunofluorescence and anti-CD45 immunofluorescence with 405-nm, 488-nm, 560-nm, 647-nm and 750-nm laser illuminations. The z-stacks (7–10 μm total) had a 200-nm step size and a 0.4 s exposure time per step.

Primary probe hybridization

After imaging, tissue sections were de-assembled from the chamber, treated with 1 μg ml−1 proteinase K (Invitrogen, AM2546) in 2% (vol/vol) sodium dodecyl sulfate in 2× SSC at 37 °C for 10 min, and washed twice in DPBS for 2 min. Tissue sections were then treated with 0.1 M HCl for 5 min and washed twice in DPBS for 2 min. Tissue sections were digested with 0.1 mg ml−1 RNase A (Thermo Fisher Scientific, EN0531) in DPBS for 45 min at 37 °C and washed twice with 2× SSC for 2 min. Tissue sections were treated with prehybridization buffer containing 50% (vol/vol) formamide (Sigma-Aldrich, F7503) and 0.1% (vol/vol) Tween-20 in 2× SSC. Synthesized primary probes were dissolved in 25 μl hybridization buffer containing 50% (vol/vol) formamide and 20% (vol/vol) dextran sulfate (MilliporeSigma, S4030) in 2× SSC. The total probe concentration was 30–40 μM. Coverslips were immersed in hybridization buffer containing probes in 60 mm petri dishes, heat denatured in a 90 °C water bath for 3.5 min and subsequently incubated at 47 °C in a humid chamber for 36–48 h. Samples were washed with 50% (vol/vol) formamide in 2× SSC for 15 min twice followed by 2× SSC for 15 min all at room temperature. Washed samples were then incubated with 0.1-μm yellow–green beads (Invitrogen, F8803) resuspended in 2× SSC as fiducial markers for drift correction, washed with 2× SSC briefly, incubated with DAPI at 1:1,000 dilution in 2× SSC for 10 min for image registration and washed again with 2× SSC for 2 min twice.

Readout probe hybridization and imaging

After the primary probe hybridization, the sample was repeatedly hybridized with adaptors and readout probes, imaged and photobleached for 50 rounds. Each adaptor probe was 60 nucleotides (nt) in length, containing a 20-nt primary probe binding region and two consecutive 20-nt readout probe binding regions. Readout probes were 30-nt oligos conjugated with Alexa Fluor 647 (or Cy5) or ATTO 565 (or Cy3) fluorophores with 20-nt complementary to the readout probe binding regions of adaptors (Supplementary Table 3). To perform automatic buffer exchange, we used a Bioptechs FCS2 flow chamber and a computer-controlled, custom-built fluidics system11,72. First, we used DAPI images for registration to images taken before primary probe hybridization. We selected the same FOVs as those selected during co-immunofluorescence and fluorescent protein imaging. At each FOV, we took z-stack images with 488-nm and 405-nm laser illuminations for fiducial beads and DAPI, respectively. For each round of readout probe hybridization, we flowed 2 ml of readout probe hybridization buffer (20% vol/vol ethylene carbonate (Sigma-Aldrich, E26258) in 2× SSC) containing two adaptors each at 50 nM through the chamber and incubated for 15 min at room temperature. We flowed through 2 ml wash buffer (20% vol/vol ethylene carbonate in 2× SSC) for 2 min and 2 ml readout probe hybridization buffer containing two dye-labeled readout probes each at 75 nM, with 15 min incubation at room temperature. We further flowed through 2 ml wash buffer for 2 min and 2 ml imaging buffer. At each FOV, dye-labeled readout probes were imaged in 647-nm and 560-nm channels and fiducial beads were imaged in the 488-nm channel. The z-stacks (7–10 μm total) had a 200-nm step size and a 0.4 s exposure time per step. After imaging, we switched buffer to readout probe hybridization buffer containing 1 μM dye-free readout probes (blocking oligos) and photobleached with continuous simultaneous laser illuminations with 750-nm, 647-nm and 560-nm lasers for 40 s. We then flowed 5 ml 2× SSC to wash away unbound blocking oligos before the next hybridization round. A total of 50 hybridization rounds was performed.

Arrayed RNAi screen of cell proliferation

Lentiviral infection conditions were optimized in 96-well plates for initial cell seeding number, lentiviral dosage, antibiotic concentration and assay time. KP and K-MADM-Trp53 LUAD cells were seeded in complete cell culture media at a density of 250 cells per well in a 96-well plate, incubated for 24 h, and infected with lentiviral supernatant of Sigma Mission shRNAs targeting CPDs obtained from Yale Cancer Center Functional Genomics Core. A total of 200 μl of media was added to each well, comprising 50 μl of lentiviral supernatant and 150 μl of complete cell culture medium with 10 μg ml−1 polybrene (EMD Millipore, TR-1003-G). Each shRNA hairpin was tested in triplicate. All lentiviral infections were performed in duplicate to rule out the influence of lentiviral dosage on cell growth—one replicate with 6 μg/ml puromycin selection and the other replicate with no puromycin selection. After 24 h of lentiviral incubation, corresponding wells were treated with or without puromycin selection for 48 h. The cells were then incubated with complete cell culture media for 4–5 days. Total viable cell count was determined with CellTiter-Glo Luminescent Cell Viability Assay (Promega, G7572) using a Promega luminescence plate reader. For data analysis, we deducted luminescence readout values of blank wells from those of test wells. We then normalized luminescence readout values of each target shRNA hairpin to those of nontargeting control shRNA sequence (shNTC). All shRNA hairpin sequences are provided in Supplementary Table 6.

Rnf2 targeted degradation

Two million cells of the KP LUAD cell line 31671 were nucleofected (Amaxa) with 2 μg homology-directed repair (HDR) template and 2 μM RNP (IDT Alt-R CRISPR–Cas9 sgRNA) for Rnf2-dTAG. The HDR template contained a 555-nt 5′ homology arm homologous to the 5′ end of the Rnf2 stop codon, a 30-nt linker and two HA tags, an FKBPF36V and mScarlet insert (Supplementary Table 6; amplified from a gift plasmid from D. Schatz at Yale University) and a 999-nt 3′ homology arm homologous to the 3′ end of the Rnf2 stop codon. After transfection, the cells were seeded into 6-well plates and cultured for 24–48 h before FACS sorting. mScarlet+ cells were sorted into 96-well plates. Single colonies were manually picked, genotyped with PCR and validated with sequencing. The spacer sequence for Rnf2 sgRNA is GACTTTATTATGCACCCACCA. The dTAG-13 ligand and negative control ligand were added to the cells at a final concentration of 500 nM. The cells were incubated at 37 °C with 5% CO2.

Image analysis

All image analyses were performed with MATLAB R2019b.

The genome-wide chromatin tracing image analysis pipeline consists of the following steps: color correction, drift correction, nucleus segmentation, foci fitting, decoding and trace linking. Detailed methods are listed in the Supplementary Note.

Data analysis and statistics

Decompaction

To compare levels of chromatin compaction, we defined a decompaction score as the mean interloci distance between a locus pair on a chromosome of a cell state.

Heterogeneity

To compare chromatin conformation heterogeneity, we defined a heterogeneity score as the coefficient of variation (COV) of interloci distances between a locus pair on a chromosome of a cell state (variation among different chromosome copies). Similarly, the heterogeneity scores for cell pairs were defined as the root mean square of interloci distance difference normalized by the mean.

Demixing

To compare levels of chromatin intermixing, we defined a demixing score as the s.d. of all normalized mean interloci distances (mean interloci distance normalized to the mean of all mean interloci distances on the chromosome) on a chromosome of a cell state. The normalization accounted for and excluded the influence of chromatin compaction on demixing measurements.

A and B compartment polarization analysis

To identify A and B compartments, we adapted a previous algorithm to determine A and B compartment scores11,24. In brief, we generated a mean interloci distance matrix for each chromosome, normalized the mean distance to the expected distance calculated by power-law fitting of spatial versus genomic distances and calculated the Pearson correlation coefficient among each column pair of the normalized matrix. Finally, we applied a principal component analysis (PCA) of the Pearson correlation matrix and used coefficients of the first principal component as compartment scores. We calculated correlations between compartment scores and an averaged profile of histone H3 lysine 4 monomethylation (H3K4me1), H3K4me3, DNase I hypersensitivity site and gene densities78,79, and flipped signs of compartment scores if necessary, so that A and B compartment genomic regions had positive and negative scores, respectively. We then used a previously described polarization index metric to quantify the polarization of A and B compartments24. For chromosome 6 in Fig. 3d, we used the first 30 target genomic regions to calculate the polarization index to match target region numbers in different chromosomes. We downloaded H3K4me1 and H3K4me3 chromatin immunoprecipitation sequencing peaks and DNA DNase I hypersensitivity sites from The Encyclopedia of DNA Elements (ENCODE)—ENCFF536DWZ, ENCFF508WEP and ENCFF268DLZ. The gene density profile was downloaded from the UCSC table browser.

Radial score analysis

To calculate the radial score of each genomic region in each single cell, we measured the mean spatial distance among each region to the centroid of all target regions and normalized the distance to the average spatial distances from all genomic regions to the centroid.

The ‘Trace2State’ pipeline for single-cell chromatin conformation-based cell state visualization and classification

First, we constructed an input matrix where each row represented a single cell and each column represented the scA/B score of each genomic region35. The scA/B score was calculated as the mean A–B compartment score of all its spatially adjacent genomic regions within a 1,200-nm 3D neighborhood. Only cells with at least 10 traces were analyzed. Missing values were replaced with 0s. The matrix was used as input in a PCA, and the first 50 principal components were used for downstream visualizations. We scaled the 50 principal components with the PCA().fit-transform function in Python 3.11.12, and visualized the single-cell data with t-distributed stochastic neighbor embedding, uniform manifold approximation and projection and pairwise controlled manifold approximation37,38,39. To classify different cell states with supervised machine learning, we supplied the Classification Learner App of MATLAB with the same input scA/B score matrix and applied fivefold cross-validation to prevent overfitting. All machine learning models were trained, and the model with the highest prediction accuracy (medium Gaussian support vector machine, regularization included with box constraint level = 1) was retained to plot the confusion matrix and receiver operating characteristic curves.

The ‘Trace2Biomarker’ pipeline for CPD gene identification

To identify CPD genes, we first identified marker genomic regions with substantially changed scA/B scores during LUAD progression. Specifically, we extracted cells with more than 325 genomic regions. We then performed rank normalization (using the tiedrank function in MATLAB) of scA/B scores in each cell. We performed Wilcoxon rank-sum tests of rank-normalized scA/B scores of each genomic region comparing AdenomaG and LUAD. Genomic regions with P < 0.1 were defined as marker genomic regions. We then defined CPD genes as genes located in marker genomic regions with increased scA/B scores and higher expression (mean expression count > 10, fold change > 3, false discovery rate (FDR) < 0.05) in LUAD versus AdenomaG cells. We defined control genes as those with increased expression (mean expression count > 10, fold change > 3, FDR < 0.05) in regions with unchanged scA/B scores (P ≥ 0.1). Genes with increased expression only are defined as those with increased expression (mean expression count > 10, fold change > 3, FDR < 0.05) in LUAD versus AdenomaG cells, irrespective of scA/B score changes. Equal gene numbers (top 21 upregulated by fold change) were used for each gene list above and interrogated in TCGA patient survival analyses.

The ‘Trace2Regulator’ pipeline for 3D genome regulator identification

To identify chromatin regulators that bind to genes in marker loci, we first identified genes with increased expression (FDR < 0.1, fold change > 1) in genomic regions with increased scA/B scores (P < 0.1) from AdenomaG to LUAD cells. We then input the gene list into BART55,56 to generate chromatin regulators predicted to bind to the input genes, using default parameters. We further knocked down candidate chromatin regulators with shRNAs in the KP LUAD cell line and performed DNA MERFISH. We further compared 3D genome alterations upon the candidate chromatin regulator knockdown with those during the AdenomaG-to-LUAD progression. Candidate regulators showing concordant 3D genome alterations were identified as 3D genome regulators.

CUT&RUN experiment and analysis

CUT&RUN protocol

CUT&RUN was performed using the Epicypher CUTANA ChIC/CUT&RUN Kit (14-1048) according to the manufacturer’s specifications with the following conditions and modifications: for binding to the Concanavalin beads, 500,000 cells per sample were prepared, counted twice using a hemacytometer, and averaged. Nuclei were prepared according to the Epicypher Cut and Run Manual Appendix. Nuclei were incubated with activated beads. For Escherichia coli spike-in DNA, 0.1 ng was added to each sample. The following antibodies (0.5 μg of antibody per reaction) were used: IgG control (CUTANA Kit Rabbit IgG CUT&RUN Negative Control Antibody), Rnf2 antibody (Cell Signaling Technology, 5694), H3K4me3 antibody (EpiCypher, 13-0041), H3K27me3 antibody (Cell Signaling Technology, 9733), H2AK119ub antibody (Cell Signaling Technology, 8240), BMI1 antibody (Active Motif, 39993) and RNA Pol II p-ser5 (Abcam, ab193467).

Library preparation

Libraries were prepared using Epicypher CUTANA CUT&RUN Library Prep Kit (14-1001) according to the manufacturer’s specifications with the following modification: SPRI-select beads were used after library preparation to perform 200–700 bp size selection to enrich for DNA fragments from CUT&RUN and remove adaptor dimers or high molecular weight DNA. Library quality was analyzed using Agilent Tapestation D1000 High Sensitivity Tapes (5067–5584). Sequencing was performed with 10 million reads per sample, 150 bp paired-end reads, on an Illumina NovaSeq 6000.

CUT&RUN data analysis

CUT&RUN reads were analyzed for quality control using FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and trimmed using Trimmomatic. Reads were then aligned to mm9 and K-12 E. coli genome U00096.3 by Bowtie2 2.3.4. Picard 2.27.4 was used to downsample the reads so that the read depths of E. coli sequences from different samples matched each other. SAMTools 1.11 was then used to convert to BAM format, index, isolate uniquely mapped paired reads and remove duplicates. MACS2 2.2.7.1 was used to call sample narrow peaks, using IgG as an input. Read counts across genomic intervals and peak visualization were performed using deepTools 3.3 (ref. 80), and .bw files were visualized using Integrative Genomics Viewer.

Other detailed methods are included in the Supplementary Note.

Statistics and reproducibility

Statistical tests and sample sizes are reported in the corresponding sections. The number of replicates was determined to be the maximum number that could be collected with the available equipment and personnel during the study period. Our study analyzed AT2 cells, AT2-derived cancer cells and immune cells expressing the pan-immune marker CD45. Other cells are excluded from the analyses. Random subsampling was performed to show that as few as 100 cells per cell state were sufficient to recapitulate the 3D genome organization changes for most structural features in lung. The investigators were not blinded to allocation during experiments and outcome assessment. We visually inspected data distributions for each statistical test, but normality and equal variances were not formally tested. Wilcoxon tests were used to avoid normality assumptions.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.


Source link

Leave a Reply

Your email address will not be published. Required fields are marked *