Mammalian cell culture
All mammalian cell cultures were maintained in a 37 °C incubator at 5% CO2. HEK293T human embryonic kidney (Thermo Fisher), A549 human lung cancer (a gift from S. Garg) and HeLa human cervical cancer (a gift from M. Stewart) cells were maintained in Dulbecco’s modified eagle’s medium with high glucose, sodium pyruvate and GlutaMAX (DMEM; 10569, Thermo Fisher) supplemented with 10% fetal bovine serum (FBS; 10438, Thermo Fisher) and 100 U ml−1 penicillin–streptomycin (15140, Thermo Fisher). V6.5 mouse ESCs (a gift from R. Jaenisch) were maintained in DMEM with high glucose and sodium pyruvate (11995, Thermo Fisher) supplemented with 15% FBS (SH30070, GE Healthcare), 1 mM HEPES (15630, Thermo Fisher), 2 mM l-glutamine (25030, Thermo Fisher), 1X MEM non-essential amino acids (NEAAs; 11140, Thermo Fisher), 0.0008% 2-mercaptoethanol (M6250, Millipore-Sigma), 1,000 U ml−1 leukemia inhibitory factor (LIF; ESG1107, Millipore-Sigma) and 100 U ml−1 penicillin–streptomycin (30-002-Cl, Corning). V6.5 cells were grown on plates coated with 0.2% gelatin (G1890, Millipore-Sigma). HEK293T cells without or with a GFP transgene insertion were used as previously described21. ESCs with a GFP transgene insertion were created by infection of V6.5 cells with pLX TRC209 lentivirus and isolation of single-cell clones expressing GFP. All cell lines were tested for mycoplasma contamination and confirmed mycoplasma free. Cell lines were not authenticated. No commonly misidentified cell lines were used.
Mutagenesis and cloning
PE2 and PEmax prime editors were obtained from pCMV-PE2 and pCMV-PEmax, and a cloning backbone for pegRNA expression was obtained from pU6-pegRNA-GG-acceptor, which were gifts from D. R. Liu. PE7 prime editor was obtained from Lenti-PE7-P2A-Puro, which was a gift from F. J. Sanchez-Rivera. PE was created by restriction cloning of Cas9n(H840A) from pCMV-PE2 into pCMV-PEmax using NotI and SacI digestion. PE mutagenesis was performed using PCR-driven splicing by overlap extension using primers listed in Supplementary Table 1. In brief, one fragment was amplified by PCR from PE using the pe-FWD or pe-mid-FWD and mutant-BOT primers, and a second fragment was amplified using the mutant-TOP and pe-mid-REV or pe-rt-REV primers for each mutant. Each pair of fragments was then spliced by overlap extension PCR using the pe-FWD and pe-mid-REV or pe-mid-FWD and pe-rt-REV primers to create a PE gene fragment with a single-residue mutation. These PE gene fragments were then each cloned back into PE using unique NotI, SacI and BamHI restriction sites to replace the PE sequence with the mutant sequence. Additional mutants (double, triple and quadruple mutants) were made iteratively starting from these single-mutant plasmids. PE7 and vPE were created by restriction cloning of a fragment containing La amplified from Lenti-PE7-P2A-Puro into pCMV-PEmax, for PE7, or into xPE, for vPE, using BamHI and BshTI digestion. The pegRNA oligos, listed in Supplementary Table 2, were cloned into pU6-pegRNA-GG-acceptor by Golden Gate cloning with Eco31I digestion.
Cas9 variants were generated as previously described21. A custom gRNA cloning backbone vector was created by PCR amplification from pX330 using the gRNA-scaffold-NheI-FWD and gRNA-scaffold-EcoRI-REV primers and restriction cloning into pUC19 (Thermo Fisher) using NheI and EcoRI digestion. The nicking gRNA spacer sequence oligos, listed in Supplementary Table 2, were phosphorylated with T4 polynucleotide kinase (NEB) and cloned into gRNA cloning backbone by Golden Gate cloning with BpiI digestion.
Primers were synthesized by IDT. Restriction enzymes were obtained from Thermo Fisher. T7 DNA ligase was obtained from NEB. Plasmids were transformed into Stbl3 chemically competent Escherichia coli (Thermo Fisher). Sequences for the PEmax, PE, pPE, xPE, PE7 and vPE vectors are presented in the Sequences section in the Supplementary Information.
Structure analysis
Crystal structures of Cas9 with substrate DNA bound (5F9R) or without substrate DNA bound (4ZT0) were analysed using PyMol (Schrödinger).
Cell transfection
Cells were seeded in the maintenance medium (without penicillin–streptomycin for HEK293T, A549 and HeLa cells) into 48-well plates at 50,000 cells per well. Transfections with dual gRNAs were carried out 24 h after seeding using 200 ng Cas9 expression vector and 72 ng of each gRNA expression vector formulated with 0.86 µl Lipofectamine 2000 (Thermo Fisher) at a total volume of 34.4 µl in OptiMEM I (Thermo Fisher) per well. Transfections with prime-editing vectors were carried out 24 h after seeding using 238 ng PE expression vector, 57 ng pegRNA expression vector and 72 ng nicking gRNA expression vector (for pegRNA + ngRNA editing) formulated with 0.74–0.92 µl (equal volume per DNA) Lipofectamine 2000 at a total volume of 29.5–36.7 µl (equal DNA concentration) in OptiMEM I per well. For sequencing assays, genomic DNA was extracted 72 h after transfection using QuickExtract (Epicentre). For flow cytometry assays, cells were transferred to 10-cm dishes 72 h after transfection and harvested 9 days after transfection in PBS with 5% FBS (Thermo Fisher).
High-throughput sequencing
The targeted loci were amplified from extracted genomic DNA by PCR using Herculase II polymerase (Agilent). The PCR primers included Illumina sequencing handles as well as replicate-specific barcodes. These PCR products were then tagged with sample-specific barcodes and sequenced on an Illumina MiSeq. Primers, listed in Supplementary Table 3, were synthesized by IDT. A sequencing file listing and sequencing depth data are available in Supplementary Table 4.
Flow cytometry
Flow cytometry analysis was performed on an LSR Fortessa analyser and data were collected using FACSDiva (BD Biosciences). Cells were first gated comparing side scatter (SSC) and forward scatter (FSC) parameters, starting with SSC-A and FSC-A, then SSC-H and SSC-W, then FSC-H and FSC-W parameters to select for single cells (Supplementary Fig. 1). To assess editing frequencies, cells were gated for GFP (488-nm laser excitation, 530/30-nm filter detection) and BFP (405-nm laser excitation, 450/50-nm filter detection). Flow cytometry data were analysed using FlowJo (FlowJo). Intended edit rates were quantified as the fraction of cells gated as BFP+ for a prime-edited sample minus the fraction gated as BFP+ for the unedited control sample. Indel rates were quantified as the fraction of cells gated as GFP− and BFP− for a prime-edited sample minus the fraction gated as GFP− and BFP− for the unedited control sample.
Genome-editing analysis
To measure editing outcomes, the high-throughput sequencing data were analysed using CRISPResso2 (ref. 36). Data for prime-editing experiments were processed using the ‘prime editing’ mode in CRISPResso2 by including sequence values for the parameters ‘prime_editing_pegRNA_spacer_seq’, ‘prime_editing_pegRNA_extension_seq’, ‘prime_editing_pegRNA_scaffold_seq’ and ‘prime_editing_nicking_guide_seq’ (for pegRNA + ngRNA modes). Editing window parameters ‘prime_editing_pegRNA_extension_quantification_window_size’ and ‘w’ were set to 5. The ‘ignore_substitutions’ option was used to account for small sequence variations that occur due to PCR and sequencing errors. Intended edit rates were quantified as the fraction of reads marked as prime edited out of total sequencing reads. Indel rates were quantified as the fraction of reads marked as indels out of total sequencing reads. Frequencies of specific indel sizes were quantified as the fraction of reads containing these sizes out of all indel reads or out of total sequencing reads, as noted, and were averaged over three independent replicates. Mean indel sizes were calculated as the mean of the absolute values of indel sizes weighted by their indel fractions. Depletion of specific indel sizes was quantified as the fractional reduction in the frequency of that indel size, comparing different editors. Plots of insertion and deletion positions were produced from data generated in CRISPResso2 and averaged over three independent replicates.
Plots of editing outcome alleles were processed using the standard mode in CRISPResso2. The editing window parameter ‘w’ was set to 30. The plot size parameter ‘plot_window_size’ was set to 30, the minimum allele frequency parameter ‘min_frequency_alleles_around_cut_to_plot’ was set to 0.05 and the allele number parameter ‘max_rows_alleles_around_cut_to_plot’ was set to 30. Accordingly, the top 30 alleles were displayed regardless of frequency.
DNA nick shift frequency and end degradation analysis
To quantitate nick shift frequencies and nicked end degradation for Cas9 variants in cells, editing outcomes for dual-gRNA cutting of genomic DNA were analysed21,22. In these datasets, HEK293T cells were edited with pairs of gRNAs separately targeting either the EMX1 or CXCR4 locus. The CXCR4 dataset was generated from new experiments, whereas the EMX1 dataset was reanalysed from previously published data21. The gRNA pairs were complementary to the same strand at each locus and were expected to make cuts 84 bp apart, resulting in junctions. The loci were amplified and sequenced by high-throughput sequencing. The high-throughput sequencing data were analysed using CRISPResso2 with the expected junction as a reference sequence. To assess DNA nick position, sequencing reads aligned to the junction reference were analysed for retained sequences perfectly matching the sequences between the expected gRNA cut sites. The lengths of these retained sequences in the junctions were used to infer shifts in nick positioning leading to each read. The most frequent cut position with a frequency of 5% of reads or greater was presented as the dominant shifted nick position. The nick shift frequency was quantified as the fraction of reads containing a retained sequence indicating shifted nicks out of the total sequencing reads containing either a retained sequence or a perfect junction sequence. To assess DNA nicked end degradation, sequencing reads aligned to the junction reference were analysed for deletion sequences. The deletions to the PAM side of the junction were inferred to correspond to lengths of non-target strand 5′ end degradation, whereas the deletions to the non-PAM side of the junction were inferred to correspond to lengths of non-target strand 3′ end degradation. The degradation lengths were quantified as the median length of these degradation products. The degradation frequencies were quantified as the fraction of reads containing deletions on the PAM or non-PAM sides of the junction out of all reads containing either a deletion sequence or a perfect junction sequence, and were presented as the ratio between degradation on the PAM side versus non-PAM side.
To estimate nicked end degradation for prime editor variants, editing outcomes for a dual-nick genomic DNA degradation sensor were analysed. In these datasets, HEK293T cells were edited with a paired pegRNA and ngRNA targeting the AAVS1 locus. The gRNA pair was complementary to opposite strands and was expected to make nicks 43 bp apart, resulting in annealing of homologous sequences flanking the nick position. The frequency of these flap homology deletions could then be used to quantify the degradation of the nicked ends for each prime editor variant, as degradation would reduce deletion frequency. The pegRNA also generated a larger edit 8 bp from the nick that could be used as an activity marker, such that the effects of a mutation on overall activity of the prime editor variant could be determined in the same assay. The loci were amplified and sequenced by high-throughput sequencing. The high-throughput sequencing data were analysed using CRISPResso2 as described above for prime-editing data. Flap degradation was quantified as the number of reads containing the activity marker edit divided by the number of reads containing the flap homology deletions, normalized to this ratio for the standard prime editor PE.
To estimate nick shift frequencies for prime editor variants, editing outcomes comparing edits at a positive and negative position relative to a nick were analysed. In these datasets, HEK293T cells were edited with pegRNAs targeting the TGFB1 locus. The pegRNAs targeted the same sequence, but installed an edit either at the +6 or −1 position. The frequency of the −1 edit could then be used to quantify the nick shift frequency, as the nick would have to shift from the canonical position between −1 and +1 to enable a −1 edit. The loci were amplified and sequenced by high-throughput sequencing. The high-throughput sequencing data were analysed using CRISPResso2 as described above for prime-editing data. Negative:positive edit ratios were quantified as the intended edit rate for the −1 edit divided by the total editing rate for the +6 edit, normalized to this ratio for the standard prime editor PE.
Off-target editing
To measure off-target editing, two to three of the most commonly cleaved Cas9 off-target sites for given gRNAs were analysed as previously described2,13. Analysis was performed for off-target editing by pegRNAs targeting the EMX1, FANCF and HEK3 loci at known off-target sites (OT2 and OT3 for EMX1; OT1, OT3 and OT4 for FANCF; and OT1, OT2 and OT4 for HEK3). High-throughput sequencing was performed on these amplified sequences and data were processed using CRISPResso2. The quality filtering parameter ‘q’ was set to 30, whereas the editing window centre parameter ‘wc’ was set to 0 and the editing window size parameter ‘w’ was set to 3. The ‘discard_indel_reads’ option was used to remove reads containing deletions or insertions from the analysis. For each off-target locus, the sequence on the PAM side of the off-target nick was compared with the sequence encoded by the pegRNA template on the PAM side of the target nick to identify the first nucleotide on the off-target locus where these sequences differ. Sequencing reads at the off-target locus that matched the pegRNA template sequence from the nick to this first differing nucleotide position were considered off-target edit reads. Off-target-editing efficiencies were quantified as the fraction of reads marked as off-target edits out of total sequencing reads.
Edit notation
Edits were denoted based on the position where the edit begins relative to expected gRNA nick position for wild-type Cas9, denoting position +1 as 3 bp upstream of the first PAM position. Substitution edits were noted using a ‘>’ mark, deletions were noted by a ‘del’ mark, and insertions were noted by an ‘ins’ mark. The base identities of the strand containing the gRNA spacer sequence were used in all cases.
Statistical analysis
Specific statistical comparisons are indicated in the figure legends. Error bars indicate the standard error for independent replicates as noted. Significance where noted was assessed using unpaired, two-tailed Student’s t-tests. Correlations were determined by Pearson coefficients. Figures and analysis were produced using Graphpad Prism and Microsoft Excel software.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Source link