Genome-scale CRISPR screens identify PTGES3 as a direct modulator of androgen receptor function in advanced prostate cancer

Ethical statement

All experiments detailed in this paper were performed in compliance with the Institutional Review Board and the Institutional Animal Care and Use Committee (IACUC) at the University of California, San Francisco and Fred Hutchinson Cancer Center. All animal studies were conducted in compliance with UCSF IACUC guidelines with protocol number AN182067.

Cell lines and reagents

Cell lines LNCaP, C42B, 22RV1, VCaP, DU145, PC3 and 293T were purchased from the American Type Culture Collection (ATCC) and cultured per ATCC protocols in RPMI 1640 (Gibco) or DMEM (ATCC) with 10% FBS (Gibco) in a humidified 5% CO2 incubator at 37 °C. MR49F cells were a gift from A. Zoubeidi (Vancouver Prostate Center) and maintained in RPMI 1640 with 10% FBS containing 10 μM Enzalutamide. Cell line short tandem repeat authentications were conducted (University of California (UC) Berkeley DNA Sequencing facility). All chemicals, unless otherwise stated, were purchased from Sigma Aldrich, Enamine, Combi-blocks or Astatech. Enzalutamide and Apalutamide were purchased from Selleckchem. ML-792 was purchased from MedKoo. AR PROTAC degrader ARD-61 was a gift from S. Wang (University of Michigan).

Endogenous AR reporter cell C42BmNG2-AR

Synthetic sgRNAs (Supplementary Table 1) and Cas9 2NLS nuclease were purchased from Synthego. For the knock-in of mNG2_11, 200-nt homology-directed recombination templates were ordered in single-stranded DNA form as ultramer oligos (Integrated DNA Technologies). C42B cells were treated with 50 ng ml−1 nocodazole (Sigma) for 14 h before electroporation. Cas9–sgRNA RNP complexes were assembled with 100 pmol Cas9 protein and 130 pmol sgRNA before electroporation and combined with 375 pmol homology-directed recombination template and 1 µg pCE-mp53DD (Addgene, cat. no. 41856) in a final volume of 25 µl. Electroporation was carried out following the protocol of SF Cell Line 4D-Nucleofector X Kit (Lonza, cat. no. V4XC-2024). Nocodazole-treated cells (1 × 106) were collected and resuspended to 75 µl SF solution. RNP/template/pCE-mp53DD mix (25 μl) was then added to cell suspension. Cells were electroporated immediately with 4D-Nucleofector (Lonza) program EN-120 and transferred to a six-well plate with prewarmed medium. Electroporated cells were recovered for 5 days before infection with lentivirus expressing mNG2_1-10 (Addgene, cat. no. 82610). At 72 h after infection, mNG2 fluorescence positive cells were sorted by flow cytometry into single clones. Genomic DNA from AR reporter single clones were extracted for sanger sequencing (MCLab) as well as exome sequencing (QB3 UC Berkeley) to verify the mNG2-11 insertion.

CRISPRi flow cytometry screen

The genome-scale CRISPRi screens were performed following the Weissman laboratory protocol (weissmanlab.ucsf.edu) as we reported previously60. In brief, two individual clones of C42BmNG2-AR cells (clone 1 and clone 2) were infected with a lentivirus encoding the CRISPRi dCas9-BFP-KRAB protein (Addgene, cat. no. 46910). Following infection, CRISPRi BFP+ cells were sorted to purity, expanded and CRISPRi silencing activity was confirmed. These cells which stably express CRISPRi were used for both the genome-scale screens and subsequent experiments58. The genome-scale CRISPRi-V2 library (Addgene cat. no. 1000000091) virus was generated and titered following the large-scale lentivirus production protocol (https://weissman.wi.mit.edu/resources/) (Extended Data Fig. 3f). Approximately 150 million C42BimNG2-AR cells were then infected at 30% infection in duplicate with the CRISPRi-V2 lentiviral library with 8 µg ml−1 polybrene (TR-1003-G), to achieve an average of 500× cells per library sgRNA after transduction. After 3 days, cells were selected using 8 µg ml−1 puromycin (−03) for 3 days and then allowed to recover without puromycin for 24 h. Cells were then harvested and fixed in 3% PFA (5 million cells per 1 ml of PFA solution) at room temperature for 20 min, washed with cold 1× phosphate-buffered saline (PBS), quenched by 30 mM glycine/PBS (pH 7.5), washed and resuspended in cold FACS buffer (PBS + 10% FBS) and sorted based on mNG2-AR expression levels on a BD FACSAria Fusion cell sorter to collect the 25% highest and 25% lowest cells. Genomic DNA was harvested with the QIAamp DNA FFPE Tissue Kit (cat. no. 56404). A standard genomic DNA PCR (22 cycles) was performed using NEB Next Ultra II Q5 master mix and primers containing TruSeq Indexes for next-generation sequencing (NGS) analysis. NGS libraries were sequenced on a HiSeq 4000. The genome-scale CRISPRi screens were performed in two independent C42BimNG2-AR clones. Clone 1 was tested in a single replicate as a pilot screen experiment, while Clone 2 was tested in two screen replicates. Common hits identified in both clones were nominated for follow-up.

CRISPR screen data analysis

Raw sequencing data were processed using the Weissman laboratory ScreenProcessing35 protocol and GitHub pipeline (https://github.com/mhorlbeck/ScreenProcessing). After setting up the Python environment and required library files as outlined in the tutorial document, sgRNA counts were generated from the raw sequencing files using the command: ‘run fastqgz_to_counts.py -p 6–trim-start 1_end 29 library_reference/CRISPRi_V2_human.trim_1_29.’ To calculate sgRNA-level and gene-level phenotypes and p-values, the process_experiments.py function was used. The corresponding configuration files are available at GitHub (https://github.com/haolongli87/ptges3_resources). Raw sgRNA counts and phenotype scores are provided in Source Data Fig. 2. To validate the nominated hits, gene-level phenotype scores were also calculated using the MAGeCK pipeline61 based on sgRNA raw counts from both clones, and results are provided in Source Data Fig. 2.

Plasmids, luciferase reporter assays and siRNAs

PTGES3 ORF (Addgene, cat. no. 108224) was cloned into pLVX-TetOne-Puro-C-Flag vector using Gateway cloning (Invitrogen). PTGES3 mutant constructs were cloned with Q5 Site-Directed Mutagenesis (NEB). Guide RNAs expression vector targeting CTRL or individual genes were cloned into pU6-sgRNA EF1Alpha-puro-T2A-mCherry or Lenti-sgRNA-blast (Addgene, cat. no. 104993). Cloning primers sequences were listed in Supplementary Tables 2 and 3. HA-tagged AR construct plasmids were obtained from Addgene, HA-FL-AR (Addgene, cat. no. 171234), HA-AR-NTD (Addgene, cat. no. 171235), HA-AR-NTD-DBD (Addgene, cat. no. 171236), HA-AR-DBD-LBD (Addgene, cat. no. 171236). The ARE was cloned from ARR3tk plasmid (Addgene, cat. no. 132360) into pGL4 vector (Promega) to construct the ARE–luciferase reporter (ARE-luc). Luciferase activities were determined using the luciferin reagent (Promega) according to the manufacturer’s protocol. Transfection efficiency was normalized to Renilla luciferase activities. For siRNA knockdown experiments, cells were transfected with siRNA against control (siNTC, Dharmacon, cat. no. D-001810-10-05), PTGSE3 (siPTGES3-1 Dharmacon, cat. no. J-004496-10-0002, siPTGES3 Dharmacon, cat. no. J-004496-10-0002), or AR (siAR-1 Dharmacon, cat. no. J-003400-06-0002, siAR-2 Dharmacon, cat. no. J-003400-07-0002), Opti-MEM and Lipofectamine RNAiMAX (Invitrogen) at the final concentration of 50 nM, and incubated for at least 48 h. Detailed siRNA information is listed in Supplementary Table 4.

STRING and gene set enrichment analysis

For protein network analysis, the top 51 positive AR regulator hits from the screen were submitted to STRING37 (v.11.5). Genes without gene interactions among any of the top 51 hits were removed. Physical subnetwork was chosen. Connections were based on confidence. Line thickness indicates the strength of data support. Genes were clustered by unsupervised Markov Cluster Algorithm (MCL clustering, inflation parameter = 3).

For RNA-seq, gene set enrichment analyses were performed using the preranked method implemented in the fgsea R package (https://doi.org/10.18129/B9.bioc.GSEABase), and Hallmark gene sets were downloaded from the molecular signatures database (MSigDB), genes were ranked by the Wald-statistics from DESeq2.

RNA extraction and qPCR

RNA was extracted from the cells per manufacturer’s protocol using the Zymo Quick-RNA extraction kit (cat. no. R1054). cDNA was prepared using SuperScript III First-Strand Synthesis System (cat. no. 18080). mRNA expression was measured using primers (Supplementary Table 5) and SYBR Green Real-Time PCR Master Mixes (Thermo) in the QuantStudio Flex Real-Time PCR system.

Western blot, nuclear and cytoplasmic fractionation, and IP

Protein was extracted with either RIPA buffer with protease inhibitor (Thermo, cat. no. 78430). Nuclear and cytoplasmic extracts were prepared using the NE-PER Nuclear and Cytoplasmic Extraction Reagents kit (Thermo Scientific, cat. no. 78833) according to the manufacturer’s instructions. Protein concentrations were quantified by BCA assay before downstream analysis. Western blots were performed as described previously62. For immunoprecipitation, protein lysates were incubated with primary antibody overnight and performed as previous described62. Antibody information is listed in Supplementary Table 6. Each experiment was performed twice to determine reproducibility, representative images were shown.

Mass spectrometry

A total of 1 × 107 22RV1 cells were lysed and then reduced and alkylated at 90 °C for 10 min while shaken at 1,000 rpm per manufacturer’s protocol (PreOmics iST Kit). Clarified protein samples (100 μg) were digested enzymatically using Trypsin/Lys-C mix at 37 °C for 90 min and peptides were desalted and dried down in a speedvac overnight at room temperature (CentriVap, Labconco). Peptide samples were resuspended in Solvent A (2% acetonitrile (ACN), 0.1% formic acid (FA)) and quantified using the Pierce Quantitative Peptide Assay (Thermo Scientific). A total of 1 μg of peptides from each sample were loaded on to an EASY-Spray nanocolumn (Thermo Fisher Scientific, cat. no. ES900) installed on Dionex Ultimate 3000 NanoRSLC coupled with Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). Peptides were separated over a 95-min gradient of ACN ranging from 2% to 30% ACN followed by a quick ramp up to 80% ACN. Tandem mass spectrometry (MS/MS) scans were performed over mass range of m/z 350–1,500 with resolution of 17,500. The isolation window was set to 4.0 m/z and the charge state was set to 2.

The MS/MS raw data (.raw files) were processed using MSFragger within FragPipe63 with default label free quantification settings and searched against the human uniprot database. The contaminant and decoy protein sequences were added to the search database using Fragpipe. The computational tools PeptideProphet and ProteinProphet were used for statistical validation of results and subsequent mapping of the peptides to the proteins respectively with 1% false discovery rate. The label free quantification intensities were considered for secondary data analysis. Features having missing values of more than 50% were eliminated, and k-nearest neighbor was used to impute the missing values. The data were log transformed, normalized and subjected to Welch’s t-tests (Source Data; Fig. 3). Proteins with P values less than 0.05 were used to identify the differentially expressed (DE) proteins. Statistical analysis and data visualization were performed in R studio and MetaboAnalyst 5.064.

IHC and staining score evaluation

The prostate tumor microarray (TMA) with correlated clinical information from 120 clinically localized PCa patients was constructed by the University of Michigan Center for Translational Pathology as previously reported65. IHC was performed as described previously66. Briefly, paraffin-embedded TMA slides were deparaffinized and rehydrated following standard protocols. Antigen retrieval was carried out with HIER EDTA buffer. Endogenous peroxidase activity was blocked using 1% hydrogen peroxide. Slides were probed overnight at 4 °C with anti-PTGES3 (1:20, Sigma cat. no. HPA038672), washed, incubated with biotinylated secondary anti-rabbit antibody (Jackson cat. no. 111-065-144, 1:2,000), followed by incubation with streptavidin-HRP (Invitrogen cat. no. SA10001, 1:250). Staining was visualized using DAB developing kit (Vector Laboratories) and nuclei were counterstained with hematoxylin (Vector Laboratories). The stained TMA slide was scanned by a Leica Aperio AT2 scanner and digital images were evaluated independently and scored blindly by two pathologists (B.A.S. and J.Z.). IHC scores of nuclear PTGES3 were calculated by IHC signal intensity (0–1 as weak and 2–3 as strong).

Recombinant protein expression and purification

Protein expression constructs, expression and purification of recombinant mouse AR constructs, including the DNA-binding domain (ARDBD, residues 548–651) and a variant lacking the N-terminal domain (ARΔNTD, residues 548–909) were performed as described previously50,51. Full-length human PTGES3 was engineered with a Smt3 N-terminal tag and cloned into pET28a (Gene Universal), transformed into BL21DE3 codon plus cells (Novagen) and grown in lysogeny broth, with expression induced by addition of 0.2 mM isopropyl-β-D-thiogalactoside and overnight shaking at 18 °C. Cells were lysed by French press (Constant Systems) and supernatants purified by Ni-NTA (Qiagen), followed by purification by size exclusion chromatography (Superdex 75, Cytiva), overnight cleavage of the Smt3 tag by Ulp1 (when assaying the tagless variant) and final purification by size exclusion chromatography (Superdex 75, Cytiva) in a final buffer of 500 mM NaCl, 20 mM Hepes pH 7.5, 1 mM Tris(2-carboxyethyl) phosphine (TCEP), 5% glycerol and 0.005% IGEPAL CA-630.

DNA-binding assays

Duplex DNAs (36-base-pair) with 5′ fluorescein-labels were purchased from Integrated DNA Technologies. The sense strand of the duplex DNAs had the following sequences, with AREs in bold: ARE: /56-FAM/TAAAATGTAAACAACGTAGAACATCAGGAACTCCGG, Canonical ARE: /56-FAM/TAAAATGTAAACAACGTAGAACATCATGTTCTCCGG; binding buffer for fluorescence polarization (FP) and electrophoretic mobility shift assays (EMSA) consisted of 150 mM NaCl, 20 mM HEPES pH 7.5, 1 mM TCEP, 1 μM DHT, 10% glycerol and 0.05% IGEPAL CA-630. Unless otherwise indicated, equimolar amounts of ARΔNTD (FP) or ARDBD (EMSA) and PTGES3 variants were preincubated on ice for 30 min at 8 μM before mixing with the 100 nM of the indicated DNA. For EMSAs, gel shifted products were resolved on 4% to 20% Tris/borate/EDTA (TBE) PAGE, data from n = 3 experiments were analyzed by densitometry using Fiji software, and fit to a two-phase specific binding model (GraphPad Prism). To calculate apparent Kd values for the FP data, a model for receptor depletion was applied, and data presented as mean ± s.d. from n = 4 experiments. Data were analyzed using two-way ANOVA, with ****P < 0.0001; NS (not significant), P > 0.05.

Mass photometry

Complex formation between ARΔNTD and Smt3-PTGES3 in the absence and presence of unlabeled 36-base-pair double-stranded ARE DNA (bearing the sequence TAAAATGTAAACAACGTAGAACATCATGTTCTCCGG, with AREs in bold) was assessed by mass photometry (Refeyn TwoMP). Tagged PTGES3 was used as its molecular weight is within the linear range of the Refeyn TwoMP assay (32 kDa), and because we determined that the Smt3 tag does not interfere with the ability of PTGES3 to alter AR’s DNA binding activity. Individual proteins and AR–PTGES3 complexes were prepared at 5 μM and incubated on ice for 30 min in DNA binding buffer the presence or absence of DNA. Samples were subsequently flash diluted into PBS to a final concentration of 10 nM. Video recordings were performed immediately to detect single particles over the course of 1 min at a rate of 100 frames per second in ratiometric acquisition settings. Ratiometric counts were converted to molecular weight in kilodaltons using a standard curve generated with ovalbumin (43 kDa), conalbumin (75 kDa), aldolase (158 kDa) and thyroglobulin (669 kDa). Data were analyzed with Refeyn DiscoverMP software to produce histograms of mass values representing the bulk population and represented as normalized counts. Data shown are representative of at least three independent measurements.

Structural modeling of AR–PTGES3 complexes

Structural models of PTGES3 and AR chaperone and DNA-bound complexes were generated using a combination of experimentally derived structural models (PDB: 1XOW; PDB: 1R4I; PDB: 7KRJ)43,50,67,68 and Alphafold 369. For Alphafold models, structural models are color coded according to pLDDT confidence estimates, where red is very high and blue is very low confidence. PyMOL (Schrödinger) was used to perform structural alignments and for figure rendering.

ChIP–qPCR and Dual X ChIP–qPCR

ChIP and Dual X ChIP assays were performed as described previously62,70. In brief, 1 × 107 cells were collected, washed and crosslinked with 1% formaldehyde at room temperature for 10 mins and then quenched in 125 mM glycine. For Dual X ChIP crosslinking, cells were treated with 1.5 mM ethylene glycol bis (EGS in PBS, Sigma) for 30 mins, subsequently crosslinked with 1% formaldehyde at room temperature for 10 mins, and quenched with 1.25 M glycine for 5 mins. Genomic DNA extraction and ChIP assay were then performed using Bioruptor Pico sonication (Diagenode) and HighCell# ChIP kit (Diagenode, cat no. C01010061). Antibodies used for ChIP–qPCR are listed in Supplemental Table 6. Bound DNA was quantified by qPCR (SYBR Green master mix, Invitrogen) using the primer sets listed in Supplementary Table 7. The qPCR results are presented as fold enrichment over control IgG antibody and normalized based on the total input (nonprecipitated chromatin). Primers for the GAPDH promoter were used as a negative control.

RNA sequencing

RNA was extracted from the cells and reverse transcribed to DNA as described above. The QuantSeq 3’mRNA-Seq library prep kit FWD (Lexogen, cat. no. 015.24) was used to prepare NGS gene expression libraries per the manufacturer’s protocol. Quality control was performed by using the Agilent Bioanalyzer 2100 system and the samples were sequenced using an Illumina HiSeq 4000. The entire RNA-seq experiment was done in two biological replicates. The RNA-seq single-end fastq data generated by Illumina HiSeq 4000 sequencing system were first trimmed to remove adapter sequences using Cutadapt v.2.6 with the ‘-q 10 -m 20’ option71. After adapter trimming, FASTQC v.0.11.8 was used to evaluate the sequence trimming as well as overall sequence quality. Using the splice-aware aligner STAR72 (v.2.7.1a), RNA-seq reads were aligned onto the Human reference genome build GRCh38decoy using the ‘–outSAMtype BAM SortedByCoordinate–outSAMunmapped Within–outSAMmapqUnique 50–sjdbOverhang 65–chimSegmentMin 12–twopassMode Basic’ option and exon–exon junctions, with human gene model annotation from GENCODE v.30. Gene expression quantification of uniquely mapping reads was performed using the ‘featurecount’ function within Rsubread R package73 with ‘GTF.featureType = ‘exon,’ GTF.attrType = ‘gene_id,’ useMetaFeatures=TRUE, allowMultiOverlap=FALSE, countMultiMappingReads=FALSE, isLongRead=FALSE, ignoreDup=FALSE, strandSpecific=0, juncCounts=TRUE, genome=NULL, isPairedEnd=FALSE, requireBothEndsMapped=FALSE, checkFragLength=FALSE, countChimericFragments=TRUE, autosort=TRUE’ option. Cross-sample normalization of expression values and differential expression analysis between the PTGES3 knockdown and control was done using DESeq2 R package74. Benjamini–Hochberg corrected P < 0.05 and log2 foldchange >1 or < −1 were considered statistically significant. All pipelines for RNA-seq data processing are described previously75 and available at https://github.com/haolongli87/ptges3_resources.

Chromatin immunoprecipitation sequencing

Cells were fixed with 1% formaldehyde in PBS for 10 min at room temperature and then quenched in 125 mM glycine. Nuclei were sequentially isolated using ice-cold LB1 buffer (50 mM HEPES-KOH, pH 7.5, 140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% NP-40, 0.25% Triton X-100), followed by ice-cold LB2 buffer (10 mM Tris-HCl, pH 8.0, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA). Subsequently, nuclear lysis was performed on ice for 10 min using ChIP lysis buffer (1% SDS, 5 mM EDTA, 50 mM Tris-HCl, pH 8.1). Chromatin was sheared to approximately 300-bp fragments with a Bioruptor Plus Sonicator (Diagenode). Immunoprecipitation was carried out using an anti-AR antibody (ab108341). Protein–DNA complexes were reverse crosslinked overnight at 65 °C. ChIP and input DNA samples were purified the following day with the QIAquick PCR Purification Kit (Qiagen, cat. no. 28104). Sequencing libraries were constructed from approximately 0.8 ng ChIP DNA using the ThruPLEX DNA-Seq Prep Kit (Takara Bio, cat. no. R400675) with 12 cycles of PCR amplification. Libraries underwent NGS (100 bp, paired-end) on an Illumina NovaSeq X instrument at the Fred Hutch Genomics Core.

ChIP–seq data analyses

Raw FASTQ sequencing data were trimmed using Trimmomatic (v.0.39) and aligned to the human genome (hg38) using Burrows–Wheeler Aligner76 (BWA-mem, v.0.7.17). Alignments with MAPQ scores lower than 30 were filtered out using SAMtools77, and ENCODE blacklisted regions78 were excluded using BEDTools79 (v.2.31.0). Duplicate reads were identified and removed using Picard MarkDuplicates (v.2.25.1) (http://broadinstitute.github.io/picard). Peak calling was performed using MACS380 (v.3.0.0) with a Q value cutoff of 0.01. Normalized BigWig files were generated by bamCoverage from deepTools81 (v.3.5.4) using RPGC normalization. Coverage heatmaps centered around genomic features of interest were produced using the computeMatrix and plotHeatmap modules within deepTools (v.3.5.4). Visualization snapshots were obtained using Integrative Genomics Viewer82 (IGV, v.2.19.1). Motif analyses utilized HOMER83 (v.4.11), applying the Known Motif Discovery approach.

ATAC-seq

ATAC-seq was performed as described previously84,85 with the following modifications. Cells were resuspended in buffer (Illumina, cat. no. 20034198), incubated on ice for 10 min and lysed using a dounce homogenizer. A total of 50,000 nuclei were incubated with 25 µl 2× TD Buffer and 1.25 µl Transposase (Illumina Tagment Enzyme/Buffer, cat. no. 20034210) shaking at 300 rpm at 37 °C for 30 min. Zymo DNA Clean and Concentrator 5 kit (cat. no. D4014) was then used to purify DNA. Transposed DNA was amplified using PCR master mix and indexes from Nextera DNA Library Prep kit (cat. no. 15028211) for five cycles and then assessed using qPCR. Final cleanup was performed using 1.8× AMPure XP beads (cat. no. A63881) and libraries quantified using the DNA High Sensitivity Agilent 2100 Bioanalyzer System. Samples were sequenced at the UCSF Core Facility on a HiSeq4000 as PE100 libraries. The entire experimental setup was performed in two technical replicates.

ATAC-seq data processing

The ATAC-seq paired-end fastq files were first trimmed to remove Illumina Nextera adapter sequence using Cutadapt v2.686 with the ‘-q 10 -m 20’ option. FASTQC v.0.11.887 was used to evaluate sequence trimming and overall sequence quality. Bowtie2 v.2.3.5.188 was used to align ATAC-seq reads against the Human reference genome build GRCh38decoy using the ‘–very-sensitive’ option. Uniquely mapped reads were obtained in SAM format. Samtools version 1.989 was used to convert SAM to BAM file and sort the BAM file. Picard (https://broadinstitute.github.io/picard/) was then used to flag duplicate reads using the MarkDuplicates tool using ‘REMOVE_DUPLICATES=true’ option. The resulting BAM file reads position were then corrected by a constant offset to the read start (‘+’ stranded +4 bp, ‘−’ stranded −5 base pairs) using deepTools2 v.3.3.281 with the ‘alignmentSieve–ATACshift’ option. This resulted in the final aligned, de-duplicated BAM file that was used in all downstream analyses. ATAC-seq peak calling was performed using MACS2 v.2.2.580 to obtain narrow peaks with the ‘callpeak -f BAMPE -g hs –qvalue 0.05–nomodel -B–keep-dup all–call-summits’ option. The resulting peaks that map to the mitochondrial genome or genomic regions listed in the ENCODE hg38 blacklist (https://www.encodeproject.org/annotations/ENCSR636HFF/) or peaks that extend beyond the ends of chromosomes were filtered out. Nonoverlapping unique ATAC-seq narrow peak regions were obtained from all samples analyzed. Only those nonoverlapping unique peak regions present in at least two samples were considered for further analysis. Sequencing reads mapped to these nonoverlapping unique regions were counted using ‘featurecount’ function within Rsubread72 R package with the ‘isPairedEnd=TRUE, countMultiMappingReads=FALSE, maxFragLength=100, autosort=TRUE’ option. Further library-size normalization of the feature counts and differential OCRs between sgPTGES3 and sgCTRL were obtained using the DESeq274 R package. Only those peak regions with Benjamini–Hochberg corrected P < 0.05 and log2 foldchange >1 or <−1 were considered statistically significant. The ATAC-seq peaks were annotated using ChIPseeker90 R package based on hg38 GENCODE v.30 annotations.

Over-representation analysis

The genes nearest to the ATAC-seq peaks that mapped within the promoter regions of the corresponding gene were considered for this analysis. These genes were tested for enrichment against Hallmark gene sets in Molecular Signature Database (MSigDB) v.7.0. A hypergeometric test-based over-representation analysis was used for this purpose. A cutoff threshold of false discovery rate ≤ 0.01 was used to obtain the significantly enriched gene sets.

TF binding analysis

The differentially accessible ATAC-seq peaks that mapped to promoter and intergenic regions were used for TF binding analysis. The MEME tool91 was used to discover TF binding motifs de novo. These potential TF binding motifs were annotated for known TF motif from the JASPAR92 database using the TomTom tool within the MEME tool suite. All pipelines for ATAC data processing are described previously93 and are available at https://github.com/haolongli87/ptges3_resources.

Clinical cohort analysis

Publicly available gene expression data from a matched cohort established previously59 or mCRPC was downloaded from cBioportal (20 May 2021)94,95,96. Samples were grouped based on expression levels above or below the 25th percentile for PTGES3 separately for poly-A or capture-based RNA-seq, and the capture-based results were used if a sample had sequencing data available from both methods. Differences in survival between groups were visualized using the Kaplan–Meier method and a log-rank test was used to test for differences in survival, using the survival and survminer R packages97,98. Hazard ratios were calculated using Cox proportional hazards regression.

Tumor models

All animal studies were conducted in compliance with UCSF Institutional Animal Care and Use Committee (IACUC) guidelines. NOD-SCID-Gamma (NSG) mice (Jackson Laboratory, strain code 005557) were housed in a pathogen-free barrier facility under a 12 h light/12 h dark cycle at 18–23 °C and 40–60% humidity. All in vivo experiments used male mice aged 6–8 weeks. Doxycycline-inducible LNCaPi cells established previously99 were lentiviral transfected with constructs encoding sgPTGES3 or sgCTRL and puromycin selected. Cells were mixed with Matrigel (Corning, cat. no. 354230, 1:1) and 2 × 106 cells were injected subcutaneously on each flank of male NSG mice. Once the tumors were palpable, mice were randomized into two groups receiving either doxycycline diet (Bio-Serv, cat. no. S3888) or control diet. Tumors were measured using digital calipers. Tumor volume was calculated using the equation: volume = length × width2 × 0.52, where the length represents the longer axis. Average tumor volume was plotted and two-way ANOVA was used to measure statistical significance denoted by asterisk (*P < 0.05, **P < 0.01, ***P < 0.001). The mice were humanely euthanized per UCSF LARC protocol once the tumor volume reached 1,000 mm3.

WST-1, clonogenic and IncuCyte assays

Cell viability assays were performed as described previously62 using the cell proliferation reagent WST-1 (Sigma) according to the manufacturerʼs protocol. Clonogenic assays were performed as described previously99. In brief, 1,000 cells were seeded in a six-well plate and treated as indicated for 10 days. The cell colonies were then washed with PBS and fixed/stained with 25% methanol plus crystal violet (0.05% w/v). Images were scanned and analyzed using a GelCount (Oxford Optronix). Average number of colonies was counted. For IncuCyte experiment, cells labeled with Nuclight red were seeded to 96-well plates. The time relapse images were captured and analyzed with by the Incucyte S3 live-cell analysis system.

Proximity ligation assays

Proximity ligation assays were performed using the Duolink in situ red starter kit mouse/rabbit (Sigma). LNCaP cells were fixed with 4% PFA and permeabilized by 0.2% Triton X-100. Fixed cells were incubated with primary antibodies overnight at 4 °C. Antibodies used for proximity ligation assays are listed in Supplementary Table 6. Secondary probe, ligation and amplification reactions were performed following the manufacturer’s instructions. Fluorescence images were captured by Zeiss fluorescent microscope (Carl Zeiss).

Statistical analysis and reproducibility

Spearman’s correlation was used to determine statistical significance for all the correlation plots. For gene expression and correlation, the Wilcoxon rank-sum test was used to test for differences between two groups, unless otherwise stated. Unpaired t test were used to determine statistical analysis for the column plots, denoted by asterisk (*P < 0.05, **P < 0.01, ***P < 0.001). Two-way ANOVA was used to determine statistical significance in the in vivo data. In RNA-seq data, the Benjamini–Hochberg test was performed. Corrected P < 0.05 and log2 foldchange >0.5 or <0.5 were considered statistically significant. In ATAC-seq data peak regions with Benjamini–Hochberg corrected P < 0.05 and log2 foldchange >0.5 or <0.5 were considered statistically significant. No statistical methods were used to predetermine sample sizes for any experiments. For all analyses, data distribution was assumed to be normal, but this was not formally tested.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.


Source link

Leave a Reply

Your email address will not be published. Required fields are marked *