Extended Data Fig. 3: Transcription and replication time influence DNA damage induced mutation rate but replication strand bias has negligible impact. | Nature

Extended Data Fig. 3: Transcription and replication time influence DNA damage induced mutation rate but replication strand bias has negligible impact.

From: Strand-resolved mutagenicity of DNA damage and repair

Extended Data Fig. 3

a, Relative enrichment (RE) of early versus late replication time for 21 quantile bins of replication fork direction bias (RFD, x-axis shared with b-d). Relative enrichment calculated as RE = (early−late)/(early+late) using the number of nucleotides annotated as early or late replicating in each of the RFD bins. b, Percent of genic nucleotides in each quantile bin, stratified as transcribed (red, >1 transcript per million (TPM) in P15 mouse liver) or non-transcribed (grey). c, Relative enrichment of strand-biassed transcription across RFD bins (RE = (forward-reverse)/(forward+reverse)) calculated using the number of nucleotides contained within the transcription strand resolved genomic span of expressed genes (panel b). d, Mutation rate (nucleotide composition normalised) for RFD bins calculated separately for forward strand and reverse strand lesions, 95% C.I. (whiskers) from bootstrap sampling. e, Percentage of nucleotides that are transcribed (>1 TPM, P15 mouse liver) in each of the 21 quantile bins of replication strand bias (RSB, x-axis shared with f). RSB is the RFD metric but all data oriented so that lesions would be on the reverse strand. f, Mutation rates for the 21 RSB bins. g, Mutation rates (y-axis) points and RSB bins identical to panel f, but x-axis shows the percent of nucleotides with transcription over a lesion strand template, illustrating that transcription using a lesion containing strand is the main determinant of mutation rate. Linear modelling (shaded area 95% C.I.) and extrapolation of this correlation accurately predicts the observed mutation rate in non-genic regions (orange point). h, Mutation rates (y-axis) for the whole genome (gold) stratified into 21 quantile bins of RSB (x-axis). Equivalent analysis is shown for fractions of the genome contained within expressed genes (tan) and non-genic regions (orange). This is a repeat of the analysis shown in Fig. 1f confirming the results using Repli-seq data from a second independent hepatocyte cell line (Hepa1-6 (h), rather than Hep-74.3a (Fig. 1f) that is used except where otherwise stated). i, Multivariate regression modelling based on 10 kb consecutive genomic windows finds all five tested parameters make nominally significant (right of the dashed line), independent contributions to variation in mutation rate (calculated separately for forward strand and reverse strand lesions, blue and gold, respectively). The predominant contributions are transcription over a lesion containing template strand and to a lesser extent replication time. Residual genomic annotation (annotated genes not meeting the >1 TPM threshold for expression) is notably significant, indicating sub-threshold expression contributes to reducing the mutation rate. The results are highly reproducible, independently using either Hep-74.3a and Hepa1-6 Repli-seq measures (circles and crosses, respectively). j, Multi-regression analysis considering only 10 kb segments that are >5 kb from annotated genes, demonstrates significant replication time influences on mutation rate but that replication strand bias does not significantly influence the mutation rate. Forward strand lesions (blue) and reverse strand lesions (gold) calculated separately.

Back to article page