Extended Data Fig. 8: Mutation enrichment and depletion at transcription factor binding sites (TFBS). | Nature

Extended Data Fig. 8: Mutation enrichment and depletion at transcription factor binding sites (TFBS).

From: Strand-resolved mutagenicity of DNA damage and repair

Extended Data Fig. 8

a, The compositionally corrected mutation rate shows helical (10 bp) periodicity over nucleosomes. Separating the mutation rates by the lesion containing strand (blue, forward; gold, reverse) reveals two partially offset periodic profiles (top panel). Orientating both strands 5′ → 3′ demonstrates that the profiles are mirror images (bottom panel). Mutation rate peaks (black) correspond to regions where the DNA major groove faces into the histones, and valleys (red) where the major groove faces outward. Mutation enrichment is shown with shaded 95% bootstrap confidence intervals (blue, gold). b, For the lesion containing strand, mutation rates are significantly higher for the peaks on the 3′ side of the nucleosome dyad than on the 5′ side (significant p-values shown, two tailed Wilcoxon tests). c, Comparing the compositionally corrected multiallelic rates shows significantly increased multiallelic variation for the 3′ peaks (significant p-values shown, two tailed Wilcoxon test), indicating the increased mutation rate results from slower repair on the 3′ side of the dyad. d, The molecular structure of the CTCF:DNA interface (top) reflects the strand specific mutation profiles of CTCF binding sites (histograms, composition corrected). A composite crystal structure of CTCF zinc fingers 2-11 (grey surface) is shown binding DNA (blue & gold strands) and close protein:DNA contacts (≤3 Å) illustrated below the structure. At nucleotide positions with close contact between CTCF and atoms thought to acquire mutagenic lesions (red circles), the corresponding strand specific mutation rates are generally lower than genome-wide expectation (y ≤ 0; excepting apparent A → N mutations considered later). Mutation rates are high (y > 0) for nucleotide positions with backbone-only contacts or no close contacts but still occluded by CTCF. CTCF motif position 6 exhibits an exceptionally high T → N mutation rate that cannot be readily reconciled with the structure, but the strand specificity demonstrates it is a consequence of DEN exposure. e, The profile of DNA accessibility around CTCF binding sites, defines categories of sequence (shaded areas) considered subsequently. f, Mutation rates are higher than genome-wide expectation (y = 0) for CTCF binding motif nucleotides and their close flanks. g, This is not reflected in increased rates of multiallelic variation. CTCF occluded positions (positions -5 to 3 of the CTCF motif) show the greatest elevation of mutation rate but evidence of decreased multiallelic variation. Both high information content (motif-high, bit score>0.2) and low information content (motif-low, bit-score ≤0.2) motif positions have high mutation rates. h, DNA accessibility around non-CTCF transcription factor binding sites (TFBS) as in e. i,j, In contrast to the situation for CTCF, all TFBS categories of sites have suppressed mutation rate compared to genome-wide expectation, y = 0 (i), and suppression of multiallelic variation (j) indicates enhanced repair. However, high information content motif sites (motif-high) have exceptionally reduced mutation rate not similarly reflected by multiallelic variation, suggesting there may be reduced damage in addition to efficient repair at these sites.

Back to article page