This Rmarkdown notebook contains simple R code to generate heatmaps for H4K16ac Input normalized ChIP-seq bigwigs (for female and male third instar larvae of Drosophila melanogaster) using EnrichedHeatmap R library. You can find the datasets used (from Valsecchi et al., 2020) here and here.

All the code is hidden by default. To see it, click the ‘show’ buttons on the right side. For example, the chunk below shows all the R libraries used in this notebook.

library(GenomicRanges)
library(GenomicFeatures)
library(tidyverse)
library(TxDb.Dmelanogaster.UCSC.dm6.ensGene)
library(EnrichedHeatmap)
library(reshape2)
library(gridExtra)
library(rtracklayer)
library(circlize)

There’s the Session Info at the bottom of the page.

Load data

Genes and promoters from TxDb package for dmel

TxDb is a convenient way to load transcriptome databases for model organisms.

genes_dm <- genes(TxDb.Dmelanogaster.UCSC.dm6.ensGene)
genes_dm <- genes_dm[seqnames(genes_dm) %in% c("chr2L", "chr2R", "chr3L", "chr3R", "chrX")]
promos <- promoters(genes_dm, upstream = 0, downstream = 1)

Load bigwigs

rtracklayer provides a convenient way to import a variety of genomic file formats into R.

# Import bw
h4k16ac_fem <- import.bw("~/Downloads/GSM3722191_Dmel_female_H4K16ac_ChIP_mergedReps_mapq10_log2FCvsInput.bw")
seqlevels(h4k16ac_fem) <- paste0("chr", seqlevels(h4k16ac_fem))
h4k16ac_fem <- h4k16ac_fem[seqnames(h4k16ac_fem) %in% c("chr2L", "chr2R", "chr3L", "chr3R", "chr4", "chrX", "chrY")]

h4k16ac_male <- import.bw("~/Downloads/GSM3722201_Dmel_male_H4K16ac_ChIP_mergedReps_mapq10_log2FCvsInput.bw")
seqlevels(h4k16ac_male) <- paste0("chr", seqlevels(h4k16ac_male))
h4k16ac_male <- h4k16ac_male[seqnames(h4k16ac_male) %in% c("chr2L", "chr2R", "chr3L", "chr3R", "chr4", "chrX", "chrY")]

Make heatmaps on TSSs +- 5000 nt

We’ve imported two bigwig files with H4K16ac ChIP-seq results for females and males and prepared a TSS reference dataset. Let’s build a heatmap on all the primary TSSs (+-5000 nt) to see if they show similar distributions.

tss_5kb_mat <- lapply(list(
  "H4K16ac_fem" = h4k16ac_fem,
  "H4K16ac_male" = h4k16ac_male
), function(bw){
  mat <- normalizeToMatrix(bw, promos, value_column = 'score',
                           extend = 5000, w = 50, background = NA,
                           smooth = T,
                           mean_mode = 'absolute')
  return(mat)
})

# Keep genes with at least some signal
indkeep <- !(rowSums(tss_5kb_mat[[1]]) == 0 & rowSums(tss_5kb_mat[[2]]) == 0)
tss_5kb_mat <- lapply(tss_5kb_mat, function(mat) mat[indkeep,])

corr_col_fun = colorRamp2(c(-5, 0, 5), c("darkblue", "white", "red"))


hm_tss_fem <- EnrichedHeatmap(tss_5kb_mat[[1]], name = "ChIP/Input", axis_name_rot = 90,
                              col = corr_col_fun,
                column_title = "H4K16ac L3 female",
                #row_split = factor(km_chip_wh$cluster),
                top_annotation = HeatmapAnnotation(
                  enrich = anno_enriched(
                    ylim = c(-1.3, 0.1),
                    axis_param = list(side = "left"))
                ))

hm_tss_male <- EnrichedHeatmap(tss_5kb_mat[[2]], axis_name_rot = 90,
                               col = corr_col_fun,
                               show_heatmap_legend = FALSE,
                               column_title = "H4K16ac L3 male",
                               top_annotation = HeatmapAnnotation(
                                 enrich = anno_enriched(
                                   ylim = c(-1.3, 0.1)
                                 )
                               )
                                     )

# Create a separate legend for clusters
# lgd_cluster <- Legend(
#   labels = names(chr_col),
#   legend_gp = gpar(fill = chr_col),
#   title = "Chr"
# )

draw(hm_tss_fem+hm_tss_male)

We see that male larvae have elevated level of H4K16ac, but it’s not clear what’s the cause of this effect. It is also possible that this is due to a normalization issue.

Separate TSSs by chromosomes

Let’s look at major drosophila chromosome arms (in terms of gene content) separately. Maybe we’ll find something interesting this way.

chrsp <- factor(seqnames(promos[match(rownames(tss_5kb_mat[[1]]), promos$gene_id)]))
chr_col <- viridis::magma(7)[1:5]
names(chr_col) <- c("chr2L", "chr2R", "chr3L", "chr3R", "chrX")

hm_tss_fem <- EnrichedHeatmap(tss_5kb_mat[[1]], name = "ChIP/Input",
                              col = corr_col_fun, axis_name_rot = 90,
                column_title = "H4K16ac L3 female",
                row_split = chrsp,
                top_annotation = HeatmapAnnotation(
                  enrich = anno_enriched(
                    ylim = c(-1.5, 1.8),
                    gp = gpar(col = chr_col),
                    axis_param = list(side = "left"))
                ))


hm_tss_male <- EnrichedHeatmap(tss_5kb_mat[[2]], axis_name_rot = 90,
                               col = corr_col_fun,
                               show_heatmap_legend = FALSE,
                               column_title = "H4K16ac L3 male",
                               top_annotation = HeatmapAnnotation(
                                enrich = anno_enriched(
                                  ylim = c(-1.5, 1.8),
                                  gp = gpar(col = chr_col))
                              )
                                     )
# Create a separate legend for clusters
lgd_cluster <- Legend(
  labels = names(chr_col),
  legend_gp = gpar(fill = chr_col),
  title = "Chr"
)

htlist <- hm_tss_fem+hm_tss_male
htlist <- Heatmap(chrsp, show_row_names = FALSE, show_heatmap_legend = F, 
    col = chr_col, 
    show_column_names = FALSE, width = unit(2, "mm"))+htlist 

#draw(hm_tss_fem+hm_tss_male, annotation_legend_list = list(lgd_cluster))
draw(htlist,
    cluster_rows = FALSE, show_row_dend = FALSE,
    row_split = chrsp, heatmap_legend_side = "right", ht_gap = unit(2, "mm"))

So, apparently H4K16ac distribution is almost indistinguishable on autosomes between males and females, but there’s a huge enrichment on male X-chromosome! Actually, it’s pretty much expected, because this histone mark is brought by MSL protein complex which promotes the transcriptional upregulation of X-linked genes to match gene expression from a single X-chromosome copy in males to the expression from two X-chromosome copies in females. This is called ‘dosage compensation’.

Session info

 R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Stockholm
tzcode source: internal

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] circlize_0.4.16                           
 [2] rtracklayer_1.64.0                        
 [3] gridExtra_2.3                             
 [4] reshape2_1.4.4                            
 [5] EnrichedHeatmap_1.34.0                    
 [6] ComplexHeatmap_2.20.0                     
 [7] TxDb.Dmelanogaster.UCSC.dm6.ensGene_3.12.0
 [8] lubridate_1.9.3                           
 [9] forcats_1.0.0                             
[10] stringr_1.5.1                             
[11] dplyr_1.1.4                               
[12] purrr_1.0.2                               
[13] readr_2.1.5                               
[14] tidyr_1.3.1                               
[15] tibble_3.2.1                              
[16] ggplot2_3.5.1                             
[17] tidyverse_2.0.0                           
[18] GenomicFeatures_1.56.0                    
[19] AnnotationDbi_1.66.0                      
[20] Biobase_2.64.0                            
[21] GenomicRanges_1.56.1                      
[22] GenomeInfoDb_1.40.1                       
[23] IRanges_2.38.1                            
[24] S4Vectors_0.42.1                          
[25] BiocGenerics_0.50.0                       

loaded via a namespace (and not attached):
 [1] DBI_1.2.3                   bitops_1.0-8               
 [3] rlang_1.1.4                 magrittr_2.0.3             
 [5] clue_0.3-65                 GetoptLong_1.0.5           
 [7] matrixStats_1.4.1           compiler_4.4.0             
 [9] RSQLite_2.3.7               png_0.1-8                  
[11] vctrs_0.6.5                 pkgconfig_2.0.3            
[13] shape_1.4.6.1               crayon_1.5.3               
[15] fastmap_1.2.0               XVector_0.44.0             
[17] utf8_1.2.4                  Rsamtools_2.20.0           
[19] rmarkdown_2.28              tzdb_0.4.0                 
[21] UCSC.utils_1.0.0            bit_4.0.5                  
[23] xfun_0.47                   zlibbioc_1.50.0            
[25] cachem_1.1.0                jsonlite_1.8.8             
[27] blob_1.2.4                  DelayedArray_0.30.1        
[29] BiocParallel_1.38.0         cluster_2.1.6              
[31] parallel_4.4.0              R6_2.5.1                   
[33] bslib_0.8.0                 stringi_1.8.4              
[35] RColorBrewer_1.1-3          jquerylib_0.1.4            
[37] Rcpp_1.0.13                 SummarizedExperiment_1.34.0
[39] iterators_1.0.14            knitr_1.48                 
[41] Matrix_1.7-0                timechange_0.3.0           
[43] tidyselect_1.2.1            rstudioapi_0.16.0          
[45] abind_1.4-5                 yaml_2.3.10                
[47] doParallel_1.0.17           codetools_0.2-20           
[49] curl_5.2.2                  plyr_1.8.9                 
[51] lattice_0.22-6              withr_3.0.1                
[53] KEGGREST_1.44.1             evaluate_0.24.0            
[55] Biostrings_2.72.1           pillar_1.9.0               
[57] MatrixGenerics_1.16.0       foreach_1.5.2              
[59] generics_0.1.3              RCurl_1.98-1.16            
[61] hms_1.1.3                   munsell_0.5.1              
[63] scales_1.3.0                glue_1.7.0                 
[65] tools_4.4.0                 BiocIO_1.14.0              
[67] locfit_1.5-9.10             GenomicAlignments_1.40.0   
[69] XML_3.99-0.17               colorspace_2.1-1           
[71] GenomeInfoDbData_1.2.12     restfulr_0.0.15            
[73] cli_3.6.3                   fansi_1.0.6                
[75] S4Arrays_1.4.1              gtable_0.3.5               
[77] sass_0.4.9                  digest_0.6.37              
[79] SparseArray_1.4.8           rjson_0.2.22               
[81] memoise_2.0.1               htmltools_0.5.8.1          
[83] lifecycle_1.0.4             httr_1.4.7                 
[85] GlobalOptions_0.1.2         bit64_4.0.5