The unfolded protein response of the endoplasmic reticulum protects Caenorhabditis elegans against DNA damage caused by stalled replication forks

Visualisation
Author
Affiliation

Richard J. Acton

Published

March 24, 2024

Modified

March 24, 2024

paper doi: 10.1093/g3journal/jkae017

We will be Focusing out attention on figure 7 E & F from this paper (Xu et al. 2024). This is as something of a pre-text to discuss the challenging problem that is the useful visual summary of the results of functional enrichment analyses.

Figures 7 E & F do not do a great job of highlighting the insights that the authors drew from the underlying data analysis. In their original context in the easyGSEA dashboard on eVITTA referenced by the authors these graphic make a little more sense where they are an interactive graphics with tooltips, and where clicking on the bars for a terms exposes further information. They are one of a number of visuals with which you can explore this data, so do not have to do as much heavy lifting on their own. Designing a graph to accommodate arbitrary dynamically generated contents is a harder problem than designing a one-off visual especially where data ranges can vary quite dramatically. Generally you have to make certain compromises to achieve generality for a graph. That said there are some critiques which still apply even with the benefit of this context.

There is a problem of emphasis in these plots. The terms are arguably the most important thing for people to be able to read and interpret. Yet the terms are buried in strings with the ‘aspect’ or source of the term and the terms identifier. They are separated with underscores, not aligned to the terms’ starts in a predictable place and sometimes truncated. The direction is for the most part as important as the terms, as knowing the know the direction of change of a term can also be important for its interpretation. The specific significance, score and rank order of terms is less cruital, interpretation of functional enrichment results is a pretty qualitative process.

The ID of the term is very helpful for anyone doing follow up to weigh other factors in how to interpret these functional enrichment results but is only needed for reference and it need not feature prominently. (Having it sometimes truncated is not optimal for the ability to look up the term.)

Other factors that are potentially relevant for interpretation are:

A smaller term with all or a very high fraction of genes showing a difference, that might not be as significant as a larger term where a mere substantial fraction of genes show a change, due to the effective ‘limit of resolution’ resulting from its small size, might be more biologically interesting to follow up.

The latter point is perhaps the hardest to capture in a summary

When assessing a single functional enrichment analysis, and not attempting to compare the enrichment of multiple different gene sets, the case can be made that a table is a preferable form of visual representation of this information than a graph. We are asking: ‘Which subsets of a set of genes that differ from some single point of comparison, by varying degrees, stand out in some way?’. Which subsets we look at, how we quantify the degree of difference from some baseline and what we mean by ‘stand out’ can all vary but we are starting from a single vector of functional elements, be they genes, transcripts, or proteins; with a single metric, be it their rank order or some score such as p-value or \(log_2\) fold change. Thus a single table of terms with columns to represent various parameters about these sets can potentially provide us with a more efficient and effective representation.

For anyone thinking about the design of tables for a scientific publication an good resource is chapter 2 of Maarten Boers’ excellent book: “Data Visualization for Biomedical Scientists” (Boers 2022).

Code
suppressPackageStartupMessages({
    library(purrr)
    library(dplyr)
    library(tidyr)
    library(fs)
    library(readr)
    library(gt)
    library(gtExtras)
    library(DT)
    library(reactable)
    library(reactablefmtr)
    library(webshot2)
    library(htmltools)
    library(tippy)
    library(ggplot2)
    library(uuid)
})
set.seed(42)

A good case can be made that gene set enrichment analysis results lend themselves more to table and table-like representations than graphs.

There are number of reasons that a table or a hybrid table/graph may be better suited to this data than a purely graphical representation.

Key information is the term name which is text, often quite long and quite varied in length.

Rank order, precision, multiple variables for each term - that can provide important context for interpretation

How big the terms are. Which genes in the leading edge set (in GSEA, which differentially expressed genes are in over-represented terms in other types of functional enrichment analysis) Overlap between the genes in one term and another. broad terms can be parents of narrower terms and be enriched as a result of changes in the same or different sets of genes

Data underpinning these figure panels is available directly from figshare, great!

Code
# Downloading data 
data_file_urls <- c(
    "table_s4.csv" = "https://ndownloader.figshare.com/files/44137139",
    "table_s5.csv" = "https://ndownloader.figshare.com/files/44137145"
)
if(!all(file_exists(names(data_file_urls)))) {
    purrr::iwalk(data_file_urls, ~download.file(.x, .y))
}

R has lots of options for packages that format tables some examples include:

These have varying degrees of support for use in printed vs web based outputs. gt and kable tables have good print support and also work web outputs, gt is a little better at being consistent between different outputs formats. rhandsontable is somewhat spreadsheet like and can be good in shiny apps with with complex interactive tabular inputs. DT and reactable have good interactive sorting, searching and filtering capabilities which can run client-side in javascript providing quite a lot of interactivity from a static web page as long as the data is not too big. However reactable is perhaps a little better documented for R users and to perform some more advanced customisations with DT knowledge of JS/CSS is more necessary. flextable has quite nice syntax somewhat reminiscent of ggplot2. It is better at static rather than dynamics tables as opposed to DT, reactable, & rhandsontable, R native visual customization are also easier than DT or reactable where JS/CSS knowledge is helpful. It makes it very easy to embed arbitrary R plots into cells of your table so making custom ‘sparkline’ style plots with ggplot2 and adding them to your tables is easy. It also plays very well with embedding tables into other larger R graphics using grid.

#' process_gsea_table
#' 
#' @param path the path to a csv file with GSEA data
#' 
1process_gsea_table <- function(path) {
2    path %>%
3        readr::read_csv(show_col_types = FALSE) %>%
4        tidyr::extract(
            pathway, into = c("aspect", "term", "id"),
            regex = "(\\w+?)_(.*?)%(\\w+)",
            remove = FALSE
        ) %>%
5        dplyr::mutate(
6            size = as.integer(size),
7            term = gsub("_", " ", term),
8            Genes = strsplit(Genes, ";"),
9            n_genes_in_leading_edge_subset = purrr::map_int(Genes, length),
10            direction = dplyr::if_else(sign(NES) == 1, "Up", "Down")
        )
}
1
Function for gene set enrichment analysis data processing
2
Send the path to the file to the read comma separated values function
3
Hide the column type guesses
4
Process the contents of the pathway column into their constituent components. make three new columns aspect, term, and id. Three groups (one per column). First group: matching 1 or more word characters non-greedily, an underscore, 2nd group: 0 or more of any value non-greedily, a % sign, group 3: 1 or more word values.
5
Make some changes to the following columns
6
convert the type of size to an integer
7
remove all _ from terms and replace them with spaces
8
Split genes string into a list of genes by the ; character
9
Count the number of genes in the list
10
If NES is positive set direction to Up if it’s negative to Down
Code
table_s4 <- names(data_file_urls)[1] %>% process_gsea_table()
# table_s4

A Quick approximate reproduction of Figure 7 (e)

Code
table_s4 %>%
    dplyr::group_by(direction) %>%
    dplyr::slice(1:30) %>%
    dplyr::ungroup() %>%
    dplyr::arrange(desc(direction),desc(ES)) %>%
    dplyr::mutate(
        `-log10(pval)*sign(ES)` = -log10(pval) * sign(ES),
        pathway = factor(pathway, levels = pathway, ordered = TRUE)
    ) %>%
    
    ggplot2::ggplot(ggplot2::aes(ES, pathway)) + 
        ggplot2::geom_point(
            ggplot2::aes(
                size = n_genes_in_leading_edge_subset,
                color = `-log10(pval)*sign(ES)`
            ), alpha = 0.8
        ) +
        ggplot2::guides(size = "none") +
        ggplot2::scale_color_distiller(palette = "RdBu", type = "div") + 
        ggplot2::geom_vline(xintercept = 0, color = "grey") +
        ggplot2::theme_minimal() + 
        ggplot2::labs(x = "Enrichment Score (ES)", y = "")

Code
table_s5 <- names(data_file_urls)[2] %>% process_gsea_table()
# table_s5

only Biological process terms are present in table S4

table_s4 %>% distinct(aspect)
# A tibble: 1 × 1
  aspect
  <chr> 
1 BP    

tables are pre-filtered

table_s4 %>% nrow()
[1] 83
table_s4 %>% filter(pval < 0.05, padj < 0.25) %>% nrow()
[1] 83

gt example

Code
table_s4_gt_prep <- table_s4 %>%
    dplyr::group_by(direction) %>%
    dplyr::slice(1:30) %>%
    dplyr::ungroup() %>%
    arrange(dplyr::desc(direction), ES) %>%
    dplyr::select(
        dir = direction,
        term, NES, padj,
        n = n_genes_in_leading_edge_subset,
        size, go_id = id
    ) %>%
    dplyr::mutate(
        leading_pc = round((n / size) * 100),
        leading = paste0(n, "/", size, " (%", leading_pc, ")"),
        go = paste0("GO:", go_id),
        go_url = paste0("https://amigo.geneontology.org/amigo/term/GO:", go_id),
        # `GO ID` = purrr::map2(go, go_url, gtExtras::gt_hyperlink)
        `GO ID` = purrr::map2(go_url, go, ~{
            lnk <- paste0('<a href="', .x, '">', .y, '</a>')
            attributes(lnk)$html <- TRUE
            attributes(lnk)$class <- c("html", "character") 
            lnk
        })
    ) %>%
    dplyr::group_by(dir) %>%
    dplyr::arrange(dplyr::desc(dir), padj) %>%
    dplyr::relocate(term, NES, padj, leading, `GO ID`)

table_s4_gt_table <- table_s4_gt_prep %>%
    gt::gt() %>%
    gt::tab_options(
        table.width = 900,
        data_row.padding = gt::px(3),
        row_group.font.weight = "bold"# ,
        # latex.use_longtable = TRUE # not supported until v0.11.1
    ) %>%
    gtExtras::gt_color_rows(
        columns = padj,
        palette = "khroma::batlow",
        domain = table_s4_gt_prep$padj #!!!
    ) %>%
    gtExtras::gt_color_rows(
        columns = NES,
        palette = "RColorBrewer::RdBu",
        direction = 1, 
        domain = table_s4_gt_prep$NES #!!!
    ) %>%
    gt::fmt_number(columns = c(padj), decimals = 4) %>% # pval,
    gt::fmt_number(columns = c(NES), decimals = 3) %>% # ES, 
    gt::cols_hide(c(n, size, leading_pc, go, go_id, go_url)) %>%
    gt::tab_style(
        style = gt::cell_borders(sides = "all", weight = gt::px(0)),
        locations = gt::cells_body()
    )
Warning: Since gt v0.9.0, the `colors` argument has been deprecated.
• Please use the `fn` argument instead.
This warning is displayed once every 8 hours.
Code
table_s4_gt_table
term NES padj leading GO ID
Up
peptidyl-tyrosine phosphorylation 1.695 0.0802 31/77 (%40) GO:0018108
protein dephosphorylation 1.889 0.0802 41/108 (%38) GO:0006470
peptidyl-serine phosphorylation 1.975 0.0802 55/119 (%46) GO:0018105
olfactory behavior 1.841 0.0802 37/67 (%55) GO:0042048
detection of chemical stimulus involved in sensory perception of smell 2.116 0.0802 50/60 (%83) GO:0050911
nuclear-transcribed mRNA catabolic process 1.811 0.0802 7/19 (%37) GO:0000956
response to stress 1.781 0.0802 7/16 (%44) GO:0006950
DNA repair 1.492 0.0813 45/128 (%35) GO:0006281
spermatogenesis 1.542 0.0813 21/58 (%36) GO:0007283
RNA phosphodiester bond hydrolysis, endonucleolytic 1.547 0.0813 13/57 (%23) GO:0090502
sensory perception of chemical stimulus 1.642 0.0813 29/68 (%43) GO:0007606
nucleosome assembly 1.653 0.0813 5/16 (%31) GO:0006334
RNA metabolic process 1.787 0.0813 9/21 (%43) GO:0016070
proteolysis involved in protein catabolic process 1.551 0.1063 16/48 (%33) GO:0051603
protein import into nucleus 1.580 0.1063 16/30 (%53) GO:0006606
DNA replication 1.512 0.1073 36/72 (%50) GO:0006260
mitotic sister chromatid segregation 1.554 0.1244 23/33 (%70) GO:0000070
DNA recombination 1.522 0.1436 18/32 (%56) GO:0006310
histone acetylation 1.528 0.1464 6/28 (%21) GO:0016573
germ cell proliferation 1.532 0.1581 12/16 (%75) GO:0036093
gastrulation 1.504 0.1694 5/22 (%23) GO:0007369
polar body extrusion after meiotic divisions 1.492 0.1865 17/21 (%81) GO:0040038
DNA replication initiation 1.494 0.2034 11/16 (%69) GO:0006270
Down
neuron differentiation −1.765 0.0813 25/36 (%69) GO:0030182
chemosensory behavior −1.824 0.0813 14/41 (%34) GO:0007635
mechanosensory behavior −1.615 0.0813 7/29 (%24) GO:0007638
protein glycosylation −1.804 0.0813 31/88 (%35) GO:0006486
endocytosis −1.602 0.0813 28/92 (%30) GO:0006897
regulation of postsynaptic membrane potential −1.530 0.0813 25/91 (%27) GO:0060078
regulation of membrane potential −1.538 0.0813 25/93 (%27) GO:0042391
excitatory postsynaptic potential −1.491 0.0813 23/85 (%27) GO:0060079
nervous system process −1.423 0.0813 24/89 (%27) GO:0050877
fatty acid metabolic process −1.403 0.0813 31/90 (%34) GO:0006631
locomotion −1.440 0.0829 25/96 (%26) GO:0040011
nervous system development −1.941 0.0996 42/114 (%37) GO:0007399
monoatomic ion transmembrane transport −1.799 0.1063 36/125 (%29) GO:0034220
chemical synaptic transmission −1.381 0.1063 30/128 (%23) GO:0007268
regulation of monoatomic ion transmembrane transport −1.627 0.1196 17/28 (%61) GO:0034765
regulation of catalytic activity −1.722 0.1223 44/140 (%31) GO:0050790
potassium ion transport −1.576 0.1244 28/55 (%51) GO:0006813
axon guidance −1.553 0.1313 22/64 (%34) GO:0007411
mitochondrion organization −1.550 0.1419 18/34 (%53) GO:0007005
regulation of oviposition −1.421 0.1436 22/48 (%46) GO:0046662
potassium ion transmembrane transport −1.326 0.1439 43/85 (%51) GO:0071805
cytoskeleton organization −1.519 0.1464 14/29 (%48) GO:0007010
molting cycle, collagen and cuticulin-based cuticle −1.498 0.1464 16/58 (%28) GO:0018996
mitochondrial translation −1.512 0.1479 23/32 (%72) GO:0032543
gonad morphogenesis −1.488 0.1479 12/38 (%32) GO:0035262
glutathione metabolic process −1.462 0.1641 28/55 (%51) GO:0006749
muscle contraction −1.571 0.1657 19/26 (%73) GO:0006936
muscle organ development −1.574 0.1845 12/18 (%67) GO:0007517
tRNA methylation −1.513 0.1968 13/23 (%57) GO:0030488
proton motive force-driven ATP synthesis −1.525 0.2171 13/17 (%76) GO:0015986

As a rule of thumb it is good to use proportional fonts in your tables, they are generally easier to read than monospace fonts where all the letters are the same width. However some proportional fonts also have digit characters which are proportional in width. This can result in issues with the alignment of numbers in tables which should center around the decimal point.

Numerical formatting rules in some table packages can be somewhat limited so to produce correct alignments it may be necessary to convert numerical values to strings which you can space properly with more powerful numerical formatting functions like R’s sprintf. This, however, has the disadvantage that the table library no longer sees these columns as numerical values so cannot do things like sorting them interactively on a web page. Even table libraries with extensive numerical formatting options frequently fail to align their formatted result correctly around the decimal point. It may annoyingly be necessary in these cases to compromise on correct alignment to get these interactive features until such time as these alignment deficiencies can be fixed in the upstream table packages. gt does have decimal alignment in versions >=v0.8.0 but this is not the default behavior which I would content it should be. I’ve not got more optimal solution for DT

Unfortunately it is also a habit of some web based tables to ignore leading white-space characters, like spaces, which you may be relying upon to align your numbers correctly. So setting the CSS properties font-family to monospace and white-space to pre for the relevant columns of your table gets around these issues as an alignment based on the the character widths in a string should then be reliably correctly spaced when rendered on the web page.

Reactable

Code
gsea_reactable <- function(
    data, element_id = paste0("gsea-reactable-", uuid::UUIDgenerate())
){
    padj_color <- colourScaleR::universal_colour_scaler(
        data$padj, type = "viridis", palette = "viridis",
        n_breaks = 30, mode = "closure"
    )
    NES_color <- colourScaleR::universal_colour_scaler(
        data$NES, type = "scico", palette = "roma",
        n_breaks = 30, mode = "closure"
    )
    prop_color <- colourScaleR::universal_colour_scaler(
        0:1, type = "viridis", palette = "plasma",
        n_breaks = 30, mode = "closure"
    )
    reactable::reactable(
        data = data,
        elementId = element_id,
        columns = list(
            padj = reactable::colDef(
                name = "Adjusted p-value",
                align = "left",
                width = 140,
                cell = reactablefmtr::data_bars(
                    data, text_position = "inside-base",
                    fill_color = colourScaleR::universal_colour_scaler(
                        sort(data$padj), direction = -1, scale = "log",
                        palette = "Reds", type = "brewer"
                    )
                ),
                style = list(fontFamily = "sans")
            ),
            NES = reactable::colDef(
                align = "center",
                width = 380,
                cell = reactablefmtr::data_bars(
                    data,
                    fill_color = c("#FE9915", "#73C08C")
                ),
                style = list(fontFamily = "sans")
            ),
            ES = reactable::colDef(show = FALSE),
            n = reactable::colDef(show = FALSE),
            leading_pc = reactable::colDef(show = FALSE),
            Genes = reactable::colDef(
                cell = function(){""},
                width = 35,
                details = function(index) {
                    htmltools::div(
                        "Genes: ",
                        paste(data$Genes[[index]], collapse = ", "),
                        style = list(fontSize = "0.75rem")
                    )
                },
                filterInput = function(values, name) {
                    htmltools::tags$input(
                        type = "text",
                        onchange = sprintf(
                            "Reactable.setFilter('%s', '%s', this.value)",
                            element_id, name
                        ),
                        "aria-label" = sprintf("Filter %s", name),
                        style = "width: 100%;"
                    )
                }
            ),
            term = reactable::colDef(width = 300),
            direction = reactable::colDef(
                show = FALSE, defaultSortOrder = "desc"
            ),
            size = reactable::colDef(width = 75, show = FALSE),
            leading = reactable::colDef(
                cell = reactablefmtr::data_bars(
                    data, fill_by = "leading_pc",
                    fill_color = viridisLite::viridis(5)
                ),
                
                name = "(leading/size) %",
                align = "left",
                width = 400,
                style = list(fontFamily = "sans")
            )
        ),
        highlight = TRUE, bordered = FALSE, borderless = TRUE, striped = TRUE,
        pagination = FALSE, compact = TRUE,
        height = 1000,
        defaultSorted = c("direction", "padj")
        # defaultSorted = c("direction", "ES")
    )
}

table_s4 %>% 
    dplyr::group_by(direction) %>%
    dplyr::slice(1:30) %>%
    dplyr::ungroup() %>%
    dplyr::select(
        Genes, term,
        ES, padj, NES,
        direction,
        n = n_genes_in_leading_edge_subset,
        size
    ) %>%
    dplyr::mutate(
        leading_pc = round((n / size) * 100),
        leading = paste0(n, "/", size, " (%",leading_pc,")"),
    ) %>%
    gsea_reactable()

Flextable

Code
table_s4_flxtbl_prep <- table_s4 %>%
    dplyr::group_by(direction) %>%
    dplyr::slice(1:30) %>%
    dplyr::ungroup() %>%
    # arrange(dplyr::desc(direction), ES) %>%
    arrange(dplyr::desc(direction), padj) %>%
    dplyr::select(
        `` = direction,
        term, NES, padj, Genes,
        n = n_genes_in_leading_edge_subset,
        size, id
    ) %>%
    dplyr::mutate(
        leading_pc = round((n / size) * 100),
        leading = paste0(n, "/", size, " (%",leading_pc,")"),
        go = paste0("GO:", id),
        go_url = paste0("https://amigo.geneontology.org/amigo/term/GO:", id)
    ) %>%
    dplyr::mutate(
        Genes = purrr::map_chr(Genes, ~paste0(.x, collapse = ", "))
    )
Code
table_s4_flxtbl <- table_s4_flxtbl_prep %>%
    flextable::flextable(
        col_keys = c("⇅", "term", "NES", "padj", "leading", "go")
    ) %>%
    flextable::bg(
        i = ~`` == "Up", j = "⇅", bg = "#73C08C"
    ) %>%
    flextable::bg(
        i = ~`` == "Down", j = "⇅", bg = "#FE9915"
    ) %>%
    flextable::merge_v(j = "⇅") %>%
    flextable::colformat_double(
        j = c("padj", "NES"),
        big.mark=",", digits = 3, na_str = "N/A"
    ) %>%
    flextable::compose(
        j = "go", value = flextable::as_paragraph(
            flextable::hyperlink_text(x = go, url = go_url)
        )
    ) %>%
    flextable::compose(
        j = "term", value = flextable::as_paragraph(
            flextable::as_chunk(
                term,
                props = flextable::fp_text_default(font.size = 10)
            )
        )
    ) %>%
    flextable::bg(
        j = c("padj"),#"pval", 
        bg = colourScaleR::universal_colour_scaler(
            sort(c(table_s4_flxtbl_prep$padj)), ## !!
            scale = "log", mode = "closure", direction = 1, end = 1, begin = 0.2,
            palette = "plasma", type = "viridis"
        )
    ) %>%
    flextable::bold(j = c("padj")) %>%
    flextable::colformat_double(
        j = c("padj"), 
        big.mark=",", digits = 5, na_str = "N/A"
    ) %>%
    flextable::bg(
        j = c("leading"),#"pval",
        bg = colourScaleR::universal_colour_scaler(
            sort(c(table_s4_flxtbl_prep$leading_pc)),
            mode = "closure", direction = -1, end = 1, begin = 0.2,
            palette = "imola", type = "scico"
        )(table_s4_flxtbl_prep$leading_pc)## !!
    ) %>%
    flextable::compose(
        j = "NES",
        value = flextable::as_paragraph(
            flextable::as_chunk(
                sprintf("% .3f\n", NES),
                props = flextable::fp_text_default(font.size = 8)
            ),
            flextable::minibar(
                value = abs(NES), height = 0.1,
                barcol = colourScaleR::universal_colour_scaler(
                    table_s4_flxtbl_prep$NES, ## !!
                    mode = "closure", direction = 1,
                    palette = "roma", type = "scico"
                )(NES)
            )
        )
    ) %>%
    flextable::align(
        j = c("padj", "NES"), align = "left", part = "all"
    ) %>%
    flextable::hline(
        i = ~{
            x <- `` == "Up"
            x[c(which(x)[-length(which(x))], which(!x))] <- FALSE
            x
        },
        border = flextable::fp_border_default(color = "#666666", width = 2) 
    ) %>%
    flextable::width(
        width = c(0.5, 3, 0.8, 0.8, 0.5, 0.8)
    ) %>%
    flextable::height(height = 8, part = "body", unit = "mm")
    #flextable::set_table_properties(layout = "autofit", width = 1)
res <- table_s4_flxtbl %>% flextable::save_as_image("table_s4_flxtbl.png")
Registered S3 method overwritten by 'webshot':
  method        from    
  print.webshot webshot2
PhantomJS not found. You can install it with webshot::install_phantomjs(). If it is installed, please make sure the phantomjs executable can be found via the PATH variable.
Code
table_s4_flxtbl

term

NES

padj

leading

go

Up

response to stress

1.781

0.08016

7/16 (%44)

GO:0006950

nuclear-transcribed mRNA catabolic process

1.811

0.08016

7/19 (%37)

GO:0000956

detection of chemical stimulus involved in sensory perception of smell

2.116

0.08016

50/60 (%83)

GO:0050911

olfactory behavior

1.841

0.08016

37/67 (%55)

GO:0042048

peptidyl-serine phosphorylation

1.975

0.08016

55/119 (%46)

GO:0018105

protein dephosphorylation

1.889

0.08016

41/108 (%38)

GO:0006470

peptidyl-tyrosine phosphorylation

1.695

0.08016

31/77 (%40)

GO:0018108

RNA metabolic process

1.787

0.08130

9/21 (%43)

GO:0016070

nucleosome assembly

1.653

0.08130

5/16 (%31)

GO:0006334

sensory perception of chemical stimulus

1.642

0.08130

29/68 (%43)

GO:0007606

RNA phosphodiester bond hydrolysis, endonucleolytic

1.547

0.08130

13/57 (%23)

GO:0090502

spermatogenesis

1.542

0.08130

21/58 (%36)

GO:0007283

DNA repair

1.492

0.08130

45/128 (%35)

GO:0006281

protein import into nucleus

1.580

0.10625

16/30 (%53)

GO:0006606

proteolysis involved in protein catabolic process

1.551

0.10625

16/48 (%33)

GO:0051603

DNA replication

1.512

0.10733

36/72 (%50)

GO:0006260

mitotic sister chromatid segregation

1.554

0.12440

23/33 (%70)

GO:0000070

DNA recombination

1.522

0.14359

18/32 (%56)

GO:0006310

histone acetylation

1.528

0.14644

6/28 (%21)

GO:0016573

germ cell proliferation

1.532

0.15807

12/16 (%75)

GO:0036093

gastrulation

1.504

0.16936

5/22 (%23)

GO:0007369

polar body extrusion after meiotic divisions

1.492

0.18646

17/21 (%81)

GO:0040038

DNA replication initiation

1.494

0.20342

11/16 (%69)

GO:0006270

Down

fatty acid metabolic process

-1.403

0.08130

31/90 (%34)

GO:0006631

nervous system process

-1.423

0.08130

24/89 (%27)

GO:0050877

excitatory postsynaptic potential

-1.491

0.08130

23/85 (%27)

GO:0060079

regulation of membrane potential

-1.538

0.08130

25/93 (%27)

GO:0042391

regulation of postsynaptic membrane potential

-1.530

0.08130

25/91 (%27)

GO:0060078

endocytosis

-1.602

0.08130

28/92 (%30)

GO:0006897

protein glycosylation

-1.804

0.08130

31/88 (%35)

GO:0006486

mechanosensory behavior

-1.615

0.08130

7/29 (%24)

GO:0007638

chemosensory behavior

-1.824

0.08130

14/41 (%34)

GO:0007635

neuron differentiation

-1.765

0.08130

25/36 (%69)

GO:0030182

locomotion

-1.440

0.08291

25/96 (%26)

GO:0040011

nervous system development

-1.941

0.09959

42/114 (%37)

GO:0007399

chemical synaptic transmission

-1.381

0.10625

30/128 (%23)

GO:0007268

monoatomic ion transmembrane transport

-1.799

0.10625

36/125 (%29)

GO:0034220

regulation of monoatomic ion transmembrane transport

-1.627

0.11957

17/28 (%61)

GO:0034765

regulation of catalytic activity

-1.722

0.12226

44/140 (%31)

GO:0050790

potassium ion transport

-1.576

0.12440

28/55 (%51)

GO:0006813

axon guidance

-1.553

0.13133

22/64 (%34)

GO:0007411

mitochondrion organization

-1.550

0.14192

18/34 (%53)

GO:0007005

regulation of oviposition

-1.421

0.14359

22/48 (%46)

GO:0046662

potassium ion transmembrane transport

-1.326

0.14395

43/85 (%51)

GO:0071805

molting cycle, collagen and cuticulin-based cuticle

-1.498

0.14644

16/58 (%28)

GO:0018996

cytoskeleton organization

-1.519

0.14644

14/29 (%48)

GO:0007010

gonad morphogenesis

-1.488

0.14791

12/38 (%32)

GO:0035262

mitochondrial translation

-1.512

0.14791

23/32 (%72)

GO:0032543

glutathione metabolic process

-1.462

0.16414

28/55 (%51)

GO:0006749

muscle contraction

-1.571

0.16566

19/26 (%73)

GO:0006936

muscle organ development

-1.574

0.18448

12/18 (%67)

GO:0007517

tRNA methylation

-1.513

0.19675

13/23 (%57)

GO:0030488

proton motive force-driven ATP synthesis

-1.525

0.21711

13/17 (%76)

GO:0015986

Clustered / selected Set Membership overlap

Representing the overlap between genes in different terms is challenging to do in a way that can be included in a combined visualisation.

Code
table_s4_set_map <- table_s4 %>%
    dplyr::select(Genes, term) %>%
    tidyr::unnest(cols = c(Genes)) %>%
    dplyr::filter(grepl("DNA", term)) %>%
    #dplyr::filter(term %in% c(head(unique(table_s4$term)))) %>%
    dplyr::group_by(Genes) %>%
    dplyr::mutate(n = n()) %>%
    dplyr::ungroup() %>%
    dplyr::arrange(dplyr::desc(n)) %>%
    dplyr::mutate(
        Genes = factor(Genes, levels = unique(Genes), ordered = TRUE),
        term = factor(term, levels = unique(term), ordered = TRUE)
    ) %>%
    
    ggplot2::ggplot(aes(Genes, term)) +
        ggplot2::geom_tile(fill = "red", colour = "grey") +
        ggplot2::theme_void() + 
        ggplot2::theme(
            #plot.background = ggplot2::element_rect(fill = "black"),
            axis.text.y = ggplot2::element_text(hjust = 1, size = 8),
            axis.text.x = ggplot2::element_text(
                angle = 70, size = 8, vjust = 0.5, hjust = 0.5
            )
        )# + 
        #ggplot2::coord_equal()

ggplot2::ggsave("table_s4_set_map.png", table_s4_set_map, height = 1.5, width = 12)

table_s4_set_map

Reproducibility & Online Graphical Portals

version is clear, where it is possible to use old versions, where any underlying database/dataset is also versioned and accessible

Slides

slides from the discussion session.

Session Info

sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] uuid_1.1-0          ggplot2_3.3.6       tippy_0.1.0        
 [4] htmltools_0.5.7     webshot2_0.1.0      reactablefmtr_2.0.0
 [7] reactable_0.4.4     DT_0.23             gtExtras_0.4.0     
[10] gt_0.10.1           readr_2.1.2         fs_1.6.3           
[13] tidyr_1.2.0         dplyr_1.1.4         purrr_0.3.4        

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0    viridisLite_0.4.0   farver_2.1.0       
 [4] viridis_0.6.2       fastmap_1.1.0       promises_1.2.0.1   
 [7] digest_0.6.29       lifecycle_1.0.3     ellipsis_0.3.2     
[10] processx_3.6.1      magrittr_2.0.3      compiler_4.3.1     
[13] rlang_1.1.0         sass_0.4.8          tools_4.3.1        
[16] utf8_1.2.2          yaml_2.3.5          data.table_1.14.2  
[19] knitr_1.39          labeling_0.4.2      htmlwidgets_1.6.4  
[22] bit_4.0.4           xml2_1.3.6          RColorBrewer_1.1-3 
[25] scico_1.3.0         websocket_1.4.1     withr_2.5.0        
[28] grid_4.3.1          fansi_1.0.3         gdtools_0.2.4      
[31] colorspace_2.0-3    paletteer_1.4.0     scales_1.3.0       
[34] cli_3.6.2           rmarkdown_2.14      crayon_1.5.1       
[37] generics_0.1.2      tzdb_0.3.0          chromote_0.1.0     
[40] stringr_1.4.0       parallel_4.3.1      base64enc_0.1-3    
[43] vctrs_0.6.5         webshot_0.5.3       jsonlite_1.8.0     
[46] hms_1.1.1           bit64_4.0.5         systemfonts_1.0.4  
[49] crosstalk_1.2.0     fontawesome_0.2.2   glue_1.6.2         
[52] reactR_0.5.0        rematch2_2.1.2      ps_1.7.1           
[55] stringi_1.7.6       flextable_0.7.2     gtable_0.3.0       
[58] shape_1.4.6         colourScaleR_1.1.1  later_1.3.0        
[61] prismatic_1.1.0     munsell_0.5.0       tibble_3.2.1       
[64] pillar_1.9.0        circlize_0.4.15     R6_2.5.1           
[67] vroom_1.5.7         evaluate_0.15       renv_0.15.5        
[70] Rcpp_1.0.8.3        zip_2.2.0           gridExtra_2.3      
[73] officer_0.4.3       xfun_0.38           pkgconfig_2.0.3    
[76] GlobalOptions_0.1.2

References

Boers, Maarten. 2022. Data Visualization for Biomedical Scientists: Creating Tables and Graphs That Work. Amsterdam: VU University Press.
Xu, Jiaming, Brendil Sabatino, Junran Yan, Glafira Ermakova, Kelsie R S Doering, and Stefan Taubert. 2024. “The Unfolded Protein Response of the Endoplasmic Reticulum Protects Caenorhabditis Elegans Against DNA Damage Caused by Stalled Replication Forks.” Edited by S Lee. G3: Genes, Genomes, Genetics, January. https://doi.org/10.1093/g3journal/jkae017.

Reuse