Chapter 16 A complex example of Chord diagram
In this chapter, we demonstrate how to make a complex Chord diagram and how to customize additional tracks by visualizing chromatin state transitions as well as methylation changes. A chromatin state transition matrix shows how much a chromatin state in the genome has been changed from e.g. one group of samples to the other. The genomic regions for which the chromatin states change also have variable methylation patterns which may show interesting correspondance to chromatin states change.
The data used in this post is processed from Roadmap dataset. The chromatin states are learned from five core chromatin marks. Roadmap samples are classified into two groups based on expression profile. In each group, a chromatin state is assigned to the corresponding genomic bin if it is recurrent in at least half of the samples in each group.
The processed data can be set from https://jokergoo.github.io/circlize_book/data/chromatin_transition.RData.
load("data/chromatin_transition.RData")
In the RData file, there are three matrix: mat
, meth_mat_1
and
meth_mat_2
which are:
mat
: chromatin state transition matrix. Rows correspond to states in group 1 and columns correspond to group 2. The value in the matrix are total base pairs that transite from one state to the other. E.g.mat["TssA", "TxFlnk"]
is the total base pairs that have “TssA” state in samples in group 1 and transites to “TxFlnk” state in samples in group 2. On the diagonal are the regions where the states have not been changed in the two groups.meth_mat_1
: mean methylation for each set of regions in group 1. E.g.meth_mat_1["TssA", "TxFlnk"]
is the mean methylation for the regions in group 1 that have “TssA” state in group 1 and “TxFlnk” state in group 2.meth_mat_2
: mean methylation for each set of regions in group 2. E.g.meth_mat_2["TssA", "TxFlnk"]
is the mean methylation for the regions in group 2 that have “TssA” state in group 1 and “TxFlnk” state in group 2.
1:4, 1:4] mat[
## TssA TssAFlnk TxFlnk Tx
## TssA 497200 79600 13400 1800
## TssAFlnk 56400 233200 5000 800
## TxFlnk 0 400 43000 1800
## Tx 800 200 200 166400
1:4, 1:4] meth_mat_1[
## TssA TssAFlnk TxFlnk Tx
## TssA 0.1647232 0.1580874 0.1917435 0.2690045
## TssAFlnk 0.2591677 0.2689880 0.3616242 0.3411387
## TxFlnk NA 0.3697514 0.3360386 0.4752722
## Tx 0.8268626 0.7822987 0.5799682 0.6595322
Normally, in majority in the genome, chromatin states of regions are not changed in the two groups, thus, we should only look at the regions in which the states are changed.
# proportion of the unchanges states in the genome
sum(diag(mat))/sum(mat)
## [1] 0.6192262
# remove the unchanged states
diag(mat) = 0
When making the plot, actually rows and columns are different (because one is
from group 1 and the other is from group 2), thus we give them different names
and the original names are stored in all_states
.
= rownames(mat)
all_states = nrow(mat)
n_states
rownames(mat) = paste0("R_", seq_len(n_states))
colnames(mat) = paste0("C_", seq_len(n_states))
dimnames(meth_mat_1) = dimnames(mat)
dimnames(meth_mat_2) = dimnames(mat)
Next we set the colors. colmat
is the color of the links and the colors
are represent as hexadecimal code. Links have more transparent (A0
) if they
contain few transitions (< 70th percentile) because we don’t want it
to disturb the visualization of the major transitions.
= c("TssA" = "#E41A1C", "TssAFlnk" = "#E41A1C",
state_col "TxFlnk" = "#E41A1C", "Tx" = "#E41A1C",
"TxWk" = "#E41A1C", "EnhG" = "#E41A1C",
"Enh" = "#E41A1C", "ZNF/Rpts" = "#E41A1C",
"Het" = "#377EB8", "TssBiv" = "#377EB8",
"BivFlnk" = "#377EB8", "EnhBiv" = "#377EB8",
"ReprPC" = "#377EB8", "ReprPCWk" = "#377EB8",
"Quies" = "black")
# one for rows and one for columns
= c(state_col, state_col)
state_col2 names(state_col2) = c(rownames(mat), colnames(mat))
= rep(state_col2[rownames(mat)], n_states)
colmat = rgb(t(col2rgb(colmat)), maxColorValue = 255)
colmat
= quantile(mat, 0.7)
qati > qati] = paste0(colmat[mat > qati], "A0")
colmat[mat <= qati] = paste0(colmat[mat <= qati], "20")
colmat[mat dim(colmat) = dim(mat)
Now we can use chordDiagram()
function to make the plot. Here we set one
pre-allocated track in which the methylation information will be added later.
Also we only set annotationTrack
to grid
and the axes and sector labels
will be customized in later code.
chordDiagram()
returns a data frame which contains coordinates for all links.
circos.par(cell.padding = c(0, 0, 0, 0), points.overflow.warning = FALSE)
= chordDiagram(mat, col = colmat, grid.col = state_col2,
cdm_res directional = TRUE, annotationTrack = "grid",
big.gap = 10, small.gap = 1,
preAllocateTracks = list(track.height = 0.1),
link.target.prop = FALSE)
head(cdm_res)
## rn cn value1 value2 o1 o2 x1 x2 col
## 1 R_1 C_1 0 0 15 13 431200 267200 #E41A1C20
## 2 R_2 C_1 56400 56400 15 12 159800 267200 #E41A1CA0
## 3 R_3 C_1 0 0 15 11 3600 210800 #E41A1C20
## 4 R_4 C_1 800 800 15 10 34600 210800 #E41A1C20
## 5 R_5 C_1 98200 98200 15 9 1411600 210000 #E41A1CA0
## 6 R_6 C_1 0 0 15 8 139800 111800 #E41A1C20
Now the axes are added in the second track, also, the index of states are
added at the center of the grids in the second track, if the degree for a
sector is larger than 3 degrees. Note since there is already one pre-allocated
track, the circular rectangles are in the second track (track.index = 2
).
circos.track(track.index = 2, panel.fun = function(x, y) {
if(abs(CELL_META$cell.start.degree - CELL_META$cell.end.degree) > 3) {
= CELL_META$sector.index
sn = as.numeric(gsub("(C|R)_", "", sn))
i_state circos.text(CELL_META$xcenter, CELL_META$ycenter, i_state, col = "white",
font = 2, cex = 0.7, adj = c(0.5, 0.5), niceFacing = TRUE)
= CELL_META$xlim
xlim = seq(0, xlim[2], by = 4e5)
breaks circos.axis(major.at = breaks, labels = paste0(breaks/1000, "KB"), labels.cex = 0.5)
}bg.border = NA) },
On the top half, it is easy to see the proportion of different transitions in group 1 that come to every state in group 2. However, it is not straightforward for the states in the bottom half to see the proportion of different states in group 2 they transite to. This can be solved by adding small circular rectangles to represent the proportions. In following example, the newly added circular rectangles in the bottom half show e.g. how much the state 15 in group 1 has been transited to different states in group 2.
Note from version 0.4.11, there is a new argument link.target.prop
in chordDiagram()
.
It basically automatically applies following code. However, we still suggest users
to read the following code because more tracks with similar code will be drawn later.
for(i in seq_len(nrow(cdm_res))) {
if(cdm_res$value1[i] > 0) {
circos.rect(cdm_res[i, "x1"], -mm_y(1),
"x1"] - abs(cdm_res[i, "value1"]), -mm_y(2),
cdm_res[i, col = state_col2[cdm_res$cn[i]], border = state_col2[cdm_res$cn[i]],
sector.index = cdm_res$rn[i], track.index = 2)
} }
Methylation in each category is put on the most outside of the circle. On this track, we will
put two paralle rectangles which are mean methylation and methylation difference between group 1
and group 2. Basically, on the bottom, we show meth_mat_2 - meth_mat_1
and on the top we show
meth_mat_1 - meth_mat_2
.
The logic of following code is simple that it just simply adds rectangles repeatedly.
= quantile(abs(c(meth_mat_1, meth_mat_2) - 0.5), 0.95, na.rm = TRUE)
abs_max = colorRamp2(c(0.5 - abs_max, 0.5, 0.5 + abs_max), c("blue", "white", "red"))
col_fun = colorRamp2(c(-abs_max, 0, abs_max), c("green", "white", "orange"))
col_fun2
= get.cell.meta.data("ylim", sector.index = rownames(mat)[1], track.index = 1)
ylim = ylim[1] + (ylim[2] - ylim[1])*0.4
y1 = ylim[2]
y2 for(i in seq_len(nrow(cdm_res))) {
if(cdm_res$value1[i] > 0) {
circos.rect(cdm_res[i, "x1"], y1, cdm_res[i, "x1"] - abs(cdm_res[i, "value1"]), y1 + (y2-y1)*0.45,
col = col_fun(meth_mat_1[cdm_res$rn[i], cdm_res$cn[i]]),
border = col_fun(meth_mat_1[cdm_res$rn[i], cdm_res$cn[i]]),
sector.index = cdm_res$rn[i], track.index = 1)
circos.rect(cdm_res[i, "x1"], y1 + (y2-y1)*0.55, cdm_res[i, "x1"] - abs(cdm_res[i, "value1"]), y2,
col = col_fun2(meth_mat_2[cdm_res$rn[i], cdm_res$cn[i]] - meth_mat_1[cdm_res$rn[i], cdm_res$cn[i]]),
border = col_fun2(meth_mat_2[cdm_res$rn[i], cdm_res$cn[i]] - meth_mat_1[cdm_res$rn[i], cdm_res$cn[i]]),
sector.index = cdm_res$rn[i], track.index = 1)
circos.rect(cdm_res[i, "x2"], y1, cdm_res[i, "x2"] - abs(cdm_res[i, "value1"]), y1 + (y2-y1)*0.45,
col = col_fun(meth_mat_2[cdm_res$rn[i], cdm_res$cn[i]]),
border = col_fun(meth_mat_2[cdm_res$rn[i], cdm_res$cn[i]]),
sector.index = cdm_res$cn[i], track.index = 1)
circos.rect(cdm_res[i, "x2"], y1 + (y2-y1)*0.55, cdm_res[i, "x2"] - abs(cdm_res[i, "value1"]), y2,
col = col_fun2(meth_mat_1[cdm_res$rn[i], cdm_res$cn[i]] - meth_mat_2[cdm_res$rn[i], cdm_res$cn[i]]),
border = col_fun2(meth_mat_1[cdm_res$rn[i], cdm_res$cn[i]] - meth_mat_2[cdm_res$rn[i], cdm_res$cn[i]]),
sector.index = cdm_res$cn[i], track.index = 1)
}
}circos.clear()
Legends can be added according to instructions discussed in Section 4.