normalizeToMatrix.rd
Normalize Associations between Genomic Signals and Target Regions into a Matrix
normalizeToMatrix(signal, target, extend = 5000, w = max(extend)/100,
value_column = NULL, mapping_column = NULL, background = ifelse(smooth, NA, 0),
empty_value = NULL, mean_mode = c("absolute", "weighted", "w0", "coverage"),
include_target = any(width(target) > 1),
target_ratio = min(c(0.4, mean(width(target))/(sum(extend) + mean(width(target))))),
k = min(c(20, min(width(target)))), smooth = FALSE, smooth_fun = default_smooth_fun,
keep = c(0, 1), limit = NULL, trim = NULL, flip_upstream = FALSE, verbose = TRUE)
A GRanges-class
object.
A GRanges-class
object.
Extended base pairs to the upstream and/or downstream of target
. It can be a vector of length one or two. Length one means same extension to the upstream and downstream.
Window size for splitting upstream and downstream, measured in base pairs
Column index in signal
that is mapped to colors. If it is not set, it assumes values for all signal regions are 1.
Mapping column to restrict overlapping between signal
and target
. By default it tries to look for all regions in signal
that overlap with every target.
Values for windows that don't overlap with signal
.
Deprecated, please use background
instead.
When a window is not perfectly overlapped to signal
, how to summarize values to the window. See 'Details' section for a detailed explanation.
Whether include target
in the heatmap? If the width of all regions in target
is 1, include_target
is enforced to FALSE
.
The ratio of target
columns in the normalized matrix. If the value is 1, extend
will be reset to 0.
Number of windows only when target_ratio = 1
or extend == 0
, otherwise ignored.
Whether apply smoothing on rows in the matrix?
The smoothing function that is applied to each row in the matrix. This self-defined function accepts a numeric vector (may contain NA
values) and returns a vector with same length. If the smoothing is failed, the function should call stop
to throw errors so that normalizeToMatrix
can catch how many rows are failed in smoothing. See the default default_smooth_fun
for example.
Percentiles in the normalized matrix to keep. The value is a vector of two percent values. Values less than the first percentile is replaces with the first pencentile and values larger than the second percentile is replaced with the second percentile.
Similar as keep
, but it provides boundary for absolute values. The value should be a vector of length two.
Deprecated, please use keep
instead.
Sometimes whether the signals are on the upstream or the downstream of the targets are not important and users only want to show the relative distance to targets. If the value is set to TRUE
, the upstream part in the normalized matrix is flipped and added to the downstream part The flipping is only allowed when the targets are single-point targets or the targets are excluded in the normalized matrix (by setting include_target = FALSE
). If the extension for the upstream and downstream is not equal, the smaller extension is used for the final matrix.
Whether to print help messages.
In order to visualize associations between signal
and target
, the data is transformed into a matrix
and visualized as a heatmap by EnrichedHeatmap
afterwards.
Upstream and downstream also with the target body are splitted into a list of small windows and overlap
to signal
. Since regions in signal
and small windows do not always 100 percent overlap, there are four different averaging modes:
Following illustrates different settings for mean_mode
(note there is one signal region overlapping with other signals):
40 50 20 values in signal regions
++++++ +++ +++++ signal regions
30 values in signal region
++++++ signal region
================= a window (17bp), there are 4bp not overlapping to any signal regions.
4 6 3 3 overlap
absolute: (40 + 30 + 50 + 20)/4
weighted: (40*4 + 30*6 + 50*3 + 20*3)/(4 + 6 + 3 + 3)
w0: (40*4 + 30*6 + 50*3 + 20*3)/(4 + 6 + 3 + 3 + 4)
coverage: (40*4 + 30*6 + 50*3 + 20*3)/17
A matrix with following additional attributes:
upstream_index
column index corresponding to upstream of target
target_index
column index corresponding to target
downstream_index
column index corresponding to downstream of target
extend
extension on upstream and downstream
smooth
whether smoothing was applied on the matrix
failed_rows
index of rows which are failed after smoothing
The matrix is wrapped into a simple normalizeToMatrix
class.
signal = GRanges(seqnames = "chr1",
ranges = IRanges(start = c(1, 4, 7, 11, 14, 17, 21, 24, 27),
end = c(2, 5, 8, 12, 15, 18, 22, 25, 28)),
score = c(1, 2, 3, 1, 2, 3, 1, 2, 3))
target = GRanges(seqnames = "chr1", ranges = IRanges(start = 10, end = 20))
normalizeToMatrix(signal, target, extend = 10, w = 2)
#> Normalize signal to target:
#> Upstream 10 bp (5 windows)
#> Downstream 10 bp (5 windows)
#> Include target regions (6 windows)
#> 1 target region
normalizeToMatrix(signal, target, extend = 10, w = 2, include_target = TRUE)
#> Normalize signal to target:
#> Upstream 10 bp (5 windows)
#> Downstream 10 bp (5 windows)
#> Include target regions (6 windows)
#> 1 target region
normalizeToMatrix(signal, target, extend = 10, w = 2, value_column = "score")
#> Normalize signal to target:
#> Upstream 10 bp (5 windows)
#> Downstream 10 bp (5 windows)
#> Include target regions (6 windows)
#> 1 target region