Processing math: 100%

Supplementary S3. The averaging model

Author: Zuguang Gu ( z.gu@dkfz.de )

Date: 2016-02-27


When using points or rectangles under “normal” mode, or using pixels under “pixel” mode, each point/rectangle/pixel is mapped to a small window in the genome. When overlapping certain genomic regions to the curve, and when the window is not completely covered by input regions, proper averaging method should be applied to summarize the value in the window.

Depending on different scenarios, HilbertCurve provides three metrics for averaging.

The overlapping model is illustrated in the following plot. The red line in the bottom represents the small window on the Hilbert curve. Black lines on the top are the parts of input regions that overlap with the window. The thick lines indicate the intersected part between the input regions and the window.

plot of chunk unnamed-chunk-1

For a given window on the curve, n is the number of input regions which overlap with the window (it is 3 in the above plot), wi is the width of the intersected segments (black thick lines), and xi is the value associated with the original regions.

The “absolute” method is denoted as va and is simply calculated as the mean of all input regions regardless of their width:

va=nixin

The “weighted” method is denoted as vw and is calculated as the mean of all input regions weighted by the width of their intersections:

vw=nixiwiniwi

“Absolute” and “weighted” mode should be applied when background values should not be taken into consideration. For example, when summarizing the mean methylation in a small window, non-CpG background should be ignored, because methylation is only associated with CpG sites and not with other positions.

The “w0” mode is the weighted mean between the intersected parts and un-intersected parts:

vw0=vwW+vbWW+W

W is sum of width of the intersected parts (niwi) and W is the sum of width for the non-intersected parts. Vb is the value corresponding to the background. When e.g. averaging colors which are represented as numeric RGB values, the background value is set to 255 which corresponds to white.