Partition the DAG

partition_by_level(dag, level = 1, from = NULL, term_pos = NULL)

partition_by_size(dag, size = round(dag_n_terms(dag)/5))

Arguments

dag

An ontology_DAG object.

level

Depth in the DAG to cut. The DAG is cut below terms (or cut the links to their child terms) with depth == level.

from

A list of terms to cut. If it is set, level is ignored.

term_pos

Internally used.

size

Number of terms in a cluster. The splitting stops on a term if all its child-trees are smaller than size.

Value

A character vector of top terms in each partition.

Details

Let's call the terms below the from term as "top terms" because they will be on top of the sub-DAGs after the partitioning. It is possible that a term in the middle of the DAG can be traced back to more than one top terms. To partition all terms exclusively, a term partitioned to the sub-DAG from the top term with the largest distance to the term. If a term has the same largest distances to several top terms, a random top term is selected.

In partition_by_size(), the DAG is first reduced to a tree where a child term only has one parent. The partition is done recursively by cutting into its child-trees. The splitting stops when all the child-trees have size less than size.

NA is assigned to the from terms, their ancestor terms, and terms having infinite directed distance to from terms.

Examples

# \donttest{
dag = create_ontology_DAG_from_GO_db()
#> relations: is_a, part_of
pa = partition_by_level(dag)
#> 
#> going through 1000 / 27942 nodes ...
#> 
#> going through 2000 / 27942 nodes ...
#> 
#> going through 3000 / 27942 nodes ...
#> 
#> going through 4000 / 27942 nodes ...
#> 
#> going through 5000 / 27942 nodes ...
#> 
#> going through 6000 / 27942 nodes ...
#> 
#> going through 7000 / 27942 nodes ...
#> 
#> going through 8000 / 27942 nodes ...
#> 
#> going through 9000 / 27942 nodes ...
#> 
#> going through 10000 / 27942 nodes ...
#> 
#> going through 11000 / 27942 nodes ...
#> 
#> going through 12000 / 27942 nodes ...
#> 
#> going through 13000 / 27942 nodes ...
#> 
#> going through 14000 / 27942 nodes ...
#> 
#> going through 15000 / 27942 nodes ...
#> 
#> going through 16000 / 27942 nodes ...
#> 
#> going through 17000 / 27942 nodes ...
#> 
#> going through 18000 / 27942 nodes ...
#> 
#> going through 19000 / 27942 nodes ...
#> 
#> going through 20000 / 27942 nodes ...
#> 
#> going through 21000 / 27942 nodes ...
#> 
#> going through 22000 / 27942 nodes ...
#> 
#> going through 23000 / 27942 nodes ...
#> 
#> going through 24000 / 27942 nodes ...
#> 
#> going through 25000 / 27942 nodes ...
#> 
#> going through 26000 / 27942 nodes ...
#> 
#> going through 27000 / 27942 nodes ...
#> 
#> going through 27942 / 27942 nodes ... Done.
#> 
#> going through 1000 / 27942 nodes ...
#> 
#> going through 2000 / 27942 nodes ...
#> 
#> going through 3000 / 27942 nodes ...
#> 
#> going through 4000 / 27942 nodes ...
#> 
#> going through 5000 / 27942 nodes ...
#> 
#> going through 6000 / 27942 nodes ...
#> 
#> going through 7000 / 27942 nodes ...
#> 
#> going through 8000 / 27942 nodes ...
#> 
#> going through 9000 / 27942 nodes ...
#> 
#> going through 10000 / 27942 nodes ...
#> 
#> going through 11000 / 27942 nodes ...
#> 
#> going through 12000 / 27942 nodes ...
#> 
#> going through 13000 / 27942 nodes ...
#> 
#> going through 14000 / 27942 nodes ...
#> 
#> going through 15000 / 27942 nodes ...
#> 
#> going through 16000 / 27942 nodes ...
#> 
#> going through 17000 / 27942 nodes ...
#> 
#> going through 18000 / 27942 nodes ...
#> 
#> going through 19000 / 27942 nodes ...
#> 
#> going through 20000 / 27942 nodes ...
#> 
#> going through 21000 / 27942 nodes ...
#> 
#> going through 22000 / 27942 nodes ...
#> 
#> going through 23000 / 27942 nodes ...
#> 
#> going through 24000 / 27942 nodes ...
#> 
#> going through 25000 / 27942 nodes ...
#> 
#> going through 26000 / 27942 nodes ...
#> 
#> going through 27000 / 27942 nodes ...
#> 
#> going through 27942 / 27942 nodes ... Done.
table(pa)
#> pa
#> GO:0000003 GO:0002376 GO:0008152 GO:0009987 GO:0016032 GO:0023052 GO:0032501 
#>        238        104       5904       2198         89          1        581 
#> GO:0032502 GO:0040007 GO:0040011 GO:0042592 GO:0043473 GO:0044419 GO:0044848 
#>       3752         19          6        137          5        377         61 
#> GO:0048511 GO:0050896 GO:0051179 GO:0051703 GO:0065007 
#>          3       1938       1760          1      10767 
pa = partition_by_size(dag, size = 1000)
#> 
#> going through 1000 / 27942 nodes ...
#> 
#> going through 2000 / 27942 nodes ...
#> 
#> going through 3000 / 27942 nodes ...
#> 
#> going through 4000 / 27942 nodes ...
#> 
#> going through 5000 / 27942 nodes ...
#> 
#> going through 6000 / 27942 nodes ...
#> 
#> going through 7000 / 27942 nodes ...
#> 
#> going through 8000 / 27942 nodes ...
#> 
#> going through 9000 / 27942 nodes ...
#> 
#> going through 10000 / 27942 nodes ...
#> 
#> going through 11000 / 27942 nodes ...
#> 
#> going through 12000 / 27942 nodes ...
#> 
#> going through 13000 / 27942 nodes ...
#> 
#> going through 14000 / 27942 nodes ...
#> 
#> going through 15000 / 27942 nodes ...
#> 
#> going through 16000 / 27942 nodes ...
#> 
#> going through 17000 / 27942 nodes ...
#> 
#> going through 18000 / 27942 nodes ...
#> 
#> going through 19000 / 27942 nodes ...
#> 
#> going through 20000 / 27942 nodes ...
#> 
#> going through 21000 / 27942 nodes ...
#> 
#> going through 22000 / 27942 nodes ...
#> 
#> going through 23000 / 27942 nodes ...
#> 
#> going through 24000 / 27942 nodes ...
#> 
#> going through 25000 / 27942 nodes ...
#> 
#> going through 26000 / 27942 nodes ...
#> 
#> going through 27000 / 27942 nodes ...
#> 
#> going through 27942 / 27942 nodes ... Done.
table(pa)
#> pa
#> GO:0000003 GO:0001748 GO:0001775 GO:0001887 GO:0001906 GO:0002376 GO:0002682 
#>        238          1         82          1          5        104        297 
#> GO:0003360 GO:0006081 GO:0006082 GO:0006091 GO:0006116 GO:0006139 GO:0006276 
#>          1         20        874         68          1       1462          4 
#> GO:0006457 GO:0006730 GO:0006735 GO:0006749 GO:0006790 GO:0006793 GO:0006807 
#>         10          2          1          3         41        166        389 
#> GO:0006810 GO:0006950 GO:0007017 GO:0007049 GO:0007154 GO:0007155 GO:0007163 
#>       1298        174         12         82        112         46         15 
#> GO:0007389 GO:0007552 GO:0007562 GO:0007568 GO:0008037 GO:0008218 GO:0008219 
#>        160          1          2          1          7          1         77 
#> GO:0008283 GO:0009056 GO:0009058 GO:0009292 GO:0009605 GO:0009607 GO:0009628 
#>         50         46         55          5        458         40        123 
#> GO:0009653 GO:0009698 GO:0009719 GO:0009790 GO:0009791 GO:0009812 GO:0009838 
#>        238         63         27        216        104         17          4 
#> GO:0009888 GO:0009889 GO:0009892 GO:0009893 GO:0009894 GO:0010098 GO:0010118 
#>        190        230         63         64         54          1          3 
#> GO:0010191 GO:0010259 GO:0010712 GO:0014823 GO:0014854 GO:0015977 GO:0015979 
#>          8          1          1          4          4          2          1 
#> GO:0016032 GO:0016043 GO:0016049 GO:0016999 GO:0018884 GO:0018893 GO:0018894 
#>         89       1173          4          1          1          1          1 
#> GO:0018930 GO:0018933 GO:0018942 GO:0018955 GO:0018958 GO:0019438 GO:0019439 
#>          1          1         10          1         85          1         12 
#> GO:0019536 GO:0019748 GO:0019835 GO:0021700 GO:0022406 GO:0022611 GO:0023051 
#>          1         26          1          2          6         14          3 
#> GO:0023052 GO:0030029 GO:0031099 GO:0031128 GO:0031137 GO:0031323 GO:0031413 
#>          1         14         24          4          9       1181          1 
#> GO:0031503 GO:0032196 GO:0032259 GO:0032501 GO:0032879 GO:0032963 GO:0033013 
#>         16          3          1        581       1211          1         51 
#> GO:0033036 GO:0033059 GO:0034337 GO:0035188 GO:0035212 GO:0035295 GO:0036166 
#>        380          1          2          2          1         86          3 
#> GO:0040007 GO:0040008 GO:0040011 GO:0040012 GO:0042180 GO:0042221 GO:0042335 
#>         19         81          6        221         31       1104         11 
#> GO:0042430 GO:0042440 GO:0042537 GO:0042558 GO:0042592 GO:0042620 GO:0042752 
#>          9         22         61          6        137          2          4 
#> GO:0043094 GO:0043170 GO:0043335 GO:0043455 GO:0043473 GO:0043696 GO:0043903 
#>          3       1093          1          1          5          6         20 
#> GO:0043934 GO:0044085 GO:0044145 GO:0044238 GO:0044281 GO:0044419 GO:0044848 
#>          2         19          6        756         64        377         61 
#> GO:0045103 GO:0045730 GO:0046483 GO:0046950 GO:0048229 GO:0048511 GO:0048513 
#>          1          1         22          1         27          3        117 
#> GO:0048518 GO:0048519 GO:0048583 GO:0048589 GO:0048647 GO:0048731 GO:0048736 
#>        133        149        533          4          6       2406         54 
#> GO:0048857 GO:0048869 GO:0048870 GO:0050792 GO:0050793 GO:0050794 GO:0051171 
#>          1        268         95         20       1161       3000        394 
#> GO:0051235 GO:0051239 GO:0051301 GO:0051606 GO:0051640 GO:0051641 GO:0051649 
#>         16        435         17          1         46         10          1 
#> GO:0051656 GO:0051674 GO:0051703 GO:0051716 GO:0051775 GO:0052803 GO:0060021 
#>          1          1          1         41          2          5          6 
#> GO:0060255 GO:0060263 GO:0060322 GO:0060352 GO:0061027 GO:0061061 GO:0061842 
#>        515          1          3          1          1          3          3 
#> GO:0061919 GO:0062012 GO:0062097 GO:0065008 GO:0065009 GO:0070085 GO:0070121 
#>          1          9          1        158        400          4          1 
#> GO:0070988 GO:0071554 GO:0071696 GO:0071728 GO:0072059 GO:0072521 GO:0072592 
#>          2          3          4          1          2          4          1 
#> GO:0072593 GO:0080090 GO:0090345 GO:0090399 GO:0090618 GO:0097006 GO:0097354 
#>          7        362          2          1          1          9          1 
#> GO:0097737 GO:0098727 GO:0098743 GO:0099120 GO:0106336 GO:0110051 GO:0120161 
#>          1          7         15         16          1          1          1 
#> GO:0120246 GO:0120252 GO:0120254 GO:0120305 GO:0120312 GO:0140253 GO:0160062 
#>          9         39          5         20          1          1          1 
#> GO:1900558 GO:1900594 GO:1900597 GO:1900619 GO:1900819 GO:1900908 GO:1901125 
#>          3          1          1          4          1          2          3 
#> GO:1901135 GO:1901275 GO:1901286 GO:1901360 GO:1901440 GO:1901615 GO:1901764 
#>        134          1          1          3         10        163          1 
#> GO:1901902 GO:1902421 GO:1903664 GO:1903700 GO:1903701 GO:1903866 GO:1903867 
#>          1          1          1          1          1          1          5 
#> GO:1904817 GO:1904888 GO:1905328 GO:1990045 GO:1990845 GO:2000241 GO:2001290 
#>          4          1          1          1          2         14          1 
# }
1
#> [1] 1