adjust_matrix.Rd
Remove rows with low variance and impute missing values
adjust_matrix(m, sd_quantile = 0.05, max_na = 0.25, verbose = TRUE)
A numeric matrix.
Cutoff of the quantile of standard deviation. Rows with standard deviation less than it are removed.
Maximum NA fraction in each row. Rows with NA fraction larger than it are removed.
Whether to print messages.
The function uses impute.knn
to impute missing values, then
uses adjust_outlier
to adjust outliers and
removes rows with low standard deviations.
A numeric matrix.
set.seed(123)
m = matrix(rnorm(100), nrow = 10)
m[sample(length(m), 5)] = NA
m[1, ] = 0
m
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0.00000000 0.00000000 0.0000000 0.0000000 0.00000000 0.0000000
#> [2,] 0.70610908 -0.01260453 0.3207204 NA 1.71798542 -1.4614942
#> [3,] 1.48902132 0.17625437 -0.7470537 NA -0.32774780 0.3494173
#> [4,] -1.81509255 -1.58368027 -0.2179812 1.0140337 0.39628362 1.0456287
#> [5,] 0.33040958 0.46779179 -0.5137169 1.5790067 NA -0.6271371
#> [6,] -1.14215571 1.19461175 0.6018131 0.6687560 0.01330582 0.3957078
#> [7,] 0.15719342 0.77458193 -1.5498219 -0.3535221 -0.63898639 -1.2075114
#> [8,] -2.06540724 0.08710445 -1.7096228 -1.4058683 2.24830001 0.9457638
#> [9,] -0.44054688 NA 0.7701488 -2.4582271 0.06632788 NA
#> [10,] 0.00395328 1.21997897 -0.7168730 2.8895694 0.03159166 -0.2721696
#> [,7] [,8] [,9] [,10]
#> [1,] 0.000000000 0.00000000 0.00000000 0.000000000
#> [2,] -1.360684436 -1.23465746 -1.56740996 0.008486843
#> [3,] -0.320849537 0.04221836 -0.61225871 0.773146260
#> [4,] -1.123577619 -0.79198362 -0.29797771 -1.151920752
#> [5,] 1.052020871 -0.38886174 0.34063680 0.862577468
#> [6,] -1.036255798 -0.74227068 0.13373072 0.566374004
#> [7,] 1.114446832 0.77322966 0.86266186 -0.653870279
#> [8,] -0.530456611 0.68250298 0.05063779 0.075560593
#> [9,] 0.001133325 -0.21793606 1.22458533 0.557868206
#> [10,] -1.231623777 -0.63878196 0.01722424 -1.098786692
m2 = adjust_matrix(m)
#> There are NA values in the data, now impute missing data.
#> 1 rows have been removed with zero variance.
#> 1 rows have been removed with too low variance (sd <= 0.05 quantile)
m2
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0.70610908 -0.01260453 0.3207204 0.1352084 1.26264107 -1.46149424
#> [2,] -1.71095702 -1.58368027 -0.2179812 1.0140337 0.39628362 1.03141096
#> [3,] 0.33040958 0.46779179 -0.5137169 1.3418631 0.63966296 -0.57609802
#> [4,] -1.09450075 0.95797664 0.6018131 0.6687560 0.01330582 0.39570778
#> [5,] 0.15719342 0.77458193 -1.3957821 -0.3535221 -0.63898639 -1.20751138
#> [6,] -1.90530426 0.08710445 -1.7096228 -1.4058683 1.66215873 0.94576382
#> [7,] -0.44054688 0.23279917 0.7701488 -1.5502710 0.06632788 -0.08002031
#> [8,] 0.00395328 1.21997897 -0.7168730 2.1382537 0.03159166 -0.27216958
#> [,7] [,8] [,9] [,10]
#> [1,] -1.360684436 -1.2346575 -1.51974788 0.008486843
#> [2,] -1.123577619 -0.7919836 -0.29797771 -1.151920752
#> [3,] 1.052020871 -0.3888617 0.34063680 0.862577468
#> [4,] -1.036255798 -0.7422707 0.13373072 0.566374004
#> [5,] 1.001143595 0.7732297 0.86266186 -0.653870279
#> [6,] -0.530456611 0.6825030 0.05063779 0.075560593
#> [7,] 0.001133325 -0.2179361 1.02008889 0.557868206
#> [8,] -1.171847089 -0.6387820 0.01722424 -1.098786692