R language learning -- common errors, causes and Solutions

Posted by djwinder on Tue, 09 Nov 2021 08:04:03 +0100

 

one. When assigning values, only variables with uniform data types can be assigned

Error: when conditionally assigning a value to a variable in the data frame, the assignment fails, and the error displayed is:

Error in x[...] <- m : invalid subscript type 'builtin'
In addition: Warning message:
In `[<-.factor`(`*tmp*`, is.na, value = 0) :
  invalid factor level, NA generated

Reason: when importing data from excel to data frame data.frame, when there are both numbers and characters or logical values in a column of variables, the imported variable data type will be automatically set as factor, and the variable of factor type cannot be assigned directly,

Solution: you must use functions such as as as.numeric(), as.character() to convert the data type of the variable to "numeric" or "string", and then perform the assignment operation

In addition, when batch converting the data format of multi column variable data in the data frame, batch operation can be carried out by combining the sapply and lapply functions.

The return value output from the sapply function is a vector type, which can be used as the index vector of the data frame to achieve the purpose of conditional filtering

The return value output by the happy function is the data frame data.frame or list type, which can be used as the nested function of the data type conversion function as.XX to output the result data frame

 

Example:

one. Import 20 lines of water quality monitoring data from the shear plate

wq <- read.table("clipboard",sep="\t",header=T)

 

View wq the data structure in the dataset

> str(wq)
'data.frame':	20 obs. of  37 variables:
 $ Station name: Factor w/ 20 levels "Bashang (Longwang Temple)",..: 2 one 7 5 4 one 5 one 3 9 one 8 one 4 one one ...
 $ Station code: int  99930 one 28 99930 one 33 99930 one 29 99930 one 30 999 one 0 one 36 99930 one 34 99930 one 3 one 999 one 0 one 24 999 one 0 one 26 999 one 0 one 25 ...
 $ Vertical line number: int  one one one one one one one one one one ...
 $ Layer number: int  one one one one one one 1 1 1 1 ...
 $ Water type: int  1 1 1 1 3 1 3 1 1 NA ...
 $ Sampling time: Factor w/ 1 level "2021/1/1 0:00": 1 1 1 1 1 1 1 1 1 1 ...
 $ WT      : num  10.2 11.4 11.9 12 8.2 11 11.9 5.6 4.5 6.9 ...
 $ PH      : num  8.2 7.7 7.7 7.7 8.1 7.7 7.6 8 8 8 ...
 $ DOX     : num  9.4 9.5 9.6 9.7 11.7 10.1 9.2 14.1 12.8 13.2 ...
 $ CODMN   : num  1.65 1.55 1.52 1.59 1.8 1.57 1.57 1.1 0.8 0.9 ...
 $ CODCR   : Factor w/ 14 levels "<2.3","2.5","2.6",..: 7 13 5 13 3 11 6 10 1 1 ...
 $ BOD5    : Factor w/ 11 levels "<0.5","0.1","0.4",..: 1 1 1 1 8 3 2 8 7 6 ...
 $ NH3N    : Factor w/ 14 levels "<0.025","0.022",..: 2 6 4 10 3 10 13 1 1 1 ...
 $ TP      : Factor w/ 6 levels "<0.01","0.01",..: 1 1 1 2 3 1 1 5 4 3 ...
 $ TN      : num  1.03 1.05 1.09 1.08 1.04 1.15 1.44 3.91 1.8 2.24 ...
 $ CU      : Factor w/ 3 levels "<0.002","0.002",..: 1 1 1 1 1 1 1 3 2 2 ...
 $ ZN      : Factor w/ 11 levels "<0.0006","0.0006",..: 1 1 1 10 1 1 1 3 2 6 ...
 $ F       : num  0.15 0.21 0.13 0.19 0.13 0.19 0.2 0.19 0.13 0.12 ...
 $ SE      : Factor w/ 1 level "<0.00041": 1 1 1 1 1 1 1 1 1 1 ...
 $ ARS     : Factor w/ 13 levels "<0.00012","0.00016",..: 7 9 3 9 5 12 4 13 6 6 ...
 $ HG      : Factor w/ 1 level "<0.00001": 1 1 1 1 1 1 1 1 1 1 ...
 $ CD      : Factor w/ 1 level "<0.0005": 1 1 1 1 1 1 1 1 1 1 ...
 $ CR6     : Factor w/ 1 level "<0.004": 1 1 1 1 1 1 1 1 1 1 ...
 $ PB      : Factor w/ 1 level "<0.004": 1 1 1 1 1 1 1 1 1 1 ...
 $ CN      : Factor w/ 1 level "<0.001": 1 1 1 1 1 1 1 1 1 1 ...
 $ VLPH    : Factor w/ 5 levels "<0.0003","0.0005",..: 1 1 1 1 1 1 1 4 2 3 ...
 $ OIL     : Factor w/ 1 level "<0.01": 1 1 1 1 1 1 1 1 1 1 ...
 $ LAS     : Factor w/ 1 level "<0.05": 1 1 1 1 1 1 1 1 1 1 ...
 $ S2      : Factor w/ 1 level "<0.005": 1 1 1 1 1 1 1 1 1 1 ...
 $ FCG     : Factor w/ 14 levels "<10","10","100",..: 8 1 1 2 1 1 13 1 12 7 ...
 $ SO4     : Factor w/ 16 levels "18.41","18.66",..: 14 6 1 13 11 7 10 16 16 16 ...
 $ CL      : Factor w/ 16 levels "2.79","2.83",..: 15 10 7 14 2 5 13 16 16 16 ...
 $ NO3     : Factor w/ 14 levels "0.92","0.94",..: 4 2 2 2 1 5 10 14 14 14 ...
 $ FE      : Factor w/ 15 levels "0.0536","0.054",..: 2 5 3 1 4 5 7 15 15 15 ...
 $ MN      : Factor w/ 14 levels "0.002","0.0035",..: 2 6 3 4 1 2 12 14 14 14 ...
 $ CLARITY : Factor w/ 13 levels "1","1.2","1.5",..: 9 10 11 12 6 12 8 13 13 13 ...
 $ CHLA    : Factor w/ 11 levels "0.39","1.13",..: 5 9 7 6 10 8 3 11 11 11 ...

You can see that there are numeric and factor variables in the data frame wq

2. Assign NA to all "not monitored" in the monitored data

Because the data in wq has factor type, there are two steps to achieve the purpose of conditional assignment: one is to modify the data type, and the other is to assign value

(1) Convert factor type data in dataset wq to numeric type

> wq[sapply(wq,is.factor)] <- lapply(wq[sapply(wq,is.factor)],as.character)
> str(wq)
'data.frame':	20 obs. of  37 variables:
 $ Station name: chr  "Baidu Beach" "Zhang Zhai, Xianghua town" "Danku Center" "Cangfang town-Zhao gou" ...
 $ Station code: int  99930128 99930133 99930129 99930130 99910136 99930134 99930131 99910124 99910126 99910125 ...
 $ Vertical line number: int  1 1 1 1 1 1 1 1 1 1 ...
 $ Layer number: int  1 1 1 1 1 1 1 1 1 1 ...
 $ Water type: int  1 1 1 1 3 1 3 1 1 NA ...
 $ Sampling time: chr  "2021/1/1 0:00" "2021/1/1 0:00" "2021/1/1 0:00" "2021/1/1 0:00" ...
 $ WT      : num  10.2 11.4 11.9 12 8.2 11 11.9 5.6 4.5 6.9 ...
 $ PH      : num  8.2 7.7 7.7 7.7 8.1 7.7 7.6 8 8 8 ...
 $ DOX     : num  9.4 9.5 9.6 9.7 11.7 10.1 9.2 14.1 12.8 13.2 ...
 $ CODMN   : num  1.65 1.55 1.52 1.59 1.8 1.57 1.57 1.1 0.8 0.9 ...
 $ CODCR   : chr  "4.2" "7.2" "3.1" "7.2" ...
 $ BOD5    : chr  "<0.5" "<0.5" "<0.5" "<0.5" ...
 $ NH3N    : chr  "0.022" "0.054" "0.045" "0.112" ...
 $ TP      : chr  "<0.01" "<0.01" "<0.01" "0.01" ...
 $ TN      : num  1.03 1.05 1.09 1.08 1.04 1.15 1.44 3.91 1.8 2.24 ...
 $ CU      : chr  "<0.002" "<0.002" "<0.002" "<0.002" ...
 $ ZN      : chr  "<0.0006" "<0.0006" "<0.0006" "0.0028" ...
 $ F       : num  0.15 0.21 0.13 0.19 0.13 0.19 0.2 0.19 0.13 0.12 ...
 $ SE      : chr  "<0.00041" "<0.00041" "<0.00041" "<0.00041" ...
 $ ARS     : chr  "0.00041" "0.00049" "0.00027" "0.00049" ...
 $ HG      : chr  "<0.00001" "<0.00001" "<0.00001" "<0.00001" ...
 $ CD      : chr  "<0.0005" "<0.0005" "<0.0005" "<0.0005" ...
 $ CR6     : chr  "<0.004" "<0.004" "<0.004" "<0.004" ...
 $ PB      : chr  "<0.004" "<0.004" "<0.004" "<0.004" ...
 $ CN      : chr  "<0.001" "<0.001" "<0.001" "<0.001" ...
 $ VLPH    : chr  "<0.0003" "<0.0003" "<0.0003" "<0.0003" ...
 $ OIL     : chr  "<0.01" "<0.01" "<0.01" "<0.01" ...
 $ LAS     : chr  "<0.05" "<0.05" "<0.05" "<0.05" ...
 $ S2      : chr  "<0.005" "<0.005" "<0.005" "<0.005" ...
 $ FCG     : chr  "30" "<10" "<10" "10" ...
 $ SO4     : chr  "23.83" "20.27" "18.41" "22.69" ...
 $ CL      : chr  "5.66" "3.66" "3.27" "4.31" ...
 $ NO3     : chr  "0.98" "0.94" "0.94" "0.94" ...
 $ FE      : chr  "0.054" "0.0637" "0.0547" "0.0536" ...
 $ MN      : chr  "0.0035" "0.0062" "0.0036" "0.0039" ...
 $ CLARITY : chr  "2.7" "3" "3.7" "4" ...
 $ CHLA    : chr  "2.29" "3.05" "2.66" "2.32" ...

At this point, you can see through the str() function that all the factor type variables in the wq data frame have been converted to character type

(2) Make conditional assignment

> wq[wq=="Not monitored"]=NA
> wq
         Station name station code vertical line number layer number water body type      Sampling time   WT  PH  DOX CODMN CODCR BOD5
1          Baidu beach 99930128        1        1        1 2021/1/1 0:00 10.2 8.2  9.4  1.65   4.2 <0.5
2      Xianghua town Zhangzhai 99930133        1        1        1 2021/1/1 0:00 11.4 7.7  9.5  1.55   7.2 <0.5
3        Danku center 99930129        1        1        1 2021/1/1 0:00 11.9 7.7  9.6  1.52   3.1 <0.5
4     Cangfang town-Zhaogou 99930130        1        1        1 2021/1/1 0:00 12.0 7.7  9.7  1.59   7.2 <0.5
5            Taocha 99910136        1        1        3 2021/1/1 0:00  8.2 8.1 11.7  1.80   2.6  0.9
6          Qingquangou 99930134        1        1        1 2021/1/1 0:00 11.0 7.7 10.1  1.57   5.3  0.4
7   Liangshui River-Taizishan 99930131        1        1        3 2021/1/1 0:00 11.9 7.6  9.2  1.57   4.1  0.1
8        Xianghe estuary 99910124        1        1        1 2021/1/1 0:00  5.6 8.0 14.1  1.10     5  0.9
9        Taohe estuary 99910126        1        1        1 2021/1/1 0:00  4.5 8.0 12.8  0.80  <2.3  0.8
10       Qihe estuary 99910125        1        1       NA 2021/1/1 0:00  6.9 8.0 13.2  0.90  <2.3  0.7
11     Laostork River Estuary 99910127        1        1       NA 2021/1/1 0:00  2.9 7.9 13.1  2.80   7.4    1
12   99930108 in front of the mountain in liupo town        1        1       NA 2021/1/1 0:00  9.1 7.9  9.5  1.49  <2.3  0.6
13       Hankou center 99930109        1        1       NA 2021/1/1 0:00  9.5 8.0  9.3  1.42     3 <0.5
14      lush mountain-Anyang 99930112        1        1       NA 2021/1/1 0:00  9.8 7.6  9.6  1.44   2.5  1.1
15       Yuanhe estuary 99930114        1        1       NA 2021/1/1 0:00 10.5 7.6  9.4  1.79  <2.3 <0.5
16  Wudang Mountain-Santang Bay 99930117        1        1       NA 2021/1/1 0:00 12.2 7.6  9.3  1.76  <2.3  0.5
17      Xiao Chuan-Longkou 99930118        1        1       NA 2021/1/1 0:00 11.7 7.6  9.0  1.88     5  1.6
18       Langhekou xia99030121        1        1       NA 2021/1/1 0:00 12.2 7.6  9.1  1.81   5.6 <0.5
19 Bashang (Longwang Temple) 99930123        1        1       NA 2021/1/1 0:00 12.2 7.5  7.8  1.88   4.4 <0.5
20 Baihe estuary (left) 99910102        1        1       NA 2021/1/1 0:00  8.4 8.2 13.4  1.50   4.7  0.8
     NH3N    TP   TN     CU      ZN    F       SE      ARS       HG      CD    CR6     PB     CN    VLPH
1   0.022 <0.01 1.03 <0.002 <0.0006 0.15 <0.00041  0.00041 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
2   0.054 <0.01 1.05 <0.002 <0.0006 0.21 <0.00041  0.00049 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
3   0.045 <0.01 1.09 <0.002 <0.0006 0.13 <0.00041  0.00027 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
4   0.112  0.01 1.08 <0.002  0.0028 0.19 <0.00041  0.00049 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
5    0.04  0.02 1.04 <0.002 <0.0006 0.13 <0.00041  0.00036 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
6   0.112 <0.01 1.15 <0.002 <0.0006 0.19 <0.00041  0.00095 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
7   0.141 <0.01 1.44 <0.002 <0.0006 0.20 <0.00041  0.00029 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
8  <0.025  0.04 3.91  0.003  0.0011 0.19 <0.00041    0.001 <0.00001 <0.0005 <0.004 <0.004 <0.001   0.001
9  <0.025  0.03 1.80  0.002  0.0006 0.13 <0.00041   0.0004 <0.00001 <0.0005 <0.004 <0.004 <0.001  0.0005
10 <0.025  0.02 2.24  0.002  0.0016 0.12 <0.00041   0.0004 <0.00001 <0.0005 <0.004 <0.004 <0.001  0.0006
11  0.118  0.08 5.94  0.003   0.002 0.17 <0.00041   0.0005 <0.00001 <0.0005 <0.004 <0.004 <0.001  0.0017
12  0.048  0.02 1.45 <0.002  0.0015 0.15 <0.00041 <0.00012 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
13  0.129  0.02 1.58 <0.002 <0.0006 0.18 <0.00041 <0.00012 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
14    0.1  0.02 1.46 <0.002  0.0022 0.14 <0.00041 <0.00012 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
15   0.06  0.02 1.44 <0.002 <0.0006 0.21 <0.00041 <0.00012 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
16  0.141  0.02 1.42 <0.002 <0.0006 0.13 <0.00041 <0.00012 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
17  0.118  0.01 1.42 <0.002 <0.0006 0.11 <0.00041 <0.00012 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
18  0.094  0.02 1.26 <0.002  0.0012 0.13 <0.00041  0.00016 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
19    0.1  0.01 1.06 <0.002  0.0021 0.17 <0.00041  0.00044 <0.00001 <0.0005 <0.004 <0.004 <0.001 <0.0003
20  0.163  0.04 1.50  0.002  0.0041 0.15 <0.00041   0.0007 <0.00001 <0.0005 <0.004 <0.004 <0.001  0.0005
     OIL   LAS     S2 FCG   SO4   CL  NO3     FE     MN CLARITY CHLA
1  <0.01 <0.05 <0.005  30 23.83 5.66 0.98  0.054 0.0035     2.7 2.29
2  <0.01 <0.05 <0.005 <10 20.27 3.66 0.94 0.0637 0.0062       3 3.05
3  <0.01 <0.05 <0.005 <10 18.41 3.27 0.94 0.0547 0.0036     3.7 2.66
4  <0.01 <0.05 <0.005  10 22.69 4.31 0.94 0.0536 0.0039       4 2.32
5  <0.01 <0.05 <0.005 <10 21.63 2.83 0.92 0.0582  0.002       2 3.45
6  <0.01 <0.05 <0.005 <10 20.48 3.11 0.99 0.0637 0.0035       4 2.71
7  <0.01 <0.05 <0.005  60 21.48 4.14 1.27 0.0828 0.0131     2.5 1.53
8  <0.01 <0.05 <0.005 <10  <NA> <NA> <NA>   <NA>   <NA>    <NA> <NA>
9  <0.01 <0.05 <0.005  52  <NA> <NA> <NA>   <NA>   <NA>    <NA> <NA>
10 <0.01 <0.05 <0.005  20  <NA> <NA> <NA>   <NA>   <NA>    <NA> <NA>
11 <0.01 <0.05 <0.005 390  <NA> <NA> <NA>   <NA>   <NA>    <NA> <NA>
12 <0.01 <0.05 <0.005  40 18.95 3.69  1.3  0.067 0.0119     2.5 2.32
13 <0.01 <0.05 <0.005 110 18.66 2.86 1.39 0.1099 0.0119     2.1 2.09
14 <0.01 <0.05 <0.005 160 20.24 3.44 1.32 0.1335  0.009     1.5 1.53
15 <0.01 <0.05 <0.005  80 26.38 2.79  1.2 0.1434 0.0162     1.8 1.53
16 <0.01 <0.05 <0.005 <10 21.28 3.21 1.19  0.167 0.0081     1.2 0.39
17 <0.01 <0.05 <0.005 100 18.85 3.89 1.24 0.1158 0.0045     1.5 1.13
18 <0.01 <0.05 <0.005  40 20.52 3.28 1.13 0.1415 0.0078       1 1.13
19 <0.01 <0.05 <0.005 190 22.07 2.84 0.97 0.1034 0.0122     1.7 1.13
20 <0.01 <0.05 <0.005 380  <NA> <NA> <NA>   <NA>   <NA>    <NA> <NA>

It can be seen that all "not monitored" have become NA, so you can make another judgment if you are not sure

> is.na(wq)
      Station name station code vertical line number layer number water body type sampling time    WT    PH   DOX CODMN CODCR  BOD5  NH3N
 [1,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [2,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [3,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [4,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [5,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [6,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [7,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [8,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [9,]    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[10,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[11,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[12,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[14,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[15,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[16,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[17,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[18,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[19,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[20,]    FALSE    FALSE    FALSE    FALSE     TRUE    FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
         TP    TN    CU    ZN     F    SE   ARS    HG    CD   CR6    PB    CN  VLPH   OIL   LAS    S2
 [1,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [3,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [5,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [6,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [7,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [8,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[10,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[11,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[12,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[14,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[15,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[16,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[17,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[18,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[19,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[20,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
        FCG   SO4    CL   NO3    FE    MN CLARITY  CHLA
 [1,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
 [2,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
 [3,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
 [4,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
 [5,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
 [6,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
 [7,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
 [8,] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE    TRUE  TRUE
 [9,] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE    TRUE  TRUE
[10,] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE    TRUE  TRUE
[11,] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE    TRUE  TRUE
[12,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
[13,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
[14,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
[15,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
[16,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
[17,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
[18,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
[19,] FALSE FALSE FALSE FALSE FALSE FALSE   FALSE FALSE
[20,] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE    TRUE  TRUE

  

3. Process data less than the detection limit (i.e. "< XX") and replace all such data with 0

Use sapply() function and gsub() function to replace text in batch

The sapply () function functions as a loop traversal to obtain a vector or matrix data set with a simpler structure

The gsub () function is a function used to replace the text in all vectors in the regular expression gift bag, which is used to construct conditional substitution

> sapply(wq,gsub,pattern="<[0-9]+.?[0-9]+",replacement=0)
      Station name         Station code vertical line number layer number water body type sampling time        WT     PH    DOX    CODMN 
 [1,] "Baidu Beach"         "99930128" "1"      "1"      "1"      "2021/1/1 0:00" "10.2" "8.2" "9.4"  "1.65"
 [2,] "Zhang Zhai, Xianghua town"     "99930133" "1"      "1"      "1"      "2021/1/1 0:00" "11.4" "7.7" "9.5"  "1.55"
 [3,] "Danku Center"       "99930129" "1"      "1"      "1"      "2021/1/1 0:00" "11.9" "7.7" "9.6"  "1.52"
 [4,] "Cangfang town-Zhao gou"    "99930130" "1"      "1"      "1"      "2021/1/1 0:00" "12"   "7.7" "9.7"  "1.59"
 [5,] "Taocha"           "99910136" "1"      "1"      "3"      "2021/1/1 0:00" "8.2"  "8.1" "11.7" "1.8" 
 [6,] "Qingquangou"         "99930134" "1"      "1"      "1"      "2021/1/1 0:00" "11"   "7.7" "10.1" "1.57"
 [7,] "Liangshui River-Taizi mountain"  "99930131" "1"      "1"      "3"      "2021/1/1 0:00" "11.9" "7.6" "9.2"  "1.57"
 [8,] "Xianghe Estuary"       "99910124" "1"      "1"      "1"      "2021/1/1 0:00" "5.6"  "8"   "14.1" "1.1" 
 [9,] "Taohe Estuary"       "99910126" "1"      "1"      "1"      "2021/1/1 0:00" "4.5"  "8"   "12.8" "0.8" 
[10,] "Qihe Estuary"       "99910125" "1"      "1"      NA       "2021/1/1 0:00" "6.9"  "8"   "13.2" "0.9" 
[11,] "Laostork River Estuary"     "99910127" "1"      "1"      NA       "2021/1/1 0:00" "2.9"  "7.9" "13.1" "2.8" 
[12,] "In front of the mountain in liupo town"   "99930108" "1"      "1"      NA       "2021/1/1 0:00" "9.1"  "7.9" "9.5"  "1.49"
[13,] "Hanku Center"       "99930109" "1"      "1"      NA       "2021/1/1 0:00" "9.5"  "8"   "9.3"  "1.42"
[14,] "lush mountain-Anyang"      "99930112" "1"      "1"      NA       "2021/1/1 0:00" "9.8"  "7.6" "9.6"  "1.44"
[15,] "Yuanhe Estuary"       "99930114" "1"      "1"      NA       "2021/1/1 0:00" "10.5" "7.6" "9.4"  "1.79"
[16,] "Wudang Mountain-Santang Bay"  "99930117" "1"      "1"      NA       "2021/1/1 0:00" "12.2" "7.6" "9.3"  "1.76"
[17,] "Xiao Chuan-Longkou"      "99930118" "1"      "1"      NA       "2021/1/1 0:00" "11.7" "7.6" "9"    "1.88"
[18,] "Under langhekou"       "99930121" "1"      "1"      NA       "2021/1/1 0:00" "12.2" "7.6" "9.1"  "1.81"
[19,] "Bashang (Longwang Temple)" "99930123" "1"      "1"      NA       "2021/1/1 0:00" "12.2" "7.5" "7.8"  "1.88"
[20,] "Baihe estuary (left)" "99910102" "1"      "1"      NA       "2021/1/1 0:00" "8.4"  "8.2" "13.4" "1.5" 
      CODCR BOD5  NH3N    TP     TN     CU      ZN       F      SE  ARS       HG  CD  CR6 PB  CN 
 [1,] "4.2" "0"   "0.022" "0"    "1.03" "0"     "0"      "0.15" "0" "0.00041" "0" "0" "0" "0" "0"
 [2,] "7.2" "0"   "0.054" "0"    "1.05" "0"     "0"      "0.21" "0" "0.00049" "0" "0" "0" "0" "0"
 [3,] "3.1" "0"   "0.045" "0"    "1.09" "0"     "0"      "0.13" "0" "0.00027" "0" "0" "0" "0" "0"
 [4,] "7.2" "0"   "0.112" "0.01" "1.08" "0"     "0.0028" "0.19" "0" "0.00049" "0" "0" "0" "0" "0"
 [5,] "2.6" "0.9" "0.04"  "0.02" "1.04" "0"     "0"      "0.13" "0" "0.00036" "0" "0" "0" "0" "0"
 [6,] "5.3" "0.4" "0.112" "0"    "1.15" "0"     "0"      "0.19" "0" "0.00095" "0" "0" "0" "0" "0"
 [7,] "4.1" "0.1" "0.141" "0"    "1.44" "0"     "0"      "0.2"  "0" "0.00029" "0" "0" "0" "0" "0"
 [8,] "5"   "0.9" "0"     "0.04" "3.91" "0.003" "0.0011" "0.19" "0" "0.001"   "0" "0" "0" "0" "0"
 [9,] "0"   "0.8" "0"     "0.03" "1.8"  "0.002" "0.0006" "0.13" "0" "0.0004"  "0" "0" "0" "0" "0"
[10,] "0"   "0.7" "0"     "0.02" "2.24" "0.002" "0.0016" "0.12" "0" "0.0004"  "0" "0" "0" "0" "0"
[11,] "7.4" "1"   "0.118" "0.08" "5.94" "0.003" "0.002"  "0.17" "0" "0.0005"  "0" "0" "0" "0" "0"
[12,] "0"   "0.6" "0.048" "0.02" "1.45" "0"     "0.0015" "0.15" "0" "0"       "0" "0" "0" "0" "0"
[13,] "3"   "0"   "0.129" "0.02" "1.58" "0"     "0"      "0.18" "0" "0"       "0" "0" "0" "0" "0"
[14,] "2.5" "1.1" "0.1"   "0.02" "1.46" "0"     "0.0022" "0.14" "0" "0"       "0" "0" "0" "0" "0"
[15,] "0"   "0"   "0.06"  "0.02" "1.44" "0"     "0"      "0.21" "0" "0"       "0" "0" "0" "0" "0"
[16,] "0"   "0.5" "0.141" "0.02" "1.42" "0"     "0"      "0.13" "0" "0"       "0" "0" "0" "0" "0"
[17,] "5"   "1.6" "0.118" "0.01" "1.42" "0"     "0"      "0.11" "0" "0"       "0" "0" "0" "0" "0"
[18,] "5.6" "0"   "0.094" "0.02" "1.26" "0"     "0.0012" "0.13" "0" "0.00016" "0" "0" "0" "0" "0"
[19,] "4.4" "0"   "0.1"   "0.01" "1.06" "0"     "0.0021" "0.17" "0" "0.00044" "0" "0" "0" "0" "0"
[20,] "4.7" "0.8" "0.163" "0.04" "1.5"  "0.002" "0.0041" "0.15" "0" "0.0007"  "0" "0" "0" "0" "0"
      VLPH     OIL LAS S2  FCG   SO4      CL       NO3      FE       MN       CLARITY  CHLA    
 [1,] "0"      "0" "0" "0" "30"  "23.83"  "5.66"   "0.98"   "0.054"  "0.0035" "2.7"    "2.29"  
 [2,] "0"      "0" "0" "0" "0"   "20.27"  "3.66"   "0.94"   "0.0637" "0.0062" "3"      "3.05"  
 [3,] "0"      "0" "0" "0" "0"   "18.41"  "3.27"   "0.94"   "0.0547" "0.0036" "3.7"    "2.66"  
 [4,] "0"      "0" "0" "0" "10"  "22.69"  "4.31"   "0.94"   "0.0536" "0.0039" "4"      "2.32"  
 [5,] "0"      "0" "0" "0" "0"   "21.63"  "2.83"   "0.92"   "0.0582" "0.002"  "2"      "3.45"  
 [6,] "0"      "0" "0" "0" "0"   "20.48"  "3.11"   "0.99"   "0.0637" "0.0035" "4"      "2.71"  
 [7,] "0"      "0" "0" "0" "60"  "21.48"  "4.14"   "1.27"   "0.0828" "0.0131" "2.5"    "1.53"  
 [8,] "0.001"  "0" "0" "0" "0"   "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored"
 [9,] "0.0005" "0" "0" "0" "52"  "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored"
[10,] "0.0006" "0" "0" "0" "20"  "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored"
[11,] "0.0017" "0" "0" "0" "390" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored"
[12,] "0"      "0" "0" "0" "40"  "18.95"  "3.69"   "1.3"    "0.067"  "0.0119" "2.5"    "2.32"  
[13,] "0"      "0" "0" "0" "110" "18.66"  "2.86"   "1.39"   "0.1099" "0.0119" "2.1"    "2.09"  
[14,] "0"      "0" "0" "0" "160" "20.24"  "3.44"   "1.32"   "0.1335" "0.009"  "1.5"    "1.53"  
[15,] "0"      "0" "0" "0" "80"  "26.38"  "2.79"   "1.2"    "0.1434" "0.0162" "1.8"    "1.53"  
[16,] "0"      "0" "0" "0" "0"   "21.28"  "3.21"   "1.19"   "0.167"  "0.0081" "1.2"    "0.39"  
[17,] "0"      "0" "0" "0" "100" "18.85"  "3.89"   "1.24"   "0.1158" "0.0045" "1.5"    "1.13"  
[18,] "0"      "0" "0" "0" "40"  "20.52"  "3.28"   "1.13"   "0.1415" "0.0078" "1"      "1.13"  
[19,] "0"      "0" "0" "0" "190" "22.07"  "2.84"   "0.97"   "0.1034" "0.0122" "1.7"    "1.13"  
[20,] "0.0005" "0" "0" "0" "380" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored" "Not monitored"