First we’ll load up the package and some data:
library(forestmangr)
data(exfm1)
data(exfm2)
data(exfm3)
data(exfm4)
data(exfm5)
data_acs_pilot <- as.data.frame(exfm3)
data_acs_def <- as.data.frame(exfm4)
data_ace_pilot <- as.data.frame(exfm1)
data_ace_def <- as.data.frame(exfm2)
data_as <- as.data.frame(exfm5)
The objective of this example is to survey an area of 46.8 ha using the simple random sampling method. The aimed error is 20%. 10 plots of 3000 m² each were measured for a pilot inventory. The data collected is shown below:
data_acs_pilot
#> TOTAL_AREA PLOT_AREA VWB VWB_m3ha
#> 1 46.8 3000 41 136.66667
#> 2 46.8 3000 33 110.00000
#> 3 46.8 3000 24 80.00000
#> 4 46.8 3000 31 103.33333
#> 5 46.8 3000 10 33.33333
#> 6 46.8 3000 32 106.66667
#> 7 46.8 3000 62 206.66667
#> 8 46.8 3000 16 53.33333
#> 9 46.8 3000 66 220.00000
#> 10 46.8 3000 25 83.33333
Now we’ll calculate the inventory variables for a 20% error,
considering a finite population with the sprs
function.
Area values must be inserted in square meters, and total area values
must be in hectares:
sprs(data_acs_pilot, "VWB", 3000, 46.8,error = 20, pop = "fin")
#> Variables Values
#> 1 Total number of sampled plots (n) 10.0000
#> 2 Number of maximum plots (N) 156.0000
#> 3 Variance Quoeficient (VC) 53.2670
#> 4 t-student 2.2622
#> 5 recalculated t-student 2.0452
#> 6 Number of samples regarding the admited error 25.0000
#> 7 Variance (S2) 328.0000
#> 8 Standard deviation (s) 18.1108
#> 9 Mean (Y) 34.0000
#> 10 Standard error of the mean (Sy) 5.5405
#> 11 Absolute Error 12.5335
#> 12 Relative Error (%) 36.8634
#> 13 Estimated Total Value (Yhat) 5304.0000
#> 14 Total Error 1955.2326
#> 15 Inferior Confidence Interval (m3) 21.4665
#> 16 Superior Confidence Interval (m3) 46.5335
#> 17 Inferior Confidence Interval (m3/ha) 71.5549
#> 18 Superior Confidence Interval (m3/ha) 155.1118
#> 19 inferior Total Confidence Interval (m3) 3348.7674
#> 20 Superior Total Confidence Interval (m3) 7259.2326
With these results, we can see that in order to meet the desired error, we’ll need 15 more samples. After a new survey was done, this are the new data:
data_acs_def
#> TOTAL_AREA PLOT_AREA VWB
#> 1 46.8 3000 41
#> 2 46.8 3000 33
#> 3 46.8 3000 24
#> 4 46.8 3000 31
#> 5 46.8 3000 10
#> 6 46.8 3000 32
#> 7 46.8 3000 62
#> 8 46.8 3000 16
#> 9 46.8 3000 66
#> 10 46.8 3000 25
#> 11 46.8 3000 44
#> 12 46.8 3000 7
#> 13 46.8 3000 57
#> 14 46.8 3000 22
#> 15 46.8 3000 31
#> 16 46.8 3000 40
#> 17 46.8 3000 43
#> 18 46.8 3000 27
#> 19 46.8 3000 17
#> 20 46.8 3000 50
#> 21 46.8 3000 38
#> 22 46.8 3000 20
#> 23 46.8 3000 35
#> 24 46.8 3000 31
#> 25 46.8 3000 26
Now the definitive inventory can be done:
sprs(data_acs_def, "VWB", 3000, 46.8, error = 20, pop = "fin")
#> Variables Values
#> 1 Total number of sampled plots (n) 25.0000
#> 2 Number of maximum plots (N) 156.0000
#> 3 Variance Quoeficient (VC) 45.4600
#> 4 t-student 2.0639
#> 5 recalculated t-student 2.0930
#> 6 Number of samples regarding the admited error 20.0000
#> 7 Variance (S2) 226.6933
#> 8 Standard deviation (s) 15.0563
#> 9 Mean (Y) 33.1200
#> 10 Standard error of the mean (Sy) 2.7595
#> 11 Absolute Error 5.6952
#> 12 Relative Error (%) 17.1957
#> 13 Estimated Total Value (Yhat) 5166.7200
#> 14 Total Error 888.4555
#> 15 Inferior Confidence Interval (m3) 27.4248
#> 16 Superior Confidence Interval (m3) 38.8152
#> 17 Inferior Confidence Interval (m3/ha) 91.4159
#> 18 Superior Confidence Interval (m3/ha) 129.3841
#> 19 inferior Total Confidence Interval (m3) 4278.2645
#> 20 Superior Total Confidence Interval (m3) 6055.1755
The desired error was met.
The area values can also be inserted as variables:
sprs(data_acs_def, "VWB", "PLOT_AREA", "TOTAL_AREA",
error = 20, pop = "fin")
#> Variables Values
#> 1 Total number of sampled plots (n) 25.0000
#> 2 Number of maximum plots (N) 156.0000
#> 3 Variance Quoeficient (VC) 45.4600
#> 4 t-student 2.0639
#> 5 recalculated t-student 2.0930
#> 6 Number of samples regarding the admited error 20.0000
#> 7 Variance (S2) 226.6933
#> 8 Standard deviation (s) 15.0563
#> 9 Mean (Y) 33.1200
#> 10 Standard error of the mean (Sy) 2.7595
#> 11 Absolute Error 5.6952
#> 12 Relative Error (%) 17.1957
#> 13 Estimated Total Value (Yhat) 5166.7200
#> 14 Total Error 888.4555
#> 15 Inferior Confidence Interval (m3) 27.4248
#> 16 Superior Confidence Interval (m3) 38.8152
#> 17 Inferior Confidence Interval (m3/ha) 91.4159
#> 18 Superior Confidence Interval (m3/ha) 129.3841
#> 19 inferior Total Confidence Interval (m3) 4278.2645
#> 20 Superior Total Confidence Interval (m3) 6055.1755
It’s also possible to run multiple simple random sampling
inventories. To demonstrate this, we’ll use the example dataset for
stratified sampling, but running simple random statistics. We’ll still
use the sprs
function, but use the .groups
argument to run a simple random sampling inventory for each stratum:
sprs(data_ace_def, "VWB", "PLOT_AREA", "STRATA_AREA",
.groups = "STRATA" ,error = 20, pop = "fin")
#> Variables STRATA1 STRATA2 STRATA3
#> 1 Total number of sampled plots (n) 14.0000 20.0000 23.0000
#> 2 Number of maximum plots (N) 144.0000 164.0000 142.0000
#> 3 Variance Quoeficient (VC) 24.4785 15.8269 16.7813
#> 4 t-student 2.1604 2.0930 2.0739
#> 5 recalculated t-student 2.4469 4.3027 4.3027
#> 6 Number of samples regarding the admited error 9.0000 11.0000 12.0000
#> 7 Variance (S2) 2.1829 3.6161 5.3192
#> 8 Standard deviation (s) 1.4774 1.9016 2.3063
#> 9 Mean (Y) 6.0357 12.0150 13.7435
#> 10 Standard error of the mean (Sy) 0.3752 0.3984 0.4402
#> 11 Absolute Error 0.8105 0.8339 0.9130
#> 12 Relative Error (%) 13.4288 6.9409 6.6431
#> 13 Estimated Total Value (Yhat) 869.1429 1970.4600 1951.5739
#> 14 Total Error 116.7157 136.7670 129.6455
#> 15 Inferior Confidence Interval (m3) 5.2252 11.1811 12.8305
#> 16 Superior Confidence Interval (m3) 6.8462 12.8489 14.6565
#> 17 Inferior Confidence Interval (m3/ha) 52.2519 111.8105 128.3048
#> 18 Superior Confidence Interval (m3/ha) 68.4624 128.4895 146.5647
#> 19 inferior Total Confidence Interval (m3) 752.4271 1833.6930 1821.9284
#> 20 Superior Total Confidence Interval (m3) 985.8586 2107.2270 2081.2194
The objective of this example is to survey an area using the stratified random sampling method. The area was divided into 3 strata: one with 14.4 ha and 7 plots, another with 16.4 ha and 8 plots, and another with 14.2 ha and 7 plots. The plots have an area of 1000 square meters. In total, 22 plots were sampled for the pilot inventory. The data is shown below:
data_ace_pilot
#> STRATA STRATA_AREA PLOT_AREA VWB VWB_m3ha
#> 1 1 14.4 1000 7.90 79.0
#> 2 1 14.4 1000 3.80 38.0
#> 3 1 14.4 1000 4.40 44.0
#> 4 1 14.4 1000 6.25 62.5
#> 5 1 14.4 1000 5.55 55.5
#> 6 1 14.4 1000 8.10 81.0
#> 7 1 14.4 1000 6.10 61.0
#> 8 2 16.4 1000 10.20 102.0
#> 9 2 16.4 1000 15.25 152.5
#> 10 2 16.4 1000 13.40 134.0
#> 11 2 16.4 1000 13.60 136.0
#> 12 2 16.4 1000 14.20 142.0
#> 13 2 16.4 1000 9.85 98.5
#> 14 2 16.4 1000 10.20 102.0
#> 15 2 16.4 1000 11.55 115.5
#> 16 3 14.2 1000 10.65 106.5
#> 17 3 14.2 1000 12.15 121.5
#> 18 3 14.2 1000 14.60 146.0
#> 19 3 14.2 1000 10.90 109.0
#> 20 3 14.2 1000 16.55 165.5
#> 21 3 14.2 1000 17.90 179.0
#> 22 3 14.2 1000 13.35 133.5
We’ll calculate the statistics with an aimed error of 5%, considering
a finite population using the strs
function. Area values
can be inserted as a numeric vector, or as a variable. The plot area
must be inserted in square meters, and strata area must be in
hectares:
strs(data_ace_pilot, "VWB", 3000, c(14.4, 16.4, 14.2),
strata = "STRATA", error = 5, pop = "fin")
#> $Table1
#> Variables STRATA 1 STRATA 2
#> 1 STRATA_AREA 14.4000 16.4000
#> 2 Plot Area 3000.0000 3000.0000
#> 3 Number of sampled plots per stratum (nj) 7.0000 8.0000
#> 4 Total number of sampled plots (n) 22.0000 22.0000
#> 5 Number of maximum plots per stratum (Nj) 48.0000 54.6667
#> 6 Number of maximum plots (N) 150.0000 150.0000
#> 7 Nj/N Ratio (Pj) 0.3200 0.3644
#> 8 Stratum sum (Eyj) 42.1000 98.2500
#> 9 Stratum quadratic sum (Eyj2) 268.8950 1237.2275
#> 10 Mean of Yi per stratum (Yj) 6.0143 12.2812
#> 11 PjSj2 0.8370 1.5929
#> 12 PjSj 0.5175 0.7619
#> 13 PjYj 1.9246 4.4758
#> 14 t-student 2.0796 2.0796
#> 15 recalculated t-student 2.0129 2.0129
#> 16 Number of samples regarding the admited error 45.0000 45.0000
#> 17 Optimal number of samples per stratum (nj optimal) 11.0000 16.0000
#> 18 Optimal number of samples (n optimal) 46.0000 46.0000
#> 19 Total value of Y per stratum (Yhatj) 288.6857 671.3750
#> STRATA 3
#> 1 14.2000
#> 2 3000.0000
#> 3 7.0000
#> 4 22.0000
#> 5 47.3333
#> 6 150.0000
#> 7 0.3156
#> 8 96.1000
#> 9 1365.5500
#> 10 13.7286
#> 11 2.4316
#> 12 0.8760
#> 13 4.3321
#> 14 2.0796
#> 15 2.0129
#> 16 45.0000
#> 17 19.0000
#> 18 46.0000
#> 19 649.8190
#>
#> $Table2
#> Variables value
#> 1 t-student 2.0796
#> 2 Standard error of the mean (Sy) 0.4228
#> 3 Stratified Variance 4.8614
#> 4 Stratified Standard Deviation 2.1554
#> 5 Variance Quoeficient (VC) 20.0829
#> 6 Stratified Mean (Y) 10.7325
#> 7 Absolute Error 0.8793
#> 8 Relative Error (%) 8.1925
#> 9 Estimated Total Value (Yhat) 1609.8798
#> 10 Total Error 131.8894
#> 11 Inferior Confidence Interval (m3) 9.8533
#> 12 Superior Confidence Interval (m3) 11.6118
#> 13 Inferior Confidence Interval (m3/ha) 32.8442
#> 14 Superior Confidence Interval (m3/ha) 38.7060
#> 15 inferior Total Confidence Interval (m3) 1477.9904
#> 16 Superior Total Confidence Interval (m3) 1741.7691
Analyzing the first table, we can see that in order to achieve the desired error, we must sample 24 additional plots. 4 in stratum 1, 8 in stratum 2 and 12 in stratum 3.
After a new survey, the new data is shown below:
data_ace_def
#> STRATA STRATA_AREA PLOT_AREA VWB VWB_m3ha
#> 1 1 14.4 1000 7.90 79.0
#> 2 1 14.4 1000 3.80 38.0
#> 3 1 14.4 1000 4.40 44.0
#> 4 1 14.4 1000 6.25 62.5
#> 5 1 14.4 1000 5.55 55.5
#> 6 1 14.4 1000 8.10 81.0
#> 7 1 14.4 1000 6.10 61.0
#> 8 1 14.4 1000 6.60 66.0
#> 9 1 14.4 1000 7.40 74.0
#> 10 1 14.4 1000 5.35 53.5
#> 11 1 14.4 1000 5.90 59.0
#> 12 1 14.4 1000 4.65 46.5
#> 13 1 14.4 1000 4.25 42.5
#> 14 1 14.4 1000 8.25 82.5
#> 15 2 16.4 1000 10.20 102.0
#> 16 2 16.4 1000 15.25 152.5
#> 17 2 16.4 1000 13.40 134.0
#> 18 2 16.4 1000 13.60 136.0
#> 19 2 16.4 1000 14.20 142.0
#> 20 2 16.4 1000 9.85 98.5
#> 21 2 16.4 1000 10.20 102.0
#> 22 2 16.4 1000 11.55 115.5
#> 23 2 16.4 1000 9.25 92.5
#> 24 2 16.4 1000 11.30 113.0
#> 25 2 16.4 1000 13.95 139.5
#> 26 2 16.4 1000 12.70 127.0
#> 27 2 16.4 1000 10.15 101.5
#> 28 2 16.4 1000 14.90 149.0
#> 29 2 16.4 1000 10.80 108.0
#> 30 2 16.4 1000 11.55 115.5
#> 31 2 16.4 1000 13.90 139.0
#> 32 2 16.4 1000 9.20 92.0
#> 33 2 16.4 1000 12.45 124.5
#> 34 2 16.4 1000 11.90 119.0
#> 35 3 14.2 1000 10.65 106.5
#> 36 3 14.2 1000 12.15 121.5
#> 37 3 14.2 1000 14.60 146.0
#> 38 3 14.2 1000 10.90 109.0
#> 39 3 14.2 1000 16.55 165.5
#> 40 3 14.2 1000 17.90 179.0
#> 41 3 14.2 1000 13.35 133.5
#> 42 3 14.2 1000 14.90 149.0
#> 43 3 14.2 1000 9.70 97.0
#> 44 3 14.2 1000 15.20 152.0
#> 45 3 14.2 1000 13.45 134.5
#> 46 3 14.2 1000 12.40 124.0
#> 47 3 14.2 1000 14.45 144.5
#> 48 3 14.2 1000 13.55 135.5
#> 49 3 14.2 1000 12.30 123.0
#> 50 3 14.2 1000 15.65 156.5
#> 51 3 14.2 1000 14.20 142.0
#> 52 3 14.2 1000 17.80 178.0
#> 53 3 14.2 1000 14.80 148.0
#> 54 3 14.2 1000 9.35 93.5
#> 55 3 14.2 1000 12.60 126.0
#> 56 3 14.2 1000 13.80 138.0
#> 57 3 14.2 1000 15.85 158.5
Now we’ll run the inventory again, this time with the definitive data:
strs(data_ace_def, "VWB", "PLOT_AREA", "STRATA_AREA",
strata = "STRATA", error = 5, pop = "fin")
#> $Table1
#> Variables STRATA 1 STRATA 2
#> 1 STRATA_AREA 14.4000 16.4000
#> 2 Plot Area 1000.0000 1000.0000
#> 3 Number of sampled plots per stratum (nj) 14.0000 20.0000
#> 4 Total number of sampled plots (n) 57.0000 57.0000
#> 5 Number of maximum plots per stratum (Nj) 144.0000 164.0000
#> 6 Number of maximum plots (N) 450.0000 450.0000
#> 7 Nj/N Ratio (Pj) 0.3200 0.3644
#> 8 Stratum sum (Eyj) 84.5000 240.3000
#> 9 Stratum quadratic sum (Eyj2) 538.3950 2955.9100
#> 10 Mean of Yi per stratum (Yj) 6.0357 12.0150
#> 11 PjSj2 0.6985 1.3179
#> 12 PjSj 0.4728 0.6930
#> 13 PjYj 1.9314 4.3788
#> 14 t-student 2.0032 2.0032
#> 15 recalculated t-student 2.0141 2.0141
#> 16 Number of samples regarding the admited error 46.0000 46.0000
#> 17 Optimal number of samples per stratum (nj optimal) 12.0000 17.0000
#> 18 Optimal number of samples (n optimal) 47.0000 47.0000
#> 19 Total value of Y per stratum (Yhatj) 869.1429 1970.4600
#> STRATA 3
#> 1 14.2000
#> 2 1000.0000
#> 3 23.0000
#> 4 57.0000
#> 5 142.0000
#> 6 450.0000
#> 7 0.3156
#> 8 316.1000
#> 9 4461.3350
#> 10 13.7435
#> 11 1.6785
#> 12 0.7278
#> 13 4.3368
#> 14 2.0032
#> 15 2.0141
#> 16 46.0000
#> 17 18.0000
#> 18 47.0000
#> 19 1951.5739
#>
#> $Table2
#> Variables value
#> 1 t-student 2.0032
#> 2 Standard error of the mean (Sy) 0.2339
#> 3 Stratified Variance 3.6949
#> 4 Stratified Standard Deviation 1.8936
#> 5 Variance Quoeficient (VC) 17.7851
#> 6 Stratified Mean (Y) 10.6471
#> 7 Absolute Error 0.4685
#> 8 Relative Error (%) 4.4003
#> 9 Estimated Total Value (Yhat) 4791.1768
#> 10 Total Error 210.8250
#> 11 Inferior Confidence Interval (m3) 10.1786
#> 12 Superior Confidence Interval (m3) 11.1156
#> 13 Inferior Confidence Interval (m3/ha) 101.7856
#> 14 Superior Confidence Interval (m3/ha) 111.1556
#> 15 inferior Total Confidence Interval (m3) 4580.3518
#> 16 Superior Total Confidence Interval (m3) 5002.0018
The desired error was met.
Now we’ll survey an area of 18 hectares in which 18 plots of 200 m² each were systematically sampled:
data_as
#> TOTAL_AREA PLOT_AREA VWB VWB_m3ha
#> 1 10 200 6 300
#> 2 10 200 8 400
#> 3 10 200 9 450
#> 4 10 200 10 500
#> 5 10 200 13 650
#> 6 10 200 12 600
#> 7 10 200 18 900
#> 8 10 200 19 950
#> 9 10 200 20 1000
#> 10 10 200 20 1000
#> 11 10 200 24 1200
#> 12 10 200 23 1150
#> 13 10 200 26 1300
#> 14 10 200 30 1500
#> 15 10 200 31 1550
#> 16 10 200 31 1550
#> 17 10 200 33 1650
#> 18 10 200 32 1600
First, let’s see what error we would get, if we used the simple random sampling method:
sprs(data_as, "VWB", 200, 18)
#> Variables Values
#> 1 Total number of sampled plots (n) 18.0000
#> 2 Number of maximum plots (N) 900.0000
#> 3 Variance Quoeficient (VC) 44.6505
#> 4 t-student 2.1098
#> 5 recalculated t-student 1.9873
#> 6 Number of samples regarding the admited error 79.0000
#> 7 Variance (S2) 81.9771
#> 8 Standard deviation (s) 9.0541
#> 9 Mean (Y) 20.2778
#> 10 Standard error of the mean (Sy) 2.1341
#> 11 Absolute Error 4.5025
#> 12 Relative Error (%) 22.2042
#> 13 Estimated Total Value (Yhat) 18250.0000
#> 14 Total Error 4052.2580
#> 15 Inferior Confidence Interval (m3) 15.7753
#> 16 Superior Confidence Interval (m3) 24.7803
#> 17 Inferior Confidence Interval (m3/ha) 788.7634
#> 18 Superior Confidence Interval (m3/ha) 1239.0143
#> 19 inferior Total Confidence Interval (m3) 14197.7420
#> 20 Superior Total Confidence Interval (m3) 22302.2580
We got a 22.2% error. Now, let’s calculate the sampling error using
the method of successive differences, with the ss_diffs
function. To use this function, the data must be set in the measured
order, the plot area must be in square meters, and the total area value
must be in hectares.
ss_diffs(data_as, "VWB", 200, 18)
#> Variables Values
#> 1 Total number of sampled plots (n) 18.0000
#> 2 Number of maximum plots (N) 900.0000
#> 3 Variance Quoeficient (VC) 44.6505
#> 4 t-student 2.1098
#> 5 recalculated t-student 1.9873
#> 6 Number of samples regarding the admited error 79.0000
#> 7 Variance (S2) 81.9771
#> 8 Standard deviation (S) 9.0541
#> 9 Mean (Y) 20.2778
#> 10 Standard error of the mean (Sy) 0.4041
#> 11 Absolute Error 0.8527
#> 12 Relative Error (%) 4.2050
#> 13 Estimated Total Value (Yhat) 18250.0000
#> 14 Total Error 767.4046
#> 15 Inferior Confidence Interval (m3) 19.4251
#> 16 Superior Confidence Interval (m3) 21.1304
#> 17 Inferior Confidence Interval (m3/ha) 971.2553
#> 18 Superior Confidence Interval (m3/ha) 1056.5225
#> 19 inferior Total Confidence Interval (m3) 17482.5954
#> 20 Superior Total Confidence Interval (m3) 19017.4046
We got a 4.2% error, which is significantly lower than before.