metan
provides tools for computing several world-known
stability statistics. The following stability methods are
implemented.
performs_ammi()
.ammi_indexes()
.waasb()
.blup_indexes()
.ecovalence()
.gge()
.Annicchiarico()
.gai()
.ge_reg()
.Schmildt()
.ge_polar()
.ge_factanal()
.Shukla()
.ge_acv()
.waas()
(fixed-effect model) and waasb()
(mixed-effect model).superiority()
.Fox()
.Huehn()
.Thennarasu()
.The easiest way to compute the above-mentioned stability indexes is
by using the function ge_stats()
.
This is a wrapper that basically returns a summary of each method. If
you are looking for more details from each method like
plot()
and print()
, I’d suggest computing the
methods using their own function.
The complete functionality of the package is described at https://tiagoolivoto.github.io/metan/index.html. You’re welcome to check it out!
Brief examples will be shown using the dataset data_ge
that contains data on two variables assessed in 10 genotypes growing in
14 environments.
First of all, we will check the data for possible problems with the
function inspect()
.
library(metan)
inspect(data_ge)
# # A tibble: 5 × 10
# Variable Class Missing Levels Valid_n Min Median Max Outlier Text
# <chr> <chr> <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <lgl>
# 1 ENV factor No 14 420 NA NA NA NA NA
# 2 GEN factor No 10 420 NA NA NA NA NA
# 3 REP factor No 3 420 NA NA NA NA NA
# 4 GY numeric No - 420 0.67 2.61 5.09 0 NA
# 5 HM numeric No - 420 38 48 58 0 NA
Then, the details of the multi-environment trial can be obtained with
the function ge_details()
.
Note that to apply the function to all numeric variables quickly, we can
use the select
helper everything()
in the argument
resp
.
ge_details(data_ge,
env = ENV,
gen = GEN,
resp = everything())
# # A tibble: 10 × 3
# Parameters GY HM
# <chr> <chr> <chr>
# 1 Mean "2.67" "48.09"
# 2 SE "0.05" "0.21"
# 3 SD "0.92" "4.37"
# 4 CV "34.56" "9.09"
# 5 Min "0.67 (G10 in E11)" "38 (G2 in E14)"
# 6 Max "5.09 (G8 in E5)" "58 (G8 in E11)"
# 7 MinENV "E11 (1.37)" "E14 (41.03)"
# 8 MaxENV "E3 (4.06)" "E11 (54.2)"
# 9 MinGEN "G10 (2.47) " "G2 (46.66) "
# 10 MaxGEN "G8 (3) " "G5 (49.3) "
We can create a plot to show the performance of the genotypes across
the environments with ge_plot()
.
ge_plot(data_ge, GEN, ENV, GY)
Or obtain the means for genotypes, environments or
genotype-environment interaction with ge_means()
.
Note that the function round_cols()
provided by metan
round all numeric columns of a data frame
to two (default) significant figures.
<- ge_means(data_ge,
mge env = ENV,
gen = GEN,
resp = everything())
# Genotype-environment means
get_model_data(mge) %>% round_cols()
# Class of the model: ge_means
# Variable extracted: ge_means
# # A tibble: 140 × 4
# ENV GEN GY HM
# <fct> <fct> <dbl> <dbl>
# 1 E1 G1 2.37 46.5
# 2 E1 G10 1.97 46.9
# 3 E1 G2 2.9 45.3
# 4 E1 G3 2.89 45.9
# 5 E1 G4 2.59 48.3
# 6 E1 G5 2.19 49.9
# 7 E1 G6 2.3 48.2
# 8 E1 G7 2.77 47.4
# 9 E1 G8 2.9 48.0
# 10 E1 G9 2.33 47.7
# # … with 130 more rows
# Environment means
get_model_data(mge, what = "env_means") %>% round_cols()
# Class of the model: ge_means
# Variable extracted: env_means
# # A tibble: 14 × 3
# ENV GY HM
# <fct> <dbl> <dbl>
# 1 E1 2.52 47.4
# 2 E10 2.18 44.3
# 3 E11 1.37 54.2
# 4 E12 1.61 49.6
# 5 E13 2.91 46.6
# 6 E14 1.78 41.0
# 7 E2 3.18 44.1
# 8 E3 4.06 52.9
# 9 E4 3.68 50
# 10 E5 3.91 52.2
# 11 E6 2.66 45.9
# 12 E7 1.99 48.5
# 13 E8 2.54 45.2
# 14 E9 3.06 51.3
# Genotype means
get_model_data(mge, what = "gen_means") %>% round_cols()
# Class of the model: ge_means
# Variable extracted: gen_means
# # A tibble: 10 × 3
# GEN GY HM
# <fct> <dbl> <dbl>
# 1 G1 2.6 47.1
# 2 G10 2.47 48.5
# 3 G2 2.74 46.7
# 4 G3 2.96 47.6
# 5 G4 2.64 48.0
# 6 G5 2.54 49.3
# 7 G6 2.53 48.7
# 8 G7 2.74 48.0
# 9 G8 3 49.1
# 10 G9 2.51 47.9
The AMMI model may be fitted with with both functions performs_ammi()
and waas()
,
which is the acronym for the weighted average of absolute scores (Olivoto, Lúcio, Da silva, Marchioro, et al.
2019).
<- performs_ammi(data_ge, ENV, GEN, REP, resp = c(GY, HM))
ammi_model # variable GY
# ---------------------------------------------------------------------------
# AMMI analysis table
# ---------------------------------------------------------------------------
# Source Df Sum Sq Mean Sq F value Pr(>F) Proportion Accumulated
# ENV 13 279.574 21.5057 62.33 0.00e+00 NA NA
# REP(ENV) 28 9.662 0.3451 3.57 3.59e-08 NA NA
# GEN 9 12.995 1.4439 14.93 2.19e-19 NA NA
# GEN:ENV 117 31.220 0.2668 2.76 1.01e-11 NA NA
# PC1 21 10.749 0.5119 5.29 0.00e+00 34.4 34.4
# PC2 19 9.924 0.5223 5.40 0.00e+00 31.8 66.2
# PC3 17 4.039 0.2376 2.46 1.40e-03 12.9 79.2
# PC4 15 3.074 0.2049 2.12 9.60e-03 9.8 89.0
# PC5 13 1.446 0.1113 1.15 3.18e-01 4.6 93.6
# PC6 11 0.932 0.0848 0.88 5.61e-01 3.0 96.6
# PC7 9 0.567 0.0630 0.65 7.53e-01 1.8 98.4
# PC8 7 0.362 0.0518 0.54 8.04e-01 1.2 99.6
# PC9 5 0.126 0.0252 0.26 9.34e-01 0.4 100.0
# Residuals 252 24.367 0.0967 NA NA NA NA
# Total 536 389.036 0.7258 NA NA NA NA
# ---------------------------------------------------------------------------
#
# variable HM
# ---------------------------------------------------------------------------
# AMMI analysis table
# ---------------------------------------------------------------------------
# Source Df Sum Sq Mean Sq F value Pr(>F) Proportion Accumulated
# ENV 13 5710.32 439.255 57.22 1.11e-16 NA NA
# REP(ENV) 28 214.93 7.676 2.70 2.20e-05 NA NA
# GEN 9 269.81 29.979 10.56 7.41e-14 NA NA
# GEN:ENV 117 1100.73 9.408 3.31 1.06e-15 NA NA
# PC1 21 381.13 18.149 6.39 0.00e+00 34.6 34.6
# PC2 19 319.43 16.812 5.92 0.00e+00 29.0 63.6
# PC3 17 114.26 6.721 2.37 2.10e-03 10.4 74.0
# PC4 15 81.96 5.464 1.92 2.18e-02 7.4 81.5
# PC5 13 68.11 5.240 1.84 3.77e-02 6.2 87.7
# PC6 11 59.07 5.370 1.89 4.10e-02 5.4 93.0
# PC7 9 46.69 5.188 1.83 6.33e-02 4.2 97.3
# PC8 7 26.65 3.808 1.34 2.32e-01 2.4 99.7
# PC9 5 3.41 0.682 0.24 9.45e-01 0.3 100.0
# Residuals 252 715.69 2.840 NA NA NA NA
# Total 536 9112.21 17.000 NA NA NA NA
# ---------------------------------------------------------------------------
#
# All variables with significant (p < 0.05) genotype-vs-environment interaction
# Done!
<- waas(data_ge, ENV, GEN, REP, GY, verbose = FALSE) waas_index
The cross-validation procedures implemented in the metan
are based on the splitting of the original data into a training set and
a validation set. The model is fitted using the training set and the
predicted value is compared with the validation set. This process is
iterated many times, say, 1000 times. The lesser the difference between
predicted and validation data, the higher the predictive accuracy of the
model. More information may be found here.
The well-known AMMI2 biplot may be obtained using the function plot_scores()
.
ggplot2-based graphics are obtained. Please, note that since performs_ammi()
and , waas()
functions allow analyzing multiple variables at the same time, e.g.,
resp = c(v1, v2, ...)
, the output ammi_model
is a list that in this case has two elements, (GY and HM). To produce an
AMMI2 biplot with IPCA1 and IPCA3, for example, we use the argument
second
to change the default value of the y axis.
<- plot_scores(ammi_model)
a <- plot_scores(ammi_model,
b type = 2,
second = "PC3")
<- plot_scores(ammi_model,
c type = 2,
polygon = TRUE,
col.gen = "black",
col.env = "gray70",
col.segm.env = "gray70",
axis.expand = 1.5)
arrange_ggplot(a, b, c, tag_levels = "a", ncol = 1)
The S3 method predict()
is implemented for objects of class performs_ammi
and may
be used to estimate the response of each genotype in each environment
considering different number of Interaction Principal Component Axis
(IPCA). As a example, to predict the variables GY and HM we will use
four and six IPCA (number of significant IPCAs, respectively). In
addition, we will create a two way table with make_mat()
to show the predicted values for the variable GY.
<- predict(ammi_model, naxis = c(4, 6))
predicted %>%
predicted subset(TRAIT == "GY") %>%
make_mat(GEN, ENV, YpredAMMI) %>%
round_cols()
# E1 E10 E11 E12 E13 E14 E2 E3 E4 E5 E6 E7 E8 E9
# G1 2.52 2.15 1.30 1.60 3.05 1.62 3.01 4.06 3.53 4.02 2.63 1.87 2.39 2.71
# G10 1.96 1.52 0.89 1.04 1.83 1.90 3.13 4.16 4.21 3.34 2.54 2.18 2.75 3.15
# G2 2.88 2.28 1.48 1.93 3.02 1.48 3.23 4.62 3.62 3.84 2.69 1.90 2.05 3.39
# G3 2.76 2.46 1.71 1.84 3.28 2.07 3.62 4.22 4.05 4.21 2.94 2.09 2.88 3.24
# G4 2.54 2.24 1.41 1.66 2.70 1.78 3.16 3.88 3.38 3.61 2.51 2.06 2.40 3.65
# G5 2.31 2.09 1.32 1.44 2.64 1.77 3.21 3.62 3.47 3.55 2.44 1.83 2.49 3.34
# G6 2.30 2.17 1.40 1.43 2.88 1.77 3.26 3.40 3.41 3.68 2.43 1.67 2.56 3.12
# G7 2.75 2.48 1.36 1.88 3.12 1.88 2.63 4.05 3.05 4.20 2.70 2.57 2.50 3.19
# G8 2.84 2.57 1.72 1.92 3.62 2.10 3.43 4.26 3.97 4.59 3.06 2.19 2.97 2.79
# G9 2.34 1.79 1.10 1.34 2.96 1.43 3.11 4.38 4.06 4.07 2.69 1.53 2.37 1.99
The implementation of linear-mixed effect models to predict the
response variable in MET is made with the function gamem_met()
.
By default, genotype and genotype-vs-environment interaction are assumed
to have random effects. Use the argument random
to change
this default. In the following example the model is fitted to all
numeric variables in data_ge
.
<- gamem_met(data_ge, ENV, GEN, REP, everything())
model2 # Evaluating trait GY |====================== | 50% 00:00:00
|============================================| 100% 00:00:01
Evaluating trait HM # Method: REML/BLUP
# Random effects: GEN, GEN:ENV
# Fixed effects: ENV, REP(ENV)
# Denominador DF: Satterthwaite's method
# ---------------------------------------------------------------------------
# P-values for Likelihood Ratio Test of the analyzed traits
# ---------------------------------------------------------------------------
# model GY HM
# COMPLETE NA NA
# GEN 1.11e-05 5.07e-03
# GEN:ENV 2.15e-11 2.27e-15
# ---------------------------------------------------------------------------
# All variables with significant (p < 0.05) genotype-vs-environment interaction
Several residual plots may be obtained using the S3 generic function
plot()
..
plot(model2, which = c(1, 2, 7), ncol = 1)
The distribution of the random effects may be obtained using the
argument type = "re"
.
plot(model2, type = "re", nrow = 3)
We can get easily the model results such as the Likelihood Ration
Test for random effects, the variance components, and the BLUPs for
genotypes with get_model_data()
.
By default, the function returns the genetic parameters.
get_model_data(model2) %>% round_cols(digits = 3)
# Class of the model: waasb
# Variable extracted: genpar
# # A tibble: 9 × 3
# Parameters GY HM
# <chr> <dbl> <dbl>
# 1 Phenotypic variance 0.181 5.52
# 2 Heritability 0.154 0.089
# 3 GEIr2 0.313 0.397
# 4 h2mg 0.815 0.686
# 5 Accuracy 0.903 0.828
# 6 rge 0.37 0.435
# 7 CVg 6.26 1.46
# 8 CVr 11.6 3.50
# 9 CV ratio 0.538 0.415
library(ggplot2)
<- plot_blup(model2)
d <- plot_blup(model2,
e prob = 0.1,
col.shape = c("gray20", "gray80")) +
coord_flip()
arrange_ggplot(d, e, tag_levels = list(c("d", "e")), ncol = 1)
get_model_data(model2, what = "blupge") %>%
round_cols()
# Class of the model: waasb
# Variable extracted: blupge
# # A tibble: 140 × 4
# ENV GEN GY HM
# <fct> <fct> <dbl> <dbl>
# 1 E1 G1 2.4 46.6
# 2 E1 G10 2.11 47.2
# 3 E1 G2 2.78 45.7
# 4 E1 G3 2.84 46.2
# 5 E1 G4 2.55 48.0
# 6 E1 G5 2.27 49.4
# 7 E1 G6 2.34 48.1
# 8 E1 G7 2.7 47.4
# 9 E1 G8 2.86 48.0
# 10 E1 G9 2.35 47.6
# # … with 130 more rows
The WAASB index (Olivoto, Lúcio, Da silva, Marchioro, et al.
2019) is a quantitative stability measure based on the
weighted average of the absolute scores from the singular value
decomposition of the BLUPs for genotype-vs-interaction effects. We can
obtain this statistic with the function waasb()
combined with get_model_data()
using what = "WAASB"
.
<- waasb(data_ge, ENV, GEN, REP, everything(), verbose = FALSE)
model3 get_model_data(model3, what = "WAASB") %>%
round_cols()
# Class of the model: waasb
# Variable extracted: WAASB
# # A tibble: 10 × 3
# GEN GY HM
# <chr> <dbl> <dbl>
# 1 G1 0.13 0.38
# 2 G10 0.46 1.03
# 3 G2 0.21 0.79
# 4 G3 0.1 0.36
# 5 G4 0.25 0.6
# 6 G5 0.22 0.88
# 7 G6 0.17 0.41
# 8 G7 0.32 0.68
# 9 G8 0.26 0.44
# 10 G9 0.37 0.56
The function blup_indexes()
can be used to compute the harmonic mean of genotypic values (HMGV), the
relative performance of the genotypic values (RPGV) and the harmonic
mean of the relative performance of genotypic values (HMRPGV). See Alves et al. (2018) for more details. We use the
function get_model_data()
to get the HMRPGV (default) for all analyzed variables.
<- blup_indexes(model3)
index get_model_data(index) %>% round_cols()
# Class of the model: blup_ind
# Variable extracted: HMRPGV
# # A tibble: 10 × 3
# GEN GY HM
# <chr> <dbl> <dbl>
# 1 G1 0.97 0.98
# 2 G10 0.9 1.01
# 3 G2 1.02 0.97
# 4 G3 1.1 0.99
# 5 G4 0.99 1
# 6 G5 0.95 1.02
# 7 G6 0.95 1.01
# 8 G7 1.03 1
# 9 G8 1.12 1.02
# 10 G9 0.92 1
The GGE model is fitted with the function gge()
.
This function produces a GGE model based on both a two-way table (in our
case the object table
) with genotypes in the rows and
environments in columns, or a data.frame containing at least the columns
for genotypes, environments and the response variable(s).
<- gge(data_ge, ENV, GEN, GY) gge_model
The generic function plot()
is used to generate a biplot using as input a fitted model of class
gge
. The type of biplot is chosen by the argument
type
in the function. Ten biplots type are available
according to {Yan and Kang (2003)}.
type = 1
A basic biplot.type = 2
Mean performance vs. stability.type = 3
Which-won-where.type = 4
Discriminativeness
vs. representativeness.type = 5
Examine an environment.type = 6
Ranking environments.type = 7
Examine a genotype.type = 8
Ranking gentoypes.type = 9
Compare two genotypes.type = 10
Relationship among environments.<- plot(gge_model)
f <- plot(gge_model, type = 2)
g arrange_ggplot(e, f, tag_levels = list(c("e", "f")), ncol = 1)
ge_stats()
To compute all the stability statistics at once, we can use the
function ge_stats()
.
Again we get the results with get_model_data()
.
<- ge_stats(data_ge, ENV, GEN, REP, GY)
stat_ge # Evaluating trait GY |============================================| 100% 00:00:08
get_model_data(stat_ge) %>%
round_cols()
# Class of the model: ge_stats
# Variable extracted: stats
# # A tibble: 10 × 44
# var GEN Y CV ACV POLAR Var Shukla Wi_g Wi_f Wi_u Ecoval
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 GY G1 2.6 35.2 34.1 0.03 10.9 0.03 84.4 89.2 81.1 1.22
# 2 GY G10 2.47 42.4 38.6 0.14 14.2 0.24 59.2 64.6 54.4 7.96
# 3 GY G2 2.74 34.0 35.2 0.06 11.3 0.09 82.8 95.3 75.6 3.03
# 4 GY G3 2.96 29.9 33.8 0.02 10.1 0.01 104. 99.7 107. 0.72
# 5 GY G4 2.64 31.4 31 -0.05 8.93 0.06 85.9 79.6 91.9 2.34
# 6 GY G5 2.54 30.6 28.8 -0.12 7.82 0.05 82.7 82.2 82.4 1.84
# 7 GY G6 2.53 29.6 27.8 -0.15 7.34 0.05 83.0 83.7 81.8 1.81
# 8 GY G7 2.74 27.4 28.3 -0.13 7.33 0.12 83.9 77.6 93.4 4.16
# 9 GY G8 3 30.4 35.0 0.05 10.8 0.07 98.8 90.5 107. 2.57
# 10 GY G9 2.51 42.4 39.4 0.15 14.8 0.17 68.8 68.9 70.3 5.56
# # … with 32 more variables: bij <dbl>, Sij <dbl>, R2 <dbl>, ASTAB <dbl>,
# # ASI <dbl>, ASV <dbl>, AVAMGE <dbl>, DA <dbl>, DZ <dbl>, EV <dbl>, FA <dbl>,
# # MASI <dbl>, MASV <dbl>, SIPC <dbl>, ZA <dbl>, WAAS <dbl>, WAASB <dbl>,
# # HMGV <dbl>, RPGV <dbl>, HMRPGV <dbl>, Pi_a <dbl>, Pi_f <dbl>, Pi_u <dbl>,
# # Gai <dbl>, S1 <dbl>, S2 <dbl>, S3 <dbl>, S6 <dbl>, N1 <dbl>, N2 <dbl>,
# # N3 <dbl>, N4 <dbl>
The multi-trait stability index (MTSI) was proposed by Olivoto, Lúcio, Da silva, Sari, et al. (2019) and is used for simultaneous selection considering mean performance and stability (of several traits) in the analysis of METs using both fixed and mixed-effect models. For more details see the complete vignette.