| Title: | Package Admix for Admixture (aka Contamination) Models | 
| Version: | 2.5.1 | 
| Description: | Implements techniques to estimate the unknown quantities related to two-component admixture models, where the two components can belong to any distribution (note that in the case of multinomial mixtures, the two components must belong to the same family). Estimation methods depend on the assumptions made on the unknown component density; see Bordes and Vandekerkhove (2010) <doi:10.3103/S1066530710010023>, Patra and Sen (2016) <doi:10.1111/rssb.12148>, and Milhaud, Pommeret, Salhi, Vandekerkhove (2024) <doi:10.3150/23-BEJ1593>. In practice, one can estimate both the mixture weight and the unknown component density in a wide variety of frameworks. On top of that, hypothesis tests can be performed in one and two-sample contexts to test the unknown component density (see Milhaud, Pommeret, Salhi and Vandekerkhove (2022) <doi:10.1016/j.jspi.2021.05.010>, and Milhaud, Pommeret, Salhi, Vandekerkhove (2024) <doi:10.3150/23-BEJ1593>). Finally, clustering of unknown mixture components is also feasible in a K-sample setting (see Milhaud, Pommeret, Salhi, Vandekerkhove (2024) https://jmlr.org/papers/v25/23-0914.html). | 
| License: | GPL (≥ 3) | 
| URL: | https://github.com/XavierMilhaud/admix-Rpackage | 
| BugReports: | https://github.com/XavierMilhaud/admix-Rpackage/issues | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.3.2 | 
| Imports: | base, cubature, EnvStats, fdrtool, graphics, Iso, MASS, orthopolynom, pracma, Rcpp, Rdpack, stats, utils | 
| Suggests: | doParallel, doRNG, evd, flexsurv, foreach, gridExtra, knitr, lattice, logitnorm, plyr, reshape2, rmarkdown, rmutil, testthat (≥ 3.0.0) | 
| Depends: | R (≥ 2.10) | 
| LazyData: | true | 
| LinkingTo: | Rcpp | 
| RdMacros: | Rdpack | 
| Config/testthat/edition: | 3 | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | yes | 
| Packaged: | 2025-09-02 15:13:55 UTC; XM | 
| Author: | Xavier Milhaud [aut, cre], Pierre Vandekerkhove [ctb], Denys Pommeret [ctb], Yahia Salhi [ctb] | 
| Maintainer: | Xavier Milhaud <xavier.milhaud.research@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-09-05 18:10:08 UTC | 
admix: Package Admix for Admixture (aka Contamination) Models
Description
Implements techniques to estimate the unknown quantities related to two-component admixture models, where the two components can belong to any distribution (note that in the case of multinomial mixtures, the two components must belong to the same family). Estimation methods depend on the assumptions made on the unknown component density; see Bordes and Vandekerkhove (2010) doi:10.3103/S1066530710010023, Patra and Sen (2016) doi:10.1111/rssb.12148, and Milhaud, Pommeret, Salhi, Vandekerkhove (2024) doi:10.3150/23-BEJ1593. In practice, one can estimate both the mixture weight and the unknown component density in a wide variety of frameworks. On top of that, hypothesis tests can be performed in one and two-sample contexts to test the unknown component density (see Milhaud, Pommeret, Salhi and Vandekerkhove (2022) doi:10.1016/j.jspi.2021.05.010, and Milhaud, Pommeret, Salhi, Vandekerkhove (2024) doi:10.3150/23-BEJ1593). Finally, clustering of unknown mixture components is also feasible in a K-sample setting (see Milhaud, Pommeret, Salhi, Vandekerkhove (2024) https://jmlr.org/papers/v25/23-0914.html).
Author(s)
Maintainer: Xavier Milhaud xavier.milhaud.research@gmail.com
Other contributors:
- Pierre Vandekerkhove [contributor] 
- Denys Pommeret [contributor] 
- Yahia Salhi [contributor] 
See Also
Useful links:
- Report bugs at https://github.com/XavierMilhaud/admix-Rpackage/issues 
Equality test of K unknown component distributions
Description
Equality test of the unknown component distributions coming from K (K > 1) admixture models, based on the Inversion - Best Matching (IBM) approach. Recall that we have K populations following admixture models, each one with probability density functions (pdf) l_k = p_k*f_k + (1-p_k)*g_k, where g_k is the known pdf and l_k corresponds to the observed sample. Perform the following hypothesis test: H0 : f_1 = ... = f_K against H1 : f_i differs from f_j (i different from j, and i,j in 1,...,K).
Usage
IBM_k_samples_test(
  samples,
  admixMod,
  conf_level = 0.95,
  sim_U = NULL,
  tune_penalty = TRUE,
  n_sim_tab = 100,
  parallel = FALSE,
  n_cpu = 2
)
Arguments
| samples | A list of the K samples to be studied, all following admixture distributions. | 
| admixMod | A list of objects of class admix_model, containing useful information about distributions and parameters. | 
| conf_level | The confidence level of the K-sample test. | 
| sim_U | (default to NULL) Random draws of the inner convergence part of the contrast as defined in the IBM approach (see references below). | 
| tune_penalty | (default to TRUE) A boolean that allows to choose between a classical penalty term or an optimized penalty (embedding some tuning parameters, automatically optimized). Optimized penalty is particularly useful for low or unbalanced sample sizes to detect alternatives to the null hypothesis (H0). It is recommended to set it to TRUE. | 
| n_sim_tab | (default to 100) Number of simulated Gaussian processes when tabulating the inner convergence distribution in the 'icv' testing method using the IBM estimation approach. | 
| parallel | (default to FALSE) Boolean to indicate whether parallel computations are performed (speed-up the tabulation). | 
| n_cpu | (default to 2) Number of cores used when paralleling computations. | 
Value
An object of class 'IBM_test', containing 17 attributes: 1) the number of samples for the test; 2) the sizes of each sample; 3) the information about component distributions for each sample; 4) the reject decision of the test; 5) the confidence level of the test (1-alpha, where alpha refers to the first-type error); 6) the test p-value; 7) the 95th-percentile of the contrast tabulated distribution; 8) the test statistic value; 9) the selected rank (number of terms involved in the test statistic); 10) the terms (pairwise contrasts) involved in the test statistic; 11) A boolean indicating whether the penalization corresponds to the null hypothesis has been considered; 12) the value of penalized test statistics; 13) the selected optimal 'gamma' parameter (see reference below); 14) the selected optimal constant involved in the penalization process (see also the reference); 15) the tabulated distribution of the contrast; 16) the estimated mixing proportions (not implemented yet, since that makes sense only in case of equal unknown component distributions); 17) the matrix of pairwise contrasts (distance between two samples); 18) the matrix of the ranks of pairwise contrasts; and 19) the matrix of identifiers of pairwise contrasts.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2024). “Contamination-source based K-sample clustering.” Journal of Machine Learning Research, 25(287), 1–32. https://jmlr.org/papers/v25/23-0914.html.
Examples
## Not run: 
####### Under the null hypothesis H0 (with K=3 populations):
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 450, weight = 0.4,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
mixt2 <- twoComp_mixt(n = 380, weight = 0.7,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 1, "sd" = 1)))
mixt3 <- twoComp_mixt(n = 400, weight = 0.8,
                      comp.dist = list("norm", "exp"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("rate" = 1)))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
data3 <- get_mixture_data(mixt3)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
admixMod3 <- admix_model(knownComp_dist = mixt3$comp.dist[[2]],
                         knownComp_param = mixt3$comp.param[[2]])
## Perform the 3-samples test:
IBM_k_samples_test(samples = list(data1, data2, data3),
                   admixMod = list(admixMod1, admixMod2, admixMod3),
                   conf_level = 0.95, sim_U = NULL, n_sim_tab = 5,
                   tune_penalty = FALSE, parallel = FALSE)
## End(Not run)
Cluster K populations following admixture models
Description
Create clusters on the unknown components related to the K populations following admixture models. Based on the K-sample test using Inversion - Best Matching (IBM) approach, see 'Details' below for further information.
Usage
admix_cluster(
  samples,
  admixMod,
  conf_level = 0.95,
  tune_penalty = TRUE,
  tabul_dist = NULL,
  echo = TRUE,
  ...
)
Arguments
| samples | A list of the K (K>1) samples to be studied, all following admixture distributions. | 
| admixMod | A list of objects of class admix_model, containing useful information about distributions and parameters. | 
| conf_level | (default to 0.95) The confidence level of the k-sample tests used in the clustering procedure. | 
| tune_penalty | (default to TRUE) A boolean that allows to choose between a classical penalty term or an optimized penalty (embedding some tuning parameters, automatically optimized). Optimized penalty is particularly useful for low/mid-sized samples, or unbalanced sample sizes to detect alternatives to the null hypothesis (H0). It is recommended to use it. | 
| tabul_dist | (default to NULL) Only useful for comparisons of detected clusters at different confidence levels. A list of the tabulated distributions of the stochastic integral used in the k-sample test, each element for each cluster previously detected. | 
| echo | (default to TRUE) Display the remaining computation time. | 
| ... | Optional arguments to IBM_k_samples_test; namely 'n_sim_tab', 'parallel' and 'n_cpu'. These are crucial to speed-up the building of clusters. | 
Value
An object of class admix_cluster, containing 12 attributes: 1) the number of samples under study; 2) the sizes of samples; 3) the information about mixture components in each sample (distributions and parameters); 4) the number of detected clusters; 5) the list of p-values for each k-sample test at the origin of detected clusters; 6) the cluster affiliation for each sample; 7) the confidence level of statistical tests; 8) which samples in which cluster; 9) the size of clusters; 10) the estimated weights of the unknown component distributions inside each cluster (remind that estimated weights are consistent only if unknown components are tested to be identical, which is the case inside clusters); 11) the matrix of pairwise discrepancies across all samples; 12) the list of tabulated distributions used for statistical tests involved in building the clusters.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2024). “Contamination-source based K-sample clustering.” Journal of Machine Learning Research, 25(287), 1–32. https://jmlr.org/papers/v25/23-0914.html.
Examples
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 1600, weight = 0.8,
                      comp.dist = list("gamma", "exp"),
                      comp.param = list(list("shape" = 16, "scale" = 1/4),
                                        list("rate" = 1/3.5)))
mixt2 <- twoComp_mixt(n = 2000, weight = 0.7,
                      comp.dist = list("gamma", "exp"),
                      comp.param = list(list("shape" = 14, "scale" = 1/2),
                                        list("rate" = 1/5)))
mixt3 <- twoComp_mixt(n = 2500, weight = 0.6,
                      comp.dist = list("gamma", "gamma"),
                      comp.param = list(list("shape" = 16, "scale" = 1/4),
                                        list("shape" = 12, "scale" = 1/2)))
mixt4 <- twoComp_mixt(n = 3800, weight = 0.5,
                      comp.dist = list("gamma", "exp"),
                      comp.param = list(list("shape" = 14, "scale" = 1/2),
                                        list("rate" = 1/7)))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
data3 <- get_mixture_data(mixt3)
data4 <- get_mixture_data(mixt4)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
admixMod3 <- admix_model(knownComp_dist = mixt3$comp.dist[[2]],
                         knownComp_param = mixt3$comp.param[[2]])
admixMod4 <- admix_model(knownComp_dist = mixt4$comp.dist[[2]],
                         knownComp_param = mixt4$comp.param[[2]])
## Clustering procedure:
admix_cluster(samples = list(data1, data2, data3, data4),
              admixMod = list(admixMod1, admixMod2, admixMod3, admixMod4),
              conf_level = 0.95, tune_penalty = TRUE, n_sim_tab = 10)
Estimate the unknown weight in the admixture model
Description
Estimate the unknown component weight (and location shift parameter in case of a symmetric unknown component density), using different estimation techniques. We remind that the i-th admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, where g_i is the known component density. The unknown quantities p_i and f_i then have to be estimated.
Usage
admix_estim(samples, admixMod, est_method = c("PS", "BVdk", "IBM"), ...)
Arguments
| samples | A list of the K (K>0) samples to be studied, all following admixture distributions. | 
| admixMod | A list of objects of class admix_model, containing useful information about distributions and parameters. | 
| est_method | The estimation method to be applied. Can be one of 'BVdk' (Bordes and Vandekerkhove estimator), 'PS' (Patra and Sen estimator), or 'IBM' (Inversion Best-Matching approach) in the continuous case (continuous random variable). Only 'IBM' for discrete random variables. The same estimation method is performed on each sample if several samples are provided. | 
| ... | Optional arguments to estim_PS, estim_BVdk or estim_IBM depending on the choice made by the user for the estimation method. | 
Details
For further details on the different estimation techniques, see references below on i) Patra and Sen estimator ; ii) Bordes and Vandekerkhove estimator ; iii) Inversion Best-Matching approach. Important note: estimation by 'IBM' requires at least two samples at hand, and provides unbiased estimators only if the distributions of unknown components are equal (meaning that it requires to perform previously this test between the pairs of samples, see admix_test.
Value
An object of class admix_estim, containing at least 5 attributes: 1) the number of samples under study; 2) the information about the mixture components (distributions and parameters); 3) the sizes of the samples; 4) the chosen estimation technique (one of 'BVdk', 'PS' or 'IBM'); 5) the estimated mixing proportions (weights of the unknown component distributions in the mixture model). In case of 'BVdk' estimation, one additional attribute corresponding to the estimated location shift parameter is included.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Patra RK, Sen B (2016). “Estimation of a two-component mixture model with applications to multiple testing.” Journal of the Royal Statistical Society Series B, 78(4), 869-893. Bordes L, Delmas C, Vandekerkhove P (2006). “Semiparametric Estimation of a Two-Component Mixture Model Where One Component Is Known.” Scandinavian Journal of Statistics, 33(4), 733–752. ISSN 03036898, 14679469, http://www.jstor.org/stable/4616955. Bordes L, Vandekerkhove P (2010). “Semiparametric two-component mixture model with a known component: An asymptotically normal estimator.” Mathematical Methods of Statistics, 19(1), 22–41. doi:10.3103/S1066530710010023. Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2024). “Two-sample contamination model test.” Bernoulli, 30(1), 170–197. doi:10.3150/23-BEJ1593.
Examples
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 300, weight = 0.7,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
mixt2 <- twoComp_mixt(n = 250, weight = 0.85,
                      comp.dist = list("norm", "exp"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("rate" = 1)))
mixt3 <- twoComp_mixt(n = 500, weight = 0.5,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 2),
                                        list("lambda" = 7)))
mixt4 <- twoComp_mixt(n = 1500, weight = 0.2, comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size"=1, "prob" = c(0.8,0.1,0.1)),
                                   list("size"=1, "prob" = c(0.1,0.2,0.7))))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
data3 <- get_mixture_data(mixt3)
data4 <- get_mixture_data(mixt4)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
admixMod3 <- admix_model(knownComp_dist = mixt3$comp.dist[[2]],
                         knownComp_param = mixt3$comp.param[[2]])
admixMod4 <- admix_model(knownComp_dist = mixt4$comp.dist[[2]],
                         knownComp_param = mixt4$comp.param[[2]])
# Estimation by different methods:
admix_estim(samples = list(data1), admixMod = list(admixMod1), est_method = "BVdk")
admix_estim(samples = list(data1, data2, data3, data4),
            admixMod = list(admixMod1, admixMod2, admixMod3, admixMod4), est_method = "PS")
admix_estim(samples = list(data1,data2), admixMod = list(admixMod1,admixMod2), est_method = "IBM")
Define the distribution/parameter(s) of the known component
Description
Create an object of class 'admix_model', containing the information about the known component distribution in the admixture model. An admixture (aka contamination) model is a two-component mixture model with one known component. Both the second component distribution and the mixing weight are unknown.
Usage
admix_model(knownComp_dist, knownComp_param)
Arguments
| knownComp_dist | (Character) The name of the distribution (specified as in R glossary) of the known component of the admixture model | 
| knownComp_param | (Character) A vector of the names of the parameters (specified as in R glossary) involved in the chosen known distribution, with their values. | 
Value
An object of class admix_model, containing 2 attributes: 1) a list that gives the information about the distributions involved in the two-component mixture model (the unknown and the known ones); 2) a list that gives the information about the corresponding parameters of those distributions.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
admix_model(knownComp_dist = "norm", knownComp_param = list("mean"=0, "sd"=1))
admix_model(knownComp_dist = "exp", knownComp_param = list("rate"=2))
admix_model(knownComp_dist = "pois", knownComp_param = list("lambda"=5))
admix_model(knownComp_dist = "multinom", knownComp_param = list("size"=1, "prob"=c(0.2,0.8,0.1)))
Equality test for the unknown components of admixture models
Description
Perform hypothesis test between unknown components of a list of admixture models, where we remind that the i-th admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, with g_i the known component density. The unknown quantities p_i and f_i are thus estimated, leading to the test given by the following null and alternative hypothesis: H0: f_i = f_j for all i != j against H1 : there exists at least i != j such that f_i differs from f_j. The test can be performed using two methods, either the comparison of coefficients obtained through polynomial basis expansions of the component densities, or by the inner-convergence property obtained using the IBM approach. See 'Details' below for further information.
Usage
admix_test(
  samples,
  admixMod,
  test_method = c("poly", "icv"),
  conf_level = 0.95,
  ...
)
Arguments
| samples | A list of the K (K > 0) samples to be studied, each one assumed to follow a mixture distribution. | 
| admixMod | A list of objects of class admix_model, containing useful information about distributions and parameters of the contamination / admixture models under study. | 
| test_method | The testing method to be applied. Can be either 'poly' (polynomial basis expansion) or 'icv' (inner convergence from IBM). The same testing method is performed between all samples. In the one-sample case, only 'poly' is available and the test is a gaussianity test. For further details, see section 'Details' below. | 
| conf_level | The confidence level of the K-sample test. | 
| ... | Depending on the choice made by the user for the test method ('poly' or 'icv'), optional arguments to gaussianity_test or orthobasis_test (in case of 'poly'), and to IBM_k_samples_test in case of 'icv'. | 
Details
For further details on implemented hypothesis tests, see the references hereafter. .
Value
An object of class 'htest' containing the classical attributes of the latter class, as well as other attributes specific to the inherited object class. Usually, the test decision (reject the null hypothesis or not); the confidence level of the test (1-alpha, where alpha denotes the level of the test or equivalently the type-I error); the number of samples under study; the respective size of each sample; the information about known mixture components.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2024). “Contamination-source based K-sample clustering.” Journal of Machine Learning Research, 25(287), 1–32. https://jmlr.org/papers/v25/23-0914.html. Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2022). “Semiparametric two-sample admixture components comparison test: The symmetric case.” Journal of Statistical Planning and Inference, 216, 135-150. ISSN 0378-3758, doi:10.1016/j.jspi.2021.05.010. Pommeret D, Vandekerkhove P (2019). “Semiparametric density testing in the contamination model.” Electronic Journal of Statistics, 4743–4793. doi:10.1214/19-EJS1650.
Examples
####### Example with 2 samples
mixt1 <- twoComp_mixt(n = 380, weight = 0.7,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
mixt2 <- twoComp_mixt(n = 350, weight = 0.85,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = -1, "sd" = 1)))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
admix_test(samples = list(data1,data2), admixMod = list(admixMod1,admixMod2),
           conf_level = 0.95, test_method = "poly", ask_poly_param = FALSE, support = "Real")
Measurements of heliocentric velocities
Description
An evolving data frame of velocities for 4 dSph galaxies (namely Carina, Sextans, Sculptor and Fornax), from SIMBAD astronomical database.
Usage
allGalaxies
Format
Currently contains 8,862 rows and 3 columns, with information on:
- Target
- Target identification; Galaxy-ID number 
- HV
- Weighted mean Heliocentric rest frame velocity 
- Name
- The name of the galaxy 
Source
Estimates the decontaminated CDF of the unknown component in an admixture
Description
Estimates the decontaminated cumulative distribution function (CDF) of the unknown component in an admixture model, using inversion of the admixture CDF. Recall that an admixture model follows the cumulative distribution function (CDF) L, where L = p*F + (1-p)*G, with g a known CDF and p and f unknown quantities.
Usage
decontaminated_cdf(sample1, estim.p, admixMod)
Arguments
| sample1 | Observations of the sample under study. | 
| estim.p | The estimated weight of the unknown component distribution, related to the proportion of the unknown component in the admixture model studied. | 
| admixMod | An object of class 'admix_model', containing useful information about distributions and parameters. | 
Details
The decontaminated CDF is obtained by inverting the admixture CDF, given by L = p*F + (1-p)*G, to isolate the unknown component F after having estimated p. This means that F = (1/hat(p)) * (hat(L)-(1-p)*G).
Value
The decontaminated CDF F of the admixture model, of class 'stepfun' (step function).
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 400, weight = 0.4,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 3, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
mixt2 <- twoComp_mixt(n = 300, weight = 0.6,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 3, "sd" = 0.5),
                                        list("mean" = 5, "sd" = 2)))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1,data2),
                   admixMod = list(admixMod1,admixMod2), est_method = 'PS')
prop <- get_mixing_weights(est)
## Determine the decontaminated version of the unknown CDF by inversion:
F1 <- decontaminated_cdf(sample1 = data1, estim.p = prop[1], admixMod = admixMod1)
F2 <- decontaminated_cdf(sample1 = data2, estim.p = prop[2], admixMod = admixMod2)
abs <- seq(from=-1, to=4, length.out=100)
plot(x=abs, y=F1(abs), xlim=c(-1,4), ylim=c(0,1), type="l")
par(new = TRUE)
plot(x=abs, y=F2(abs), xlim=c(-1,4), ylim=c(0,1), type="l", col="red")
Probability density function of the unknown component
Description
Estimate the decontaminated density of the unknown component in the admixture model under study, after inversion of the admixture cumulative distribution function. Recall that an admixture model follows the cumulative distribution function (CDF) L, where L = p*F + (1-p)*G, with g a known CDF and p and f unknown quantities.
Usage
decontaminated_density(sample1, estim.p, admixMod)
Arguments
| sample1 | Sample under study. | 
| estim.p | The estimated weight, related to the proportion of the unknown component distribution in the admixture model studied. | 
| admixMod | An object of class 'admix_model', containing useful information about known distribution(s) and parameter(s). | 
Details
The decontaminated density is obtained by inverting the admixture density, given by l = p*f + (1-p)*g, to isolate the unknown component f after having estimated p.
Value
An object of class 'decontaminated_density', containing 3 attributes: 1) the data under study; 2) the type of support for the underlying distribution (either discrete or continuous, useful for plots); 3) the decontaminated density function.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 400, weight = 0.4,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
data1 <- get_mixture_data(mixt1)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1), admixMod = list(admixMod1),
                   est_method = 'PS')
## Determine the decontaminated version of the unknown density by inversion:
x <- decontaminated_density(sample1 = data1, estim.p = get_mixing_weights(est),
                       admixMod = admixMod1)
print(x)
summary(x)
plot(x, type = "l")
####### Discrete support:
mixt1 <- twoComp_mixt(n = 5000, weight = 0.6,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 3),
                                        list("lambda" = 2)))
mixt2 <- twoComp_mixt(n = 4000, weight = 0.8,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 3),
                                        list("lambda" = 4)))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1, data2),
                   admixMod = list(admixMod1, admixMod2), est_method = 'IBM')
## Determine the decontaminated version of the unknown density by inversion:
x <- decontaminated_density(sample1 = data1, estim.p = get_mixing_weights(est)[1],
                       admixMod = admixMod1)
summary(x)
plot(x)
####### Finite discrete support:
mixt1 <- twoComp_mixt(n = 12000, weight = 0.6,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.3,0.4,0.3)),
                                        list("size" = 1, "prob" = c(0.6,0.3,0.1))))
mixt2 <- twoComp_mixt(n = 10000, weight = 0.8,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.3,0.4,0.3)),
                                        list("size" = 1, "prob" = c(0.2,0.6,0.2))))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1, data2),
                   admixMod = list(admixMod1, admixMod2), est_method = 'IBM')
## Determine the decontaminated version of the unknown density by inversion:
x <- decontaminated_density(sample1 = data1, estim.p = get_mixing_weights(est)[1],
                       admixMod = admixMod1)
summary(x)
plot(x)
Detect the type of support of some random variables
Description
Given one or two sets of observations (samples), the function provides with the most plausible type of support for the underlying random variables to be studied. If less than 3 percents of the observations have different values, we consider that the support is discrete. Otherwise, we consider it as a continuous support.
Usage
detect_support_type(sample1, sample2 = NULL)
Arguments
| sample1 | The first sample of observations under study. | 
| sample2 | The second sample of observations under study. | 
Value
The type of support, either discrete or continuous.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 1500, weight = 0.5,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 3, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
data1 <- get_mixture_data(mixt1)
mixt2 <- twoComp_mixt(n = 2000, weight = 0.7,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 3, "sd" = 0.5),
                                        list("mean" = 5, "sd" = 2)))
data2 <- get_mixture_data(mixt2)
## Test the type of support:
detect_support_type(data1, data2)
Estimation of the admixture parameters by Bordes & Vandekerkhove (2010)
Description
Estimates parameters in an admixture model where the unknown component is assumed to have a symmetric density. More precisely, estimates the two parameters (mixture weight and location shift) in the admixture model with pdf: l(x) = p*f(x-mu) + (1-p)*g(x), x in R, where g is the known component, p is the proportion and f is the unknown component with symmetric density. The localization shift parameter is denoted mu, and the component weight p. See the reference below for further details.
Usage
estim_BVdk(
  samples,
  admixMod,
  method = c("L-BFGS-B", "Nelder-Mead"),
  compute_var = FALSE
)
Arguments
| samples | The observed sample under study. | 
| admixMod | An object of class admix_model, containing useful information about distributions and parameters. | 
| method | The method used throughout the optimization process, either 'L-BFGS-B' or 'Nelder-Mead' (see ?optim). | 
| compute_var | (default to FALSE) A boolean that indicates whether one computes the variance of the estimators of unknown mixing proportions and location shift parameter. | 
Value
An object of class estim_BVdk, containing 8 attributes: 1) the number of sample under study (set to 1 here); 2) the sample size; 3) the information about mixture components (distributions and parameters); 4) the estimation method (Bordes and Vandekerkhove here, see the given reference); 5) the estimated mixing proportion (weight of the unknown component distribution); 6) the estimated location parameter of the unknown component distribution (with symetric density); 7) the variance of the two estimators (respectively the mixing proportion and location shift); 8) the optimization method that was used.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Bordes L, Vandekerkhove P (2010). “Semiparametric two-component mixture model with a known component: An asymptotically normal estimator.” Mathematical Methods of Statistics, 19(1), 22–41. doi:10.3103/S1066530710010023.
See Also
print.estim_BVdk() for printing a short version of the results from this estimation method,
and summary.estim_BVdk() for more comprehensive results.
Examples
## Not run: 
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 200, weight = 0.4,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
## Retrieves the mixture data:
data1 <- get_mixture_data(mixt1)
## Define the admixture model:
admixMod <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                        knownComp_param = mixt1$comp.param[[2]])
## Perform the estimation of parameters in real-life:
ex <- estim_BVdk(samples = data1, admixMod = admixMod, method = 'L-BFGS-B')
print.estim_BVdk(ex)
## Second example:
mixt2 <- twoComp_mixt(n = 200, weight = 0.65,
                      comp.dist = list("norm", "exp"),
                      comp.param = list(list("mean" = -1, "sd" = 0.5),
                                        list("rate" = 1)))
data2 <- get_mixture_data(mixt2)
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                        knownComp_param = mixt2$comp.param[[2]])
## Perform the estimation of parameters in real-life:
ex <- estim_BVdk(samples = data2, admixMod = admixMod2, method = 'L-BFGS-B',
                 compute_var = TRUE)
print.estim_BVdk(ex)
## End(Not run)
Estimates weights of unknown components from 2 admixtures using IBM
Description
Estimation of the component weights from the Inversion - Best Matching (IBM) method, related to two admixture models with respective probability density function (pdf) l1 and l2, such that: l1 = p1*f1 + (1-p1)g1 and l2 = p2f2 + (1-p2)*g2, where g1 and g2 are the known component densities. For further details about IBM approach, see 'Details' below.
Usage
estim_IBM(samples, admixMod, n.integ = 1000, compute_var = FALSE)
Arguments
| samples | A list of the two considered samples. | 
| admixMod | A list of two objects of class admix_model, one for each sample. | 
| n.integ | Number of data points generated for the distribution on which to integrate. | 
| compute_var | (default to FALSE) A boolean that indicates whether one computes the variance of the estimators of unknown mixing proportions. | 
Value
An object of class estim_IBM, containing 7 attributes: 1) the number of samples under study; 2) the sizes of samples; 3) the information about mixture components (distributions and parameters) for each sample; 4) the estimation method (Inversion Best Matching here, see the given reference); 5) the estimated mixing proportions (weights of the unknown component distributions in each sample); 6) the arbitrary value of the mixing weight in the first admixture sample (in case of equal known components, see the given reference); 7) the support of integration that was used in the computations.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2024). “Two-sample contamination model test.” Bernoulli, 30(1), 170–197. doi:10.3150/23-BEJ1593.
See Also
print.estim_IBM() for printing a short version of the results from this estimation method,
and summary.estim_IBM() for more comprehensive results.
Examples
## Not run: 
## Continuous support: simulate mixture data.
mixt1 <- twoComp_mixt(n = 1500, weight = 0.5,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 3, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
data1 <- get_mixture_data(mixt1)
mixt2 <- twoComp_mixt(n = 2000, weight = 0.7,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 3, "sd" = 0.5),
                                        list("mean" = 5, "sd" = 2)))
data2 <- get_mixture_data(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimate the mixture weights of the two admixture models (provide only hat(theta)_n):
ex <- estim_IBM(samples = list(data1,data2),
                admixMod = list(admixMod1,admixMod2), n.integ = 1000)
print.estim_IBM(ex)
## Example 2: multinomial distribution (discrete case).
mixt1 <- twoComp_mixt(n = 1500, weight = 0.8,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.2,0.3,0.5)),
                                        list("size" = 1, "prob" = c(0.1,0.6,0.3))))
data1 <- get_mixture_data(mixt1)
mixt2 <- twoComp_mixt(n = 2000, weight = 0.3,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.2,0.3,0.5)),
                                        list("size" = 1, "prob" = c(0.7,0.1,0.2))))
data2 <- get_mixture_data(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimate the mixture weights of the two admixture models (provide only hat(theta)_n):
ex <- estim_IBM(samples = list(data1,data2), admixMod = list(admixMod1,admixMod2),
                compute_var = TRUE)
print.estim_IBM(ex)
## End(Not run)
Estimates in an admixture using Patra and Sen approach
Description
Estimation of both the weight and the distribution of the unknown component in an admixture model, by Patra and Sen approach. Remind that the admixture probability density function (pdf) l is given by l = p*f + (1-p)*g, where g is the known component of the two-component mixture, p is the unknown proportion of the unknown component distribution f. More information in 'Details' below concerning the estimation method.
Usage
estim_PS(
  samples,
  admixMod,
  method = c("fixed", "lwr.bnd", "cv"),
  c.n = 0.1 * log(log(length(samples))),
  folds = 10,
  reps = 1,
  cn.s = NULL,
  cn.length = 100,
  gridsize = 1200
)
Arguments
| samples | Sample to be studied. | 
| admixMod | An object of class admix_model, containing information about the known component distribution and its parameter(s). | 
| method | One of 'lwr.bnd', fixed' or 'cv': depending on whether compute some lower bound of the mixing proportion, the estimate based on the value of 'c.n' or use cross-validation for choosing 'c.n' (tuning parameter). | 
| c.n | (default to NULL) A positive number for the penalization, see reference below. If NULL, equals to 0.1*log(log(n)). | 
| folds | (optional, default to 10) Number of folds used for cross-validation. | 
| reps | (optional, default to 1) Number of replications for cross-validation. | 
| cn.s | (optional) A sequence of 'c.n' to be used for cross-validation (vector of values). Default is equally spaced grid of 100 values between .001 x log(log(n)) and 0.2 x log(log(n)). | 
| cn.length | (optional, default to 100) Number of equally spaced tuning parameter (between .001 x log(log(n)) and 0.2 x log(log(n))). Values to search from. | 
| gridsize | (default to 600) Number of equally spaced points (between 0 and 1) to evaluate the distance function. Larger values are more computationally intensive but also lead to more accurate estimates. | 
Value
An object of class estim_PS, containing 10 attributes: 1) the number of samples studied (1 in this case); 2) the sample size; 3) the information about component distributions of the admixture model; 4) the estimation method 5patra and Sen here); 5) the estimated mixing weight (estimate of the unknown component proportion); 6) the estimated decontaminated CDF; 7) an object of the class 'dist.fun' (that gives the distance); 8) the tuning parameter 'c.n'; 9) the lower bound of the estimated mixing proportion (if such an option has been chosen); 10) the number of observations.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Patra RK, Sen B (2016). “Estimation of a two-component mixture model with applications to multiple testing.” Journal of the Royal Statistical Society Series B, 78(4), 869-893.
See Also
print.estim_PS() for printing a short version of the results from this estimation method,
and summary.estim_PS() for more comprehensive results.
Examples
## Not run: 
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 800, weight = 0.33,
                      comp.dist = list("gamma", "exp"),
                      comp.param = list(list("shape" = 2, "scale" = 0.5),
                                        list("rate" = 0.25)))
data1 <- get_mixture_data(mixt1)
## Define the admixture model:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
## Estimation step:
ex <- estim_PS(samples = data1, admixMod = admixMod1, method = 'fixed')
print.estim_PS(ex)
## End(Not run)
Gaussianity test in an admixture model
Description
Performs an hypothesis test to check for the gaussianity of the unknown mixture component. Recall that an admixture model has probability density function (pdf) l = p*f + (1-p)*g, where g is the known pdf and l is observed (others are unknown). This test requires optimization (to estimate the unknown parameters) as defined by Bordes & Vandekerkhove (2010), which means that the unknown mixture component must have a symmetric density.
Usage
gaussianity_test(
  sample,
  admixMod,
  conf_level = 0.95,
  ask_poly_param = FALSE,
  K = 3,
  s = 0.25,
  support = c("Real", "Integer", "Positive", "Bounded.continuous"),
  ...
)
Arguments
| sample | (numeric) The sample under study. | 
| admixMod | An object of class admix_model, containing useful information about distributions and parameters. | 
| conf_level | (default to 0.95) The confidence level. Equals 1-alpha, where alpha is the level of the test (type-I error). | 
| ask_poly_param | (default to FALSE) If TRUE, ask the user to choose both the order 'K' of expansion coefficients in the orthonormal polynomial basis, and the penalization rate 's' involved on the penalization rule for the test. | 
| K | (default to 3) If not asked (see the previous argument), number of coefficients considered for the polynomial basis expansion. | 
| s | (in ]0,1/2[, default to 0.25) If not asked (see the previous argument), normalization rate involved in the penalization rule for model selection. See the reference below. | 
| support | Support of the probability distributions, useful to choose the appropriate polynomial orthonormal basis. One of 'Real', 'Integer', 'Positive', or 'Bounded.continuous'. | 
| ... | Optional arguments to estim_BVdk. | 
Value
An object of class gaussianity_test, inherited class from "htest". Contains attributes like 1) the number of populations under study (1 in this case); 2) the sample size; 3) the information about the known component distribution; 4) the reject decision of the test; 5) the confidence level of the test, 6) the p-value of the test; 7) the value of the test statistic; 8) the variance of the test statistic at each order in the polynomial orthobasis expansion; 9) the selected rank (order) for the test statistic; 10) a list of estimates (mixing weight, mean and standard deviation of the Gaussian unknown distribution).
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Pommeret D, Vandekerkhove P (2019). “Semiparametric density testing in the contamination model.” Electronic Journal of Statistics, 4743–4793. doi:10.1214/19-EJS1650.
Examples
## Not run: 
####### Under the null hypothesis H0.
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 250, weight = 0.4,
                      comp.dist = list("norm", "exp"),
                      comp.param = list(list("mean" = -2, "sd" = 0.5),
                                        list("rate" = 1)))
data1 <- get_mixture_data(mixt1)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
## Performs the test:
gaussianity_test(sample = data1, admixMod = admixMod1,
                 conf_level = 0.95, K = 3, s = 0.1, support = "Real")
## End(Not run)
Extractor for object of class 'admix_cluster'
Description
Extract the clusters that were discovered among K samples, where belonging to one given cluster means having equal unknown component distributions.
Usage
get_cluster_members(x)
Arguments
| x | An object of class 'admix_cluster'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'admix_cluster'
Description
Extract the clusters that were discovered among K samples, where belonging to one given cluster means having equal unknown component distributions.
Usage
## S3 method for class 'admix_cluster'
get_cluster_members(x)
Arguments
| x | An object of class 'admix_cluster'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'admix_cluster'
Description
Provide the number of samples in each cluster.
Usage
get_cluster_sizes(x)
Arguments
| x | An object of class 'admix_cluster'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'admix_cluster'
Description
Provide the number of samples in each cluster.
Usage
## S3 method for class 'admix_cluster'
get_cluster_sizes(x)
Arguments
| x | An object of class 'admix_cluster'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the matrix storing the identifiers of discrepancies using Inversion-Best Matching approach between all couples among the K samples under study.
Usage
get_discrepancy_id(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the matrix storing the identifiers of discrepancies using Inversion-Best Matching approach between all couples among the K samples under study.
Usage
## S3 method for class 'IBM_test'
get_discrepancy_id(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test' or 'admix_cluster'
Description
Provide the matrix storing discrepancies using Inversion-Best Matching approach between all couples among the K samples under study.
Usage
get_discrepancy_matrix(x)
Arguments
| x | An object of class 'IBM_test' or 'admix_cluster'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the matrix storing discrepancies using Inversion-Best Matching approach between all couples among the K samples under study.
Usage
## S3 method for class 'IBM_test'
get_discrepancy_matrix(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'admix_cluster'
Description
Provide the matrix storing discrepancies using Inversion-Best Matching approach between all couples among the K samples under study.
Usage
## S3 method for class 'admix_cluster'
get_discrepancy_matrix(x)
Arguments
| x | An object of class 'admix_cluster'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the matrix storing the ranks of discrepancies using Inversion-Best Matching approach between all couples among the K samples under study.
Usage
get_discrepancy_rank(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the matrix storing the ranks of discrepancies using Inversion-Best Matching approach between all couples among the K samples under study.
Usage
## S3 method for class 'IBM_test'
get_discrepancy_rank(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for objects of class 'admix_estim', 'htest' or 'admix_cluster'
Description
Get the known component(s) of admixture model(s) considered for estimation, test, or clustering.
Usage
get_known_component(x)
Arguments
| x | An object of class 'admix_estim', 'htest', or 'admix_cluster'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for objects of class 'admix_cluster'
Description
Get the known component(s) of admixture model(s) considered for clustering.
Usage
## S3 method for class 'admix_cluster'
get_known_component(x)
Arguments
| x | An object of class 'admix_cluster'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for objects of class 'admix_estim'
Description
Get the known component(s) of admixture model(s) considered for estimation.
Usage
## S3 method for class 'admix_estim'
get_known_component(x)
Arguments
| x | An object of class 'admix_estim'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for objects of class 'htest'
Description
Get the known component(s) of admixture model(s) considered for testing
Usage
## S3 method for class 'htest'
get_known_component(x)
Arguments
| x | An object of class 'htest'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for objects of class 'admix_estim' or 'htest'
Description
Get the estimated unknown mixing proportion(s) related to the weight(s) of the unknown component distribution(s) of the admixture model(s).
Usage
get_mixing_weights(x)
Arguments
| x | An object of class 'admix_estim' or 'htest'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for objects of class 'admix_estim'
Description
Get the estimated unknown mixing proportion(s) related to the weight(s) of the unknown component distribution(s) of the admixture model(s).
Usage
## S3 method for class 'admix_estim'
get_mixing_weights(x)
Arguments
| x | An object of class 'admix_estim'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for objects of class 'htest'
Description
Get the estimated unknown mixing proportion(s) related to the weight(s) of the unknown component distribution(s) of the admixture model(s).
Usage
## S3 method for class 'htest'
get_mixing_weights(x)
Arguments
| x | An object of class 'htest'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'twoComp_mixt'
Description
Get the mixture data generated from method 'twoComp_mixt'.
Usage
get_mixture_data(x)
Arguments
| x | An object of class 'twoComp_mixt'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the terms (or discrepancies) that compose the test statistic for the k-sample test.
Usage
get_statistic_components(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the terms (or discrepancies) that compose the test statistic for the k-sample test.
Usage
## S3 method for class 'IBM_test'
get_statistic_components(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the tabulated distribution that allows to define the extreme quantile against which the test statistic is compared.
Usage
get_tabulated_dist(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'IBM_test'
Description
Provide the tabulated distribution that allows to define the extreme quantile against which the test statistic is compared.
Usage
## S3 method for class 'IBM_test'
get_tabulated_dist(x)
Arguments
| x | An object of class 'IBM_test'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Equality of known components in two admixture models
Description
Test if the known component distributions coming from two admixture models are identical.
Usage
is_equal_knownComp(admixMod1, admixMod2)
Arguments
| admixMod1 | An object of class 'admix_model' related to the first admixture model. | 
| admixMod2 | An object of class 'admix_model' related to the second admixture model. | 
Value
A boolean (TRUE if the known components are the same, otherwise FALSE).
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
admixMod1 <- admix_model(knownComp_dist = "norm",
                         knownComp_param = list("mean"=0, "sd"=1))
admixMod2 <- admix_model(knownComp_dist = "norm",
                         knownComp_param = list("mean"=0, "sd"=1))
is_equal_knownComp(admixMod1, admixMod2)
admixMod1 <- admix_model(knownComp_dist = "multinom",
                         knownComp_param = list("size"=1, "prob"=c(0.2,0.5,0.3)))
admixMod2 <- admix_model(knownComp_dist = "multinom",
                         knownComp_param = list("size"=1, "prob"=c(0.2,0.4,0.4)))
is_equal_knownComp(admixMod1, admixMod2)
Dataset of heliocentric velocity for the Milky Way
Description
Dataset of heliocentric velocity for the Milky Way
Usage
milkyWay
Format
A data frame with 170,601 rows and 1 column:
- V1
- Heliocentric velocity measurements of the Milky way 
Source
https://www.aanda.org/articles/aa/full_html/2018/08/aa32905-18/aa32905-18.html
References
Walker MG, Mateo M, Olszewski EW, Gnedin OY, Wang X, Sen B, Woodroofe M (2007). “Velocity Dispersion Profiles of Seven Dwarf Spheroidal Galaxies.” The Astrophysical Journal, 667(1), L53–L56. ISSN 1538-4357, doi:10.1086/521998, http://dx.doi.org/10.1086/521998.
Dataset of deaths statistics in 11 european countries
Description
Dataset of deaths statistics in 11 european countries
Usage
mortality_sample
Format
Dataset providing the exposure-to-death (population size) and number of deaths for males in 11 european countries, between 1908 and 2020, with ages ranging from 30 years old to 85 years old. Exported from the Human Mortality Database (HMD). The two first lists relate to some subsample of the population size and number of deaths in those countries, with random sampling from the original dataset.
An evolving data frame of exposure-to-death and number of deaths in Belgium, Switzerland, Denmark, Spain, Finland, France, United Kingdom, Italia, The Netherlands, Norway and Sweden.
- XP
- A list of eleven elements (one for each country) giving a subset of the exposure-to-death (or reduced population size), each element having 56 rows (ages 30-85) and 113 columns (period 1908-2020) 
- DX
- A list of eleven elements (one for each country) giving a subset of the number of deaths, each element having 56 rows (ages 30-85) and 113 columns (period 1908-2020) 
- names
- A list of eleven elements giving the names of the countries, in the same order as the elements in other lists 
Source
Equality test of two unknown component distributions using polynomial expansions
Description
Tests the null hypothesis (H0: f1=f2) using the decomposition of unknown component densities of two admixture distributions in an adequate orthonormal polynomial basis. Recall that we have two admixture models with respective probability density functions (pdf) l1 = p1*f1 + (1-p1)g1 and l2 = p2f2 + (1-p2)*g2, where g1 and g2 are the only known elements and l1 and l2 are observed. The admixture weights p1 and p2 thus have to be estimated. For further information on this method, see 'Details' below.
Usage
orthobasis_test(
  samples,
  admixMod,
  conf_level = 0.95,
  est_method = c("BVdk", "PS"),
  ask_poly_param = FALSE,
  K = 3,
  s = 0.25,
  nb_echBoot = 100,
  support = c("Real", "Integer", "Positive", "Bounded.continuous", "Bounded.discrete"),
  bounds_supp = NULL,
  ...
)
Arguments
| samples | List of the two samples, each one following the mixture distribution given by l = p*f + (1-p)*g, with f and p unknown and g known. | 
| admixMod | A list of objects of class admix_model, containing useful information about distributions and parameters. | 
| conf_level | The confidence level, default to 95 percent. Equals 1-alpha, where alpha is the level of the test (type-I error). | 
| est_method | Estimation method to get the component weights, either 'PS' (Patra and Sen estimation) or 'BVdk' (Bordes and Vendekerkhove estimation). Choosing 'PS' requires to specify the number of bootstrap samples. | 
| ask_poly_param | (default to FALSE) If TRUE, ask the user to choose both the order 'K' of expansion coefficients in the orthonormal polynomial basis, and the penalization rate 's' involved on the penalization rule for the test. | 
| K | (K > 0, default to 3) If not asked (see the previous argument), number of coefficients considered for the polynomial basis expansion. | 
| s | (in ]0,1/2[, default to 0.25) If not asked (see the previous argument), rate at which the normalization factor is set in the penalization rule for model selection (in ]0,1/2[). Low values of 's' favors the detection of alternative hypothesis. See reference below. [, default to 0.25) If not asked (see the previous argument), rate at which the normalization factor is set in the penalization rule for model selection (in ]: R:,%20default%20to%200.25)%20If%20not%20asked%20(see%20the%20previous%20argument),%20rate%20at%20which%20the%20normalization%20factor%20is%20set%20in%0A%20%20%20%20%20%20%20%20%20%20the%20penalization%20rule%20for%20model%20selection%20(in%20 | 
| nb_echBoot | (default to 100) Number of bootstrap samples, useful when choosing 'PS' estimation method. | 
| support | Support of the probability distributions, useful to choose the appropriate polynomial orthonormal basis. One of 'Real', 'Integer', 'Positive', or 'Bounded.continuous'. | 
| bounds_supp | (default to NULL) Useful if support = 'Bounded.continuous', a list of minimum and maximum bounds, specified as following: list( list(min.f1,min.g1,min.f2,min.g2) , list(max.f1,max.g1,max.f2,max.g2) ) | 
| ... | Optional arguments to estim_BVdk or estim_PS, depending on the chosen argument 'est_method' (see above). | 
Value
An object of class orthobasis_test, containing ten attributes: 1) the number of populations under study (2 in this case); 2) the sizes of samples; 3) the information about the known component distribution; 4) the reject decision of the test; 5) the confidence level of the test, 6) the p-value of the test; 7) the value of the test statistic; 8) the variance of the test statistic at each order in the polynomial orthobasis expansion; 9) the selected rank (order) for the test statistic; 10) a vector of estimates, related to the estimated mixing proportions in the two samples.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
References
Milhaud X, Pommeret D, Salhi Y, Vandekerkhove P (2022). “Semiparametric two-sample admixture components comparison test: The symmetric case.” Journal of Statistical Planning and Inference, 216, 135-150. ISSN 0378-3758, doi:10.1016/j.jspi.2021.05.010.
Examples
## Not run: 
#### Under the null hypothesis H0.
mixt1 <- twoComp_mixt(n = 300, weight = 0.77,
                      comp.dist = list("norm", "exp"),
                      comp.param = list(list("mean" = 1, "sd" = 1),
                                        list("rate" = 0.33)))
data1 <- get_mixture_data(mixt1)
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                        knownComp_param = mixt1$comp.param[[2]])
mixt2 <- twoComp_mixt(n = 500, weight = 0.62,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 1, "sd" = 1),
                                        list("mean" = -2, "sd" = 0.5)))
data2 <- get_mixture_data(mixt2)
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                        knownComp_param = mixt2$comp.param[[2]])
## Test procedure:
orthobasis_test(samples = list(data1,data2), admixMod = list(admixMod1,admixMod2),
                conf_level = 0.95, est_method = 'BVdk', support = 'Real')
## End(Not run)
Plot method for objects of class 'admix_model'
Description
Plots the probability density function of the known component of the admixture model, where we recall that an admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, with g_i the known component density. The unknown quantities are therefore p_i and f_i.
Usage
## S3 method for class 'admix_model'
plot(x, ...)
Arguments
| x | An object of class 'admix_model'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
plot(admix_model(knownComp_dist = "norm", knownComp_param = list("mean"=0, "sd"=1)))
plot(admix_model(knownComp_dist = "pois", knownComp_param = list("lambda"=1.5)))
Plot method for object of class 'decontaminated_density'
Description
Plot the decontaminated density of the unknown component from some admixture model, after inversion of the admixture cumulative distribution functions.
Usage
## S3 method for class 'decontaminated_density'
plot(x, x_val = NULL, add_plot = FALSE, ...)
Arguments
| x | An object of class 'decontaminated_density' (see ?decontaminated_density). | 
| x_val | (numeric) A vector of points at which to evaluate the probability mass/density function. | 
| add_plot | (default to FALSE) A boolean specifying if one plots the decontaminated density over an existing plot. Used for visual comparison purpose. | 
| ... | Arguments to be passed to generic method 'plot', such as graphical parameters (see ?par). | 
Details
The decontaminated density is obtained by inverting the admixture density, given by l = p*f + (1-p)*g, to isolate the unknown component f after having estimated p and l.
Value
The plot of the decontaminated density.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
####### Continuous support:
## Simulate mixture data:
mixt1 <- twoComp_mixt(n = 400, weight = 0.4,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 3, "sd" = 0.5),
                                        list("mean" = 0, "sd" = 1)))
mixt2 <- twoComp_mixt(n = 350, weight = 0.6,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean" = 3, "sd" = 0.5),
                                        list("mean" = 5, "sd" = 2)))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1,data2),
                   admixMod = list(admixMod1,admixMod2), est_method = 'PS')
prop <- get_mixing_weights(est)
## Determine the decontaminated version of the unknown density by inversion:
res1 <- decontaminated_density(sample1 = data1, estim.p = prop[1],
                               admixMod = admixMod1)
res2 <- decontaminated_density(sample1 = data2, estim.p = prop[2],
                               admixMod = admixMod2)
## Use appropriate sequence of x values:
plot(x = res1)
plot(x = res2, col="red", add_plot=TRUE)
plot(x = res1, x_val = seq(from=0, to=6, length.out=100), type = "l")
plot(x = res2, x_val = seq(from=0, to=6, length.out=100), col="red", add_plot=TRUE)
####### Countable discrete support:
mixt1 <- twoComp_mixt(n = 4000, weight = 0.7,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 3),
                                        list("lambda" = 2)))
mixt2 <- twoComp_mixt(n = 3500, weight = 0.85,
                      comp.dist = list("pois", "pois"),
                      comp.param = list(list("lambda" = 3),
                                        list("lambda" = 4)))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1,data2),
                   admixMod = list(admixMod1,admixMod2), est_method = "IBM")
prop <- get_mixing_weights(est)
## Determine the decontaminated version of the unknown density by inversion:
res1 <- decontaminated_density(sample1 = data1, estim.p = prop[1],
                               admixMod = admixMod1)
res2 <- decontaminated_density(sample1 = data2, estim.p = prop[2],
                               admixMod = admixMod2)
## Use appropriate sequence of x values:
plot(x = res1, add_plot = FALSE)
plot(x = res2, add_plot = TRUE, col = "red")
####### Finite discrete support:
mixt1 <- twoComp_mixt(n = 4000, weight = 0.7,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.3,0.4,0.3)),
                                        list("size" = 1, "prob" = c(0.6,0.3,0.1))))
mixt2 <- twoComp_mixt(n = 3500, weight = 0.85,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size" = 1, "prob" = c(0.3,0.4,0.3)),
                                        list("size" = 1, "prob" = c(0.2,0.6,0.2))))
data1 <- get_mixture_data(mixt1)
data2 <- get_mixture_data(mixt2)
## Define the admixture models:
admixMod1 <- admix_model(knownComp_dist = mixt1$comp.dist[[2]],
                         knownComp_param = mixt1$comp.param[[2]])
admixMod2 <- admix_model(knownComp_dist = mixt2$comp.dist[[2]],
                         knownComp_param = mixt2$comp.param[[2]])
## Estimation:
est <- admix_estim(samples = list(data1,data2),
                   admixMod = list(admixMod1,admixMod2), est_method = "IBM")
prop <- get_mixing_weights(est)
## Determine the decontaminated version of the unknown density by inversion:
res1 <- decontaminated_density(sample1 = data1, estim.p = prop[1],
                               admixMod = admixMod1)
res2 <- decontaminated_density(sample1 = data2, estim.p = prop[2],
                               admixMod = admixMod2)
## Use appropriate sequence of x values:
plot(x = res1, add_plot = FALSE)
plot(x = res2, add_plot = TRUE, col = "red")
Plots several mixture densities on the same graph
Description
Plots the empirical densities of the samples with optional arguments to improve the visualization.
Usage
## S3 method for class 'twoComp_mixt'
plot(x, add_plot = FALSE, ...)
Arguments
| x | Object of class 'twoComp_mixt' from which the density will be plotted. | 
| add_plot | (default to FALSE) Option to plot another mixture distribution on the same graph. | 
| ... | further classical arguments and graphical parameters for methods plot and hist. | 
Value
a plot with the densities of the samples provided as inputs.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Print method for object of class 'admix_cluster'
Description
Print the main results when clustering the unknown component distributions coming from various admixture samples, i.e. the obtained clusters.
Usage
## S3 method for class 'admix_cluster'
print(x, ...)
Arguments
| x | An object of class 'admix_cluster' (see ?admix_clustering). | 
| ... | further arguments passed to or from other methods. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Print method for object of class 'admix_estim'
Description
Print method for object of class 'admix_estim'
Usage
## S3 method for class 'admix_estim'
print(x, ...)
Arguments
| x | An object of class 'admix_estim' (see ?admix_estim). | 
| ... | further arguments passed to or from other methods. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Print method for objects of class 'admix_model'
Description
Print an object of class 'admix_mod'. An admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, with g_i the known component density. The unknown quantities are therefore p_i and f_i.
Usage
## S3 method for class 'admix_model'
print(x, ...)
Arguments
| x | An object of class 'admix_model'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Print method for object of class 'decontaminated_density'
Description
Print some overview of the decontaminated density function.
Usage
## S3 method for class 'decontaminated_density'
print(x, ...)
Arguments
| x | An object of class 'decontaminated_density' (see ?decontaminated_density). | 
| ... | Arguments to be passed to generic method 'plot', such as graphical parameters (see par). | 
Value
The function related to the estimated decontaminated density.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Print method for objects 'estim_BVdk'
Description
Print the results stored in an object of class 'estim_BVdk'.
Usage
## S3 method for class 'estim_BVdk'
print(x, ...)
Arguments
| x | An object of class 'estim_BVdk'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Print method for objects of class 'estim_IBM'
Description
Print the results stored in an object of class 'estim_IBM'.
Usage
## S3 method for class 'estim_IBM'
print(x, ...)
Arguments
| x | An object of class 'estim_IBM'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Print method for objects of class 'estim_PS'
Description
Print all the attributes of objects of class 'estim_PS'. Results of estimated quantities in an admixture using Patra and Sen approach
Usage
## S3 method for class 'estim_PS'
print(x, ...)
Arguments
| x | An object of class 'estim_PS'. | 
| ... | further arguments passed to or from other methods. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Print method for objects 'twoComp_mixt'
Description
Print an object of class 'twoComp_mixt'. A two-component mixture model has probability density function (pdf) l such that: l = p * f + (1-p) * g, where p is the mixing proportion, and f and g are the component distributions.
Usage
## S3 method for class 'twoComp_mixt'
print(x, ...)
Arguments
| x | An object of class 'twoComp_mixt'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'htest'
Description
Provide the decision of the statistical test: reject or do not reject the null hypothesis.
Usage
reject_nullHyp(x)
Arguments
| x | An object of class 'htest'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'htest'
Description
Provide the decision of the statistical test: reject or do not reject the null hypothesis.
Usage
## S3 method for class 'htest'
reject_nullHyp(x)
Arguments
| x | An object of class 'htest'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Dataset of Short-term Mortality Fluctuations (STMF) from HMD
Description
Restricted to 6 countries: Belgium, France, Italy, Netherlands, Spain, Germany. Weekly death counts provide the most objective and comparable way of assessing the scale of short-term mortality elevations across countries and time. Extraction date from the Human Mortality Database (HMD): 09/21/2020.
Usage
stmf_small
Format
A data frame with 88146 rows and 19 variables:
- CountryCode
- Mortality database country code 
- Year
- Year 
- Week
- Week number 
- Sex
- Gender ('m': male, 'f': female, 'b': both) 
- D0_14
- Age range 0-14 
- D15_64
- Age range 15-64 
- D65_74
- Age range 65-74 
- D75_84
- Age range 75-84 
- D85p
- Age range 85-+ 
- DTotal
- Count of deaths for all ages combined 
- R0_14
- Crude death rate for age range 0-14 
- R15_64
- Crude death rate for age range 15-64 
- R65_74
- Crude death rate for age range 65-74 
- R75_84
- Crude death rate for age range 75-84 
- R85p
- Crude death rate for age range 85-+ 
- RTotal
- Crude death rate for all ages combined 
- Split
- Indicates if data were split from aggregated age groups (0 if the original data has necessary detailed age scale). For example, if the original age scale was 0-4, 5-29, 30-65, 65+, then split will be equal to 1 
- SplitSex
- Indicates if the original data are available by sex (0) or data are interpolated (1) 
- Forecast
- Equals 1 for all years where forecasted population exposures were used to calculate weekly death rates 
Source
Summary method for object of class 'admix_cluster'
Description
Summarizes the results obtained when clustering the unknown component distributions coming from various admixture samples.
Usage
## S3 method for class 'admix_cluster'
summary(object, ...)
Arguments
| object | An object of class 'admix_cluster' (see ?admix_clustering). | 
| ... | further arguments passed to or from other methods. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Summary method for object of class 'admix_estim'
Description
Summarize the estimated weight(s) of the unknown component(s), and admixture model(s) under study. Recall that an admixture model follows the cumulative distribution function (CDF) L, where L = p*F + (1-p)*G, with G a known CDF, and p and F unknown quantities.
Usage
## S3 method for class 'admix_estim'
summary(object, ...)
Arguments
| object | An object of class 'admix_estim' (see ?admix_estim). | 
| ... | further arguments passed to or from other methods. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Summary method for objects of class 'admix_model'
Description
Summarizes the information related to the known component of the two-component mixture. An admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, with g_i the known component density. The unknown quantities are therefore p_i and f_i.
Usage
## S3 method for class 'admix_model'
summary(object, ...)
Arguments
| object | An object of class 'admix_model'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Summary method for object of class 'decontaminated_density'
Description
Summarizes information about the estimated decontaminated density function.
Usage
## S3 method for class 'decontaminated_density'
summary(object, ...)
Arguments
| object | An object of class 'decontaminated_density' (see ?decontaminated_density). | 
| ... | Arguments to be passed to generic method 'summary'. | 
Value
Classical statistical indicators about the decontaminated density.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Summary method for objects 'estim_BVdk'
Description
Summarizes the results stored in an object of class 'estim_BVdk'.
Usage
## S3 method for class 'estim_BVdk'
summary(object, ...)
Arguments
| object | An object of class 'estim_BVdk'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Summary method for objects 'estim_IBM'
Description
Summarizes the results stored in an object of class 'estim_IBM'.
Usage
## S3 method for class 'estim_IBM'
summary(object, ...)
Arguments
| object | An object of class 'estim_IBM'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Summary method for objects 'estim_PS'
Description
Summarizes the results stored in an object of class 'estim_PS'.
Usage
## S3 method for class 'estim_PS'
summary(object, ...)
Arguments
| object | An object of class 'estim_PS'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Summary method for objects 'twoComp_mixt'
Description
Provides statistical indicators of an object of class 'twoComp_mixt'. A two-component mixture model has probability density function (pdf) l such that: l = p * f + (1-p) * g, where p is the mixing proportion, and f and g are the component distributions.
Usage
## S3 method for class 'twoComp_mixt'
summary(object, ...)
Arguments
| object | An object of class 'twoComp_mixt'. | 
| ... | A list of additional parameters belonging to the default method. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Simulation of a two-component mixture model
Description
Simulate a two-component mixture model following the probability density function (pdf) l such that l = p*f + (1-p)*g, with f and g the mixture component distributions, and p the mixing weight.
Usage
twoComp_mixt(
  n = 1000,
  weight = 0.5,
  comp.dist = list("norm", "norm"),
  comp.param = list(list(mean = 0, sd = 1), list(mean = 2, sd = 1))
)
Arguments
| n | Number of observations to be simulated. | 
| weight | Weight of the first component distribution (distribution f) in the mixture. | 
| comp.dist | A list of two elements corresponding to the component distributions (with available names listed in object 'Distribution.df' in package EnvStats) involved in the mixture model. These elements respectively refer to the two component distributions f and g. By convention, in the framework of admixture models where one of the two components is unknown, the first element of the list corresponds to the 'unknown' component distribution, whereas the second one refers to the known one. | 
| comp.param | A list of two elements corresponding to the parameters of the component distributions, each element being a list itself. The names used in each list must correspond to the available parameters listed in object 'Distribution.df' in package EnvStats. These elements respectively refer to the parameters of f and g distributions of the mixture model. By convention, in the framework of admixture models where one of the two components is unknown, the first element of the list corresponds to the 'unknown' component parameters, whereas the second one refers to the known ones. | 
Value
An object of class twoComp_mixt, containing eight attributes: 1) the number of simulated observations, 2) the simulated mixture data, 3) the support of the distributions, 4) the name of the component distributions, 5) the name of the parameters of the component distributions and their values, 6) the mixing proportion, 7) the observations coming from the first component, 8) the observations coming from the second component.
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Examples
## Mixture of continuous random variables:
sim.X <- twoComp_mixt(n = 2000, weight = 0.5,
                      comp.dist = list("norm", "norm"),
                      comp.param = list(list("mean"=3, "sd"=0.5),
                                        list("mean"=0, "sd"=1)))
print(sim.X)
sim.Y <- twoComp_mixt(n = 1200, weight = 0.7,
                      comp.dist = list("norm", "exp"),
                      comp.param = list(list("mean"=-3, "sd"=0.5),
                                        list("rate"=1)))
plot(sim.X, xlim=c(-5,5), ylim=c(0,0.5))
plot(sim.Y, add_plot = TRUE, xlim=c(-5,5), ylim=c(0,0.5), col = "red")
## Mixture of discrete random variables:
sim.X <- twoComp_mixt(n = 2000, weight = 0.5,
                      comp.dist = list("multinom", "multinom"),
                      comp.param = list(list("size"=1, "prob"=c(0.3,0.4,0.3)),
                                        list("size"=1, "prob"=c(0.1,0.2,0.7))))
plot(sim.X)
Extractor for object of class 'htest'
Description
Provide the selected rank of the test statistic (connected to the expansion order of the densities in the orthonormal polynomial basis if method 'poly' was chosen; or to the number of terms, i.e. discrepancies between couples of samples, included in the test statistic).
Usage
which_rank(x)
Arguments
| x | An object of class 'htest'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com
Extractor for object of class 'htest'
Description
Provide the selected rank of the test statistic (connected to the expansion order of the densities in the orthonormal polynomial basis if method 'poly' was chosen; or to the number of terms, i.e. discrepancies between couples of samples, included in the test statistic).
Usage
## S3 method for class 'htest'
which_rank(x)
Arguments
| x | An object of class 'htest'. | 
Author(s)
Xavier Milhaud xavier.milhaud.research@gmail.com