Help for package npRmpi

Version:

0.60-20

Date:

2026-02-11

Imports:

boot, cubature, methods, quadprog, quantreg, stats, parallel

Suggests:

crs, MASS, logspline, ks, testthat, np, Rmpi

Title:

Parallel Nonparametric Kernel Smoothing Methods for Mixed Data Types Using 'MPI'

Maintainer:

Jeffrey S. Racine <racinej@mcmaster.ca>

Description:

Nonparametric (and semiparametric) kernel methods that seamlessly handle a mix of continuous, unordered, and ordered factor data types. This package is a parallel implementation of the 'np' package based on the 'MPI' specification that incorporates the 'Rmpi' package (Hao Yu <hyu@stats.uwo.ca>) with minor modifications and we are extremely grateful to Hao Yu for his contributions to the 'R' community. We would like to gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC, https://www.nserc-crsng.gc.ca/), the Social Sciences and Humanities Research Council of Canada (SSHRC, https://www.sshrc-crsh.gc.ca/), and the Shared Hierarchical Academic Research Computing Network (SHARCNET, https://sharcnet.ca/). We would also like to acknowledge the contributions of the 'GNU GSL' authors. In particular, we adapt the 'GNU GSL' B-spline routine 'gsl_bspline.c' adding automated support for quantile knots (in addition to uniform knots), providing missing functionality for derivatives, and for extending the splines beyond their endpoints.

License:

GPL-2 | GPL-3 [expanded from: GPL]

URL:

https://github.com/JeffreyRacine/R-Package-np

BugReports:

https://github.com/JeffreyRacine/R-Package-np/issues

Repository:

CRAN

NeedsCompilation:

yes

Packaged:

2026-02-11 20:43:22 UTC; jracine

Author:

Jeffrey S. Racine [aut, cre], Tristen Hayfield [aut], Hao Yu [ctb, cph], The GSL Team [cph], Numerical Recipes Software [cph]

Date/Publication:

2026-02-16 17:20:13 UTC

Parallel Nonparametric Kernel Smoothing Methods for Mixed Data Types

Description

This package provides a variety of nonparametric and semiparametric kernel methods that seamlessly handle a mix of continuous, unordered, and ordered factor data types (unordered and ordered factors are often referred to as ‘nominal’ and ‘ordinal’ categorical variables respectively). A vignette containing many of the examples found in the help files accompanying the npRmpi package that is intended to serve as a gentle introduction to this package can be accessed via vignette("npRmpi", package="npRmpi").

For a listing of all routines in the npRmpi package type: ‘library(help="npRmpi")’.

Bandwidth selection is a key aspect of sound nonparametric and semiparametric kernel estimation. npRmpi is designed from the ground up to make bandwidth selection the focus of attention. To this end, one typically begins by creating a ‘bandwidth object’ which embodies all aspects of the method, including specific kernel functions, data names, data types, and the like. One then passes these bandwidth objects to other functions, and those functions can grab the specifics from the bandwidth object thereby removing potential inconsistencies and unnecessary repetition. Furthermore, many functions such as plot (which automatically calls npplot) can work with the bandwidth object directly without having to do the subsequent companion function evaluation.

As of npRmpi version 0.20-0, we allow the user to combine these steps. When using npRmpi versions 0.20-0 and higher, if the first step (bandwidth selection) is not performed explicitly then the second step will automatically call the omitted first step bandwidth selector using defaults unless otherwise specified, and the bandwidth object could then be retrieved retroactively if so desired via objectname$bws. Furthermore, options for bandwidth selection will be passed directly to the bandwidth selector function. Note that the combined approach would not be a wise choice for certain applications such as when bootstrapping (as it would involve unnecessary computation since the bandwidths would properly be those for the original sample and not the bootstrap resamples) or when conducting quantile regression (as it would involve unnecessary computation when different quantiles are computed from the same conditional cumulative distribution estimate).

There are two ways in which you can interact with functions in npRmpi, either i) using data frames, or ii) using a formula interface, where appropriate.

To some, it may be natural to use the data frame interface. The R data.frame function preserves a variable's type once it has been cast (unlike cbind, which we avoid for this reason). If you find this most natural for your project, you first create a data frame casting data according to their type (i.e., one of continuous (default, numeric), factor, ordered). Then you would simply pass this data frame to the appropriate npRmpi function, for example npudensbw(dat=data).

To others, however, it may be natural to use the formula interface that is used for the regression examples, among others. For nonparametric regression functions such as npreg, you would proceed as you would using lm (e.g., bw <- npregbw(y~factor(x1)+x2)) except that you would of course not need to specify, e.g., polynomials in variables, interaction terms, or create a number of dummy variables for a factor. Every function in npRmpi supports both interfaces, where appropriate.

Note that if your factor is in fact a character string such as, say, X being either "MALE" or "FEMALE", npRmpi will handle this directly, i.e., there is no need to map the string values into unique integers such as (0,1). Once the user casts a variable as a particular data type (i.e., factor, ordered, or continuous (default, numeric)), all subsequent methods automatically detect the type and use the appropriate kernel function and method where appropriate.

All estimation methods are fully multivariate, i.e., there are no limitations on the number of variables one can model (or number of observations for that matter). Execution time for most routines is, however, exponentially increasing in the number of observations and increases with the number of variables involved.

Nonparametric methods include unconditional density (distribution), conditional density (distribution), regression, mode, and quantile estimators along with gradients where appropriate, while semiparametric methods include single index, partially linear, and smooth (i.e., varying) coefficient models.

A number of tests are included such as consistent specification tests for parametric regression and quantile regression models along with tests of significance for nonparametric regression.

A variety of bootstrap methods for computing standard errors, nonparametric confidence bounds, and bias-corrected bounds are implemented.

A variety of bandwidth methods are implemented including fixed, nearest-neighbor, and adaptive nearest-neighbor.

A variety of data-driven methods of bandwidth selection are implemented, while the user can specify their own bandwidths should they so choose (either a raw bandwidth or scaling factor).

A flexible plotting utility, npplot (which is automatically invoked by plot) , facilitates graphing of multivariate objects. An example for creating postscript graphs using the npplot utility and pulling this into a LaTeX document is provided.

The function npksum allows users to create or implement their own kernel estimators or tests should they so desire.

The underlying functions are written in C for computational efficiency. Despite this, due to their nature, data-driven bandwidth selection methods involving multivariate numerical search can be time-consuming, particularly for large datasets. A version of this package using the Rmpi wrapper is under development that allows one to deploy this software in a clustered computing environment to facilitate computation involving large datasets.

To cite the npRmpi package, type citation("npRmpi") from within R for details.

Details

The kernel methods in npRmpi employ the so-called ‘generalized product kernels’ found in Hall, Racine, and Li (2004), Li, Lin, and Racine (2013), Li, Ouyang, and Racine (2013), Li and Racine (2003), Li and Racine (2004), Li and Racine (2007), Li and Racine (2010), Ouyang, Li, and Racine (2006), and Racine and Li (2004), among others. For details on a particular method, kindly refer to the original references listed above.

We briefly describe the particulars of various univariate kernels used to generate the generalized product kernels that underlie the kernel estimators implemented in the npRmpi package. In a nutshell, the generalized kernel functions that underlie the kernel estimators in npRmpi are formed by taking the product of univariate kernels such as those listed below. When you cast your data as a particular type (continuous, factor, or ordered factor) in a data frame or formula, the routines will automatically recognize the type of variable being modelled and use the appropriate kernel type for each variable in the resulting estimator.

Second Order Gaussian (x is continuous)

k(z) = \exp(-z^2/2)/\sqrt{2\pi} where z=(x_i-x)/h, and h>0.

Second Order Truncated Gaussian (x is continuous)

k(z) = (\exp(-z^2/2)-\exp(-b^2/2))/(\textrm{erf}(b/\sqrt{2})\sqrt{2\pi}-2b\exp(-b^2/2)) where z=(x_i-x)/h, b>0, |z|\le b and h>0.

See nptgauss for details on modifying b.

Second Order Epanechnikov (x is continuous)

k(z) = 3\left(1 - z^2/5\right)/(4\sqrt{5}) if z^2<5, 0 otherwise, where z=(x_i-x)/h, and h>0.

Uniform (x is continuous)

k(z) = 1/2 if |z|<1, 0 otherwise, where z=(x_i-x)/h, and h>0.

Aitchison and Aitken (x is a (discrete) factor)

l(x_i,x,\lambda) = 1 - \lambda if x_i=x, and \lambda/(c-1) if x_i \neq x, where c is the number of (discrete) outcomes assumed by the factor x.

Note that \lambda must lie between 0 and (c-1)/c.

Wang and van Ryzin (x is a (discrete) ordered factor)

l(x_i,x,\lambda) = 1 - \lambda if |x_i-x|=0, and ((1-\lambda)/2)\lambda^{|x_i-x|} if |x_i - x|\ge1.

Note that \lambda must lie between 0 and 1.

Li and Racine (x is a (discrete) factor)

l(x_i,x,\lambda) = 1 if x_i=x, and \lambda if x_i \neq x.

Note that \lambda must lie between 0 and 1.

Li and Racine Normalised for Unconditional Objects (x is a (discrete) factor)

l(x_i,x,\lambda) = 1/(1+(c-1)\lambda) if x_i=x, and \lambda/(1+(c-1)\lambda) if x_i \neq x.

Note that \lambda must lie between 0 and 1.

Li and Racine (x is a (discrete) ordered factor)

l(x_i,x,\lambda) = 1 if |x_i-x|=0, and \lambda^{|x_i-x|} if |x_i - x|\ge1.

Note that \lambda must lie between 0 and 1.

Li and Racine Normalised for Unconditional Objects (x is a (discrete) ordered factor)

l(x_i,x,\lambda) = (1-\lambda)/(1+\lambda) if |x_i-x|=0, and (1-\lambda)/(1+\lambda)\lambda^{|x_i-x|} if |x_i - x|\ge1.

Note that \lambda must lie between 0 and 1.

So, if you had two variables, x_{i1} and x_{i2}, and x_{i1} was continuous while x_{i2} was, say, binary (0/1), and you created a data frame of the form X <- data.frame(x1,factor(x2)), then the kernel function used by npRmpi would be K(\cdot)=k(\cdot)\times l(\cdot) where the particular kernel functions k(\cdot) and l(\cdot) would be, say, the second order Gaussian (ckertype="gaussian") and Aitchison and Aitken (ukertype="aitchisonaitken") kernels by default, respectively.

Note that higher order continuous kernels (i.e., fourth, sixth, and eighth order) are derived from the second order kernels given above (see Li and Racine (2007) for details).

For particulars on any given method, kindly see the references listed for the method in question.

Author(s)

Tristen Hayfield <tristen.hayfield@gmail.com>, Jeffrey S. Racine <racinej@mcmaster.ca>

Maintainer: Jeffrey S. Racine <racinej@mcmaster.ca>

We are grateful to John Fox and Achim Zeleis for their valuable input and encouragement. We would like to gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC:www.nserc.ca), the Social Sciences and Humanities Research Council of Canada (SSHRC:www.sshrc.ca), and the Shared Hierarchical Academic Research Computing Network (SHARCNET:www.sharcnet.ca)

References

Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413-420.

Hall, P. and J.S. Racine and Q. Li (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015-1026.

Li, Q. and J. Lin and J.S. Racine (2013), “Optimal bandwidth selection for nonparametric conditional distribution and quantile functions”, Journal of Business and Economic Statistics, 31, 57-65.

Li, Q. and D. Ouyang and J.S. Racine (2013), “Categorical Semiparametric Varying-Coefficient Models,” Journal of Applied Econometrics, 28, 551-589.

Li, Q. and J.S. Racine (2003), “Nonparametric estimation of distributions with categorical and continuous data,” Journal of Multivariate Analysis, 86, 266-292.

Li, Q. and J.S. Racine (2004), “Cross-validated local linear nonparametric regression,” Statistica Sinica, 14, 485-512.

Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Li, Q. and J.S. Racine (2010), “Smooth varying-coefficient estimation and inference for qualitative and quantitative data,” Econometric Theory, 26, 1-31.

Ouyang, D. and Q. Li and J.S. Racine (2006), “Cross-validation and the estimation of probability distributions with categorical data,” Journal of Nonparametric Statistics, 18, 69-100.

Racine, J.S. and Q. Li (2004), “Nonparametric estimation of regression functions with both categorical and continuous data,” Journal of Econometrics, 119, 99-130.

Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.

Scott, D.W. (1992), Multivariate Density Estimation: Theory, Practice and Visualization, New York: Wiley.

Silverman, B.W. (1986), Density Estimation, London: Chapman and Hall.

Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301-309.

1995 British Family Expenditure Survey

Description

British cross-section data consisting of a random sample taken from the British Family Expenditure Survey for 1995. The households consist of married couples with an employed head-of-household between the ages of 25 and 55 years. There are 1655 household-level observations in total.

Usage

data("Engel95")

Format

A data frame with 10 columns, and 1655 rows.

food: expenditure share on food, of type numeric
catering: expenditure share on catering, of type numeric
alcohol: expenditure share on alcohol, of type numeric
fuel: expenditure share on fuel, of type numeric
motor: expenditure share on motor, of type numeric
fares: expenditure share on fares, of type numeric
leisure: expenditure share on leisure, of type numeric
logexp: logarithm of total expenditure, of type numeric
logwages: logarithm of total earnings, of type numeric
nkids: number of children, of type numeric

Source

Richard Blundell and Dennis Kristensen

References

Blundell, R. and X. Chen and D. Kristensen (2007), “Semi-Nonparametric IV Estimation of Shape-Invariant Engel Curves,” Econometrica, 75, 1613-1669.

Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Examples

## Not run: 
## Not run in checks: this IV example is computationally expensive and can
## exceed check time limits in MPI environments.
## Example - compute nonparametric instrumental regression using
## Landweber-Fridman iteration of Fredholm integral equations of the
## first kind.

## We consider an equation with an endogenous regressor (`z') and an
## instrument (`w'). Let y = phi(z) + u where phi(z) is the function of
## interest. Here E(u|z) is not zero hence the conditional mean E(y|z)
## does not coincide with the function of interest, but if there exists
## an instrument w such that E(u|w) = 0, then we can recover the
## function of interest by solving an ill-posed inverse problem.

## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave). Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

data(Engel95)

## Sort on logexp (the endogenous regressor) for plotting purposes

Engel95 <- Engel95[order(Engel95$logexp),] 
mpi.bcast.Robj2slave(Engel95)

mpi.bcast.cmd(attach(Engel95),
              caller.execute=TRUE)

mpi.bcast.cmd(model.iv <- npregiv(y=food,z=logexp,w=logwages,method="Landweber-Fridman"),
              caller.execute=TRUE)
phi <- model.iv$phi

## Compute the non-IV regression (i.e. regress y on z)

mpi.bcast.cmd(ghat <- npreg(food~logexp,regtype="ll"),
              caller.execute=TRUE)

## For the plots, restrict focal attention to the bulk of the data
## (i.e. for the plotting area trim out 1/4 of one percent from each
## tail of y and z)

trim <- 0.0025

plot(logexp,food,
     ylab="Food Budget Share",
     xlab="log(Total Expenditure)",
     xlim=quantile(logexp,c(trim,1-trim)),
     ylim=quantile(food,c(trim,1-trim)),
     main="Nonparametric Instrumental Kernel Regression",
     type="p",
     cex=.5,
     col="lightgrey")

lines(logexp,phi,col="blue",lwd=2,lty=2)

lines(logexp,fitted(ghat),col="red",lwd=2,lty=4)

legend(quantile(logexp,trim),quantile(food,1-trim),
       c(expression(paste("Nonparametric IV: ",hat(varphi)(logexp))),
         "Nonparametric Regression: E(food | logexp)"),
       lty=c(2,4),
       col=c("blue","red"),
       lwd=c(2,2))

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)

## End(Not run)

Italian GDP Panel

Description

Italian GDP growth panel for 21 regions covering the period 1951-1998 (millions of Lire, 1990=base). There are 1008 observations in total.

Usage

data("Italy")

Format

A data frame with 2 columns, and 1008 rows.

year: the first column, of type ordered
gdp: the second column, of type numeric: millions of Lire, 1990=base

Source

Giovanni Baiocchi

References

Baiocchi, G. (2006), “Economic Applications of Nonparametric Methods,” Ph.D. Thesis, University of York.

Examples

## Not run: 
## Not run in checks: excluded to keep MPI examples stable and check times short.
## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave). Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

data("Italy")
mpi.bcast.Robj2slave(Italy)

attach(Italy)

plot(ordered(year), gdp, xlab="Year (ordered factor)",
     ylab="GDP (millions of Lire, 1990=base)")

detach(Italy)

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)

## End(Not run)

Compute Optimal Block Length for Stationary and Circular Bootstrap

Description

b.star is a function which computes the optimal block length for the continuous variable data using the method described in Patton, Politis and White (2009).

Usage

b.star(data,
       Kn = NULL,
       mmax= NULL,
       Bmax = NULL,
       c = NULL,
       round = FALSE)

Arguments

data

data, an n x k matrix, each column being a data series.

Kn

See footnote c, page 59, Politis and White (2004). Defaults to ceiling(log10(n)).

mmax

See Politis and White (2004). Defaults to ceiling(sqrt(n))+Kn.

Bmax

See Politis and White (2004). Defaults to ceiling(min(3*sqrt(n),n/3)).

c

See Politis and White (2004). Defaults to qnorm(0.975).

round

whether to round the result or not. Defaults to FALSE.

Details

b.star is a function which computes optimal block lengths for the stationary and circular bootstraps. This allows the use of tsboot from the boot package to be fully automatic by using the output from b.star as an input to the argument l = in tsboot. See below for an example.

Value

A kx2 matrix of optimal bootstrap block lengths computed from data for the stationary bootstrap and circular bootstrap (column 1 is for the stationary bootstrap, column 2 the circular).

Author(s)

Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca

References

Patton, A. and D.N. Politis and H. White (2009), “CORRECTION TO "Automatic block-length selection for the dependent bootstrap" by D. Politis and H. White”, Econometric Reviews 28(4), 372-375.

Politis, D.N. and J.P. Romano (1994), “Limit theorems for weakly dependent Hilbert space valued random variables with applications to the stationary bootstrap”, Statistica Sinica 4, 461-476.

Politis, D.N. and H. White (2004), “Automatic block-length selection for the dependent bootstrap”, Econometric Reviews 23(1), 53-70.

Examples

## Not run: 
## Not run in checks: excluded to keep MPI examples stable and check times short.
set.seed(12345)

# Function to generate an AR(1) series

ar.series <- function(phi,epsilon) {
  n <- length(epsilon)
  series <- numeric(n)
  series[1] <- epsilon[1]/(1-phi)
  for(i in 2:n) {
    series[i] <- phi*series[i-1] + epsilon[i]
  }
  return(series)
}

yt <- ar.series(0.1,rnorm(10000))
b.star(yt,round=TRUE)

yt <- ar.series(0.9,rnorm(10000))
b.star(yt,round=TRUE)

## End(Not run)

Canadian High School Graduate Earnings

Description

Canadian cross-section wage data consisting of a random sample taken from the 1971 Canadian Census Public Use Tapes for male individuals having common education (grade 13). There are 205 observations in total.

Usage

data("cps71")

Format

A data frame with 2 columns, and 205 rows.

logwage: the first column, of type numeric
age: the second column, of type integer

Source

Aman Ullah

References

Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.

Examples

## Not run: 
## Not run in checks: excluded to keep MPI examples stable and check times short.
## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave). Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

data("cps71")
mpi.bcast.Robj2slave(cps71)

attach(cps71)

plot(age, logwage, xlab="Age", ylab="log(wage)")

detach(cps71)

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)

## End(Not run)

Extract Gradients

Description

gradients is a generic function which extracts gradients from objects.

Usage

gradients(x, ...)

## S3 method for class 'condensity'
gradients(x, errors = FALSE, ...)

## S3 method for class 'condistribution'
gradients(x, errors = FALSE, ...)

## S3 method for class 'npregression'
gradients(x, errors = FALSE, ...)

## S3 method for class 'qregression'
gradients(x, errors = FALSE, ...)

## S3 method for class 'singleindex'
gradients(x, errors = FALSE, ...)

Arguments

x

an object for which the extraction of gradients is meaningful.

...

other arguments.

errors

a logical value specifying whether or not standard errors of gradients are desired. Defaults to FALSE.

Details

This function provides a generic interface for extraction of gradients from objects.

Value

Gradients extracted from the model object x.

Note

This method currently only supports objects from the npRmpi library.

Author(s)

Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca

References

See the references for the method being interrogated via gradients in the appropriate help file. For example, for the particulars of the gradients for nonparametric regression see the references in npreg

Examples

## Not run: 
## Not run in checks: excluded to keep MPI examples stable and check times short.
## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave).  Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

set.seed(42)

x <- runif(10)
y <- x + rnorm(10, sd = 0.1)
mydat <- data.frame(x,y)
rm(x,y)

mpi.bcast.Robj2slave(mydat)

mpi.bcast.cmd(model <- npreg(y~x, data=mydat, gradients=TRUE),
              caller.execute=TRUE)

gradients(model)

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)

## End(Not run)

Hosts Information

Description

lamhosts finds the host name associated with its node number. Can be used by mpi.spawn.Rslaves to spawn R slaves on selected hosts. This is a MPI implementation specific function.

mpi.is.master checks if it is running on master or slaves.

mpi.hostinfo finds an individual host information including rank and size in a comm.

slave.hostinfo is executed only by master and find all master and slaves host information in a comm.

Usage

lamhosts()
mpi.is.master()
mpi.hostinfo(comm = 1)
slave.hostinfo(comm = 1, short=TRUE)

Arguments

comm

a communicator number

short

if true, a short form is printed

Value

lamhosts returns CPUs nodes numbers with their host names.

mpi.is.master returns TRUE if it is on master and FALSE otherwise.

mpi.hostinfo sends to stdio a host name, rank, size and comm.

slave.hostname sends to stdio a list of host, rank, size, and comm information for all master and slaves. With short=TRUE and 8 slaves or more, the first 3 and last 2 slaves are shown.

Author(s)

Hao Yu (minor modifications by Jeffrey S. Racine racinej@mcmaster.ca)

MPI_Abort API

Description

mpi.abort makes a “best attempt" to abort all tasks in a comm.

Usage

  mpi.abort(comm = 1)

Arguments

comm

a communicator number

Value

1 if success. Otherwise 0.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI Constants

Description

Find MPI constants: MPI_ANY_SOURCE, MPI_ANY_TAG, or MPI_PROC_NULL

Usage

mpi.any.source()
mpi.any.tag()
mpi.proc.null()

Arguments

None

Details

These constants are mainly used by mpi.send, mpi.recv, and mpi.probe. Different implementation of MPI may use different integers for MPI_ANY_SOURCE, MPI_ANY_TAG, and MPI_PROC_NULL. Hence one should use these functions instead of real integers for MPI communications.

Value

Each function returns an integer value.

References

https://www.mpich.org, https://www.mpich.org/static/docs/latest/www3/

Scatter an array to slaves and then apply a FUN

Description

An array (length <= total number of slaves) is scattered to slaves so that the first slave calls FUN with arguments x[[1]] and ..., the second one calls with arguments x[[2]] and ..., and so on. mpi.iapply is a nonblocking version of mpi.apply so that it will not consume CPU on master node.

Usage

mpi.apply(X, FUN, ..., comm=1)  
mpi.iapply(X, FUN, ..., comm=1, sleep=0.01)

Arguments

X

an array

FUN

a function

...

optional arguments to FUN

comm

a communicator number

sleep

a sleep interval on master node (in sec)

Value

A list of the results is returned. Its length is the same as that of x. In case the call FUN with arguments x[[i]] and ... fails on ith slave, corresponding error message will be returned in the returning list.

Author(s)

Hao Yu

Examples

## Not run: 
# Not run in checks: requires pre-spawned slaves and a live worker communicator.
# Running this without the expected MPI session can deadlock.
#Assume that there are at least 5 slaves running
#Otherwise run mpi.spawn.Rslaves(nslaves=5)
x=c(10,20)
mpi.apply(x,runif)
meanx=1:5
mpi.apply(meanx,rnorm,n=2,sd=4)

## End(Not run)

(Load balancing) parallel apply

Description

(Load balancing) parallellapply and related functions.

Usage

mpi.applyLB(X, FUN, ..., apply.seq=NULL, comm=1)
mpi.parApply(X, MARGIN, FUN, ..., job.num = mpi.comm.size(comm)-1,
                    apply.seq=NULL, comm=1)
mpi.parLapply(X, FUN, ..., job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		comm=1)  
mpi.parSapply(X, FUN, ..., job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		simplify=TRUE, USE.NAMES = TRUE, comm=1)  
mpi.parRapply(X, FUN, ..., job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		comm=1)  
mpi.parCapply(X, FUN, ..., job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		comm=1)  
mpi.parReplicate(n, expr, job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		simplify = TRUE, comm=1)
mpi.parMM (A, B, job.num=mpi.comm.size(comm)-1, comm=1)

Arguments

X

an array or matrix.

MARGIN

vector specifying the dimensions to use.

FUN

a function.

simplify

logical; should the result be simplified to a vector or matrix if possible?

USE.NAMES

logical; if TRUE and if X is character, use X as names for the result unless it had names already.

n

number of replications.

A

a matrix

B

a matrix

expr

expression to evaluate repeatedly.

job.num

Total job numbers. If job numbers is bigger than total slave numbers (default value), a load balancing approach is used.

apply.seq

if reproducing the same computation (simulation) is desirable, set it to the integer vector .mpi.applyLB generated in previous computation (simulation).

...

optional arguments to FUN

comm

a communicator number

Details

Unless length of X is no more than total slave numbers (slave.num) and in this case mpi.applyLB is the same as mpi.apply, mpi.applyLB sends a next job to a slave who just delivered a finished job. The sequence of slaves who deliver results to master are saved into .mpi.applyLB. It keeps track of which slaves do which parts of the results. .mpi.applyLB can be used to reproduce the same simulation result if the same seed is used and the argument apply.seq is equal to .mpi.applyLB.

With the default value of argument job.num which is slave.num, mpi.parApply, mpi.parLapply, mpi.parSapply, mpi.parRapply, mpi.parCapply, mpi.parSapply, and mpi.parMM are clones of snow's parApply, parLappy, parSapply, parRapply, parCapply, parSapply, and parMM, respectively. When job.num is bigger than slave.num, a load balancing approach is used.

Value

Returns an object with the same structure as the corresponding base apply call (typically a list or simplified vector/array when 'simplify = TRUE').

Warning

When using the argument apply.seq with .mpi.applyLB, be sure all settings are the same as before, i.e., the same data, job.num, slave.num, and seed. Otherwise a deadlock could occur. Notice that apply.seq is useful only if job.num is bigger than slave.num.

Examples

## Not run: 
# Not run in checks: requires pre-spawned slaves and load-balancing state.
# A mismatched communicator or apply.seq can deadlock.
#Assume that there are some slaves running

#mpi.applyLB
x=1:7
mpi.applyLB(x,rnorm,mean=2,sd=4)

#get the same simulation 
mpi.remote.exec(set.seed(111))
mpi.applyLB(x,rnorm,mean=2,sd=4)
mpi.remote.exec(set.seed(111))
mpi.applyLB(x,rnorm,mean=2,sd=4,apply.seq=.mpi.applyLB)

#mpi.parApply
x=1:24
dim(x)=c(2,3,4)
mpi.parApply(x, MARGIN=c(1,2), FUN=mean,job.num = 5)

#mpi.parLapply
mdat <- matrix(c(1,2,3, 7,8,9), nrow = 2, ncol=3, byrow=TRUE,
                    dimnames = list(c("R.1", "R.2"), c("C.1", "C.2", "C.3")))
mpi.parLapply(mdat, rnorm) 

#mpi.parSapply
mpi.parSapply(mdat, rnorm) 

#mpi.parMM
A=matrix(1:1000^2,ncol=1000)
mpi.parMM(A,A)

## End(Not run)

MPI_Barrier API

Description

mpi.barrier blocks the caller until all members have called it.

Usage

  mpi.barrier(comm = 1)

Arguments

comm

a communicator number

Value

1 if success. Otherwise 0.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Bcast API

Description

mpi.bcast is a collective call among all members in a comm. It broadcasts a message from the specified rank to all members.

Usage

mpi.bcast(x, type, rank = 0, comm = 1, buffunit=100)

Arguments

x

data to be sent or received. Must be the same type among all members.

type

1 for integer, 2 for double, and 3 for character. Others are not supported.

rank

the sender.

comm

a communicator number.

buffunit

a buffer unit number.

Details

mpi.bcast is a blocking call among all members in a comm, i.e, all members have to wait until everyone calls it. All members have to prepare the same type of messages (buffers). Hence it is relatively difficult to use in R environment since the receivers may not know what types of data to receive, not to mention the length of data. Users should use various extensions of mpi.bcast in R. They are mpi.bcast.Robj, mpi.bcast.cmd, and mpi.bcast.Robj2slave.

When type=5, MPI continuous datatype (double) is defined with unit given by buffunit. It is used to transfer huge data where a double vector or matrix is divided into many chunks with unit buffunit. Total ceiling(length(obj)/buffunit) units are transferred. Due to MPI specification, both buffunit and total units transferred cannot be over 2^31-1. Notice that the last chunk may not have full length of data due to rounding. Special care is needed.

Value

mpi.bcast returns the message broadcasted by the sender (specified by the rank).

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Extensions of MPI_Bcast API

Description

mpi.bcast.Robj and mpi.bcast.Robj2slave are used to move a general R object around among master and all slaves.

Usage

mpi.bcast.Robj(obj = NULL, rank = 0, comm = 1)
mpi.bcast.Robj2slave(obj, comm = 1, all = FALSE)
mpi.bcast.Rfun2slave(comm = 1)
mpi.bcast.data2slave(obj, comm = 1, buffunit = 100)

Arguments

obj

an R object to be transmitted from the sender

rank

the sender.

comm

a communicator number.

all

a logical. If TRUE, all R objects on master are transmitted to slaves.

buffunit

a buffer unit number.

Details

mpi.bcast.Robj is an extension of mpi.bcast for moving a general R object around from a sender to everyone. mpi.bcast.Robj2slave does an R object transmission from master to all slaves unless all=TRUE in which case, all master's objects with the global enviroment are transmitted to all slavers.

mpi.bcast.data2slave transfers data (a double vector or a matrix) natively without (un)serilization. It should be used with a huge vector or matrix. It results in less memory usage and faster transmission. Notice that data with missing values (NA) are allowed.

Value

mpi.bcast.Robj returns no value for the sender and the transmitted one for others. mpi.bcast.Robj2slave returns no value for the master and the transmitted R object along its name on slaves. mpi.bcast.Rfun2slave transmits all master's functions to slaves and returns no value. mpi.bcast.data2slave transmits a double vector or a matrix to slaves and returns no value.

Author(s)

Hao Yu

Extension of MPI_Bcast API

Description

mpi.bcast.cmd is an extension of mpi.bcast. It is mainly used to transmit a command from master to all R slaves spawned by using slavedaemon.R script.

Usage

mpi.bcast.cmd(cmd=NULL,
              ...,
              rank = 0,
              comm = 1,
              nonblock=FALSE,
              sleep=0.1,
              caller.execute = FALSE)

Arguments

cmd

a command to be sent from master.

...

used as arguments to cmd (function command) for passing their (master) values to R slaves, i.e., if ‘myfun(x)’ will be executed on R slaves with ‘x’ as master variable, use mpi.bcast.cmd(cmd=myfun, x=x).

rank

the sender

comm

a communicator number

nonblock

logical. If TRUE, a nonblock procedure is used on all receivers so that they will consume none or little CPUs while waiting.

sleep

a sleep interval, used when nonblock=TRUE. The smaller sleep is, the more responsive slaves are, the more CPUs consume.

caller.execute

a logical value indicating whether the master node is additionally to execute the command

Details

mpi.bcast.cmd is a collective call. This means all members in a communicator must execute it at the same time. If slaves are spawned (created) by using slavedaemon.R (Rprofile script), then they are running mpi.bcast.cmd in infinite loop (idle state). Hence master can execute mpi.bcast.cmd alone to start computation. On the master, cmd and ... are put together as a list which is then broadcasted (after serialization) to all slaves (using for loop with mpi.send and mpi.recv pair). All slaves will return an expression which will be evaluated by either slavedaemon.R, or by whatever an R script based on slavedaemon.R.

If nonblock=TRUE, then on receiving side, a nonblock procedure is used to check if there is a message. If not, it will sleep for the specied amount and repeat itself.

Please use mpi.remote.exec if you want the executed results returned from R slaves.

Value

mpi.bcast.cmd returns no value for the sender and an expression of the transmitted command for others.

Warning

Be cautious of using mpi.bcast.cmd alone by master in the middle of comptuation. Only all slaves in idle states (waiting instructions from master) can be used. Othewise it may result in miscommunication with other MPI calls.

Author(s)

Hao Yu (minor modifications by Jeffrey S. Racine racinej@mcmaster.ca)

MPI_Cart_coords

Description

mpi.cart.coords translates a rank to its Cartesian topology coordinate.

Usage

mpi.cart.coords(comm=3, rank, maxdims)

Arguments

comm

Communicator with Cartesian structure

rank

rank of a process within group

maxdims

length of vector coord in the calling program

Details

This function is the rank-to-coordinates translator. It is the inverse map of mpi.cart.rank. maxdims is at least as big as ndims as returned by mpi.cartdim.get.

Value

mpi.cart.coords returns an integer array containing the Cartesian coordinates of a specified process.

Author(s)

Alek Hunchak and Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires a Cartesian communicator built from spawned slaves.
#Need at least 9 slaves
mpi.bcast.cmd(mpi.cart.create(1,c(3,3),c(F,T)))
mpi.cart.create(1,c(3,3),c(F,T))
mpi.cart.coords(3,4,2)

## End(Not run)

MPI_Cart_create

Description

mpi.cart.create creates a Cartesian structure of arbitrary dimension.

Usage

 mpi.cart.create(commold=1, dims, periods, reorder=FALSE, commcart=3)

Arguments

commold

Input communicator

dims

Integery array of size ndims specifying the number of processes in each dimension

periods

Logical array of size ndims specifying whether the grid is periodic or not in each dimension

reorder

ranks may be reordered or not

commcart

The new communicator to which the Cartesian topology information is attached

Details

If reorder = false, then the rank of each process in the new group is the same as its rank in the old group. If the total size of the Cartesian grid is smaller than the size of the group of commold, then some processes are returned mpi.comm.null. The call is erroneous if it specifies a grid that is larger than the group size.

Value

mpi.cart.create returns 1 if success and 0 otherwise.

Author(s)

Alek Hunchak and Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires a multi-rank MPI session with spawned slaves.
#Need at least 9 slaves
mpi.bcast.cmd(mpi.cart.create(1,c(3,3),c(F,T)))
mpi.cart.create(1,c(3,3),c(F,T))

## End(Not run)

MPI_Cart_get

Description

mpi.cart.get provides the user with information on the Cartesian topology associated with a comm.

Usage

 mpi.cart.get(comm=3, maxdims)

Arguments

comm

Communicator with Cartesian structure

maxdims

length of vectors dims, periods, and coords in the calling program

Details

The coords are as given for the rank of the calling process as shown.

Value

mpi.cart.get returns a vector containing information on the Cartesian topology associated with comm. maxdims must be at least ndims as returned by mpi.cartdim.get.

Author(s)

Alek Hunchak and Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires a Cartesian communicator built from spawned slaves.
#Need at least 9 slaves
mpi.bcast.cmd(mpi.cart.create(1,c(3,3),c(F,T)))
mpi.cart.create(1,c(3,3),c(F,T))
mpi.remote.exec(mpi.cart.get(3,2))

## End(Not run)

MPI_Cart_rank

Description

mpi.cart.rank translates a Cartesian topology coordinate to its rank.

Usage

 mpi.cart.rank(comm=3, coords)

Arguments

comm

Communicator with Cartesian structure

coords

Specifies the Cartesian coordinates of a process

Details

For a process group with a Cartesian topology, this function translates the logical process coordinates to process ranks as they are used by the point-to-point routines. It is the inverse map of mpi.cart.coords.

Value

mpi.cart.rank returns the rank of the specified process.

Author(s)

Alek Hunchak and Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires a Cartesian communicator built from spawned slaves.
#Need at least 9 slaves
mpi.bcast.cmd(mpi.cart.create(1,c(3,3),c(F,T)))
mpi.cart.create(1,c(3,3),c(F,T))
mpi.cart.rank(3,c(1,0))

## End(Not run)

MPI_Cart_shift

Description

mpi.cart.shift shifts the Cartesian topology in both manners, displacement and direction.

Usage

 mpi.cart.shift(comm=3, direction, disp)

Arguments

comm

Communicator with Cartesian structure

direction

Coordinate dimension of the shift

disp

displacement (>0 for upwards or left shift, <0 for downwards or right shift)

Details

mpi.cart.shift provides neighbor ranks from given direction and displacement. The direction argument indicates the dimension of the shift. direction=1 means the first dim, direction=2 means the second dim, etc. disp=1 or -1 provides immediate neighbor ranks and disp=2 or -2 provides neighbor's neighbor ranks. Negative ranks mean out of boundary. They correspond to mpi.proc.null.

Value

mpi.cart.shift returns a vector containing information regarding the rank of the source process and rank of the destination process.

Author(s)

Alek Hunchak and Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires a Cartesian communicator built from spawned slaves.
#Need at least 9 slaves
mpi.bcast.cmd(mpi.cart.create(1,c(3,3),c(F,T)))
mpi.cart.create(1,c(3,3),c(F,T))
mpi.remote.exec(mpi.cart.shift(3,2,1))#get neighbor ranks
mpi.remote.exec(mpi.cart.shift(3,1,1))

## End(Not run)

MPI_Cartdim_get

Description

mpi.cartdim.get gets dim information about a Cartesian topology.

Usage

 mpi.cartdim.get(comm=3)

Arguments

comm

Communicator with Cartesian structure

Details

Can be used to provide other functions with the correct size of arrays.

Value

mpi.cartdim.get returns the number of dimensions of the Cartesian structure

Author(s)

Alek Hunchak and Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires a Cartesian communicator built from spawned slaves.
#Need at least 9 slaves
mpi.bcast.cmd(mpi.cart.create(1,c(3,3),c(F,T)))
mpi.cart.create(1,c(3,3),c(F,T))
mpi.cartdim.get(comm=3)

## End(Not run)

MPI_Comm_disconnect API

Description

mpi.comm.disconnect disconnects itself from a communicator and then deallocates the communicator so it points to MPI_COMM_NULL.

Usage

mpi.comm.disconnect(comm=1)

Arguments

comm

a communicator number

Details

When members associated with a communicator finish jobs or exit, they have to call mpi.comm.disconnect to release resource if the communicator was created from an intercommunicator by mpi.intercomm.merge. If mpi.comm.free is used instead, mpi.finalize called by slaves may cause undefined impacts on master who wishes to stay.

Value

1 if success. Otherwise 0.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Comm_free API

Description

mpi.comm.free deallocates a communicator so it points to MPI_COMM_NULL.

Usage

  mpi.comm.free(comm=1)

Arguments

comm

a communicator number

Details

When members associated with a communicator finish jobs or exit, they have to call mpi.comm.free to release resource so mpi.comm.size will return 0. If the comm was created from an intercommunicator by mpi.intercomm.merge, use mpi.comm.disconnect instead.

Value

1 if success. Otherwise 0.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Comm_get_parent, MPI_Comm_remote_size, MPI_Comm_test_inter APIs

Description

mpi.comm.get.parent is mainly used by slaves to find the intercommunicator or the parent who spawns them. The intercommunicator is saved in the specified comm number.

mpi.comm.remote.size is mainly used by master to find the total number of slaves spawned.

mpi.comm.test.inter tests if a comm is an intercomm or not.

Usage

  mpi.comm.get.parent(comm = 2)
  mpi.comm.remote.size(comm = 2)
  mpi.comm.test.inter(comm = 2)

Arguments

comm

an intercommunicator number.

Value

mpi.comm.get.parent and mpi.comm.test.inter return 1 if success and 0 otherwise.

mpi.comm.remote.size returns the total number of members in the remote group in an intercomm.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Comm_set_errhandler API

Description

mpi.comm.set.errhandler sets a communicator to MPI_ERRORS_RETURN instead of MPI_ERRORS_ARE_FATAL (default) which crashes R on any type of MPI errors. Almost all MPI API calls return errcodes which can map to specific MPI error messages. All MPI related error messages come from predefined MPI_Error_string.

Usage

mpi.comm.set.errhandler(comm = 1)

Arguments

comm

a communicator number

Value

1 if success. Otherwise 0.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Comm_c2f, MPI_Comm_dup, MPI_Comm_rank, and MPI_Comm_size APIs

Description

mpi.comm.c2f converts the comm (a C communicator) and returns an integer that can be used as the communicator in external FORTRAN code. mpi.comm.dup duplicates (copies) a comm to a new comm. mpi.comm.rank returns its rank in a comm. mpi.comm.size returns the total number of members in a comm.

Usage

  mpi.comm.c2f(comm=1)
  mpi.comm.dup(comm, newcomm)
  mpi.comm.rank(comm = 1)
  mpi.comm.size(comm = 1)

Arguments

comm

a communicator number

newcomm

a new communicator number

Value

mpi.comm.c2f: integer communicator for use in FORTRAN code.
mpi.comm.dup: integer identifier of the duplicated communicator.
mpi.comm.rank: integer rank within the communicator.
mpi.comm.size: integer size of the communicator.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
## Not run in checks when toggled to dontrun: communicator examples are
## documented for manual MPI sessions.
mpi.comm.rank(comm=0)
mpi.comm.size(comm=0)
mpi.comm.dup(comm=0, newcomm=5)

## End(Not run)

MPI_Comm_spawn API

Description

mpi.comm.spawn tries to start nslaves identical copies of slaves, establishing communication with them and returning an intercommunicator. The spawned slaves are referred to as children, and the process that spawned them is called the parent (master). The children have their own MPI_COMM_WORLD represented by comm 0. To make communication possible among master and slaves, all slaves should use mpi.comm.get.parent to find their parent and use mpi.intercomm.merge to merger an intercomm to a comm.

Usage

 mpi.comm.spawn(slave, slavearg = character(0),
                nslaves = mpi.universe.size(), info = 0,
                root = 0, intercomm = 2, quiet = FALSE)

Arguments

slave

a file name to an executable program.

slavearg

an argument list (a char vector) to slave.

nslaves

number of slaves to be spawned.

info

an info number.

root

the root member who spawns slaves.

intercomm

an intercomm number.

quiet

a logical. If TRUE, do not print anything unless an error occurs.

Value

Unless quiet = TRUE, a message is printed to indicate how many slaves are successfully spawned and how many failed.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Dims_create

Description

mpi.dims.create Create a Cartesian dimension used by mpi.cart.create.

Usage

 mpi.dims.create(nnodes, ndims, dims=integer(ndims))

Arguments

nnodes

Number of nodes in a cluster

ndims

Number of dimension in a Cartesian topology

dims

Initial dimension numbers

Details

The entries in the return value are set to describe a Cartesian grid with ndims dimensions and a total of nnodes nodes. The dimensions are set to be as close to each other as possible, using an appropriate divisibility algorithm. The return value can be constrained by specifying positive number(s) in dims. Only those 0 values in dims are modified by mpi.dims.create.

Value

mpi.dims.create returns the dimension vector used by that in mpi.cart.create.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
## Not run in checks when toggled to dontrun: this MPI utility example is
## intended for manual interactive use.
#What is the dim numbers of 2 dim Cartersian topology under a grid of 36 nodes
mpi.dims.create(36,2)	#return c(6,6)
#Constrained dim numbers
mpi.dims.create(12,2,c(0,4)) #return c(9,4)

## End(Not run)

Exit MPI Environment

Description

mpi.exit terminates MPI execution environment and detaches the library Rmpi. After that, you can still work on R.

mpi.quit terminates MPI execution environment and quits R.

Usage

mpi.exit()
mpi.quit(save = "no")

Arguments

save

the same argument as quit but default to "no".

Details

Normally, mpi.finalize is used to clean all MPI states. However, it will not detach the library Rmpi. To be more safe leaving MPI, mpi.exit not only calls mpi.finalize but also detaches the library Rmpi. This will make reloading of the library Rmpi impossible.

If leaving MPI and R altogether, one simply uses mpi.quit.

Value

mpi.exit always returns 1

Author(s)

Hao Yu

MPI_Finalize API

Description

Terminates MPI execution environment.

Usage

  mpi.finalize()

Arguments

None

Details

This routines must be called by each slave (master) before it exits. This call cleans all MPI state. Once mpi.finalize has been called, no MPI routine may be called. To be more safe leaving MPI, please use mpi.exit which not only calls mpi.finalize but also detaches the library Rmpi. This will make reload the library Rmpi impossible.

Value

Always return 1

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Gather, MPI_Gatherv, MPI_Allgather, and MPI_Allgatherv APIs

Description

mpi.gather and mpi.gatherv (vector variant) gather each member's message to the member specified by the argument root. The root member receives the messages and stores them in rank order. mpi.allgather and mpi.allgatherv are the same as mpi.gather and mpi.gatherv except that all members receive the result instead of just the root.

Usage

mpi.gather(x, type, rdata, root = 0, comm = 1) 
mpi.gatherv(x, type, rdata, rcounts, root = 0, comm = 1) 

mpi.allgather(x, type, rdata, comm = 1) 
mpi.allgatherv(x, type, rdata, rcounts, comm = 1)

Arguments

x

data to be gathered. Must be the same type.

type

1 for integer, 2 for double, and 3 for character. Others are not supported.

rdata

the receive buffer. Must be the same type as the sender and big enough to include all message gathered.

rcounts

int vector specifying the length of each message.

root

rank of the receiver

comm

a communicator number

Details

For mpi.gather and mpi.allgather, the message to be gathered must be the same dim and the same type. The receive buffer can be prepared as either integer(size * dim) or double(size * dim), where size is the total number of members in a comm. For mpi.gatherv and mpi.allgatherv, the message to be gathered can have different dims but must be the same type. The argument rcounts records these different dims into an integer vector in rank order. Then the receive buffer can be prepared as either integer(sum(rcounts)) or double(sum(rcounts)).

Value

For mpi.gather or mpi.gatherv, it returns the gathered message for the root member. For other members, it returns what is in rdata, i.e., rdata (or rcounts) is ignored. For mpi.allgather or mpi.allgatherv, it returns the gathered message for all members.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires a fixed number of spawned slaves and rank-specific buffers.
# Running this with a different communicator layout can deadlock.
#Need 3 slaves to run properly
#Or use mpi.spawn.Rslaves(nslaves=3)
 mpi.bcast.cmd(id <-mpi.comm.rank(.comm), comm=1)
mpi.bcast.cmd(mpi.gather(letters[id],type=3,rdata=string(1)))
mpi.gather(letters[10],type=3,rdata=string(4))

 mpi.bcast.cmd(x<-rnorm(id))
 mpi.bcast.cmd(mpi.gatherv(x,type=2,rdata=double(1),rcounts=1))
 mpi.gatherv(double(1),type=2,rdata=double(sum(1:3)+1),rcounts=c(1,1:3))

mpi.bcast.cmd(out1<-mpi.allgatherv(x,type=2,rdata=double(sum(1:3)+1),
		rcounts=c(1,1:3)))
mpi.allgatherv(double(1),type=2,rdata=double(sum(1:3)+1),rcounts=c(1,1:3))

## End(Not run)

Extentions of MPI_Gather and MPI_Allgather APIs

Description

mpi.gather.Robj gathers each member's object to the member specified by the argument root. The root member receives the objects as a list. mpi.allgather.Robj is the same as mpi.gather.Robj except that all members receive the result instead of just the root.

Usage

mpi.gather.Robj(obj=NULL, root = 0, comm = 1, ...)

mpi.allgather.Robj(obj=NULL, comm = 1)

Arguments

obj

data to be gathered. Could be different type.

root

rank of the gather

comm

a communicator number

...

optional arugments to sapply.

Details

Since sapply is used to gather all results, its default option "simplify=TRUE" is to simplify outputs. In some situations, this option is not desirable. Using "simplify=FALSE" as in the place of ... will tell sapply not to simplify and a list of outputs will be returned.

Value

For mpi.gather.Robj, it returns a list, the gathered message for the root member. For mpi.allgatherv.Robj, it returns a list, the gathered message for all members.

Author(s)

Hao Yu and Wei Xia

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires pre-spawned slaves and a live worker communicator.
#Assume that there are some slaves running
mpi.bcast.cmd(id<-mpi.comm.rank())
mpi.bcast.cmd(x<-rnorm(id))
mpi.bcast.cmd(mpi.gather.Robj(x))
x<-"test mpi.gather.Robj"
mpi.gather.Robj(x)

mpi.bcast.cmd(obj<-rnorm(id+10))
mpi.bcast.cmd(nn<-mpi.allgather.Robj(obj))
obj<-rnorm(5)
mpi.allgather.Robj(obj)
mpi.remote.exec(nn)

## End(Not run)

MPI_Get_count API

Description

mpi.get.count finds the length of a received message.

Usage

mpi.get.count(type, status = 0)

Arguments

type

1 for integer, 2 for double, 3 for char.

status

a status number

Details

When mpi.recv is used to receive a message, the receiver buffer can be set to be bigger than the incoming message. To find the exact length of the received message, mpi.get.count is used to find its exact length. mpi.get.count must be called immediately after calling mpi.recv otherwise the status may be changed.

Value

the length of a received message.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Get_processor_name API

Description

mpi.get.processor.name returns the host name (a string) where it is executed.

Usage

  mpi.get.processor.name(short = TRUE)

Arguments

short

a logical.

Value

a base host name if short = TRUE and a full host name otherwise.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Utility for finding the source and tag of a received message

Description

mpi.get.sourcetag finds the source and tag of a received message.

Usage

  mpi.get.sourcetag(status = 0)

Arguments

status

a status number

Details

When mpi.any.source and/or mpi.any.tag are used by mpi.recv or mpi.probe, one can use mpi.get.sourcetag to find who sends the message or with what tag number. mpi.get.sourcetag must be called immediately after calling mpi.recv or mpi.probe otherwise the obtained information may not be right.

Value

2 dim int vector. The first integer is the source and the second is the tag.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

(Load balancing) parallel apply with nonblocking features

Description

(Load balancing) parallellapply and related functions.

Usage

mpi.iapplyLB(X, FUN, ..., apply.seq=NULL, comm=1, sleep=0.01)
mpi.iparApply(X, MARGIN, FUN, ..., job.num = mpi.comm.size(comm)-1,
                    apply.seq=NULL, comm=1, sleep=0.01)
mpi.iparLapply(X, FUN, ..., job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		    comm=1,sleep=0.01)  
mpi.iparSapply(X, FUN, ..., job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		simplify=TRUE, USE.NAMES = TRUE, comm=1, sleep=0.01)  
mpi.iparRapply(X, FUN, ..., job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		comm=1, sleep=0.01)  
mpi.iparCapply(X, FUN, ..., job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		comm=1,sleep=0.01)  
mpi.iparReplicate(n, expr, job.num=mpi.comm.size(comm)-1, apply.seq=NULL, 
		simplify = TRUE, comm=1,sleep=0.01)
mpi.iparMM(A, B, comm=1, sleep=0.01)

Arguments

X

an array or matrix.

MARGIN

vector specifying the dimensions to use.

FUN

a function.

simplify

logical; should the result be simplified to a vector or matrix if possible?

USE.NAMES

logical; if TRUE and if X is character, use X as names for the result unless it had names already.

n

number of replications.

A

a matrix

B

a matrix

expr

expression to evaluate repeatedly.

job.num

Total job numbers. If job numbers is bigger than total slave numbers (default value), a load balancing approach is used.

apply.seq

if reproducing the same computation (simulation) is desirable, set it to the integer vector .mpi.applyLB generated in previous computation (simulation).

...

optional arguments to Fun

comm

a communicator number

sleep

a sleep interval on master node (in sec)

Details

mpi.iparApply, mpi.iparLapply, mpi.iparSapply, mpi.iparRapply, mpi.iparCapply, mpi.iparSapply, mi.iparReplicate, and mpi.iparMM are nonblocking versions of mpi.parApply, mpi.parLapply, mpi.parSapply, mpi.parRapply, mpi.parCapply, mpi.parSapply, mpi.parReplicate, and mpi.parMM respectively. The main difference is that mpi.iprobe and Sys.sleep are used so that master node consumes almost no CPU cycles while waiting for slaves results. However, due to frequent wake/sleep cycles on master, those functions are not suitable for running small jobs on slave nodes. If anticipated computing time for each job is relatively long, e.g., minutes or hours, setting sleep to be 1 second or longer will further reduce load on master (only slightly).

Value

Returns an object with the same structure as the corresponding base or 'mpi.par*' apply call (typically a list or simplified vector/array when 'simplify = TRUE').

MPI_Info_create, MPI_Info_free, MPI_Info_get, MPI_Info_set APIs

Description

Many MPI APIs take an info argument for additional information passing. An info is an object which consists of many (key,value) pairs. Rmpi uses an internal memory to store an info object.

mpi.info.create creates a new info object.

mpi.info.free frees an info object and sets it to MPI_INFO_NULL.

mpi.info.get retrieves the value associated with key in an info.

mpi.info.set adds the key and value pair to info.

Usage

  mpi.info.create(info = 0)
  mpi.info.free(info = 0)
  mpi.info.get(info = 0, key, valuelen)
  mpi.info.set(info = 0, key, value)

Arguments

info

an info number.

key

a char (length 1).

valuelen

the length (nchar) of a key

value

a char (length 1).

Value

mpi.info.create, mpi.info.free, and mpi.info.set return 1 if success and 0 otherwise.

mpi.info.get returns the value (a char) for a given info and valuelen.

Author(s)

Hao Yu

MPI_Intercomm_merge API

Description

Creates an intracommunicator from an intercommunicator

Usage

mpi.intercomm.merge(intercomm=2, high=0, comm=1)

Arguments

intercomm

an intercommunicator number

high

Used to order the groups of the two intracommunicators within comm when creating the new communicator

comm

a (intra)communicator number

Details

When master spawns slaves, an intercommunicator is created. To make communications (point-to-point or groupwise) among master and slaves, an intracommunicator must be created. mpi.intercomm.merge is used for that purpose. This is a collective call so all master and slaves call together. R slaves spawned by mpi.spawn.Rslaves should use mpi.comm.get.parent to get (set) an intercomm to a number followed by merging antercomm to an intracomm. One can use mpi.comm.test.inter to test if a communicator is an intercommunicator or not.

Value

1 if success. Otherwise 0.

Author(s)

Hao Yu

References

https://www.mpich.org, https://www.mpich.org/static/docs/latest/www3/

Parallel Monte Carlo Simulation

Description

Carry out parallel Monte Carlo simulation on R slaves spawned by using slavedaemon.R script and all executed results are returned back to master.

Usage

mpi.parSim(n=100, rand.gen=rnorm, rand.arg=NULL,statistic, 
nsim=100, run=1, slaveinfo=FALSE, sim.seq=NULL, simplify=TRUE, comm=1, ...)

Arguments

n

sample size.

rand.gen

the random data generating function. See the details section

rand.arg

additional argument list to rand.gen.

statistic

the statistic function to be simulated. See the details section

nsim

the number of simulation carried on a slave which is counted as one slave job.

run

the number of looping. See the details section.

slaveinfo

if TRUE, the numbers of jobs finished by slaves will be displayed.

sim.seq

if reproducing the same simulation is desirable, set it to the integer vector .mpi.parSim generated in previous simulation.

simplify

logical; should the result be simplified to a vector or matrix if possible?

comm

a communicator number

...

optional arguments to statistic

Details

It is assumed that one simulation is carried out as statistic(rand.gen(n)), where rand.gen(n) can return any values as long as statistic can take them. Additional arguments can be passed to rand.gen by rand.arg as a list. Optional arguments can also be passed to statistic by the argument ....

Each slave job consists of replicate(nsim,statistic(rand.gen(n))), i.e., each job runs nsim number of simulation. The returned values are transported from slaves to master.

The total number of simulation (TNS) is calculated as follows. Let slave.num be the total number of slaves in a comm and it is mpi.comm.size(comm)-1. Then TNS=slave.num*nsim*run and the total number of slave jobs is slave.num*run, where run is the number of looping from master perspective. If run=1, each slave will run one slave job. If run=2, each slave will run two slaves jobs on average, and so on.

The purpose of using run has two folds. It allows a tuneup of slave job size and total number of slave jobs to deal with two different cluster environments. On a cluster of slaves with equal CPU power, run=1 is often enough. But if nsim is too big, one can set run=2 and the slave job size to be nsim/2 so that TNS=slave.num*(nsim/2)*(2*run). This may improve R computation efficiency slightly. On a cluster of slaves with different CPU power, one can choose a big value of run and a small value of nsim so that master can dispatch more jobs to slaves who run faster than others. This will keep all slaves busy so that load balancing is achieved.

The sequence of slaves who deliver results to master are saved into .mpi.parSim. It keeps track of which slaves do which parts of the results. .mpi.parSim can be used to reproduce the same simulation result if the same seed is used and the argument sim.seq is equal to .mpi.parSim.

See the warning section before you use mpi.parSim.

Value

The returned values depend on values returned by replicate of statistic(rand.gen(n)) and the total number of simulation (TNS). If statistic returns a single value, then the result is a vector of length TNS. If statistic returns a vector (list) of length nrow, then the result is a matrix of dimension c(nrow, TNS).

Warning

It is assumed that a parallel RNG is used on all slaves. Run mpi.setup.rngstream on the master to set up a parallel RNG. Though mpi.parSim works without a parallel RNG, the quality of simulation is not guarantied.

mpi.parSim will automatically transfer rand.gen and statistic to slaves. However, any functions that rand.gen and statistic reply on but are not on slaves must be transfered to slaves before using mpi.parSim. You can use mpi.bcast.Robj2slave for that purpose. The same is applied to required packages or C/Fortran codes. You can use either mpi.bcast.cmd or put required(package) and/or dyn.load(so.lib) into rand.gen and statistic.

If simplify is TRUE, sapply style simplication is applied. Otherwise a list of length slave.num*run is returned.

Author(s)

Hao Yu

MPI_Probe and MPI_Iprobe APIs

Description

mpi.probe uses the source and tag of incoming message to set a status. mpi.iprobe does the same except it is a nonblocking call, i.e., returns immediately.

Usage

mpi.probe(source, tag, comm = 1, status = 0)
mpi.iprobe(source, tag, comm = 1, status = 0)

Arguments

source

the source of incoming message or mpi.any.source() for any source.

tag

a tag number or mpi.any.tag() for any tag.

comm

a communicator number

status

a status number

Details

When mpi.send or other nonblocking sends are used to send a message, the receiver may not know the exact length before receiving it. mpi.probe is used to probe the incoming message and put some information into a status. Then the exact length can be found by using mpi.get.count to such a status. If the wild card mpi.any.source or mpi.any.tag are used, then one can use mpi.get.sourcetag to find the exact source or tag of a sender.

Value

mpi.probe returns 1 only after a matching message has been found.

mpi.iproble returns TRUE if there is a message that can be received; FALSE otherwise.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Find and increase the lengths of MPI opaques comm, request, and status

Description

mpi.comm.maxsize, mpi.request.maxsize, and mpi.status.maxsize find the lengths of comm, request, and status arrays respectively.

mpi.realloc.comm, mpi.realloc.request and mpi.realloc.status increase the lengths of comm, request and status arrays to newmaxsize respectively if newmaxsize is bigger than the original maximum size.

Usage

mpi.realloc.comm(newmaxsize)
mpi.realloc.request(newmaxsize)
mpi.realloc.status(newmaxsize)
mpi.comm.maxsize()
mpi.request.maxsize()
mpi.status.maxsize()

Arguments

newmaxsize

an integer.

Details

When Rmpi is loaded, Rmpi allocs comm array with size 10, request array with 10,000 and status array with 5,000. They should be enough in most cases. They use less than 150KB system memory. In rare case, one can use mpi.realloc.comm, mpi.realloc.request and mpi.realloc.status to increase them to bigger arrays.

Value

mpi.realloc.comm, mpi.realloc.request, mpi.realloc.status: no return value (called for side effects).
mpi.comm.maxsize, mpi.request.maxsize, mpi.status.maxsize: integer size limits.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Reduce and MPI_Allreduce APIs

Description

mpi.reduce and mpi.allreduce are global reduction operations. mpi.reduce combines each member's result, using the operation op, and returns the combined value(s) to the member specified by the argument dest. mpi.allreduce is the same as mpi.reduce except that all members receive the combined value(s).

Usage

mpi.reduce(x, type=2, op=c("sum","prod","max","min","maxloc","minloc"), 
	dest = 0, comm = 1) 

mpi.allreduce(x, type=2, op=c("sum","prod","max","min","maxloc","minloc"), 
	comm = 1)

Arguments

x

data to be reduced. Must be the same dim and the same type for all members.

type

1 for integer and 2 for double. Others are not supported.

op

one of "sum", "prod", "max", "min", "maxloc", or "minloc".

dest

rank of destination

comm

a communicator number

Details

It is important that all members in a comm call either all mpi.reduce or all mpi.allreduce even though the master may not be in computation. They must provide exactly the same type and dim vectors to be reduced. If the operation "maxloc" or "minloc" is used, the combined vector is twice as long as the original one since the maximum or minimum ranks are included.

Value

mpi.reduce returns the combined value(s) to the member specified by dest. mpi.allreduce returns the combined values(s) to every member in a comm. The combined value(s) may be the summation, production, maximum, or minimum specified by the argument op. If the op is either "maxloc" or "minloc", then the maximum (minimum) value(s) along the maximum (minimum) rank(s) will be returned.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Remote Executions on R slaves

Description

Remotely execute a command on R slaves spawned by using slavedaemon.R script and return all executed results back to master.

Usage

mpi.remote.exec(cmd, ..., simplify = TRUE, comm = 1, ret = TRUE)

Arguments

cmd

the command to be executed on R slaves

...

used as arguments to cmd (function command) for passing their (master) values to R slaves, i.e., if ‘myfun(x)’ will be executed on R slaves with ‘x’ as master variable, use mpi.remote.exec(cmd=myfun, x).

simplify

logical; should the result be simplified to a data.frame if possible?

comm

a communicator number.

ret

return executed results from R slaves if TRUE.

Details

Once R slaves are spawned by mpi.spawn.Rslaves with the slavedaemon.R script, they are waiting for instructions from master. One can use mpi.bcast.cmd to send a command to R slaves. However it will not return executed results. Hence mpi.remote.exec can be considered an extension to mpi.bcast.cmd.

Value

return executed results from R slaves if the argument ret is set to be TRUE. The value could be a data.frame if values (integer or double) from each slave have the same dimension. Otherwise a list is returned.

Warning

mpi.remote.exec may have difficulty guessing invisible results on R slaves. Use ret = FALSE instead.

Author(s)

Hao Yu

Examples

## Not run: 
# Not run in checks: requires pre-spawned slaves and a live worker communicator.
mpi.remote.exec(mpi.comm.rank())
 x=5
mpi.remote.exec(rnorm,x)

## End(Not run)

MPI_Scatter and MPI_Scatterv APIs

Description

mpi.scatter and mpi.scatterv are the inverse operations of mpi.gather and mpi.gatherv respectively.

Usage

mpi.scatter(x, type, rdata, root = 0,  comm = 1) 
mpi.scatterv(x, scounts, type, rdata, root = 0, comm = 1)

Arguments

x

data to be scattered.

type

1 for integer, 2 for double, and 3 for character. Others are not supported.

rdata

the receive buffer. Must be the same type as the sender

scounts

int vector specifying the block length inside a message to be scattered to other members.

root

rank of the receiver

comm

a communicator number

Details

mpi.scatter scatters the message x to all members. Each member receives a portion of x with dim as length(x)/size in rank order, where size is the total number of members in a comm. So the receive buffer can be prepared as either integer(length(x)/size) or double(length(x)/size). For mpi.scatterv, scounts counts the portions (different dims) of x sent to each member. Each member needs to prepare the receive buffer as either integer(scounts[i]) or double(scounts[i]).

Value

For non-root members, mpi.scatter or scatterv returns the scattered message and ignores whatever is in x (or scounts). For the root member, it returns the portion belonging to itself.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
# Not run in checks: requires a fixed number of spawned slaves and rank-specific buffers.
# Running this with a different communicator layout can deadlock.
#Need 3 slaves to run properly
#Or run  mpi.spawn.Rslaves(nslaves=3)
  num="123456789abcd"
  scounts<-c(2,3,1,7)
  mpi.bcast.cmd(strnum<-mpi.scatter(integer(1),type=1,rdata=integer(1),root=0))
  strnum<-mpi.scatter(scounts,type=1,rdata=integer(1),root=0)
  mpi.bcast.cmd(ans <- mpi.scatterv(string(1),scounts=0,type=3,rdata=string(strnum),
					root=0))
  mpi.scatterv(as.character(num),scounts=scounts,type=3,rdata=string(strnum),root=0)
  mpi.remote.exec(ans)

## End(Not run)

Extensions of MPI_ SCATTER and MPI_SCATTERV

Description

mpi.scatter.Robj and mpi.scatter.Robj2slave are used to scatter a list to all members. They are more efficient than using any parallel apply functions.

Usage

mpi.scatter.Robj(obj = NULL, root = 0, comm = 1)
mpi.scatter.Robj2slave(obj, comm = 1)

Arguments

obj

a list object to be scattered from the root or master

root

rank of the scatter.

comm

a communicator number.

Details

mpi.scatter.Robj is an extension of mpi.scatter for scattering a list object from a sender (root) to everyone. mpi.scatter.Robj2slave scatters a list to all slaves.

Value

mpi.scatter.Robj for non-root members, returns the scattered R object. For the root member, it returns the portion belonging to itself. mpi.scatter.Robj2slave returns no value for the master and all slaves get their corresponding components in the list, i.e., the first slave gets the first component in the list.

Author(s)

Hao Yu and Wei Xia

Examples

## Not run: 
# Not run in checks: requires pre-spawned slaves and a live worker communicator.
#assume that there are three slaves running
mpi.bcast.cmd(x<-mpi.scatter.Robj())

xx <- list("master",rnorm(3),letters[2],1:10)
mpi.scatter.Robj(obj=xx)

mpi.remote.exec(x)

#scatter a matrix to slaves
dat=matrix(1:24,ncol=3)
splitmatrix = function(x, ncl) lapply(.splitIndices(nrow(x), ncl), function(i) x[i,])
dat2=splitmatrix(dat,3)
mpi.scatter.Robj2slave(dat2)
mpi.remote.exec(dat2)

## End(Not run)

MPI_Send, MPI_Isend, MPI_Recv, and MPI_Irecv APIs

Description

The pair mpi.send and mpi.recv are two most used blocking calls for point-to-point communications. An int, double or char vector can be transmitted from any source to any destination.

The pair mpi.isend and mpi.irecv are the same except that they are nonblocking calls.

Blocking and nonblocking calls are interchangeable, e.g., nonblocking sends can be matched with blocking receives, and vice-versa.

Usage

mpi.send(x, type, dest, tag,  comm = 1)
mpi.isend(x, type, dest, tag,  comm = 1, request=0)
mpi.recv(x, type, source, tag,  comm = 1, status = 0)
mpi.irecv(x, type, source, tag,  comm = 1, request = 0)

Arguments

x

data to be sent or received. Must be the same type for source and destination. The receive buffer must be as large as the send buffer.

type

1 for integer, 2 for double, and 3 for character. Others are not supported.

dest

the destination rank. Use mpi.proc.null for a fake destination.

source

the source rank. Use mpi.any.source for any source. Use mpi.proc.null for a fake source.

tag

non-negative integer. Use mpi.any.tag for any tag flag.

comm

a communicator number.

request

a request number.

status

a status number.

Details

The pair mpi.send (or mpi.isend) and mpi.recv (or mpi.irecv) must be used together, i.e., if there is a sender, then there must be a receiver. Any mismatch will result a deadlock situation, i.e., programs stop responding. The receive buffer must be large enough to contain an incoming message otherwise programs will be crashed. One can use mpi.probe (or mpi.iprobe) and mpi.get.count to find the length of an incoming message before calling mpi.recv. If mpi.any.source or mpi.any.tag is used in mpi.recv, one can use mpi.get.sourcetag to find out the source or tag of the received message. To send/receive an R object rather than an int, double or char vector, please use the pair mpi.send.Robj and mpi.recv.Robj.

Since mpi.irecv is a nonblocking call, x with enough buffer must be created before using it. Then use nonblocking completion calls such as mpi.wait or mpi.test to test if x contains data from sender.

If multiple nonblocking sends or receives are used, please use request number consecutively from 0. For example, to receive two messages from two slaves, try mpi.irecv(x,1,source=1,tag=0,comm=1,request=0) mpi.irecv(y,1,source=2,tag=0,comm=1,request=1) Then mpi.waitany, mpi.waitsome or mpi.waitall can be used to complete the operations.

Value

mpi.send and mpi.isend return no value. mpi.recv returns the int, double or char vector sent from source. However, mpi.irecv returns no value. See details for explanation.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

 
## Not run: 
# Not run in checks: send/recv calls must be paired across ranks.
# Running one side without a matching peer can deadlock.
#on a slave
mpi.send(1:10,1,0,0)

#on master
x <- integer(10)
mpi.irecv(x,1,1,0)
x	
mpi.wait()
x

## End(Not run)

Extensions of MPI_Send and MPI_Recv APIs

Description

mpi.send.Robj and mpi.recv.Robj are two extensions of mpi.send and mpi.recv. They are used to transmit a general R object from any source to any destination.

mpi.isend.Robj is a nonblocking version of mpi.send.Robj.

Usage

mpi.send.Robj(obj, dest, tag, comm = 1)
mpi.isend.Robj(obj, dest, tag, comm = 1, request=0)
mpi.recv.Robj(source, tag, comm = 1, status = 0)

Arguments

obj

an R object. Can be any R object.

dest

the destination rank.

source

the source rank or mpi.any.source() for any source.

tag

non-negative integer or mpi.any.tag() for any tag.

comm

a communicator number.

request

a request number.

status

a status number.

Details

mpi.send.Robj and mpi.isend.Robj use serialize to encode an R object into a binary char vector. It sends the message to the destination. The receiver decode the message back into an R object by using unserialize.

If mpi.isend.Robj is used, mpi.wait or mpi.test must be used to check the object has been sent.

Value

mpi.send.Robj or mpi.isend.Robj return no value. mpi.recv.Robj returns the the transmitted R object.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

MPI_Sendrecv and MPI_Sendrecv_replace APIs

Description

mpi.sendrecv and mpi.sendrecv.replace execute blocking send and receive operations. Both of them combine the sending of one message to a destination and the receiving of another message from a source in one call. The source and destination are possibly the same. The send buffer and receive buffer are disjoint for mpi.sendrecv, while the buffers are not disjoint for mpi.sendrecv.replace.

Usage

mpi.sendrecv(senddata, sendtype, dest, sendtag, recvdata, recvtype, 
source, recvtag, comm = 1, status = 0)

mpi.sendrecv.replace(x, type, dest, sendtag, source, recvtag, 
comm = 1, status = 0)

Arguments

x

data to be sent or recieved. Must be the same type for source and destination.

senddata

data to be sent. May have different datatypes and lengths

recvdata

data to be recieved. May have different datatypes and lengths

type

type of the data to be sent or recieved. 1 for integer, 2 for double, and 3 for character. Others are not supported.

sendtype

type of the data to be sent. 1 for integer, 2 for double, and 3 for character. Others are not supported.

recvtype

type of the data to be recieved. 1 for integer, 2 for double, and 3 for character. Others are not supported.

dest

the destination rank. Use mpi.proc.null for a fake destination.

source

the source rank. Use mpi.any.source for any source. Use mpi.proc.null for a fake source.

sendtag

non-negative integer. Use mpi.any.tag for any tag flag.

recvtag

non-negative integer. Use mpi.any.tag for any tag flag.

comm

a communicator number.

status

a status number.

Details

The receive buffer must be large enough to contain an incoming message otherwise programs will be crashed. There is compatibility between send-receive and normal sends and receives. A message sent by a send-receive can be received by a regular receive and a send-receive can receive a message sent by a regular send.

Value

Returns the int, double or char vector sent from the send buffers.

Author(s)

Kris Chen

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Examples

## Not run: 
## Not run in checks when toggled to dontrun: paired send/recv calls are
## documented for manual MPI sessions.
mpi.sendrecv(as.integer(11:20),1,0,33,integer(10),1,0,33,comm=0)
mpi.sendrecv.replace(seq(1,2,by=0.1),2,0,99,0,99,comm=0)

## End(Not run)

Setup parallel RNG on all slaves

Description

mpi.setup.rngstream setups RNGstream on all slaves.

Usage

mpi.setup.rngstream(iseed=NULL, comm = 1)

Arguments

iseed

An integer to be supplied to set.seed, or NULL not to set reproducible seeds.

comm

A comm number.

Details

mpi.setup.rngstream can be run only on master node. It can be run later on with the same or different iseed.

Value

No value returned.

Author(s)

Hao Yu

Spawn and Close R Slaves

Description

mpi.spawn.Rslaves spawns R slaves to those hosts automatically chosen by MPI or specific hosts assigned by the argument hosts. Those R slaves are running in R BATCH mode with a specific Rscript file. The default Rscript file "slavedaemon.R" provides interactive R slave environments.

mpi.close.Rslaves shuts down R slaves spawned by mpi.spawn.Rslaves.

tailslave.log view (from tail) R slave log files (assuming they are all in one working directory).

Usage

mpi.spawn.Rslaves(Rscript=system.file("slavedaemon.R", package="npRmpi"),
                  nslaves=mpi.universe.size(),
                  root = 0,
                  intercomm = 2,
                  comm = 1,
                  hosts = NULL,
                  needlog = FALSE,
                  mapdrive=TRUE,
                  quiet = FALSE,
                  nonblock=TRUE,
                  sleep=0.1)

mpi.close.Rslaves(dellog = TRUE, comm = 1, force = FALSE)
tailslave.log(nlines = 3, comm = 1)

Arguments

Rscript

an R script file used to run R in BATCH mode.

nslaves

number of slaves to be spawned.

root

the rank number of the member who spawns R slaves.

intercomm

an intercommunicator number

comm

a communicator number merged from an intercomm.

hosts

NULL or LAM node numbers to specify where R slaves are to be spawned.

needlog

a logical. If TRUE, R BATCH outputs will be saved in log files. If FALSE, the outputs will send to /dev/null.

mapdrive

a logical. If TRUE and master's working dir is on a network, mapping network drive is attemped on remote nodes under windows platform.

quiet

a logical. If TRUE, do not print anything unless an error occurs.

nonblock

a logical. If TRUE, a nonblock procedure is used on all slaves so that they will consume none or little CPUs while waiting.

sleep

a sleep interval, used when nonblock=TRUE. The smaller sleep is, the more responsive slaves are, the more CPUs consume.

dellog

a logical specifying if R slave's log files are deleted or not.

force

a logical. If TRUE, force a hard shutdown of slave daemons. When options(npRmpi.reuse.slaves=TRUE) and force=FALSE, mpi.close.Rslaves() performs a soft-close (i.e., keeps daemons alive for reuse).

nlines

number of lines to view from tail in R slave's log files.

Details

The R slaves that mpi.spawn.Rslaves spawns are really running a shell program which can be found in system.file("Rslaves.sh",package="npRmpi") which takes a Rscript file as one of its arguments. Other arguments are used to see if a log file (R output) is needed and how to name it. The master process id and the comm number, along with host names where R slaves are running are used to name these log files.

Once R slaves are successfully spawned, the mergers from an intercomm (default ‘intercomm = 2’) to a comm (default ‘comm = 1’) are automatically done on master and slaves (should be done if the default Rscript is replaced). If additional sets of R slaves are needed, please use ‘comm = 3’, ‘comm = 4’, etc to spawn them. At most a comm number up to 10 can be used. Notice that the default comm number for R slaves (using slavedaemon.R) is always 1 which is saved as .comm.

On some systems (notably macOS+MPICH), repeatedly spawning and tearing down slaves in the same R session can lead to hangs/crashes. To avoid this, npRmpi may reuse an existing slave pool when options(npRmpi.reuse.slaves=TRUE). In this mode, mpi.spawn.Rslaves() becomes idempotent and mpi.close.Rslaves(force=FALSE) performs a soft-close.

To spawn R slaves to specific hosts, please use the argument hosts with a list of those node numbers (an integer vector). Total node numbers along their host names can be found by using mpi.hostinfo. Notice that this is MPI implementation specific.

Value

Unless quiet = TRUE, mpi.spawn.Rslaves prints to stdio how many slaves are successfully spawned and where they are running.

mpi.close.Rslaves returns a status code. When options(npRmpi.reuse.slaves=TRUE) and force=FALSE, this may be a no-op (soft-close) so that spawned daemons can be reused within the same R session.

tailslave.log returns last lines of R slave's log files.

Author(s)

Hao Yu

Examples

## Not run: 
# Not run in checks: spawning/tearing down MPI daemons is environment-dependent
# and can interfere with later examples in the same session.
mpi.spawn.Rslaves(nslaves=2)
tailslave.log()
mpi.remote.exec(rnorm(10))
mpi.close.Rslaves()

## End(Not run)

MPI_Universe_size API

Description

mpi.universe.size returns the total number of CPUs available in a cluster. Some MPI implements may not have this MPI call available.

Usage

  mpi.universe.size()

Arguments

None.

Value

An integer giving the total number of CPUs available in the MPI universe for the current configuration.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Nonblocking completion operations

Description

mpi.cancel cancels a nonblocking send or receive request.

mpi.test.cancelled tests if mpi.cancel cancels or not.

wait, waitall, waitany, and waitsome are used to complete nonblocking send or receive requests. They are not local.

test, testall, testany, and testsome are used to complete nonblocking send and receive requests. They are local.

Usage

mpi.cancel(request)
mpi.test.cancelled(status=0)
mpi.test(request, status=0)
mpi.testall(count)
mpi.testany(count, status=0)
mpi.testsome(count)
mpi.wait(request, status=0)
mpi.waitall(count)
mpi.waitany(count, status=0)
mpi.waitsome(count)

Arguments

count

total number of nonblocking operations.

request

a request number.

status

a status number.

Details

mpi.wait and mpi.test are used to complete a nonblocking send and receive request: use the same request number by mpi.isend or mpi.irecv. Once completed, the associated request is set to MPI_REQUEST_NULL and status contains information such as source, tag, and length of message.

If multiple nonblocking sends or receives are initiated, the following calls are more efficient. Make sure that request numbers are used consecutively as request=0, request=1, request=2, etc. In this way, the following calls can find request information in system memory.

mpi.waitany and mpi.testany are used to complete one out of several requests.

mpi.waitall and mpi.testall are used to complete all requests.

mpi.waitsome and mpi.testsome are used to complete all enabled requests.

Value

mpi.cancel returns no value.

mpi.test.cancelled returns TRUE if a nonblocking call is cancelled; FALSE otherwise.

mpi.wait returns no value. Instead status contains information that can be retrieved by mpi.get.count and mpi.get.sourcetag.

mpi.test returns TRUE if a request is complete; FALSE otherwise. If TRUE, it is the same as mpi.wait.

mpi.waitany returns which request (index) has been completed. In addition, status contains information that can be retrieved by mpi.get.count and mpi.get.sourcetag.

mpi.testany returns a list: index— request index; flag—TRUE if a request is complete; FALSE otherwise (index is no use in this case). If flag is TRUE, it is the same as mpi.waitany.

mpi.waitall returns no value. Instead statuses 0, 1, ..., count-1 contain corresponding information that can be retrieved by mpi.get.count and mpi.get.sourcetag.

mpi.testall returns TRUE if all requests are complete; FALSE otherwise. If TRUE, it is the same as mpi.waitall.

mpi.waitsome returns a list: count— number of requests that have been completed; indices—an integer vector of size count of those completed request numbers (in 0, 1 ,..., count-1). In addition, statuses 0, 1, ..., count-1 contain corresponding information that can be retrieved by mpi.get.count and mpi.get.sourcetag.

mpi.testsome is the same as mpi.waitsome except that count may be 0 and in this case indices is no use.

Author(s)

Hao Yu

References

https://www.mpich.org/, https://www.mpich.org/static/docs/latest/www3/

Initialize Master and Slave Nodes for the np Package

Description

np.mpi.initialize is used to initialize master and slave nodes.

Usage

np.mpi.initialize()

Value

np.mpi.initialize returns no value for the sender and an expression of the transmitted command for others.

Author(s)

Jeffrey S. Racine racinej@mcmaster.ca

Cross-Validated Pairs Plot (Helper Functions)

Description

Compute pairwise nonparametric regressions and densities for a set of variables, then plot a pairs-style display with fitted smoothers.

Usage

np.pairs(y_vars, y_dat, ...)
np.pairs.plot(pair_list)

Arguments

y_vars

character vector of column names in y_dat. If y_vars is named, the names are used as plot labels.

y_dat

data frame containing the variables listed in y_vars.

...

additional arguments passed to npudens and npreg.

pair_list

list returned by np.pairs.

Details

On the diagonal, npudens is used to compute kernel density estimates. Off-diagonal panels use npreg with residuals to draw scatterplots and smoothers.

Value

np.pairs returns a list with components y_vars, pair_names, and pair_kerns. np.pairs.plot returns NULL (invisibly).

Examples

## Not run: 
## Not run in checks: excluded to keep MPI examples stable and check times short.
## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave).  Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

data("USArrests")
y_vars <- c("Murder", "UrbanPop")
names(y_vars) <- c("Murder Arrests per 100K", "Pop. Percent Urban")

mpi.bcast.Robj2slave(USArrests)
mpi.bcast.Robj2slave(y_vars)

mpi.bcast.cmd(pair_list <- np.pairs(y_vars = y_vars, y_dat = USArrests,
                                    ckertype = "epanechnikov", 
                                    bwscaling = TRUE),
              caller.execute=TRUE)

np.pairs.plot(pair_list)

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)

## End(Not run)

Internal npRmpi functions

Description

Internal functions used by other MPI functions. These are not intended to be called directly by the user.

Usage

mpi.comm.is.null(comm)
string(length)
.docall(fun, args)
.force.type(x, type)
.mpi.undefined()
.mpi.worker.apply(n, tag)
.mpi.worker.applyLB(n)
.mpi.worker.exec(tag, ret, simplify)
.mpi.worker.sim(n, nsim, run)
.simplify(n, answer, simplify, len = 1, recursive = FALSE)
.splitIndices(nx, ncl)
.typeindex(x)

Arguments

comm

a communicator number.

length

length of a string.

fun

a function or name of a function.

args

a list of arguments.

x

an object.

type

a type indicator.

n

number of tasks.

tag

an MPI tag.

ret

logical; whether to return a value.

simplify

logical; whether to simplify the result.

nsim

number of simulations.

run

run indicator.

answer

a result list.

len

expected length.

recursive

logical; whether to unlist recursively.

nx

number of elements.

ncl

number of clusters.

Details

These functions are required for internal MPI communication and slave execution.

Value

Internal helpers; return values vary by function:

mpi.comm.is.null: logical indicator.
string: character string of requested length.
.docall: result of calling fun with args.
.force.type: coerced object of the requested type.
.mpi.undefined: integer constant used by MPI.
.mpi.worker.apply, .mpi.worker.applyLB, .mpi.worker.exec, .mpi.worker.sim: internal worker results (typically lists or vectors).
.simplify: simplified result (vector, matrix, or list).
.splitIndices: list of index vectors.
.typeindex: integer type code.

Author(s)

Hao Yu and Jeffrey Racine

Start/Stop Helpers for Interactive npRmpi Sessions

Description

Convenience helpers for interactive use of npRmpi. These functions provide a recommended, robust workflow: initialize a slave pool once and reuse it across multiple examples within the same R session.

Usage

npRmpi.start(..., nslaves = 1, comm = 1)

npRmpi.stop(force = FALSE, dellog = TRUE, comm = 1)

npRmpi.session.info(comm = 1)

Arguments

...

Additional arguments passed to mpi.spawn.Rslaves().

nslaves

Number of slaves to spawn for interactive execution.

comm

Communicator used for the master+slaves pool (defaults to 1).

force

Logical; when TRUE, force a hard shutdown of slave daemons.

dellog

Logical; when TRUE, remove slave log files (if applicable).

Details

npRmpi.start() ensures that a slave pool exists (spawning if needed) and runs np.mpi.initialize() on all ranks via mpi.bcast.cmd().

npRmpi.stop() is idempotent: if no slaves are running it returns silently. When options(npRmpi.reuse.slaves=TRUE) (default on some systems), force=FALSE performs a soft-close to keep daemons alive for reuse within the session; use force=TRUE to actually shut down the slaves.

npRmpi.session.info() prints and returns a list of useful version, platform, and MPI/communicator details to aid reproducibility and bug reports.

Examples

## Not run: 
## Not run in checks: excluded to keep MPI examples stable and check times short.
## Start once, run many examples, then stop.
npRmpi.start(nslaves=1)

## ... run np* calls here ...

## Soft-stop (may keep daemons alive for reuse)
npRmpi.stop()

## Hard-stop (actually shuts down slaves)
## npRmpi.stop(force=TRUE)

## End(Not run)

Kernel Conditional Density Estimation with Mixed Data Types

Description

npcdens computes kernel conditional density estimates on p+q-variate evaluation data, given a set of training data (both explanatory and dependent) and a bandwidth specification (a conbandwidth object or a bandwidth vector, bandwidth type, and kernel type) using the method of Hall, Racine, and Li (2004). The data may be continuous, discrete (unordered and ordered factors), or some combination thereof.

Usage

npcdens(bws, ...)

## S3 method for class 'formula'
npcdens(bws, data = NULL, newdata = NULL, ...)

## S3 method for class 'call'
npcdens(bws, ...)

## S3 method for class 'conbandwidth'
npcdens(bws,
        txdat = stop("invoked without training data 'txdat'"),
        tydat = stop("invoked without training data 'tydat'"),
        exdat,
        eydat,
        gradients = FALSE,
        ...)

## Default S3 method:
npcdens(bws, txdat, tydat, ...)

Arguments

bws

a bandwidth specification. This can be set as a conbandwidth object returned from a previous invocation of npcdensbw, or as a p+q-vector of bandwidths, with each element i up to i=q corresponding to the bandwidth for column i in tydat, and each element i from i=q+1 to i=p+q corresponding to the bandwidth for column i-q in txdat. If specified as a vector, then additional arguments will need to be supplied as necessary to specify the bandwidth type, kernel types, training data, and so on.

gradients

a logical value specifying whether to return estimates of the gradients at the evaluation points. Defaults to FALSE.

...

additional arguments supplied to specify the bandwidth type, kernel types, and so on. This is necessary if you specify bws as a p+q-vector and not a conbandwidth object, and you do not desire the default behaviours. To do this, you may specify any of bwmethod, bwscaling, bwtype, cxkertype, cxkerorder, cykertype, cykerorder, uxkertype, uykertype, oxkertype, oykertype, as described in npcdensbw.

data

an optional data frame, list or environment (or object coercible to a data frame by as.data.frame) containing the variables in the model. If not found in data, the variables are taken from environment(bws), typically the environment from which npcdensbw was called.

newdata

An optional data frame in which to look for evaluation data. If omitted, the training data are used.

txdat

a p-variate data frame of sample realizations of explanatory data (training data). Defaults to the training data used to compute the bandwidth object.

tydat

a q-variate data frame of sample realizations of dependent data (training data). Defaults to the training data used to compute the bandwidth object.

exdat

a p-variate data frame of explanatory data on which conditional densities will be evaluated. By default, evaluation takes place on the data provided by txdat.

eydat

a q-variate data frame of dependent data on which conditional densities will be evaluated. By default, evaluation takes place on the data provided by tydat.

Details

npcdens implements a variety of methods for estimating multivariate conditional distributions (p+q-variate) defined over a set of possibly continuous and/or discrete (unordered, ordered) data. The approach is based on Li and Racine (2004) who employ ‘generalized product kernels’ that admit a mix of continuous and discrete data types.

Three classes of kernel estimators for the continuous data types are available: fixed, adaptive nearest-neighbor, and generalized nearest-neighbor. Adaptive nearest-neighbor bandwidths change with each sample realization in the set, x_i, when estimating the density at the point x. Generalized nearest-neighbor bandwidths change with the point at which the density is estimated, x. Fixed bandwidths are constant over the support of x.

Training and evaluation input data may be a mix of continuous (default), unordered discrete (to be specified in the data frames using factor), and ordered discrete (to be specified in the data frames using ordered). Data can be entered in an arbitrary order and data types will be detected automatically by the routine (see npRmpi for details).

A variety of kernels may be specified by the user. Kernels implemented for continuous data types include the second, fourth, sixth, and eighth order Gaussian and Epanechnikov kernels, and the uniform kernel. Unordered discrete data types use a variation on Aitchison and Aitken's (1976) kernel, while ordered data types use a variation of the Wang and van Ryzin (1981) kernel.

Value

npcdens returns a condensity object. The generic accessor functions fitted, se, and gradients, extract estimated values, asymptotic standard errors on estimates, and gradients, respectively, from the returned object. Furthermore, the functions predict, summary and plot support objects of both classes. The returned objects have the following components:

xbw

bandwidth(s), scale factor(s) or nearest neighbours for the explanatory data, txdat

ybw

bandwidth(s), scale factor(s) or nearest neighbours for the dependent data, tydat

xeval

the evaluation points of the explanatory data

yeval

the evaluation points of the dependent data

condens

estimates of the conditional density at the evaluation points

conderr

standard errors of the conditional density estimates

congrad

if invoked with gradients = TRUE, estimates of the gradients at the evaluation points

congerr

if invoked with gradients = TRUE, standard errors of the gradients at the evaluation points

log_likelihood

log likelihood of the conditional density estimate

Usage Issues

If you are using data of mixed types, then it is advisable to use the data.frame function to construct your input data and not cbind, since cbind will typically not work as intended on mixed data types and will coerce the data to the same type.

Author(s)

Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca

References

Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413-420.

Hall, P. and J.S. Racine and Q. Li (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015-1026.

Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.

Scott, D.W. (1992), Multivariate Density Estimation. Theory, Practice and Visualization, New York: Wiley.

Silverman, B.W. (1986), Density Estimation, London: Chapman and Hall.

Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301-309.

Examples

## Not run: 
## Not run in checks: excluded to keep MPI examples stable and check times short.
## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave).  Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

mpi.bcast.cmd(data("Italy"),
              caller.execute=TRUE)
mpi.bcast.cmd(attach(Italy),
              caller.execute=TRUE)

mpi.bcast.cmd(bw <- npcdensbw(formula=gdp~ordered(year)),
              caller.execute=TRUE)

mpi.bcast.cmd(fhat <- npcdens(bws=bw),
              caller.execute=TRUE)

summary(fhat)

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)

## End(Not run)

Kernel Conditional Density Bandwidth Selection with Mixed Data Types

Description

npcdensbw computes a conbandwidth object for estimating the conditional density of a p+q-variate kernel density estimator defined over mixed continuous and discrete (unordered, ordered) data using either the normal-reference rule-of-thumb, likelihood cross-validation, or least-squares cross validation using the method of Hall, Racine, and Li (2004).

Usage

npcdensbw(...)

## S3 method for class 'formula'
npcdensbw(formula, data, subset, na.action, call, ...)

## S3 method for class 'NULL'
npcdensbw(xdat = stop("data 'xdat' missing"),
          ydat = stop("data 'ydat' missing"),
          bws, ...)

## S3 method for class 'conbandwidth'
npcdensbw(xdat = stop("data 'xdat' missing"),
          ydat = stop("data 'ydat' missing"),
          bws,
          bandwidth.compute = TRUE,
          nmulti,
          remin = TRUE,
          itmax = 10000,
          ftol = 1.490116e-07,
          tol = 1.490116e-04,
          small = 1.490116e-05,
          memfac = 500,
          lbc.dir = 0.5,
          dfc.dir = 3,
          cfac.dir = 2.5*(3.0-sqrt(5)),
          initc.dir = 1.0,
          lbd.dir = 0.1,
          hbd.dir = 1,
          dfac.dir = 0.25*(3.0-sqrt(5)),
          initd.dir = 1.0,
          lbc.init = 0.1,
          hbc.init = 2.0,
          cfac.init = 0.5,
          lbd.init = 0.1,
          hbd.init = 0.9,
          dfac.init = 0.375, 
          scale.init.categorical.sample = FALSE,
          transform.bounds = FALSE,
          invalid.penalty = c("baseline","dbmax"),
          penalty.multiplier = 10,
          ...)

## Default S3 method:
npcdensbw(xdat = stop("data 'xdat' missing"),
          ydat = stop("data 'ydat' missing"),
          bws,
          bandwidth.compute = TRUE,
          nmulti,
          remin,
          itmax,
          ftol,
          tol,
          small,
          memfac,
          lbc.dir,
          dfc.dir,
          cfac.dir,
          initc.dir,
          lbd.dir,
          hbd.dir,
          dfac.dir,
          initd.dir,
          lbc.init,
          hbc.init,
          cfac.init,
          lbd.init,
          hbd.init,
          dfac.init,
          scale.init.categorical.sample,
          transform.bounds,
          invalid.penalty,
          penalty.multiplier,
          bwmethod,
          bwscaling,
          bwtype,
          cxkertype,
          cxkerorder,
          cykertype,
          cykerorder,
          uxkertype,
          uykertype,
          oxkertype,
          oykertype,
          ...)

Arguments

formula

a symbolic description of variables on which bandwidth selection is to be performed. The details of constructing a formula are described below.

data

an optional data frame, list or environment (or object coercible to a data frame by as.data.frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which the function is called.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The (recommended) default is na.omit.

call

the original function call. This is passed internally by npRmpi when a bandwidth search has been implied by a call to another function. It is not recommended that the user set this.

xdat

a p-variate data frame of explanatory data on which bandwidth selection will be performed. The data types may be continuous, discrete (unordered and ordered factors), or some combination thereof.

ydat

a q-variate data frame of dependent data on which bandwidth selection will be performed. The data types may be continuous, discrete (unordered and ordered factors), or some combination thereof.

bws

a bandwidth specification. This can be set as a conbandwidth object returned from a previous invocation, or as a p+q-vector of bandwidths, with each element i up to i=q corresponding to the bandwidth for column i in ydat, and each element i from i=q+1 to i=p+q corresponding to the bandwidth for column i-q in xdat. In either case, the bandwidth supplied will serve as a starting point in the numerical search for optimal bandwidths. If specified as a vector, then additional arguments will need to be supplied as necessary to specify the bandwidth type, kernel types, selection methods, and so on. This can be left unset.

...

additional arguments supplied to specify the bandwidth type, kernel types, selection methods, and so on, detailed below.

bwmethod

which method to use to select bandwidths. cv.ml specifies likelihood cross-validation, cv.ls specifies least-squares cross-validation, and normal-reference just computes the ‘rule-of-thumb’ bandwidth h_j using the standard formula h_j = 1.06 \sigma_j n^{-1/(2P+l)}, where \sigma_j is an adaptive measure of spread of the jth continuous variable defined as min(standard deviation, mean absolute deviation/1.4826, interquartile range/1.349), n the number of observations, P the order of the kernel, and l the number of continuous variables. Note that when there exist factors and the normal-reference rule is used, there is zero smoothing of the factors. Defaults to cv.ml.

bwscaling

a logical value that when set to TRUE the supplied bandwidths are interpreted as ‘scale factors’ (c_j), otherwise when the value is FALSE they are interpreted as ‘raw bandwidths’ (h_j for continuous data types, \lambda_j for discrete data types). For continuous data types, c_j and h_j are related by the formula h_j = c_j \sigma_j n^{-1/(2P+l)}, where \sigma_j is an adaptive measure of spread of continuous variable j defined as min(standard deviation, mean absolute deviation/1.4826, interquartile range/1.349), n the number of observations, P the order of the kernel, and l the number of continuous variables. For discrete data types, c_j and h_j are related by the formula h_j = c_jn^{-2/(2P+l)}, where here j denotes discrete variable j. Defaults to FALSE.

bwtype

character string used for the continuous variable bandwidth type, specifying the type of bandwidth to compute and return in the conbandwidth object. Defaults to fixed. Option summary:
fixed: compute fixed bandwidths
generalized_nn: compute generalized nearest neighbors
adaptive_nn: compute adaptive nearest neighbors

bandwidth.compute

a logical value which specifies whether to do a numerical search for bandwidths or not. If set to FALSE, a conbandwidth object will be returned with bandwidths set to those specified in bws. Defaults to TRUE.

cxkertype

character string used to specify the continuous kernel type for xdat. Can be set as gaussian, epanechnikov, or uniform. Defaults to gaussian.

cxkerorder

numeric value specifying kernel order for xdat (one of (2,4,6,8)). Kernel order specified along with a uniform continuous kernel type will be ignored. Defaults to 2.

cykertype

character string used to specify the continuous kernel type for ydat. Can be set as gaussian, epanechnikov, or uniform. Defaults to gaussian.

cykerorder

numeric value specifying kernel order for ydat (one of (2,4,6,8)). Kernel order specified along with a uniform continuous kernel type will be ignored. Defaults to 2.

uxkertype

character string used to specify the unordered categorical kernel type. Can be set as aitchisonaitken or liracine. Defaults to aitchisonaitken.

uykertype

character string used to specify the unordered categorical kernel type. Can be set as aitchisonaitken or liracine.

oxkertype

character string used to specify the ordered categorical kernel type. Can be set as wangvanryzin or liracine. Defaults to liracine.

oykertype

character string used to specify the ordered categorical kernel type. Can be set as wangvanryzin or liracine.

nmulti

integer number of times to restart the process of finding extrema of the cross-validation function from different (random) initial points

remin

a logical value which when set as TRUE the search routine restarts from located minima for a minor gain in accuracy. Defaults to TRUE.

itmax

integer number of iterations before failure in the numerical optimization routine. Defaults to 10000.

ftol

fractional tolerance on the value of the cross-validation function evaluated at located minima (of order the machine precision or perhaps slightly larger so as not to be diddled by roundoff). Defaults to 1.490116e-07 (1.0e+01*sqrt(.Machine$double.eps)).

tol

tolerance on the position of located minima of the cross-validation function (tol should generally be no smaller than the square root of your machine's floating point precision). Defaults to 1.490116e-04 (1.0e+04*sqrt(.Machine$double.eps)).

small

a small number used to bracket a minimum (it is hopeless to ask for a bracketing interval of width less than sqrt(epsilon) times its central value, a fractional width of only about 10-04 (single precision) or 3x10-8 (double precision)). Defaults to small = 1.490116e-05 (1.0e+03*sqrt(.Machine$double.eps)).

lbc.dir, dfc.dir, cfac.dir, initc.dir

lower bound, chi-square degrees of freedom, stretch factor, and initial non-random values for direction set search for Powell's algorithm for numeric variables. See Details

lbd.dir, hbd.dir, dfac.dir, initd.dir

lower bound, upper bound, stretch factor, and initial non-random values for direction set search for Powell's algorithm for categorical variables. See Details

lbc.init, hbc.init, cfac.init

lower bound, upper bound, and non-random initial values for scale factors for numeric variables for Powell's algorithm. See Details

lbd.init, hbd.init, dfac.init

lower bound, upper bound, and non-random initial values for scale factors for categorical variables for Powell's algorithm. See Details

scale.init.categorical.sample

a logical value that when set to TRUE scales lbd.dir, hbd.dir, dfac.dir, and initd.dir by n^{-2/(2P+l)}, n the number of observations, P the order of the kernel, and l the number of numeric variables. See Details

transform.bounds

a logical value that when set to TRUE applies an internal transformation that maps the unconstrained search to the feasible bandwidth domain. Defaults to FALSE.

invalid.penalty

a character string specifying the penalty used when the optimizer encounters invalid bandwidths. "baseline" returns a finite penalty based on a baseline objective; "dbmax" returns DBL\_MAX. Defaults to "baseline".

penalty.multiplier

a numeric multiplier applied to the baseline penalty when invalid.penalty="baseline". Defaults to 10.

memfac

The algorithm to compute the least-squares objective function uses a block-based algorithm to eliminate or minimize redundant kernel evaluations. Due to memory, hardware and software constraints, a maximum block size must be imposed by the algorithm. This block size is roughly equal to memfac*10^5 elements. Empirical tests on modern hardware find that a memfac of 500 performs well. If you experience out of memory errors, or strange behaviour for large data sets (>100k elements) setting memfac to a lower value may fix the problem.

Details

npcdensbw implements a variety of methods for choosing bandwidths for multivariate distributions (p+q-variate) defined over a set of possibly continuous and/or discrete (unordered, ordered) data. The approach is based on Li and Racine (2004) who employ ‘generalized product kernels’ that admit a mix of continuous and discrete data types.

The cross-validation methods employ multivariate numerical search algorithms (direction set (Powell's) methods in multidimensions).

Bandwidths can (and will) differ for each variable which is, of course, desirable.

npcdensbw may be invoked either with a formula-like symbolic description of variables on which bandwidth selection is to be performed or through a simpler interface whereby data is passed directly to the function via the xdat and ydat parameters. Use of these two interfaces is mutually exclusive.

Data contained in the data frames xdat and ydat may be a mix of continuous (default), unordered discrete (to be specified in the data frames using factor), and ordered discrete (to be specified in the data frames using ordered). Data can be entered in an arbitrary order and data types will be detected automatically by the routine (see npRmpi for details).

Data for which bandwidths are to be estimated may be specified symbolically. A typical description has the form dependent data ~ explanatory data, where dependent data and explanatory data are both series of variables specified by name, separated by the separation character '+'. For example, y1 + y2 ~ x1 + x2 specifies that the bandwidths for the joint distribution of variables y1 and y2 conditioned on x1 and x2 are to be estimated. See below for further examples.

The optimizer invoked for search is Powell's conjugate direction method which requires the setting of (non-random) initial values and search directions for bandwidths, and, when restarting, random values for successive invocations. Bandwidths for numeric variables are scaled by robust measures of spread, the sample size, and the number of numeric variables where appropriate. Two sets of parameters for bandwidths for numeric can be modified, those for initial values for the parameters themselves, and those for the directions taken (Powell's algorithm does not involve explicit computation of the function's gradient). The default values are set by considering search performance for a variety of difficult test cases and simulated cases. We highly recommend restarting search a large number of times to avoid the presence of local minima (achieved by modifying nmulti). Further refinement for difficult cases can be achieved by modifying these sets of parameters. However, these parameters are intended more for the authors of the package to enable ‘tuning’ for various methods rather than for the user themselves.

Value

npcdensbw returns a conbandwidth object, with the following components:

xbw

bandwidth(s), scale factor(s) or nearest neighbours for the explanatory data, xdat

ybw

bandwidth(s), scale factor(s) or nearest neighbours for the dependent data, ydat

fval

objective function value at minimum

The functions predict, summary and plot support objects of type conbandwidth.

Usage Issues

Caution: multivariate data-driven bandwidth selection methods are, by their nature, computationally intensive. Virtually all methods require dropping the ith observation from the data set, computing an object, repeating this for all observations in the sample, then averaging each of these leave-one-out estimates for a given value of the bandwidth vector, and only then repeating this a large number of times in order to conduct multivariate numerical minimization/maximization. Furthermore, due to the potential for local minima/maxima, restarting this procedure a large number of times may often be necessary. This can be frustrating for users possessing large datasets. For exploratory purposes, you may wish to override the default search tolerances, say, setting ftol=.01 and tol=.01 and conduct multistarting (the default is to restart min(5, ncol(xdat,ydat)) times) as is done for a number of examples. Once the procedure terminates, you can restart search with default tolerances using those bandwidths obtained from the less rigorous search (i.e., set bws=bw on subsequent calls to this routine where bw is the initial bandwidth object). A version of this package using the Rmpi wrapper is under development that allows one to deploy this software in a clustered computing environment to facilitate computation involving large datasets.

Author(s)

Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca

References

Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413-420.

Hall, P. and J.S. Racine and Q. Li (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015-1026.

Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.

Scott, D.W. (1992), Multivariate Density Estimation. Theory, Practice and Visualization, New York: Wiley.

Silverman, B.W. (1986), Density Estimation, London: Chapman and Hall.

Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301-309.

Examples

## Not run: 
## Not run in checks: excluded to keep MPI examples stable and check times short.
## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave).  Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

mpi.bcast.cmd(data("Italy"),
              caller.execute=TRUE)
mpi.bcast.cmd(attach(Italy),
              caller.execute=TRUE)

mpi.bcast.cmd(bw <- npcdensbw(formula=gdp~ordered(year)),
              caller.execute=TRUE)

summary(bw)

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)

## End(Not run)

Kernel Conditional Distribution Estimation with Mixed Data Types

Description

npcdist computes kernel cumulative conditional distribution estimates on p+q-variate evaluation data, given a set of training data (both explanatory and dependent) and a bandwidth specification (a condbandwidth object or a bandwidth vector, bandwidth type, and kernel type) using the method of Li and Racine (2008) and Li, Lin, and Racine (2013). The data may be continuous, discrete (unordered and ordered factors), or some combination thereof.

Usage

npcdist(bws, ...)

## S3 method for class 'formula'
npcdist(bws, data = NULL, newdata = NULL, ...)

## S3 method for class 'call'
npcdist(bws, ...)

## S3 method for class 'condbandwidth'
npcdist(bws,
        txdat = stop("invoked without training data 'txdat'"),
        tydat = stop("invoked without training data 'tydat'"),
        exdat,
        eydat,
        gradients = FALSE,
        ...)

## Default S3 method:
npcdist(bws, txdat, tydat, ...)

Arguments

bws

a bandwidth specification. This can be set as a condbandwidth object returned from a previous invocation of npcdistbw, or as a p+q-vector of bandwidths, with each element i up to i=q corresponding to the bandwidth for column i in tydat, and each element i from i=q+1 to i=p+q corresponding to the bandwidth for column i-q in txdat. If specified as a vector, then additional arguments will need to be supplied as necessary to specify the bandwidth type, kernel types, training data, and so on.

gradients

a logical value specifying whether to return estimates of the gradients at the evaluation points. Defaults to FALSE.

...

additional arguments supplied to specify the bandwidth type, kernel types, and so on. This is necessary if you specify bws as a p+q-vector and not a condbandwidth object, and you do not desire the default behaviours. To do this, you may specify any of bwmethod, bwscaling, bwtype, cxkertype, cxkerorder, cykertype, cykerorder, uxkertype, oxkertype, oykertype, as described in npcdistbw.

data

newdata

An optional data frame in which to look for evaluation data. If omitted, the training data are used.

txdat

a p-variate data frame of sample realizations of explanatory data (training data). Defaults to the training data used to compute the bandwidth object.

tydat

a q-variate data frame of sample realizations of dependent data (training data). Defaults to the training data used to compute the bandwidth object.

exdat

a p-variate data frame of explanatory data on which cumulative conditional distributions will be evaluated. By default, evaluation takes place on the data provided by txdat.

eydat

a q-variate data frame of dependent data on which cumulative conditional distributions will be evaluated. By default, evaluation takes place on the data provided by tydat.

Details

npcdist implements a variety of methods for estimating multivariate conditional cumulative distributions (p+q-variate) defined over a set of possibly continuous and/or discrete (unordered, ordered) data. The approach is based on Li and Racine (2004) who employ ‘generalized product kernels’ that admit a mix of continuous and discrete data types.

Three classes of kernel estimators for the continuous data types are available: fixed, adaptive nearest-neighbor, and generalized nearest-neighbor. Adaptive nearest-neighbor bandwidths change with each sample realization in the set, x_i, when estimating the cumulative conditional distribution at the point x. Generalized nearest-neighbor bandwidths change with the point at which the cumulative conditional distribution is estimated, x. Fixed bandwidths are constant over the support of x.

Value

npcdist returns a condistribution object. The generic accessor functions fitted, se, and gradients, extract estimated values, asymptotic standard errors on estimates, and gradients, respectively, from the returned object. Furthermore, the functions predict, summary and plot support objects of both classes. The returned objects have the following components:

xbw

bandwidth(s), scale factor(s) or nearest neighbours for the explanatory data, txdat

ybw

bandwidth(s), scale factor(s) or nearest neighbours for the dependent data, tydat

xeval

the evaluation points of the explanatory data

yeval

the evaluation points of the dependent data

condist

estimates of the conditional cumulative distribution at the evaluation points

conderr

standard errors of the cumulative conditional distribution estimates

congrad

if invoked with gradients = TRUE, estimates of the gradients at the evaluation points

congerr

if invoked with gradients = TRUE, standard errors of the gradients at the evaluation points

log_likelihood

log likelihood of the cumulative conditional distribution estimate

Usage Issues

Author(s)

Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca

References

Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413-420.

Hall, P. and J.S. Racine and Q. Li (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015-1026.

Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Li, Q. and J.S. Racine (2008), “Nonparametric estimation of conditional CDF and quantile functions with mixed categorical and continuous data,” Journal of Business and Economic Statistics, 26, 423-434.

Li, Q. and J. Lin and J.S. Racine (2013), “Optimal bandwidth selection for nonparametric conditional distribution and quantile functions”, Journal of Business and Economic Statistics, 31, 57-65.

Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.

Scott, D.W. (1992), Multivariate Density Estimation. Theory, Practice and Visualization, New York: Wiley.

Silverman, B.W. (1986), Density Estimation, London: Chapman and Hall.

Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301-309.

Examples

## Not run: 
## Not run in checks: this example performs bandwidth search on panel data and
## can be too slow/unstable for automated MPI checks.
## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave).  Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

data("Italy")

mpi.bcast.Robj2slave(Italy)

mpi.bcast.cmd(bw <- npcdistbw(formula=gdp~ordered(year),
                              data=Italy),
              caller.execute=TRUE)

mpi.bcast.cmd(F <- npcdist(bws=bw),
              caller.execute=TRUE)

summary(F)

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)

## End(Not run)

Kernel Conditional Distribution Bandwidth Selection with Mixed Data Types

Description

npcdistbw computes a condbandwidth object for estimating a p+q-variate kernel conditional cumulative distribution estimator defined over mixed continuous and discrete (unordered xdat, ordered xdat and ydat) data using either the normal-reference rule-of-thumb or least-squares cross validation method of Li and Racine (2008) and Li, Lin and Racine (2013).

Usage

npcdistbw(...)

## S3 method for class 'formula'
npcdistbw(formula, data, subset, na.action, call, gdata = NULL,...)

## S3 method for class 'NULL'
npcdistbw(xdat = stop("data 'xdat' missing"),
          ydat = stop("data 'ydat' missing"),
          bws, ...)

## S3 method for class 'condbandwidth'
npcdistbw(xdat = stop("data 'xdat' missing"),
          ydat = stop("data 'ydat' missing"),
          gydat = NULL,
          bws,
          bandwidth.compute = TRUE,
          nmulti,
          remin = TRUE,
          itmax = 10000,
          do.full.integral = FALSE,
          ngrid = 100,
          ftol = 1.490116e-07,
          tol = 1.490116e-04,
          small = 1.490116e-05,
          memfac = 500.0,
          lbc.dir = 0.5,
          dfc.dir = 3,
          cfac.dir = 2.5*(3.0-sqrt(5)),
          initc.dir = 1.0,
          lbd.dir = 0.1,
          hbd.dir = 1,
          dfac.dir = 0.25*(3.0-sqrt(5)),
          initd.dir = 1.0,
          lbc.init = 0.1,
          hbc.init = 2.0,
          cfac.init = 0.5,
          lbd.init = 0.1,
          hbd.init = 0.9,
          dfac.init = 0.375, 
          scale.init.categorical.sample = FALSE,
          transform.bounds = FALSE,
          invalid.penalty = c("baseline","dbmax"),
          penalty.multiplier = 10,
          ...)

## Default S3 method:
npcdistbw(xdat = stop("data 'xdat' missing"),
          ydat = stop("data 'ydat' missing"),
          gydat,
          bws,
          bandwidth.compute = TRUE,
          nmulti,
          remin,
          itmax,
          do.full.integral,
          ngrid,
          ftol,
          tol,
          small,
          memfac,
          lbc.dir,
          dfc.dir,
          cfac.dir,
          initc.dir,
          lbd.dir,
          hbd.dir,
          dfac.dir,
          initd.dir,
          lbc.init,
          hbc.init,
          cfac.init,
          lbd.init,
          hbd.init,
          dfac.init,
          scale.init.categorical.sample,
          transform.bounds,
          invalid.penalty,
          penalty.multiplier,
          bwmethod,
          bwscaling,
          bwtype,
          cxkertype,
          cxkerorder,
          cykertype,
          cykerorder,
          uxkertype,
          oxkertype,
          oykertype,
          ...)

Arguments

formula

a symbolic description of variables on which bandwidth selection is to be performed. The details of constructing a formula are described below.

data

an optional data frame, list or environment (or object coercible to a data frame by as.data.frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which the function is called.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

call

the original function call. This is passed internally by npRmpi when a bandwidth search has been implied by a call to another function. It is not recommended that the user set this.

gdata

a grid of data on which the indicator function for least-squares cross-validation is to be computed (can be the sample or a grid of quantiles).

xdat

a p-variate data frame of explanatory data on which bandwidth selection will be performed. The data types may be continuous, discrete (unordered and ordered factors), or some combination thereof.

ydat

a q-variate data frame of dependent data on which bandwidth selection will be performed. The data types may be continuous, discrete (ordered factors), or some combination thereof.

gydat

a grid of data on which the indicator function for least-squares cross-validation is to be computed (can be the sample or a grid of quantiles for ydat).

bws

a bandwidth specification. This can be set as a condbandwidth object returned from a previous invocation, or as a p+q-vector of bandwidths, with each element i up to i=q corresponding to the bandwidth for column i in ydat, and each element i from i=q+1 to i=p+q corresponding to the bandwidth for column i-q in xdat. In either case, the bandwidth supplied will serve as a starting point in the numerical search for optimal bandwidths. If specified as a vector, then additional arguments will need to be supplied as necessary to specify the bandwidth type, kernel types, selection methods, and so on. This can be left unset.

...

additional arguments supplied to specify the bandwidth type, kernel types, selection methods, and so on, detailed below.

bwmethod

which method to use to select bandwidths. cv.ls specifies least-squares cross-validation (Li, Lin and Racine (2013), and normal-reference just computes the ‘rule-of-thumb’ bandwidth h_j using the standard formula h_j = 1.06 \sigma_j n^{-1/(2P+l)}, where \sigma_j is an adaptive measure of spread of the jth continuous variable defined as min(standard deviation, mean absolute deviation/1.4826, interquartile range/1.349), n the number of observations, P the order of the kernel, and l the number of continuous variables. Note that when there exist factors and the normal-reference rule is used, there is zero smoothing of the factors. Defaults to cv.ls.

bwscaling

bwtype

character string used for the continuous variable bandwidth type, specifying the type of bandwidth to compute and return in the condbandwidth object. Defaults to fixed. Option summary:
fixed: compute fixed bandwidths
generalized_nn: compute generalized nearest neighbors
adaptive_nn: compute adaptive nearest neighbors

bandwidth.compute

a logical value which specifies whether to do a numerical search for bandwidths or not. If set to FALSE, a condbandwidth object will be returned with bandwidths set to those specified in bws. Defaults to TRUE.

cxkertype

character string used to specify the continuous kernel type for xdat. Can be set as gaussian, epanechnikov, or uniform. Defaults to gaussian.

cxkerorder

numeric value specifying kernel order for xdat (one of (2,4,6,8)). Kernel order specified along with a uniform continuous kernel type will be ignored. Defaults to 2.

cykertype

character string used to specify the continuous kernel type for ydat. Can be set as gaussian, epanechnikov, or uniform. Defaults to gaussian.

cykerorder

numeric value specifying kernel order for ydat (one of (2,4,6,8)). Kernel order specified along with a uniform continuous kernel type will be ignored. Defaults to 2.

uxkertype

character string used to specify the unordered categorical kernel type. Can be set as aitchisonaitken or liracine. Defaults to aitchisonaitken.

oxkertype

character string used to specify the ordered categorical kernel type. Can be set as wangvanryzin or liracine. Defaults to liracine.

oykertype

character string used to specify the ordered categorical kernel type. Can be set as wangvanryzin or liracine.

nmulti

integer number of times to restart the process of finding extrema of the cross-validation function from different (random) initial points

remin

a logical value which when set as TRUE the search routine restarts from located minima for a minor gain in accuracy. Defaults to TRUE.

itmax

integer number of iterations before failure in the numerical optimization routine. Defaults to 10000.

do.full.integral

a logical value which when set as TRUE evaluates the moment-based integral on the entire sample.

ngrid

integer number of grid points to use when computing the moment-based integral. Defaults to 100.

ftol

tol

small

lbc.dir, dfc.dir, cfac.dir, initc.dir

lower bound, chi-square degrees of freedom, stretch factor, and initial non-random values for direction set search for Powell's algorithm for numeric variables. See Details

lbd.dir, hbd.dir, dfac.dir, initd.dir

lower bound, upper bound, stretch factor, and initial non-random values for direction set search for Powell's algorithm for categorical variables. See Details

lbc.init, hbc.init, cfac.init

lower bound, upper bound, and non-random initial values for scale factors for numeric variables for Powell's algorithm. See Details

lbd.init, hbd.init, dfac.init

lower bound, upper bound, and non-random initial values for scale factors for categorical variables for Powell's algorithm. See Details

scale.init.categorical.sample

transform.bounds

a logical value that when set to TRUE applies an internal transformation that maps the unconstrained search to the feasible bandwidth domain. Defaults to FALSE.

invalid.penalty

penalty.multiplier

a numeric multiplier applied to the baseline penalty when invalid.penalty="baseline". Defaults to 10.

memfac

The algorithm to compute the least-squares objective function uses a block-based algorithm to eliminate or minimize redundant kernel evaluations. Due to memory, hardware and software constraints, a maximum block size must be imposed by the algorithm. This block size is roughly equal to memfac*10^5 elements. Empirical tests on modern hardware find that a memfac of around 500 performs well. If you experience out of memory errors, or strange behaviour for large data sets (>100k elements) setting memfac to a lower value may fix the problem.

Details

npcdistbw implements a variety of methods for choosing bandwidths for multivariate distributions (p+q-variate) defined over a set of possibly continuous and/or discrete (unordered xdat, ordered xdat and ydat) data. The approach is based on Li and Racine (2004) who employ ‘generalized product kernels’ that admit a mix of continuous and discrete data types.

The cross-validation methods employ multivariate numerical search algorithms (direction set (Powell's) methods in multidimensions).

Bandwidths can (and will) differ for each variable which is, of course, desirable.

Three classes of kernel estimators for the continuous data types are available: fixed, adaptive nearest-neighbor, and generalized nearest-neighbor. Adaptive nearest-neighbor bandwidths change with each sample realization in the set, x_i, when estimating the cumulative distribution at the point x. Generalized nearest-neighbor bandwidths change with the point at which the cumulative distribution is estimated, x. Fixed bandwidths are constant over the support of x.

npcdistbw may be invoked either with a formula-like symbolic description of variables on which bandwidth selection is to be performed or through a simpler interface whereby data is passed directly to the function via the xdat and ydat parameters. Use of these two interfaces is mutually exclusive.

Data contained in the data frame xdat may be a mix of continuous (default), unordered discrete (to be specified in the data frames using factor), and ordered discrete (to be specified in the data frames using ordered). Data contained in the data frame ydat may be a mix of continuous (default) and ordered discrete (to be specified in the data frames using ordered). Data can be entered in an arbitrary order and data types will be detected automatically by the routine (see npRmpi for details).

Value

npcdistbw returns a condbandwidth object, with the following components:

xbw

bandwidth(s), scale factor(s) or nearest neighbours for the explanatory data, xdat

ybw

bandwidth(s), scale factor(s) or nearest neighbours for the dependent data, ydat

fval

objective function value at minimum

The functions predict, summary and plot support objects of type condbandwidth.