In this release, we introduce several estimation methods aimed at accommodating bandwidths that may vary for each variable and adapt according to location. Following the initial works of Yang et al (2011) for developing a Flexible Bandwidth approach for GWR, Yang (2014); Leong and Yue (2017); Lu et al (2017); Fotheringham et al (2017) (hereafter FT2017) made a significant contribution to this issue by allowing for the identification of different optimal bandwidths for each variable j, using jacobi iteration or backfitting algorithm. However existing implementation of the multiscale GWR using backfitting methods demonstrates sluggish performance and is only suitable for datasets containing fewer than 2500 observations and fewer than 6 explanatory variables. To achieve this for larger sample size, we rely on an innovative approach that departs from the optimization of a single optimal bandwidth and instead utilizes a decreasing series of bandwidths in a manner reminiscent of gradient boosting algorithms.
The gwr_multiscale function directly transcribes the backfitting procedure used in the gwr.multiscale of the GWmodel package. It leverages the mgwrsar estimation framework to reduce estimation time and enables the estimation of multiscale GWR with spatial dependence. The criterion used for finding optimal (univariate) bandwidth stopping criteria is based on AICc while RMSE is used as stopping criteria for the backfitting algorithm. For this method a demo is provided only for the analyses of in sample Betas estimates.
The tds_mgwr is an innovative approach for multiscale GWR that departs from the original backfitting procedure proposed by Fotheringham and al. (2017) by employing a set of decreasing bandwidths without fully optimizing the bandwidth at each iteration of the backfitting algorithm. It leverages the top-down scale method, employing a backfitting procedure only to preserve the favorable properties observed in the univariate case. This drastically reduces the number of estimations, and convergence is ensured by a test procedure conducted in parallel on a batch of ascending and descending bandwidths. Along the algorithm the full AICc is computed using Yu et al. (2020) proposition.
The atds_mgwr procedure constitutes a second step after the estimation of tds_mgwr, once the optimal bandwidths for each variable are known. It launches a backfitting on each variable to perform an univariate tds_gwr estimation to test if adapting the bandwidth based on location improves the estimations. (The tds_gwr model is an innovative top-down scale approach for univariate GWR, allowing adaptation of the bandwidth to the data’s location. It employs a gradient boosting framework instead of a backfitting procedure.)
Both tds_mgwr and atds_mgwr imply several \(n\times n\) matrix multiplications for computing the hat matrix, using Wu et al 2018 for backfitting part and Buhlmann and Hothorn (2007) for the gradient boosting part. For both cases we propose approximation method to speed up calculation.
In the following example, the simu_multiscale allow to simulate the DGP proposed by Geniaux 2025 (Top down scale approaches for multiscale GWR and other semi-parametric partially linear varying coefficient regression with multiple locally-self-adaptive bandwidths, in prep) illustrated in the next figure.
library(mgwrsar)
n=1000
simu1=simu_multiscale(n,myseed = 5,config_snr=0.9,config_beta='default',config_eps='normal')
mydata=simu1$mydata
coords=simu1$coords
TRUEBETA<-as.matrix(mydata[,c('Beta1','Beta2','Beta3','Beta4')],ncol=4)
myformula<-as.formula('Y ~ X1 + X2 + X3')With a GWR, we obtain for this simulated data a global bandwidth equal to 0.04339215 and a mean RMSE of the Beta coefficients equal to 0.8726403.
############
## GWR as a benshmark model (we use the whole dataset for estimation)
############
res1=golden_search_bandwidth(formula=myformula,H2=NULL,data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'), Model = 'GWR',control=list(verbose=F,NN=300,criterion='AICc',adaptive=F),lower.bound=0, upper.bound=1,tolerance=0.001)
res1$minimum # 0.04339215## [1] 0.04339215## [1] 4755.424##    Intercept            X1               X2                X3          
##  Min.   :-1.856   Min.   :-1.317   Min.   :-3.0094   Min.   :-3.92712  
##  1st Qu.: 1.167   1st Qu.: 1.116   1st Qu.:-0.5407   1st Qu.:-1.91987  
##  Median : 2.052   Median : 2.304   Median : 0.1700   Median :-0.19165  
##  Mean   : 2.075   Mean   : 2.330   Mean   : 0.2149   Mean   :-0.05694  
##  3rd Qu.: 2.856   3rd Qu.: 3.522   3rd Qu.: 0.9500   3rd Qu.: 1.72766  
##  Max.   : 6.890   Max.   : 6.092   Max.   : 3.2380   Max.   : 5.48828### Beta RMSE
Beta_RMSE=apply((res1$model@Betav-TRUEBETA %>% as.data.frame()),2,rmse)
Beta_RMSE #0.5719872 0.6340170 1.0473393 1.2372179##     Beta1     Beta2     Beta3     Beta4 
## 0.5719872 0.6340170 1.0473393 1.2372179## [1] 0.8726403## [1] 4755.426The multiscale_gwr function implements the backfitting algorithm proposed by Fotheringham et al. (2017). Key features of this implementation include:
While this implementation of multiscale GWR does not significantly reduce the bias of Beta coefficients compared to standard GWR, it highlights the challenges of selecting appropriate bandwidths for complex spatial patterns. For instance, the bandwidth chosen for \(\beta_4\) (1.359149) is too large, leading to suboptimal results.
############
## multiscale_gwr 
############
model_multiscale_gwr <-multiscale_gwr(formula=myformula,data=mydata,coords=coords,kernels='gauss',verbose=T,get_AICg=TRUE,control=list(SE=F,adaptive=F,NN=300,isgcv=FALSE,verbose=F,family=gaussian(),doMC=F, ncore=1),init='GWR',nstable=5,tolerance =0.0001)## H0= 0.04345371 
## AICc= 4755.423  RMSE= 1.698892 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  1  H  0.05462896 0.05796469 0.03096999 1.359149  RMSE= 3.045944  delta_rmse  0.414929  AICg= 5657.412  stable 1 1 1 1 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  2  H  0.1214272 0.06336201 0.03926689 1.359149  RMSE= 3.007683  delta_rmse  0.01256141  AICg= 5400.491  stable 1 1 1 2 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  3  H  0.131717 0.06479464 0.03917704 1.359149  RMSE= 3.015468  delta_rmse  -0.002588628  AICg= 5397.519  stable 1 1 1 3 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  4  H  0.1337998 0.06502987 0.0391215 1.359149  RMSE= 3.016645  delta_rmse  -0.0003902467  AICg= 5397.2  stable 1 1 1 4 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  5  H  0.1341804 0.06502987 0.0391215 1.359149  RMSE= 3.016867  delta_rmse  -7.342889e-05  AICg= 5397.149  stable 1 2 2 5 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  6  H  0.134236 0.06502987 0.0391215 1.359149  RMSE= 3.016904  delta_rmse  -1.224577e-05  AICg= 5397.144  stable 1 3 3 6 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  7  H  0.1342703 0.06502987 0.0391215 1.359149  RMSE= 3.01692  delta_rmse  -5.271203e-06  AICg= 5397.142  stable 1 4 4 7 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  8  H  0.1342703 0.06502987 0.0391215 1.359149  RMSE= 3.016921  delta_rmse  -3.256739e-07  AICg= 5397.142  stable 2 5 5 8 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  9  H  0.1342703 0.06502987 0.0391215 1.359149  RMSE= 3.016921  delta_rmse  -3.307275e-08  AICg= 5397.142  stable 3 6 6 9 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  10  H  0.1342703 0.06502987 0.0391215 1.359149  RMSE= 3.016921  delta_rmse  -3.812308e-09  AICg= 5397.142  stable 4 7 7 10 
## 
##  backfitting   Intercept    X1    X2    X3  
##  iter  11  H  0.1342703 0.06502987 0.0391215 1.359149  RMSE= 3.016921  delta_rmse  -4.792044e-10  AICg= 5397.142  stable 5 8 8 11 
## Time =  257.014## Call:
## MGWRSAR(formula = myformula, data = data, coords = coords, fixed_vars = NULL, 
##     kernels = kernels, H = c(h, H2), Model = "multiscale_gwr", 
##     control = control)
## Model: multiscale_gwr 
## Method for spatial autocorrelation: 2SLS 
## Kernels function: gauss 
## Kernels adaptive: NO 
## Kernels type: GD 
## Bandwidth: 0.1342703 0.06502987 0.0391215 1.359149 
## Computation time: 257.014 
## Use of parallel computing: FALSE  ncore = 1 
## Use of rough kernel: YES, 300 neighbors / 1000 
## Use of Target Points: NO 
## Number of data points: 1000 
## Varying parameters: X3 
##         Intercept        X1        X2      X3
## Min.     0.092348 -1.468016 -4.121051 -0.8917
## 1st Qu.  1.403305  1.329966 -0.720194 -0.2933
## Median   2.218123  2.284794  0.260265 -0.1072
## Mean     2.161043  2.343896  0.195910 -0.0408
## 3rd Qu.  2.847813  3.488064  0.981496  0.3363
## Max.     4.485215  5.771464  4.081508  0.7085
## Effective degrees of freedom: 996.5876 
## AICc: 5397.142 
## Residual sum of squares: 9101.81 
## RMSE: 3.016921# Computation time: 211.175 with macbookpro M1
### Beta RMSE
Beta_RMSE=apply((model_multiscale_gwr@Betav-TRUEBETA %>% as.data.frame()),2,rmse)
Beta_RMSE # 0.2915894 0.6517318 1.0764612 2.8593433 ##     Beta1     Beta2     Beta3     Beta4 
## 0.2915045 0.6529376 1.0763655 2.8593369## [1] 1.220036## [1] 5397.142The tds_mgwr algorithm is designed to achieve the same objectives as multiscale GWR, specifically to identify a unique bandwidth for each variable. The model is defined as:
\(y_{i}=\sum_{k=1}^{K}\beta_{k}(u_{i},v_{i};K(h_k))x_{ij}+\epsilon_{i}\)
############
### tds_mgwr 
############
# model estimate
model_tds_mgwr=tds_mgwr(formula=myformula,Model='tds_mgwr',data=mydata,coords=coords,kernels='gauss',fixed_vars=NULL,H2=NULL,control_tds=list(nns=30,get_AIC=T),control=list(adaptive=F,doMC=F,ncore=1))
# Estimation can be faster using get_AIC=F
### Beta RMSE
Beta_RMSE=apply((model_tds_mgwr@Betav-TRUEBETA %>% as.data.frame()),2,rmse)
Beta_RMSE # 0.1866531 0.4445685 0.8608283 0.9585331##     Beta1     Beta2     Beta3     Beta4 
## 0.1866531 0.4445685 0.8608283 0.9585331## [1] 0.6126457##          
## 4547.775The atds_mgwr algorithm extends tds_mgwr by incorporating location-specific bandwidths, enabling local adaptation and improved estimation accuracy. The model is defined as:
\(y_{i}=\sum_{k=1}^{K}\beta_{k}(u_{i},v_{i};K(h_k(u_i,v_i)))x_{ij}+\epsilon_{i}\)
The atds_mgwr algorithm integrates a backfitting approach with a univariate GWR gradient boosting algorithm. Unlike tds_mgwr, which applies GWR at each iteration of the backfitting process, atds_mgwr employs a gradient boosting GWR that follows a sequence of decreasing bandwidths (using the same sequence as in tds_mgwr). For further details, see Geniaux (2025).
The atds_mgwr algorithm significantly improves estimation accuracy compared to tds_mgwr:
############
## atds_mgwr 
############
model_atds_mgwr=tds_mgwr(formula=myformula,Model='atds_mgwr',data=mydata,coords=coords,kernels='gauss',fixed_vars=NULL,control_tds=list(nns=30,model_stage1=model_tds_mgwr),control=list(adaptive=F,verbose=FALSE,isgcv =FALSE,Type ='GD',doMC=F,ncore=1))
summary(model_atds_mgwr)## Call:
## tds_mgwr(formula = myformula, data = mydata, coords = coords, 
##     Model = "atds_mgwr", kernels = "gauss", fixed_vars = NULL, 
##     control_tds = list(nns = 30, model_stage1 = model_tds_mgwr), 
##     control = list(adaptive = F, verbose = FALSE, isgcv = FALSE, 
##         Type = "GD", doMC = F, ncore = 1))
## Model: atds_mgwr 
## Method for spatial autocorrelation:  
## Kernels function: gauss 
## Kernels adaptive: NO 
## Kernels type: GD 
## Bandwidth: 0.3218526 0.1437018 0.04387009 0.04730317 
## Computation time: 263.314 
## Use of parallel computing: FALSE  ncore = 1 
## Use of rough kernel: NO 
## Use of Target Points: NO 
## Number of data points: 1000 
## Varying parameters: Intercept X1 X2 X3 
##         Intercept       X1       X2      X3
## Min.     -0.65501 -0.92293 -4.32107 -5.6627
## 1st Qu.   1.19592  1.06110 -0.87681 -2.6450
## Median    2.06740  2.19465  0.18820 -0.5112
## Mean      2.05897  2.32279  0.24090 -0.1311
## 3rd Qu.   2.88027  3.56081  1.17471  2.3528
## Max.      4.98107  5.41300  7.74459  6.9789
## Effective degrees of freedom: 747.5733 
## AICc: 4222.661 
## Residual sum of squares:  
## RMSE: 1.422596### Beta RMSE
Beta_RMSE=apply((model_atds_mgwr@Betav-TRUEBETA %>% as.data.frame()),2,rmse)
Beta_RMSE # nrounds=3 0.1016979 0.2814159 0.7942419 0.6162484;##      Beta1      Beta2      Beta3      Beta4 
## 0.08675225 0.27901447 0.80444136 0.63377897## [1] 0.4509968## [1] 4222.661Even though considering specific bandwidth for each Beta, the multiscale_gwr algorithm performs poorly for estimating \(Beta_4\). In this simulation, the estimations of \(Beta_4\) are significantly improved using tds_mgwr, and even more so with atds_mgwr.
The following table compare rmse for Betas estimates using GWR and multiscale GWR like estimators of mgwrsar R package on a single simulation of the DGP proposed by Geniaux (2025).
| \[\beta_1\] | \[\beta_2\] | \[\beta_3\] | \[\beta_4\] | |
|---|---|---|---|---|
| GWR | 0.5719872 | 0.6340170 | 1.0473393 | 1.2372179 | 
| multiscale_gwr | 0.2915894 | 0.6517318 | 1.0764612 | 2.8593433 | 
| TDS_MGWR | 0.1866531 | 0.4445685 | 0.8608283 | 0.9585331 | 
| ATDS_MGWR | 0.1437748 | 0.3569051 | 0.8220955 | 0.6234205 | 
### PREDICTION
# build in sample and out sample data
# folds
length_out=800
index_in=sample(1:1000,length_out)
index_out=(1:1000)[-index_in]
# out sample data
newdata=mydata[index_out,]
newdata_coords=coords[index_out,]
## GWR model
res14=golden_search_bandwidth(formula=myformula,H2=NULL,data = mydata[index_in,],coords=coords[index_in,], fixed_vars=NULL,kernels=c('gauss'), Model = 'GWR',control=list(verbose=F,NN=300,criterion='AICc',adaptive=F),lower.bound=0, upper.bound=1,tolerance=0.001)
### three methods of predictions with GWR are now tested
# surprisingly, the target point method leads to outsample 
#estimates and predictions better than methods 'shepard' and 'model'
## outsample prediction with the GWR model with target points
res_pred1=predict(res14$model, newdata=newdata, newdata_coords=newdata_coords,method_pred='TP',beta_proj=TRUE)
Y_pred=res_pred1$Y_predicted
head(mydata$Y[index_out])## [1] -0.7209412  7.2004245 -1.1266273  8.4334616 -2.4885930 -1.5429266# RMSE for predicted Y on the out sample fold
gwr_out_rmse = sqrt(mean((mydata$Y[index_out]-Y_pred)^2))
sqrt(mean((mydata$Y[index_out]-Y_pred)^2)) #  2.831865## [1] 2.907076## [1] 1.747453# out sample RMSE for extrapolated beta coefficients
Beta_RMSE_outsample_gwr=apply((res_pred1$Beta_proj_out-TRUEBETA[index_out,] %>% as.data.frame()),2,rmse)
mean(Beta_RMSE_outsample_gwr) #  1.009409## [1] 1.096563# in sample RMSE for beta coefficients
Beta_RMSE_insample_gwr=apply((res14$model@Betav-TRUEBETA[index_in,] %>% as.data.frame()),2,rmse)
Beta_RMSE_insample_gwr##     Beta1     Beta2     Beta3     Beta4 
## 0.6210909 0.7177753 1.1229993 1.3723732#    Beta1     Beta2     Beta3     Beta4 
# 0.6129079 0.6707307 1.0983044 1.4200666 
# -> Beta 4  (and Beta 3) spatial structure are not well capture by the GWR
# (large RMSE)
mean(Beta_RMSE_insample_gwr) # 0.9505024## [1] 0.9585597## Predictions with the GWR model (method_pred='shepard')
res_pred3=predict(res14$model, newdata=newdata, newdata_coords=newdata_coords,method_pred='shepard',k_extra=8,beta_proj=TRUE)
Y_pred=res_pred3$Y_predicted
head(mydata$Y[index_out])## [1] -0.7209412  7.2004245 -1.1266273  8.4334616 -2.4885930 -1.5429266## [1] 2.897205## [1] 1.747453Beta_RMSE_outsample=apply((res_pred3$Beta_proj_out-TRUEBETA[index_out,] %>% as.data.frame()),2,rmse)
mean(Beta_RMSE_outsample) #  1.021705 ## [1] 1.095181## [1] 0.9585597## Predictions with the GWR model (method_pred='model')
res_pred4=predict(res14$model, newdata=newdata, newdata_coords=newdata_coords,method_pred='model',k_extra=5,beta_proj=TRUE)
Y_pred=res_pred4$Y_predicted
head(mydata$Y[index_out])## [1] -0.7209412  7.2004245 -1.1266273  8.4334616 -2.4885930 -1.5429266## [1] 3.043557## [1] 1.747453Beta_RMSE_outsample=apply((res_pred4$Beta_proj_out-TRUEBETA[index_out,] %>% as.data.frame()),2,rmse)
mean(Beta_RMSE_outsample) #  1.066749## [1] 1.11062## [1] 0.9585597##############################################
### Algorithm Top down scale MGWR (tds_mgwr)
# model calibration
model_tds_mgwr_in=tds_mgwr(formula=myformula,Model='tds_mgwr',data=mydata[index_in,],coords=coords[index_in,],kernels='gauss',fixed_vars=NULL,control_tds=list(nns=30,get_AIC=T),control=list(adaptive=F,verbose=FALSE,isgcv =FALSE,Type ='GD',doMC=F,ncore=1))
summary(model_tds_mgwr_in)## Call:
## tds_mgwr(formula = myformula, data = mydata[index_in, ], coords = coords[index_in, 
##     ], Model = "tds_mgwr", kernels = "gauss", fixed_vars = NULL, 
##     control_tds = list(nns = 30, get_AIC = T), control = list(adaptive = F, 
##         verbose = FALSE, isgcv = FALSE, Type = "GD", doMC = F, 
##         ncore = 1))
## Model: tds_mgwr 
## Method for spatial autocorrelation:  
## Kernels function: gauss 
## Kernels adaptive: NO 
## Kernels type: GD 
## Bandwidth: 0.07310735 0.0570755 0.03317522 0.03317522 
## Computation time: 102.001 
## Use of parallel computing: FALSE  ncore = 1 
## Use of rough kernel: NO 
## Use of Target Points: NO 
## Number of data points: 800 
## Varying parameters: Intercept X1 X2 X3 
##         Intercept       X1       X2      X3
## Min.     -1.10029 -0.25180 -3.11937 -4.4422
## 1st Qu.   1.17345  1.27204 -0.73879 -2.2401
## Median    2.34015  2.34312  0.17149 -0.2979
## Mean      2.13598  2.44952  0.19176 -0.1209
## 3rd Qu.   2.94012  3.66287  1.06636  2.0009
## Max.      5.15849  5.63382  4.58067  5.4436
## Effective degrees of freedom: 513.4306 
## AICc: 3721.427 
## Residual sum of squares:  
## RMSE: 1.414451# prediction on outsample data, beta coefficients extrapolation using (method_pred='tWtp_model')
res_pred2=predict(model_tds_mgwr_in, newdata=newdata, newdata_coords=newdata_coords,method_pred='tWtp_model',beta_proj=TRUE)
# predicted values for the dependent variable
Y_pred=res_pred2$Y_predicted
# RMSE for out sample predictions
sqrt(mean((mydata$Y[index_out]-Y_pred)^2)) #  2.565149 (# for GWR 2.831865)## [1] 2.587307## [1] 1.414451# note that tds_mgwr improves predictions RMSE comparing to GWR 
## RMSE on coefficients estimates
# out sample
Beta_RMSE_outsample_tds_mgwr=apply((res_pred2$Beta_proj_out-TRUEBETA[index_out,] %>% as.data.frame()),2,rmse)
# outsample rmse for each coefficient
Beta_RMSE_outsample_tds_mgwr##     Beta1     Beta2     Beta3     Beta4 
## 0.2977538 0.4905907 1.1510240 1.4276576            #Beta1     Beta2     Beta3     Beta4 
# TDS-MGWR 0.2030172 0.4936683 1.0147774 1.3803936 
# GWR      0.6129079 0.6707307 1.0983044 1.4200666 
## TDS-MGWR model improves outsample beta estimates espescially for
# Beta_1 and Beta_2. For Beta_4 the improvement is small.
# average RMSE
mean(Beta_RMSE_outsample_tds_mgwr) #  0.7729641 (GWR 1.009409)## [1] 0.8417565# in sample
Beta_RMSE_insample_tds_mgwr=apply((model_tds_mgwr_in@Betav-TRUEBETA[index_in,] %>% as.data.frame()),2,rmse)
Beta_RMSE_insample_tds_mgwr##     Beta1     Beta2     Beta3     Beta4 
## 0.2874578 0.4825926 0.9396987 1.0851012## [1] 0.6987126## prediction with (method_pred='shepard')
# prediction performances close to the ones obtained with (method_pred='tWtp_model')
res_pred6=predict(model_tds_mgwr_in, newdata=newdata, newdata_coords=newdata_coords,method_pred='shepard',k_extra=5,beta_proj=TRUE)
Y_pred=res_pred6$Y_predicted
sqrt(mean((mydata$Y[index_out]-Y_pred)^2)) # 2.617877 ## [1] 2.584122## [1] 1.414451Beta_RMSE_outsample_tds_mgwr=apply((res_pred6$Beta_proj_out-TRUEBETA[index_out,] %>% as.data.frame()),2,rmse)
mean(Beta_RMSE_outsample_tds_mgwr) #  0.7874575## [1] 0.8543411## [1] 0.6987126###################################################################
###################################################################
### atds_mgwr algorithm
# model estimate using in sample data, Gaussian non adaptive kernel
model_atds_mgwr_in=tds_mgwr(formula=myformula,Model='atds_mgwr',data=mydata[index_in,],coords=coords[index_in,],kernels='gauss',fixed_vars=NULL,control_tds=list(nns=30,get_AIC=T),control=list(adaptive=F,verbose=FALSE,isgcv =FALSE,Type ='GD',doMC=F,ncore=1))
summary(model_atds_mgwr_in)## Call:
## tds_mgwr(formula = myformula, data = mydata[index_in, ], coords = coords[index_in, 
##     ], Model = "atds_mgwr", kernels = "gauss", fixed_vars = NULL, 
##     control_tds = list(nns = 30, get_AIC = T), control = list(adaptive = F, 
##         verbose = FALSE, isgcv = FALSE, Type = "GD", doMC = F, 
##         ncore = 1))
## Model: atds_mgwr 
## Method for spatial autocorrelation:  
## Kernels function: gauss 
## Kernels adaptive: NO 
## Kernels type: GD 
## Bandwidth: 0.2532088 0.1367192 0.0432803 0.0522608 
## Computation time: 298.204 
## Use of parallel computing: FALSE  ncore = 1 
## Use of rough kernel: NO 
## Use of Target Points: NO 
## Number of data points: 800 
## Varying parameters: Intercept X1 X2 X3 
##         Intercept       X1       X2      X3
## Min.     -0.93601 -1.02440 -4.16357 -5.2688
## 1st Qu.   1.17042  1.13717 -0.93081 -2.5715
## Median    2.17956  2.33458  0.12425 -0.5127
## Mean      2.10036  2.40935  0.22340 -0.1595
## 3rd Qu.   3.00911  3.67388  1.29080  2.2497
## Max.      5.02085  5.37713 11.29336  7.3030
## Effective degrees of freedom: 574.2321 
## AICc: 3446.372 
## Residual sum of squares:  
## RMSE: 1.403203# prediction on out sample data, method_pred='tWtp_model' 
res_pred7=predict(model_atds_mgwr_in, newdata=newdata, newdata_coords=newdata_coords,method_pred='tWtp_model',beta_proj=TRUE)
Y_pred=res_pred7$Y_predicted
# RMSE for predicted Y on the out sample fold
sqrt(mean((mydata$Y[index_out]-Y_pred)^2)) ## [1] 2.43769# ATDS-MGWR 2.445863 # TDS-MGWR 2.565149  # GWR 2.831865)
# RMSE for insample predicted Y
model_atds_mgwr_in@RMSE # 1.380074## [1] 1.403203## beta estimates
RMSE_outsample=apply((res_pred7$Beta_proj_out-TRUEBETA[index_out,] %>% as.data.frame()),2,rmse)
mean(RMSE_outsample) # 0.7191038## [1] 0.778225RMSE_insample_atds=apply((model_atds_mgwr_in@Betav-TRUEBETA[index_in,] %>% as.data.frame()),2,rmse)
mean(RMSE_insample_atds) # 0.5305797## [1] 0.5684047## predictions with method_pred='shepard': 
# in this example this methods leads to outsample RMSE better than 'tWtp_model'
res_pred8=predict(model_atds_mgwr_in, newdata=newdata, newdata_coords=newdata_coords,method_pred='shepard',k_extra=8,beta_proj=TRUE)
# predictions
Y_pred=res_pred8$Y_predicted
sqrt(mean((mydata$Y[index_out]-Y_pred)^2)) #   2.372426## [1] 2.328874## [1] 1.403203# RMSE for beta estimates
RMSE_outsample_atds=apply((res_pred8$Beta_proj_out-TRUEBETA[index_out,] %>% as.data.frame()),2,rmse)
RMSE_outsample_atds##     Beta1     Beta2     Beta3     Beta4 
## 0.1700092 0.3449231 1.0705712 1.1302407            #Beta1     Beta2     Beta3     Beta4 
# ARTS-MGWR 0.1265100 0.3675659 1.0143203 1.0776401 
# TDS-MGWR  0.2030172 0.4936683 1.0147774 1.3803936 
# GWR       0.6129079 0.6707307 1.0983044 1.4200666 
                     # ATDS-MGWR   # TDS-MGWR  # GWR
mean(RMSE_outsample_atds) #  0.6465091  # 0.7729641 # 1.009409## [1] 0.678936## [1] 0.5684047#the method atds\_MGWR allows to significantly improve $\Beta$ estimates, both for in sample and out sample data.The results for out-of-sample prediction of the dependent variable demonstrate that both tds_mgwr and atds_mgwr significantly improve prediction accuracy compared to standard GWR estimates. The Root Mean Squared Prediction Error (RMSPE) values are as follows:
While atds_mgwr achieves the lowest RMSPE, the improvement in out-of-sample prediction accuracy between atds_mgwr and tds_mgwr is less pronounced than the improvement in the RMSE of the coefficients.
In this simulation, the GWR model, which employs a single bandwidth, fails to adequately capture the spatial structure of the \(\beta\) coefficients. Despite relatively low prediction errors, the estimation errors are significantly higher, indicating potential misspecification issues in the GWR model.
The tds_mgwr method addresses this limitation by assigning a unique bandwidth to each covariate, leading to substantial improvements in the estimation of \(\beta\) coefficients. This is particularly evident for \(\beta_1\) and \(\beta_2\), where the model achieves significantly better accuracy.
Building on the tds_mgwr model, atds_mgwr introduces location-specific bandwidths, enabling further improvements for \(\beta_1\) and \(\beta_2\). Additionally, atds_mgwr achieves a notable improvement for \(\beta_4\) by applying a sequence of bandwidths that better capture the spatial pattern of this coefficient.