library(spsur)
library(spdep)
library(spatialreg)
library(sf)

1 Introduction

The spsur R-package (Mínguez, López, and Mur 2022; López, Mı́nguez, and Mur 2020) has been designed for estimate spatial regression models in a multiequational framework. However, because of its flexibility, it is also possible to obtain useful results for uniequational models. On the oher hand, the spatialreg package (Bivand, Pebesma, and Gómez-Rubio (2013); Bivand and Piras (2015)), is the most adequate alternative for working with uniequational models, in a pure cross-sectional setting. The purpose of this vignette is to compare the results of spsur and spatialreg in case of uniequational models. As we will see, the differences between the two are negligible.

In sum, the purpose of this vignette is to compare the results of spsur and spatialreg in the following issues:

Estimation of spatially independent model, or SIM models.
Lagrange Multiplier misspecification tests for SIM models.
ML estimation of SLM, SEM, SARAR models.
3SLS estimation of SLM and SDM models.

2 The data set NCOVR

Throughout the vignette we use the dataset NCOVR (National Consortium on Violence Research). These data were employed by Baller et al. (2001) to analyze the incidence of homicides rates in the US counties. The dataset can be freely dowloaded from https://geodacenter.github.io/data-and-lab/ncovr/

NCOVR contains 3085 spatial units (counties), for 4 different cross-sections (1960, 1970, 1980, 1990) and 69 variables. According to Baller et al. (2001), the main variables of interest are:

HR: homicide rate per 100000 inhabitants
RD: resource deprivation
PS: population structure
MA: median age
DV: divorce rate (% males over 14 divorced)
UE: unemployment rate
SOUTH: dummy variable for Southern counties (South = 1)

First, we can read the NCOVR dataset as a simple feature (sf) object (named NCOVR.sf),

data(NCOVR, package = "spsur")

The first three observations in NCOVR appear below:

NCOVR <- st_drop_geometry(NCOVR.sf)
knitr::kable(
  head((NCOVR[1:3, 1:6])),
  caption = 'First observations of NCOVR dataset' )

NAME	STATE_NAME	FIPS	HR60	HR70
Lake of the Woods	Minnesota	27077	0.000000	0.000000
Ferry	Washington	53019	0.000000	0.000000
Stevens	Washington	53065	1.863863	1.915158

Whereas the geometry of the USA counties is shown in Figure 1:

plot(st_geometry(NCOVR.sf))

Figure 1: Geometry of the USA counties

Following Baller et al. (2001), we consider a W matrix based on the criterion of 10 nearest-neighbourhood, which is immediate to obtain using the spdep package (Bivand, Pebesma, and Gómez-Rubio (2013); Bivand and Wong (2018)). The resulting weighting matrix will be called listw. Note that this matrix is non-symmetric and it is row-standardized.

# Obtain coordinates of centroids
co <- sf::st_coordinates(sf::st_centroid(NCOVR.sf))
listw <- spdep::nb2listw(spdep::knn2nb(spdep::knearneigh(co, k = 10,longlat = TRUE)))

Baller et al. (2001) specify a single linear model to explain the case of HR in the year 1960 with the results shown in the Table below,

\[\begin{equation} HR_{60} = \beta_{0}+\beta_{1}RD_{60}+\beta_{2}PS_{60}+\beta_{3}MA_{60}+\beta_{4}DV_{60} +\beta_{5}UE_{60} +\beta_{6}SOUTH+\epsilon_{60} \tag{2.1} \end{equation}\]

formula_60 <- HR60 ~ RD60 + PS60 + MA60 + DV60 + UE60 + SOUTH
lm_60 <- stats::lm(formula = formula_60, data = NCOVR.sf)
summary(lm_60)

#> 
#> Call:
#> stats::lm(formula = formula_60, data = NCOVR.sf)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -14.026  -2.217  -0.635   1.393  88.312 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  8.12591    0.63465  12.804  < 2e-16 ***
#> RD60         1.79824    0.12341  14.571  < 2e-16 ***
#> PS60         0.35871    0.09216   3.892 0.000101 ***
#> MA60        -0.23047    0.01932 -11.931  < 2e-16 ***
#> DV60         1.16002    0.09483  12.233  < 2e-16 ***
#> UE60        -0.06195    0.03515  -1.762 0.078138 .  
#> SOUTH        2.63862    0.23325  11.312  < 2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 4.743 on 3078 degrees of freedom
#> Multiple R-squared:  0.2966, Adjusted R-squared:  0.2952 
#> F-statistic: 216.3 on 6 and 3078 DF,  p-value: < 2.2e-16

The model can be estimated in usual way with the function stats::lm (R Core Team (2019)),

formula_60 <- HR60 ~ RD60 + PS60 + MA60 + DV60 + UE60 + SOUTH
lm_60 <- lm(formula = formula_60, data = NCOVR.sf)
summary(lm_60)

#> 
#> Call:
#> lm(formula = formula_60, data = NCOVR.sf)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -14.026  -2.217  -0.635   1.393  88.312 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  8.12591    0.63465  12.804  < 2e-16 ***
#> RD60         1.79824    0.12341  14.571  < 2e-16 ***
#> PS60         0.35871    0.09216   3.892 0.000101 ***
#> MA60        -0.23047    0.01932 -11.931  < 2e-16 ***
#> DV60         1.16002    0.09483  12.233  < 2e-16 ***
#> UE60        -0.06195    0.03515  -1.762 0.078138 .  
#> SOUTH        2.63862    0.23325  11.312  < 2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 4.743 on 3078 degrees of freedom
#> Multiple R-squared:  0.2966, Adjusted R-squared:  0.2952 
#> F-statistic: 216.3 on 6 and 3078 DF,  p-value: < 2.2e-16

The same results can be obtained with the function spsurml from spsur, selecting the argument type = “sim”

ols.spsur <- spsurml(formula = formula_60, type = "sim", 
                     data = NCOVR.sf)

#> Initial point:  
#> log_lik:  -9176.338 
#> Iteration:  1  log_lik:  -9176.338 
#> Time to fit the model:  0.07  seconds
#> Time to compute covariances:  0  seconds

summary(ols.spsur)

#> Call:
#> spsurml(formula = formula_60, data = NCOVR.sf, type = "sim")
#> 
#>  
#> Spatial SUR model type:  sim 
#> 
#> Equation  1 
#>                Estimate Std. Error  t value  Pr(>|t|)    
#> (Intercept)_1  8.125915   0.634033  12.8162 < 2.2e-16 ***
#> RD60_1         1.798240   0.123294  14.5850 < 2.2e-16 ***
#> PS60_1         0.358706   0.092069   3.8960 9.986e-05 ***
#> MA60_1        -0.230475   0.019298 -11.9428 < 2.2e-16 ***
#> DV60_1         1.160020   0.094737  12.2446 < 2.2e-16 ***
#> UE60_1        -0.061948   0.035120  -1.7639   0.07785 .  
#> SOUTH_1        2.638618   0.233024  11.3234 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.2966 
#>   
#> Residual standard error:  4.739

The two functions, stats::lm and spsurml, produce identical results. The output of spsurml is shorter than that of lm so, depending on the necessities of the user, he/she can choose between lm and spsurml without loss of information. If you want all the estimation details, then use lm.

3 LM tests for spatial autocorrelation

The existence of omitted spatial dependence in the results of a SIM model, estimated by LS, can be tested by using the classical LM tests. The function spdep::lm.LMtests report the values of these Lagrange Multipliers

lmtest.spdep <- spdep::lm.LMtests(lm_60, listw, test = "all")
print(lmtest.spdep)

#> 
#>  Lagrange multiplier diagnostics for spatial dependence
#> 
#> data:  
#> model: lm(formula = formula_60, data = NCOVR.sf)
#> weights: listw
#> 
#> LMerr = 208.12, df = 1, p-value < 2.2e-16
#> 
#> 
#>  Lagrange multiplier diagnostics for spatial dependence
#> 
#> data:  
#> model: lm(formula = formula_60, data = NCOVR.sf)
#> weights: listw
#> 
#> LMlag = 232.1, df = 1, p-value < 2.2e-16
#> 
#> 
#>  Lagrange multiplier diagnostics for spatial dependence
#> 
#> data:  
#> model: lm(formula = formula_60, data = NCOVR.sf)
#> weights: listw
#> 
#> RLMerr = 1.1565, df = 1, p-value = 0.2822
#> 
#> 
#>  Lagrange multiplier diagnostics for spatial dependence
#> 
#> data:  
#> model: lm(formula = formula_60, data = NCOVR.sf)
#> weights: listw
#> 
#> RLMlag = 25.138, df = 1, p-value = 5.337e-07
#> 
#> 
#>  Lagrange multiplier diagnostics for spatial dependence
#> 
#> data:  
#> model: lm(formula = formula_60, data = NCOVR.sf)
#> weights: listw
#> 
#> SARMA = 233.25, df = 2, p-value < 2.2e-16

The same tests can be obtained with the function lmtestspsur

lmtest.spsur <- lmtestspsur(formula = formula_60, listw = listw, 
                            data = NCOVR.sf)
print(lmtest.spsur)

#> [[1]]
#> 
#>  LM-SUR-SLM
#> 
#> data:  NCOVR.sf
#> LM-stat = 231.96, df = 1, p-value < 2.2e-16
#> 
#> 
#> [[2]]
#> 
#>  LM-SUR-SEM
#> 
#> data:  NCOVR.sf
#> LM-stat = 207.98, df = 1, p-value < 2.2e-16
#> 
#> 
#> [[3]]
#> 
#>  LM*-SUR-SLM
#> 
#> data:  NCOVR.sf
#> LM-stat = 25.13, df = 1, p-value = 5.36e-07
#> 
#> 
#> [[4]]
#> 
#>  LM*-SUR-SEM
#> 
#> data:  NCOVR.sf
#> LM-stat = 1.1523, df = 1, p-value = 0.2831
#> 
#> 
#> [[5]]
#> 
#>  LM-SUR-SARAR
#> 
#> data:  NCOVR.sf
#> LM-stat = 233.11, df = 2, p-value < 2.2e-16

Note that the ordering of the battery of Lagrange Multipliers is not the same. Otherwise, the results are almost identical.

4 The Spatial Lag Model

Both R packages spatialreg and spsur can estimate Spatial Lag Models for a single cross-section. Continuing with the example before, the model that we want to estimate is:

\[ \begin{equation} HR_{60} = \rho W HR_{60} +\beta_{0}+\beta_{1}RD_{60}+\beta_{2}PS_{60}+\beta_{3}MA_{60}+\beta_{4}DV_{60} +\beta_{5}UE_{60} +\beta_{6}SOUTH+\epsilon_{60}\ \tag{4.1} \end{equation} \]

The ML estimation of equation (4.1) using the function spsurml() of spsur renders the following results:

slm.spsur <- spsur::spsurml(formula = formula_60, type = "slm", 
                            listw = listw, data = NCOVR.sf)

#> neighbourhood matrix eigenvalues
#> Computing eigenvalues ...
#> 
#> Initial point:   log_lik:  -9101.721  rhos:  0.364 
#> Iteration:  1   log_lik:  -9098.57  rhos:  0.37 
#> Iteration:  2   log_lik:  -9098.569  rhos:  0.37 
#> Time to fit the model:  1.18  seconds
#> Time to compute covariances:  5.68  seconds

summary(slm.spsur)

#> Call:
#> spsur::spsurml(formula = formula_60, data = NCOVR.sf, listw = listw, 
#>     type = "slm")
#> 
#>  
#> Spatial SUR model type:  slm 
#> 
#> Equation  1 
#>                Estimate Std. Error t value  Pr(>|t|)    
#> (Intercept)_1  5.475515   0.647627  8.4547 < 2.2e-16 ***
#> RD60_1         1.358071   0.125738 10.8008 < 2.2e-16 ***
#> PS60_1         0.284375   0.089320  3.1838  0.001468 ** 
#> MA60_1        -0.165479   0.019384 -8.5367 < 2.2e-16 ***
#> DV60_1         0.884739   0.093725  9.4398 < 2.2e-16 ***
#> UE60_1        -0.021741   0.034004 -0.6394  0.522641    
#> SOUTH_1        1.355704   0.244626  5.5419 3.244e-08 ***
#> rho_1          0.370414   0.030275 12.2350 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.3409 
#>   
#> Residual standard error:  4.588
#>  LMM: 22.393  p-value: (2.22e-06)

The output of the function spatialreg::lagsarlm() from spatialreg is

slm.spatialreg <- spatialreg::lagsarlm(formula = formula_60, 
                                       listw = listw, type = "lag", 
                                       data = NCOVR.sf)
summary(slm.spatialreg)

#> 
#> Call:
#> spatialreg::lagsarlm(formula = formula_60, data = NCOVR.sf, listw = listw, 
#>     type = "lag")
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -13.54416  -2.11106  -0.63828   1.30754  88.57312 
#> 
#> Type: lag 
#> Coefficients: (asymptotic standard errors) 
#>              Estimate Std. Error z value  Pr(>|z|)
#> (Intercept)  5.475272   0.647529  8.4556 < 2.2e-16
#> RD60         1.358031   0.125719 10.8021 < 2.2e-16
#> PS60         0.284368   0.089306  3.1842  0.001452
#> MA60        -0.165474   0.019381 -8.5377 < 2.2e-16
#> DV60         0.884714   0.093710  9.4410 < 2.2e-16
#> UE60        -0.021737   0.033999 -0.6393  0.522596
#> SOUTH        1.355586   0.244590  5.5423 2.986e-08
#> 
#> Rho: 0.37045, LR test value: 155.54, p-value: < 2.22e-16
#> Asymptotic standard error: 0.030274
#>     z-value: 12.237, p-value: < 2.22e-16
#> Wald statistic: 149.74, p-value: < 2.22e-16
#> 
#> Log likelihood: -9098.569 for lag model
#> ML residual variance (sigma squared): 21.043, (sigma: 4.5872)
#> Number of observations: 3085 
#> Number of parameters estimated: 9 
#> AIC: 18215, (AIC for lm: 18369)
#> LM test for residual autocorrelation
#> test value: 22.435, p-value: 2.1742e-06

Once again, the two estimations are almost the same. The estimated log-likelihood of the SLM can be recovered using the function logLik()

logLik(slm.spsur)

#> 'log Lik.' -9098.569 (df=9)

The log-likelihood corresponding to the SIM model is much lower, -9176.338, which points to a severe misspecification in the last model. More formally, the LR, obtained with the function anova of spsur strongly rejects the SIM model in favour of the SLM alternative; the AIC and BIC statistics indicates the same.

anova(ols.spsur, slm.spsur)

#>               logLik df   AIC   BIC LRtest      p.val
#> model 1: sim -9176.3  8 18369 18353                  
#> model 2: slm -9098.6  9 18215 18197 155.54 1.0684e-35

The SLM model can also be estimate by Three Stages-Least-Squares (3SLS) by both R-packages. The function for spsur is spsur3sls()

slm.3sls.spsur <- spsur3sls(formula = formula_60, type = "slm", 
                            listw = listw, data = NCOVR.sf)

#> Time to fit the model:  0.05  seconds

summary(slm.3sls.spsur)

#> Call:
#> spsur3sls(formula = formula_60, data = NCOVR.sf, listw = listw, 
#>     type = "slm")
#> 
#>  
#> Spatial SUR model type:  slm 
#> 
#> Equation  1 
#>                  Estimate  Std. Error t value  Pr(>|t|)    
#> (Intercept)_1  4.00373231  0.83096624  4.8182 1.519e-06 ***
#> RD60_1         1.11364326  0.15193264  7.3298 2.931e-13 ***
#> PS60_1         0.24309853  0.09249060  2.6284  0.008622 ** 
#> MA60_1        -0.12938710  0.02331379 -5.5498 3.103e-08 ***
#> DV60_1         0.73187418  0.10955683  6.6803 2.819e-11 ***
#> UE60_1         0.00058645  0.03576227  0.0164  0.986917    
#> SOUTH_1        0.64329382  0.35018066  1.8370  0.066301 .  
#> rho_1          0.57610680  0.07601797  7.5786 4.595e-14 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.3447 
#>   
#> Residual standard error:  4.695

Whereas spatialreg uses the function spatialreg::stsls()

slm.3sls.spatialreg <- spatialreg::stsls(formula = formula_60, listw = listw, 
                             data = NCOVR.sf)
summary(slm.3sls.spatialreg)

#> 
#> Call:
#> spatialreg::stsls(formula = formula_60, data = NCOVR.sf, listw = listw)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -14.78215  -2.06681  -0.57751   1.22823  88.71807 
#> 
#> Coefficients: 
#>                Estimate  Std. Error t value  Pr(>|t|)
#> Rho          0.57610680  0.07413902  7.7706 7.772e-15
#> (Intercept)  4.00373231  0.81042715  4.9403 7.801e-07
#> RD60         1.11364326  0.14817730  7.5156 5.662e-14
#> PS60         0.24309853  0.09020450  2.6950  0.007039
#> MA60        -0.12938710  0.02273754 -5.6905 1.267e-08
#> DV60         0.73187418  0.10684890  6.8496 7.405e-12
#> UE60         0.00058645  0.03487833  0.0168  0.986585
#> SOUTH        0.64329382  0.34152520  1.8836  0.059620
#> 
#> Residual variance (sigma squared): 20.967, (sigma: 4.579)

There is hardly any difference between them because the estimation algorithm is quasilinear and both functions use the same set of instruments. The case of the SDM model produces identical results.

5 Spatial Error Model

The model to estimate in this case is:

\[ \begin{equation} HR_{60} = \beta_{0}+\beta_{1}RD_{60}+\beta_{2}PS_{60}+\beta_{3}MA_{60}+\beta_{4}DV_{60} +\beta_{5}UE_{60} +\beta_{6}SOUTH+ u_{60} \\ u_{60} = \lambda W u_{60} + \epsilon_{60}\ \tag{5.1} \end{equation} \]

which can be solved by ML using the function spatialreg::errorsarlm(), from spatialreg.

sem.spatialreg <- spatialreg::errorsarlm(formula = formula_60, 
                             listw = listw, data = NCOVR.sf)
summary(sem.spatialreg)

#> 
#> Call:spatialreg::errorsarlm(formula = formula_60, data = NCOVR.sf, 
#>     listw = listw)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -13.54014  -2.10753  -0.66838   1.27875  88.31401 
#> 
#> Type: error 
#> Coefficients: (asymptotic standard errors) 
#>              Estimate Std. Error z value  Pr(>|z|)
#> (Intercept)  7.527982   0.766617  9.8197 < 2.2e-16
#> RD60         1.742357   0.148632 11.7226 < 2.2e-16
#> PS60         0.346866   0.109221  3.1758  0.001494
#> MA60        -0.208285   0.023388 -8.9057 < 2.2e-16
#> DV60         0.957248   0.108389  8.8316 < 2.2e-16
#> UE60         0.018210   0.039933  0.4560  0.648383
#> SOUTH        2.495326   0.316229  7.8909 3.109e-15
#> 
#> Lambda: 0.38244, LR test value: 140.48, p-value: < 2.22e-16
#> Asymptotic standard error: 0.032062
#>     z-value: 11.928, p-value: < 2.22e-16
#> Wald statistic: 142.28, p-value: < 2.22e-16
#> 
#> Log likelihood: -9106.1 for error model
#> ML residual variance (sigma squared): 21.124, (sigma: 4.5961)
#> Number of observations: 3085 
#> Number of parameters estimated: 9 
#> AIC: 18230, (AIC for lm: 18369)

spsur always uses the same function for the ML algorithm, spsurml(); you only have to change the type of model to estimate.

sem.spsur <- spsurml(formula = formula_60, type = "sem",
                     listw = listw, data = NCOVR.sf)

#> neighbourhood matrix eigenvalues
#> Computing eigenvalues ...
#> 
#> Initial point:   log_lik:  -9108.886  lambdas:  0.375 
#> Iteration:  1  log_lik:  -9106.101  lambdas:  0.382 
#> Iteration:  2  log_lik:  -9106.1  lambdas:  0.382 
#> Time to fit the model:  2.52  seconds
#> Time to compute covariances:  97.7  seconds

summary(sem.spsur)

#> Call:
#> spsurml(formula = formula_60, data = NCOVR.sf, listw = listw, 
#>     type = "sem")
#> 
#>  
#> Spatial SUR model type:  sem 
#> 
#> Equation  1 
#>                Estimate Std. Error t value  Pr(>|t|)    
#> (Intercept)_1  7.528052   0.766724  9.8185 < 2.2e-16 ***
#> RD60_1         1.742363   0.148653 11.7210 < 2.2e-16 ***
#> PS60_1         0.346864   0.109237  3.1753  0.001511 ** 
#> MA60_1        -0.208288   0.023391 -8.9046 < 2.2e-16 ***
#> DV60_1         0.957272   0.108405  8.8305 < 2.2e-16 ***
#> UE60_1         0.018201   0.039939  0.4557  0.648628    
#> SOUTH_1        2.495357   0.316267  7.8900 4.163e-15 ***
#> lambda_1       0.382398   0.032109 11.9095 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.3387 
#>   
#> Residual standard error:  4.597
#>  LMM:  1.193  p-value: (0.275)

The LR test for nested models, in this case SIM vs SEM, can be obtained as before using the function anova()

anova(ols.spsur, sem.spsur)

#>               logLik df   AIC   BIC LRtest      p.val
#> model 1: sim -9176.3  8 18369 18353                  
#> model 2: sem -9106.1  9 18230 18212 140.48 2.0951e-32

Clearly, the SEM model is preferable to the SIM specification because there is spatial dependence in the data, which has been effectively captured by the SEM mechanism. Moreover, according to the LMM test below (this is a Marginal Lagrange Multiplier of the type \(LM(\rho|\lambda)\); see the spsur user guide), once the spatial errors are introduced in the equation, the spatial lag of the endogenous variable, \(WHR_{60}\) is not statistically significant.

print(sem.spsur$LMM)

#> [1] 1.193044

6 Spatial Simultaneous Autoregressive Model (SARAR)

The especification of the SARAR model is as usual

\[ \begin{equation} HR_{60} = \rho W HR_{60} + \beta_{0}+\beta_{1}RD_{60}+\beta_{2}PS_{60}+\beta_{3}MA_{60}+\beta_{4}DV_{60} +\beta_{5}UE_{60} +\beta_{6}SOUTH+ u_{60} \\ u_{60} = \lambda W u_{60} + \epsilon_{60}\ \tag{6.1} \end{equation} \]

spsur estimates this model by using the funtion spsurml(); you only have to adjust the argument type to sarar, that is

sarar.spsur <- spsurml(formula = formula_60, listw = listw, 
                       type ="sarar",data = NCOVR.sf)

#> neighbourhood matrix eigenvalues
#> Computing eigenvalues ...
#> 
#> Initial point:   log_lik:  -9085.419  rhos:  0.663  lambdas:  -0.603 
#> Iteration:  1  log_lik:  -9062.889  rhos:  0.748  lambdas:  -0.837 
#> Iteration:  2  log_lik:  -9060.502  rhos:  0.769  lambdas:  -0.905 
#> Iteration:  3  log_lik:  -9060.28  rhos:  0.775  lambdas:  -0.925 
#> Iteration:  4  log_lik:  -9060.259  rhos:  0.777  lambdas:  -0.93 
#> Iteration:  5  log_lik:  -9060.257  rhos:  0.777  lambdas:  -0.932 
#> Iteration:  6  log_lik:  -9060.256  rhos:  0.778  lambdas:  -0.932 
#> Time to fit the model:  46.67  seconds
#> Time to compute covariances:  9.31  seconds

summary(sarar.spsur)

#> Call:
#> spsurml(formula = formula_60, data = NCOVR.sf, listw = listw, 
#>     type = "sarar")
#> 
#>  
#> Spatial SUR model type:  sarar 
#> 
#> Equation  1 
#>                Estimate Std. Error  t value  Pr(>|t|)    
#> (Intercept)_1  2.424897   0.391608   6.1922 6.723e-10 ***
#> RD60_1         0.657738   0.079705   8.2521 2.281e-16 ***
#> PS60_1         0.149831   0.054133   2.7678  0.005677 ** 
#> MA60_1        -0.079625   0.011841  -6.7245 2.093e-11 ***
#> DV60_1         0.500529   0.063120   7.9298 3.044e-15 ***
#> UE60_1        -0.031844   0.021379  -1.4895  0.136455    
#> SOUTH_1        0.261410   0.127913   2.0437  0.041072 *  
#> rho_1          0.777641   0.018131  42.8893 < 2.2e-16 ***
#> lambda_1      -0.932459   0.083230 -11.2034 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> R-squared: 0.444 
#>   
#> Residual standard error:  4.241

In the case of spatialreg, the function for this case is spatialreg::sacsar()

sarar.spatialreg <- spatialreg::sacsarlm(formula = formula_60, 
                             listw = listw, data = NCOVR.sf)
summary(sarar.spatialreg)

#> 
#> Call:
#> spatialreg::sacsarlm(formula = formula_60, data = NCOVR.sf, listw = listw)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -12.72590  -1.99405  -0.57621   1.20291  80.72854 
#> 
#> Type: sac 
#> Coefficients: (asymptotic standard errors) 
#>              Estimate Std. Error z value  Pr(>|z|)
#> (Intercept)  2.422847   0.410239  5.9059 3.506e-09
#> RD60         0.657260   0.085075  7.7256 1.110e-14
#> PS60         0.149719   0.054436  2.7504  0.005953
#> MA60        -0.079565   0.012404 -6.4146 1.412e-10
#> DV60         0.500200   0.066423  7.5305 5.063e-14
#> UE60        -0.031823   0.021382 -1.4883  0.136671
#> SOUTH        0.260793   0.131656  1.9809  0.047606
#> 
#> Rho: 0.77788
#> Asymptotic standard error: 0.023069
#>     z-value: 33.72, p-value: < 2.22e-16
#> Lambda: -0.93327
#> Asymptotic standard error: 0.072704
#>     z-value: -12.837, p-value: < 2.22e-16
#> 
#> LR test value: 232.16, p-value: < 2.22e-16
#> 
#> Log likelihood: -9060.256 for sac model
#> ML residual variance (sigma squared): 17.974, (sigma: 4.2396)
#> Number of observations: 3085 
#> Number of parameters estimated: 10 
#> AIC: 18141, (AIC for lm: 18369)

The same as before, the differences between the two codes are minimal.

Finally, according to the anova() function of spsur, the SARAR model is preferable to the SLM and SEM alternatives, which are nested in the SARAR, in spite of the high value estimated for the parameter \(\lambda\), -0.93327, close to a unit root case. The LR and both AIC and BIC statistics reach the same conclusion.

anova(slm.spsur,sarar.spsur)

#>                 logLik df   AIC   BIC LRtest      p.val
#> model 1: slm   -9098.6  9 18215 18197                  
#> model 2: sarar -9060.3 10 18141 18121 76.625 2.0666e-18

anova(sem.spsur,sarar.spsur)

#>                 logLik df   AIC   BIC LRtest      p.val
#> model 1: sem   -9106.1  9 18230 18212                  
#> model 2: sarar -9060.3 10 18141 18121 91.687 1.0151e-21

References

Baller, Robert D, Luc Anselin, Steven F Messner, Glenn Deane, and Darnell F Hawkins. 2001. “Structural Covariates of US County Homicide Rates: Incorporating Spatial Effects.” Criminology 39 (3): 561–88.

Bivand, Roger, Edzer Pebesma, and Virgilio Gómez-Rubio. 2013. Applied Spatial Data Analysis with R, Second Edition. Springer-Verlag, NY. https://asdar-book.org/.

Bivand, Roger, and Gianfranco Piras. 2015. “Comparing Implementations of Estimation Methods for Spatial Econometrics.” Journal of Statistical Software 63 (18): 1–36. https://doi.org/10.18637/jss.v063.i18.

Bivand, Roger, and David W. S. Wong. 2018. “Comparing Implementations of Global and Local Indicators of Spatial Association.” TEST 27 (3): 716–48. https://doi.org/10.1007/s11749-018-0599-x.

López, Fernando A, Román Mı́nguez, and Jesús Mur. 2020. “ML Versus IV Estimates of Spatial SUR Models: Evidence from the Case of Airbnb in Madrid Urban Area.” The Annals of Regional Science 64 (2): 313–47. https://doi.org/10.1007/s00168-019-00914-1.

Mínguez, Román, Fernando A. López, and Jesús Mur. 2022. “spsur: An R Package for Dealing with Spatial Seemingly Unrelated Regression Models.” Journal of Statistical Software 104 (11): 1–43. https://doi.org/10.18637/jss.v104.i11.

R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

spsur vs spatialreg

Uniequational model

Román Mínguez, University of Castilla-La Mancha (Spain)

Fernando A. López, Technical University of Cartagena (Spain)

Jesús Mur, University of Zaragoza, (Spain)

2022-10-29