In many practical situations, it is possible to have information about an auxiliary variate \(x_i\) (correlated with \(y_i\)) for all the population units, or at least for each unit in the sample, plus the population mean, \(\bar X\). In practice, \(x_i\) is often the value of \(y_i\) at some previous time when a complete census was taken. This approach is used in situations where the expected value and the variance of \(y_i\) is proportional to \(x_i\), so in the BLE setup, we replace some hypotheses about the \(y\)’s with ones about the first two moments of the rate \(y_i\)/\(x_i\). To the best of our knowledge, the new ratio estimator proposed below is a novel contribution in sampling survey theory.
The new ratio estimator is obtained as a particular case of model (2.4) and with the hypothesis of exchangeability, used in Bayes linear approach, applied to the rate \(y_i\)/\(x_i\) for all \(i = 1,..., N\) as described below:
\[\begin{equation} \tag{3.1} E \left( \frac{y_i}{x_i} \right) = m, \hspace{0.7cm} V \left( \frac{y_i}{x_i} \right) = v \hspace{0.7cm} \text{and} \hspace{0.7cm} Cov \left( \frac{y_i}{x_i},\frac{y_j}{x_j} \right) = c, \hspace{0.5cm} i,j = 1,...,N \hspace{0.5cm} \forall i \neq j \end{equation}\]such that: \(\sigma^2 = v - c\)
We can apply this with the BLE_Ratio() function, which receives the following parameters:
Letting \(v \to \infty\) and \(v \to \infty\), but keeping \(\sigma^2\) fixed, that is, assuming prior ignorance, we recover the ratio type estimator, found in the design-based approach: \(\hat{T}_{ra} = N \bar{X} (\bar{y}_s / \bar{x}_s)\).
This can be achieved using the BLE_SRS() function by omitting either the prior mean or the prior variance, that is:
data(BigCity)
end <- dim(BigCity)[1]
s <- seq(from = 1, to = end, by = 1)
set.seed(5)
samp <- sample(s, size = 10000, replace = FALSE)
ordered_samp <- sort(samp)
BigCity_red <- BigCity[ordered_samp,]
Expend <- BigCity_red$Expenditure
Income <- BigCity_red$Income
sampl <- sample(seq(1,10000),size=10)
ys <- Expend[sampl]
xs <- Income[sampl]
The real ratio between expenditure and income will be the value we want to estimate. In this example we know its real value:
Our design-based estimator for the mean would be the ratio between sample means:
Applying the prior information about the ratio we can get a better estimate, especially in cases when only a small sample is available:
x_nots <- BigCity_red$Income[-sampl]
Estimator <- BLE_Ratio(ys, xs, x_nots, m = 0.85, v = 0.24, sigma = sqrt(0.23998))
Estimator$est.beta
#> Beta
#> 1 0.7723287
Estimator$Vest.beta
#> V1
#> 1 1.383985e-05
Estimator$est.mean[1:4,]
#> [1] 104.2644 230.4165 826.3917 1241.5184
Estimator$Vest.mean[1:5,1:5]
#> V1 V2 V3 V4 V5
#> 1 32.6495313 0.5574125 1.999167 3.003421 0.5217451
#> 2 0.5574125 72.8274736 4.418010 6.637338 1.1530181
#> 3 1.9991667 4.4180104 272.623847 23.804893 4.1353134
#> 4 3.0034210 6.6373380 23.804893 421.530808 6.2126320
#> 5 0.5217451 1.1530181 4.135313 6.212632 68.0936545
Estimator$est.tot
#> [1] 4466282
ys <- c(10,8,6)
xs <- c(5,4,3.1)
x_nots <- c(1,20,13,15,-5)
m <- 2.5
v <- 10
sigma <- 2
Estimator <- BLE_Ratio(ys, xs, x_nots, m, v, sigma)
Estimator
#> $est.beta
#> Beta
#> 1 2.010444
#>
#> $Vest.beta
#> V1
#> 1 0.3133159
#>
#> $est.mean
#> y_nots
#> 1 2.010444
#> 2 40.208877
#> 3 26.135770
#> 4 30.156658
#> 5 -10.052219
#>
#> $Vest.mean
#> V1 V2 V3 V4 V5
#> 1 4.313316 6.266319 4.073107 4.699739 -1.56658
#> 2 6.266319 205.326371 81.462141 93.994778 -31.33159
#> 3 4.073107 81.462141 104.950392 61.096606 -20.36554
#> 4 4.699739 93.994778 61.096606 130.496084 -23.49869
#> 5 -1.566580 -31.331593 -20.365535 -23.498695 -12.16710
#>
#> $est.tot
#> [1] 112.4595
#>
#> $Vest.tot
#> [1] 782.5796
ys <- mean(c(10,8,6))
xs <- mean(c(5,4,3.1))
n <- 3
x_nots <- c(1,20,13,15,-5)
m <- 2.5
v <- 10
sigma <- 2
Estimator <- BLE_Ratio(ys, xs, x_nots, m, v, sigma, n)
#> sample means informed instead of sample observations, parameters 'n' and 'sigma' will be necessary
Estimator
#> $est.beta
#> Beta
#> 1 2.010444
#>
#> $Vest.beta
#> V1
#> 1 0.3133159
#>
#> $est.mean
#> y_nots
#> 1 2.010444
#> 2 40.208877
#> 3 26.135770
#> 4 30.156658
#> 5 -10.052219
#>
#> $Vest.mean
#> V1 V2 V3 V4 V5
#> 1 4.313316 6.266319 4.073107 4.699739 -1.56658
#> 2 6.266319 205.326371 81.462141 93.994778 -31.33159
#> 3 4.073107 81.462141 104.950392 61.096606 -20.36554
#> 4 4.699739 93.994778 61.096606 130.496084 -23.49869
#> 5 -1.566580 -31.331593 -20.365535 -23.498695 -12.16710
#>
#> $est.tot
#> [1] 112.4595
#>
#> $Vest.tot
#> [1] 782.5796