memochange-Tutorial: Break in Persistence

The memochange package can be used for two things: Checking for a break in persistence and checking for a change in mean. This vignette presents the functions related to a break in persistence. This includes BP_estim, cusum_test, LBI_test, LKSN_test, MR_test, ratio_test, and pb_sim. Before considering the usage of these functions, a brief literature review elaborates on their connection.

Usage

The memochange package contains all procedure mentioned above to identify whether a time series exhibits a break in persistence mentioned above. Additionally, several estimators are implemented which consistently estimate the point at which the series exhibits a break in persistence and the order of integration in the two regimes. We will now show how the usage of the implemented procedures while investigating the price of crude oil.

First, we download the monthly price series from the FRED data base.

oil=data.table::fread("https://fred.stlouisfed.org/graph/fredgraph.csv?bgcolor=%23e1e9f0&chart_type=line&drp=0&fo=open%20sans&graph_bgcolor=%23ffffff&height=450&mode=fred&recession_bars=on&txtcolor=%23444444&ts=12&tts=12&width=1168&nt=0&thu=0&trc=0&show_legend=yes&show_axis_titles=yes&show_tooltip=yes&id=MCOILWTICO&scale=left&cosd=1986-01-01&coed=2019-08-01&line_color=%234572a7&link_values=false&line_style=solid&mark_type=none&mw=3&lw=2&ost=-99999&oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&fgst=lin&fgsnd=2009-06-01&line_index=1&transformation=lin&vintage_date=2019-09-23&revision_date=2019-09-23&nd=1986-01-01")

To get a first visual impression, we plot the series.

oil=as.data.frame(oil)
oil$observation_date=zoo::as.Date(oil$observation_date)
oil_xts=xts::xts(oil[,-1],order.by = oil$observation_date)
zoo::plot.zoo(oil_xts, xlab="", ylab="Price", main="Crude Oil Price: West Texas Intermediate")

From the plot we observe that the series seems to be more variable in its second part from year 2000 onwards. This is first evidence that a change in persistence has occurred. We can test this hypothesis using the functions cusum_test (Leybourne, Taylor, and Kim (2007), Sibbertsen and Kruse (2009)) LBI_test (Busetti and Taylor (2004)), LKSN_test (Leybourne et al. (2003)), MR_test (Martins and Rodrigues (2014)) , and ratio_test (Busetti and Taylor (2004), Leybourne and Taylor (2004), Harvey, Leybourne, and Taylor (2006)). In this vignette we use the ratio and MR test since these are the empirically most often applied ones. The functionality of the other tests is similar. They all require a univariate numeric vector x as an input variable and yield a matrix of test statistic and critical values as an output variable.

library(memochange)
x <- as.numeric(oil[,2])

As a starting point the default version of the ratio test is applied.

ratio_test(x)
#>                                        90%    95%    99% Teststatistic
#> Against change from I(0) to I(1)    3.5148 4.6096 7.5536    225.943543
#> Against change from I(1) to I(0)    3.5588 4.6144 7.5304      1.170217
#> Against change in unknown direction 4.6144 5.7948 9.0840    225.943543

This yields a matrix that gives test statistic and critical values for the null of constant \(I(0)\) against a change from \(I(0)\) to \(I(1)\) or vice versa. Furthermore, the statistics for a change in an unknown direction are included as well. This accounts for the fact that we perform two tests facing a multiple testing problem. The results suggest that a change from \(I(0)\) to \(I(1)\) has occurred somewhere in the series since the test statistic exceeds the critical value at the one percent level. In addition, this value is also significant when accounting for the multiple testing problem. Consequently, the default version of the ratio test suggests a break in persistence.

We can modify this default version by choosing the arguments trend, tau, statistic, type, m, z, simu, and M (see the help page of the ratio test for details). The plot does not indicate a linear trend so that it seems unreasonable to change the trend argument. Also, the plot suggests that the break is rather in the middle of the series than at the beginning or the end so that changing tau seems unnecessary as well. The type of test statistic calculated can be easily changed using the statistic argument. However, simulation results indicate mean, max, and exp statistics to deliver qualitatively similar results.

Something that is of more importance is the type of test performed. The default version considers the approach by Busetti and Taylor (2004). In case of a constant \(I(1)\) process this test often spuriously identifies a break in persistence. Harvey, Leybourne and Taylor (2006) account for this issue by adjusting the test statistic such that its critical values are the same under constant \(I(0)\) and constant \(I(1)\). We can calculate their test statistic by setting type="HLT". For this purpose, we need to state the number of polynomials z used in their test statistic. The default value is 9 as suggested by Harvey, Leybourne and Taylor (2006). Choosing another value is only sensible for very large data sets (number of obs. > 10000) where the test statistic cannot be calculated due to computational singularity. In this case decreasing z can allow the test statistic to be calculated. This invalidates the critical values so that we would have to simulate them by setting simu=1. However, as our data set is rather small we can stick with the default of z=9.

ratio_test(x, type="HLT")
#>                                        90%    95%    99% Teststatistic 90%
#> Against change from I(0) to I(1)    3.5148 4.6096 7.5536        58.9102128
#> Against change from I(1) to I(0)    3.5588 4.6144 7.5304         0.3085619
#> Against change in unknown direction 4.6144 5.7948 9.0840        44.2193169
#>                                     Teststatistic 95% Teststatistic 99%
#> Against change from I(0) to I(1)           43.4794337        25.3386256
#> Against change from I(1) to I(0)            0.2290226         0.1290391
#> Against change in unknown direction        34.1387057        20.0073212

Again the test results suggests that there is a break from \(I(0)\) to \(I(1)\). Consequently, it is not a constant \(I(1)\) process that led to a spurious rejection of the test by Busetti and Taylor (2004).

Another test for a change in persistence is that by Martins and Rodrigues (2014). This is more general as it is not restricted to the \(I(0)/I(1)\) framework, but can identify changes from \(I(d_1)\) to \(I(d_2)\) with \(d_1 \neq d_2\) and \(-1/2<d_1,d_2<2\). The default version is applied by

MR_test(x)
#>                                          90%      95%      99% Teststatistic
#> Against increase in memory          4.270666 5.395201 8.233674      16.21494
#> Against decrease in memory          4.060476 5.087265 7.719128       2.14912
#> Against change in unknown direction 5.065695 6.217554 9.136441      16.21494

Again, the function returns a matrix consisting of test statistic and critical values. Here, the alternative of the test is an increase respectively a decrease in memory. In line with the results of the ratio test, the approach by Martins and Rodrigues (2014) suggests that the series exhibits an increase in memory, i.e. that the memory of the series increases from \(d_1\) to \(d_2\) with \(d_1<d_2\) at some point in time. Again, this also holds if we consider the critical values that account for the multiple testing problem.

Similar to the ratio test and all other tests against a change in persistence in the memochange package, the MR test also has the same arguments trend, tau, simu, and M. Furthermore, we can choose again the type of test statistic. This time we can decide whether to use the squared t-statistic or the standard t-statistic.

MR_test(x, statistic="standard")
#>                                           90%       95%       99% Teststatistic
#> Against increase in memory          -1.637306 -1.920434 -2.504862     -2.880545
#> Against decrease in memory          -1.651586 -1.951420 -2.514165     -1.277410
#> Against change in unknown direction -1.933137 -2.203370 -2.722017     -2.880545

As for the ratio test, changing the type of statistic has a rather small effect on the empirical performance of the test.

If we believe that the underlying process exhibits additional short run components, we can account for these by setting serial=TRUE

MR_test(x, serial=TRUE)
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo
#>                                          90%      95%      99% Teststatistic
#> Against increase in memory          4.270666 5.395201 8.233674     10.727202
#> Against decrease in memory          4.060476 5.087265 7.719128      6.758906
#> Against change in unknown direction 5.065695 6.217554 9.136441     10.727202

While the test statistic changes, the conclusion remains the same.

All tests indicate that the oil price series exhibits an increase in memory over time. To correctly model and forecast the series, the exact location of the break is important. This can be estimated by the BP_estim function. It is important for the function that the direction of the change is correctly specified. In our case, an increase in memory has occurred so that we set direction="01"

BP_estim(x, direction="01")
#> $Breakpoint
#> [1] 151
#> 
#> $d_1
#> [1] 0.8127501
#> 
#> $sd_1
#> [1] 0.08574929
#> 
#> $d_2
#> [1] 1.088039
#> 
#> $sd_2
#> [1] 0.07142857

This yields a list stating the location of the break (observation 151), semiparametric estimates of the order of integration in the two regimes (0.86 and 1.03) as well as the standard deviations of these estimates (0.13 and 0.15).

oil$DATE[151]
#> NULL

Consequently, the function indicates that there is a break in persistence in July, 1998. This means that from the beginning of the sample until June 1998 the series is integrated with an order of 0.85 and from July 1998 on the order of integration increased to 1.03.

As before, the function allows for various types of break point estimators. Instead of the default estimator of Busetti and Taylor (2004), one can also rely on the estimator of Leybourne, Kim, and Taylor (2007) by setting type="LKT". This estimator relies on estimates of the long-run variance. Therefore, it is also needed that m is chosen, which determines how many covariances are used when estimating the long-run variance. Leybourne, Kim, and Taylor (2007) suggest m=0.

BP_estim(x, direction="01", type="LKT", m=0)
#> $Breakpoint
#> [1] 148
#> 
#> $d_1
#> [1] 0.7660609
#> 
#> $sd_1
#> [1] 0.08703883
#> 
#> $d_2
#> [1] 1.067404
#> 
#> $sd_2
#> [1] 0.07142857

This yields a similar result with the break point lying in the year 1998 and d increasing from approximately 0.8 to approximately 1.

All other arguments of the function (trend, tau, serial) were already discussed above except for d_estim and d_bw. These two arguments determine which estimator and bandwidth are used to estimate the order of integration in the two regimes. Concerning the estimator, the GPH (Geweke and Porter-Hudak (1983)) and the exact local Whittle estimator (Shimotsu and Phillips (2005)) can be selected. Although the exact local Whittle estimator has a lower variance, the GPH estimator is still often considered in empirical applications due to its simplicity. In our example the results of the two estimators are almost identical.

BP_estim(x, direction="01", d_estim="GPH")
#> $Breakpoint
#> [1] 151
#> 
#> $d_1
#> [1] 0.855238
#> 
#> $sd_1
#> [1] 0.129834
#> 
#> $d_2
#> [1] 1.034389
#> 
#> $sd_2
#> [1] 0.1468516

The d_bw argument determines how many frequencies are used for estimation. Larger values imply a lower variance of the estimates, but also bias the estimator if the underlying process possesses short run dynamics. Usually a value between 0.5 and 0.8 is considered.

BP_estim(x, direction="01", d_bw=0.75)
#> $Breakpoint
#> [1] 151
#> 
#> $d_1
#> [1] 0.9146951
#> 
#> $sd_1
#> [1] 0.07624929
#> 
#> $d_2
#> [1] 1.173524
#> 
#> $sd_2
#> [1] 0.0625
BP_estim(x, direction="01", d_bw=0.65)
#> $Breakpoint
#> [1] 151
#> 
#> $d_1
#> [1] 0.5803242
#> 
#> $sd_1
#> [1] 0.09805807
#> 
#> $d_2
#> [1] 0.9353325
#> 
#> $sd_2
#> [1] 0.08219949

In our setup, it can be seen that increasing d_bw to 0.75 does not severely change the estimated order of integration in the two regimes. Decreasing d_bw, however, leads to smaller estimates of \(d\).

memochange-Tutorial: Break in Persistence

Janis Becker

2025-01-08

Literature Review

Usage

References