User’s Guide : Advanced Single Equation Analysis : Discrete Threshold Regression : Background
  
Background
We begin with a standard multiple linear regression model with observations and potential thresholds (producing regimes). (While we will use to index the observations, there is nothing in the structure of the model that requires time series data.)
For the observations in regime we have the linear regression specification
(35.1)
Note that the regressors are divided into two groups. The variables are those whose parameters do not vary across regimes, while the variables have coefficients that are regime-specific.
Suppose that there is an observable threshold variable and strictly increasing threshold values such that we are in regime if and only if:
where we set and . Thus, we are in regime if the value of the threshold variable is at least as large as the j-th threshold value, but not as large as the -th threshold. (Note that we follow EViews convention by defining the threshold values as the first values of each regime.)
For example, in the single threshold, two regime model, we have:
(35.2)
Using an indicator function which takes the value 1 if the expression is true and 0 otherwise and defining , we may combine the individual regime specifications into a single equation:
(35.3)
The identity of the threshold variable and the regressors and will determine the type of TR specification. If is the-th lagged value of , Equation (35.3) is a self-exciting (SE) model with delay ; if it’s not a lagged dependent, it's a conventional TR model. If the regressors and contain only a constant and lags of the dependent variable, we have an autoregressive (AR) model. Thus, a SETAR model is a threshold regression that combines an autoregressive specification with a lagged dependent threshold variable.
Given the threshold variable and the regression specification in Equation (35.1), we wish to find the coefficients and , and usually, the threshold values . We may also use model selection to identify the threshold variable .
Nonlinear least squares is an natural approach for estimating the parameters of the model. If we define the sum-of-squares objective function
(35.4)
and we may obtain threshold regression estimates by minimizing with respect to the parameters.
Taking advantage of the fact that for a given , say , minimization of the concentrated objective is a simple least squares problem, we can view estimation as finding the set of thresholds and corresponding OLS coefficient estimates that minimize the sum-of-squares across all possible sets of -threshold partitions.
This basic estimation setup is well known from the breakpoint testing and regression literature (see, for example, Hansen, 2001 and Perron, 2006), and indeed, by permuting the observation index so that the threshold variable is non-decreasing, one sees that estimation of the threshold and breakpoint models are fundamentally equivalent (Bai and Perron, 2003), In essence, threshold regressions can be thought of as breakpoint least squares regressions with data reordered with respect to the threshold variable. Alternately, breakpoint regressions may be thought of as threshold regressions with time as the threshold variable.
Accordingly, the discussion of breakpoint testing ( “Multiple Breakpoint Tests”) and estimation ( “Least Squares with Breakpoints”) may generally be applied in the current context. We will assume for our purposes that you are familiar with, or can refer to this material, and in the interest of brevity, we will minimize the amount of repetition in our discussion below.