User’s Guide : Advanced Single Equation Analysis : Robust Least Squares : Background
  
Background
 
M-estimation
M-estimator definition
M-estimator calculation
M-estimator summary statistics
R-squared
Rw-squared
Rn-squared Statistic
Deviance
Information Criteria
S-estimation
S-estimator definition
S-estimator calculation
S-estimator summary statistics
R-squared
Deviance
Rn-squared Statistic
MM Estimation
Coefficient Covariance Methods
Before describing the mechanics of estimating robust regression models in EViews, it will be useful to review the basics of the three estimation methods and to outline alternative approaches for computing the covariance matrix of the coefficient estimates.
M-estimation
The traditional least squares estimator is computed by finding coefficient values that minimize the sum of the squared residuals:
(33.1)
where the residual function is given by
(33.2)
Since the residuals enter the objective function on the right-hand side of Equation (33.1) after squaring, the effects of outliers are magnified accordingly.
M-estimator definition
One obvious approach to robust regression replaces squaring of residuals in Equation (33.1) with a function that provides less weight to outliers. The Huber M-estimator (“M” for “maximum likelihood estimator-like”) computes the coefficient values that minimize the summed values of a function of the residuals:
(33.3)
where is a measure of the scale of the residuals, is an arbitrary positive tuning constant associated with the function, and where are individual weights that are generally set to 1, but may be set to:
(33.4)
to down-weight observations with high leverage (large diagonals of the Hat Matrix).
The potential choices for the function (Andrews, Bisquare, Cauchy, Fair, Huber-Bisquare, Logistic, Median, Talworth, Welsch) are outlined below along with the default values of the tuning constants:
 
Name
Default
Andrews
1.339
Bisquare
4.685
Cauchy
2.385
Fairl
1.4
Huber
1.345
Logistic
1.205
Median
0.01
Talworth
2.796
Welsch
2.985
The default tuning constants for each function are taken from Holland and Welsch (1977), and are chosen so that the estimator achieves 95% asymptotic efficiency under residual normality.
M-estimator calculation
If the scale is known, then the -vector of coefficient estimates may be found using standard iterative techniques for solving the nonlinear first-order equations:
(33.5)
for , where , the derivative of the function, and is the value of the j-th regressor for observation .
Since is not known, a sequential procedure is used that alternates between: (1) computing updated estimates of the scale given coefficient estimates , and (2) using iterative methods to find the that solves Equation (33.5) for a given . The initial are obtained from ordinary least squares. The initial coefficients are used to compute a scale estimate, , and from that are formed new coefficient estimates , followed by a new scale estimate , and so on until convergence is reached.
Given an estimate , the updated scale is estimated using one of three different methods: Mean Absolute Deviation – Zero Centered (MADZERO), Median Absolute Deviation – Median Centered (MADMED), or Huber Scaling:
 
MADZERO
MADMED
Huber
where
where are the residuals associated with and where the initial scale required for the Huber method is estimated by:
(33.6)
M-estimator summary statistics
EViews automatically computes a variety of robust summary statistics for equations estimated using M-estimators.
R-squared
Maronna (1996, p. 171) defines the robust statistic as
where is the M-estimate from the constant-only specification.
The adjusted is calculated as:
(33.7)
Both of these statistics can be highly sensitive to the choice of function, even when the coefficient estimates and standard errors are not. Studies have also found that these statistics may be upwardly biased (see, for example, Renaud and Victoria-Feser (2010)).
Rw-squared
Renaud and Victoria-Feser (2010) propose the statistic, and provide simulation results showing to be a better measure of fit than the robust outlined above. The statistic is defined as
(33.8)
where is the function of the residual value and
(33.9)
As with the robust , an adjusted value of may be calculated from the unadjusted statistic
(33.10)
Rn-squared Statistic
The statistic is a robust version of a Wald test of the hypothesis that all of the coefficients are equal to zero. It is calculated using the standard Wald test quadratic form:
(33.11)
where are the non-intercept robust coefficient estimates and is the corresponding estimated covariance. Under the null hypothesis that all of the coefficients are equal to zero, the statistic is asymptotically distributed as a .
Deviance
The deviance is the value of the objective function Equation (33.3) evaluated at the final coefficient estimates and estimate of the scale:
(33.12)
Information Criteria
EViews reports two information criteria for M-estimated equations: the robust equivalent of the Akaike Information Criterion (), and a corresponding robust Schwarz Information Criterion ():
(33.13)
where is the derivative of as outlined in Holland and Welsch (1977). See Ronchetti (1985) for details.
S-estimation
The S-estimator (“S” for “scale statistic”) is a member of the class of high-breakdown-value estimators introduced by Rousseeuw and Yohai (1984). The breakdown-value of an estimator can be seen as a measure of an estimator's robustness to outliers. (A good description of breakdown-values and high-breakdown-value estimators can be found in Hubert and Debruyne (2009)).
S-estimator definition
S-estimators find the set of coefficients that provide the smallest estimate of the scale such that:
(33.14)
for the function with tuning constant , where is taken to be with the standard normal. The breakdown value for this estimator is .
Following Rousseeuw and Yohai, we choose a function based on the integral of the Biweight function
(33.15)
and estimate the scale using the Median Absolute Deviation, Zero Centered (MADZERO) method.
Note that affects the objective function through and . is typically chosen to achieve a desired breakdown value. EViews defaults to a value of 1.5476 implying a breakdown value of 0.5. Other notable values for (with associated ) are:
 
5.1824
0.10
4.0963
0.15
3.4207
0.20
2.9370
0.25
2.5608
0.30
1.9880
0.40
1.5476
0.50
S-estimator calculation
Calculation of S-estimates is computationally intensive, and there exist a number of fast algorithms that provide accurate approximations. EViews uses the Fast-S algorithm of Salibian-Barrera and Yohai (2006):
1. Obtain a random subsample of size from the data and compute the least squares regression to obtain a . By default is set equal to , the number of regressors. (Note that with the default , the regression will produce an exact fit for the subsample.)
2. Using the full sample, perform a set of refinements to the initial coefficient estimates using a variant of M-estimation which takes a single step toward the solution of Equation (33.5) at every update. These modified M-estimate refinements employ the Bisquare function with tuning parameter and scale estimator
(33.16)
where is the previous iteration's estimate of the scale and is the breakdown value defined earlier.
The initial scale estimator is obtained using MADZERO
3. Compute a new set of residuals over the entire sample using the possibly refined initial coefficient estimates, compute an estimate of the scale using MADZERO, and produce a final estimate of by iterating Equation (33.16) (with in place of ) to convergence or until .
4. Steps 1-3 are repeated times. The best (smallest) scale estimates are refined using M-estimation as in Step 2 with (or until convergence). The smallest scale from those refined scales is the final estimate of , and the final coefficient estimates are the corresponding estimates of .
S-estimator summary statistics
The following summary statistics are available for equations estimated by S-estimation:
R-squared
The robust version of for S-estimation is given by:
(33.17)
where is the estimate of the scale from the final estimation, and is an estimate of the scale from S-estimation with only a constant as a regressor.
Deviance
The S-estimator deviance value is given by:
(33.18)
Rn-squared Statistic
The statistic is identical to the one computed for M-estimation. See “Rn-squared Statistic” for discussion.
MM Estimation
MM-estimation addresses outliers in both the dependent and the independent variables by combining S-estimation with M-estimation.
The MM-estimator first computes S-estimates of the coefficients and scale, then uses the estimate of the scale as a fixed value in iterating to find a solution to Equation (33.5). The second stage M-estimation in EViews uses the Bisquare function with a default tuning parameter value of 4.684 which gives 95% relative efficiency for normal errors (Yohai, 1987).
The summary statistics for MM-estimation are obtained from the second-stage M-estimation procedure.
Coefficient Covariance Methods
EViews offers three different methods for computing the coefficient covariance matrix taken from Huber (1981, p. 173, equations 6.5, 6.6 and 6.7). All three methods provide unbiased estimates of the covariance matrix, with none having better properties than the others.
 
Type I (default)
Type II
Type III
with
(33.19)
where as before, and is the value of the j-th regressor for observation .
The first method (which is the easiest computationally) is the default choice.