We may be missing key predictors, interactions, or non-linear effects. Let’s modify our formula above to substitute HC1 “meat” in our sandwich:īut it’s important to remember large residuals (or evidence of non-constant variance) could be due to a misspecified model.
In our simple model above, \(k = 2\), since we have an intercept and a slope. \[\text_i^2\) refers to squared residuals, \(n\) is the number of observations, and \(k\) is the number of coefficients. The usual method for estimating coefficient standard errors of a linear model can be expressed with this somewhat intimidating formula: Now that we know the basics of getting robust standard errors out of Stata and R, let’s talk a little about why they’re robust by exploring how they’re calculated. While not really the point of this post, we should note the results say that larger turn circles and bigger trunks are associate with lower gas mileage. (We talk more about the different types and why it’s called the “sandwich” package below.)Ĭoefci(m, vcov. “HC1” is one of several types available in the sandwich package and happens to be the default type in Stata 16. The type argument allows us to specify what kind of robust standard errors to calculate. The sandwich package provides the vcovHC function that allows us to calculate robust standard errors. The lmtest package provides the coeftest function that allows us to re-calculate a coefficient table using a different variance-covariance matrix. Then we load two more packages: lmtest and sandwich. First we load the haven package to use the read_dta function that allows us to import Stata data sets. To replicate the result in R takes a bit more work. Notice the third column indicates “Robust” Standard Errors.