I have a question on Wooldrige's Procedure 18.1 which relates to IV estimation when the endogenous regressor is a binary variable. Suppose I want to estimate the following equation: Y = a0 + a1 * X1 + a2* X2 + error where X1 is an indicator variable and is also endogenous. Assume that we have exactly 1 instrument Z for X1. X2 is an exogenous variable. The parameter of interest is a1. Now, for a more efficient estimation it is often suggested to use the following two step estimation method: Step 1: Estimate a Probit for the binary endogenous regressor on all exogenous variable and the instrument variable to obtain fitted probabilities. Probit: X1 = f (Z, X2,) + error Gives us predicted X1 : X1_hat1 Step 2: Use the fitted probabilities from (step 1 above) X1_hat1 and all exogenous regressors as instruments to obtain a more efficient estimate of the binary endogenous regressor. So in this step I use the standard 2sls procedure: First stage: X1= b0 = b1*X1_hat1 + b2*X2 + error This gives us the predicted value X1_hat2. Second Stage: Y = a0 + a1 * X1_hat2 + a2* X2 + error My question relates to the validity of the instrument Z. To argue for the validity of this instrument should I consider Z's statistical significance in the Probit model of Step 1? Or since step 1 is simply an extra step to obtain fitted probabilities to be used as instruments , we should only look at the statistical significance of the fitted values (X1_hat1) in the first stage of 2sls estimation in Step 2, i.e. whether or not b1=0 ? Wooldrige claims that we can ignore the step 1 estimation properties and focus only on step 2. However he does not offer any explanation for the same. I will really appreciate any comments or suggestions you may have on this issue. Thanks, VB
None of the above (or below, in this case). Use the predicted value from your -probit- as an excluded instrument in -ivreg2- (on SSC) and read the help file for -ivreg2- for tests of identification and overid tests. The part you can ignore is that your excluded instrument is generated, which is like the zeroth stage (with the first and second stages being done by -ivreg2- is a single step)--Wooldridge does not say you can ignore the first stage. See also http://www.stata-journal.com/article.html?article=st0136 or slide 55 out of 95 in http://www.stata.com/meeting/germany09/nichols.pdf or e.g. http://www.stata.com/statalist/archive/2007-04/msg00415.html On Thu, Jan 7, 2010 at 8:52 AM, vipul bhatt wrote: My question relates to the validity of the instrument Z. To argue for the validity of this instrument should I consider Z's statistical significance in the Probit model of Step 1? Or since step 1 is simply an extra step to obtain fitted probabilities to be used as instruments , we should only look at the statistical significance of the fitted values (X1_hat1) in the first stage of 2sls estimation in Step 2, i.e. whether or not b1=0 ? Wooldrige claims that we can ignore the step 1 estimation properties and focus only on step 2. However he does not offer any explanation for the same. I will really appreciate any comments or suggestions you may have on this issue. Thanks, VB
<> " or e.g. http://www.stata.com/statalist/archive/2007-04/msg00415.html" Did some characters go missing at some point from the code in this post? "esto" should probably be "eststo"? HTH Martin -----Ursprüngliche Nachricht----- Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Austin Nichols Gesendet: Donnerstag, 7. Januar 2010 15:33 An: [hidden email] Betreff: Re: st: Question on Wooldridge's Procedure 18.1 vipul bhatt <[hidden email]> : None of the above (or below, in this case). Use the predicted value from your -probit- as an excluded instrument in -ivreg2- (on SSC) and read the help file for -ivreg2- for tests of identification and overid tests. The part you can ignore is that your excluded instrument is generated, which is like the zeroth stage (with the first and second stages being done by -ivreg2- is a single step)--Wooldridge does not say you can ignore the first stage. See also http://www.stata-journal.com/article.html?article=st0136 or slide 55 out of 95 in http://www.stata.com/meeting/germany09/nichols.pdf or e.g. http://www.stata.com/statalist/archive/2007-04/msg00415.html On Thu, Jan 7, 2010 at 8:52 AM, vipul bhatt <[hidden email]> wrote: <snip> My question relates to the validity of the instrument Z. To argue for the validity of this instrument should I consider Z's statistical significance in the Probit model of Step 1? > > Or > > since step 1 is simply an extra step to obtain fitted probabilities to be used as instruments , we should only look at the statistical significance of the fitted values (X1_hat1) in the first stage of 2sls estimation in Step 2, i.e. whether or not b1=0 ? > > Wooldrige claims that we can ignore the step 1 estimation properties and focus only on step 2. However he does not offer any explanation for the same. I will really appreciate any comments or suggestions you may have on this issue. > > Thanks, > > VB * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
Thank you for these useful links. When I said Wooldrige mentions that we can ignore step 1 estimation I was actually referring to zeroth stage where I do probit of binary endogenous regressor on my instrument Z. Can we ignore insignificance of Z in this zeroth stage and simply focus on the validity of the predicted probablities (as these are my 'generated instrument') using output of ivreg2? Thank you, VB
That older post uses the earlier (now obsolete) forms of the user-written commands. Of course, nothing is lost if you simply ignore those commands: sysuse auto, clear treatreg pri wei, treat(for=mpg) r ivreg pri wei (for=mpg), r probit for wei mpg predict ghat ivreg pri wei (for=ghat), r reg for wei ghat, r predict vhat, resid reg pri wei for vhat, r or rewrite: est clear sysuse auto, clear treatreg pri wei, treat(for=mpg) r est sto treat ivreg pri wei (for=mpg), r est sto ivreg probit for wei mpg predict ghat ivreg pri wei (for=ghat), r est sto pr18_1 reg for wei ghat, r predict vhat, resid reg pri wei for vhat, r est sto pr18_1cf est table *, t eq(1)
No, you cannot--this is point of the sentence "Note that the weak instrument diagnostics will fail miserably using these approaches." in http://www.stata.com/meeting/germany09/nichols.pdf Your weak instrument diagnostics should come from straight IV, not procedure 18.1--note that procedure 18.1 would go through if Z was pure noise, and the predicted value of your endogenous variable could be very highly correlated with the endogenous variable, leading you to think you had very strong instruments. This is also a problem using predictions as excluded instruments in other settings (e.g. http://www.nber.org/papers/w11279 which uses predicted income as an instrument for income), and then reporting the diagnostics as if the predictions are actual measured data. On Thu, Jan 7, 2010 at 10:13 AM, vipul bhatt wrote: Thank you for these useful links. When I said Wooldrige mentions that we can ignore step 1 estimation I was actually referring to zeroth stage where I do probit of binary endogenous regressor on my instrument Z. Can we ignore insignificance of Z in this zeroth stage and simply focus on the validity of the predicted probablities (as these are my 'generated instrument') using output of ivreg2? Thank you, VB
Thank you for this clarification. VB
I'm currently using the 18.1 method in a project and have seen your warnings about using tests of instrument strength through the 18.1 method.
My 1st question is whether those warnings are solely because of the potential that the non-linearity in the probit may be correlated with the binary endogenous variable - and falsely show good instrument strength. 2nd - what if there is both strong correlation between the non-linearity in the probit AND strong instruments in the model: is there reason to worry about this non-linearity as having a potential bias, especially since you can't test the exclusion restriction for that? Thanks for any comments, Brent
