Dear all,
I am trying to copy the approach of Dittmar/Thakor (2007) "Why do firms issue equity?" p. 27: The authors divide their sample of debt and equity issuers into quartiles based on two explanatory variables, i.e. building a matrix. Specifically, they examine the observed number of firms that fall into one of the four categories and compare them to the expected frequencies. After that, they apply a chisquared test for independence to determine if there are more or fewer firms than expected in each category. Untabulated results show that each of these frequencies is significant. I have managed to build the 4x3 matrix of observed and expected frequencies using the userwritten program ". tabchi [1. Dimension] [2. Dimension]". The tabulated statistics include Pearson chi2(6) = 15.0080 Pr = 0.020 and likelihoodratio chi2(6) = 15.4736 Pr = 0.017. However, I struggle to conduct this chisquared test for independence to determine if there are more or fewer firms than expected in each category. I have tried userwritten program ". chitesti" (part of the program tab_chi), plugging into it the expected and observed frequencies. This gives me Pearson chi2(11) = 15.0257 Pr = 0.181 and likelihoodratio chi2(11) = 15.6908 Pr = 0.153. But this does not allow me to test the frequencies of each (!) category. What am I doing wrong? What is the correct and straightforward approach in Stata for this type of problem? Many thanks for considering this posting. Regards Marc * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
 On Thu, 15/7/10, Marc Michelsen wrote:
> I am trying to copy the approach of Dittmar/Thakor (2007) > "Why do firms issue equity?" Please report a complete reference. > p. 27: The authors divide their sample of debt and equity > issuers into quartiles based on two explanatory variables, That is a horrible idea: You are throwing away huge amounts of information. Just use your favourite regression like model with your dependent variable your independent variables. If you worry about functional form, use splines. > I have managed to build the 4x3 matrix of observed and > expected frequencies using the userwritten program ". > tabchi [1. Dimension] [2. Dimension]". However, I struggle > to conduct this chisquared test for independence to > determine if there are more or fewer firms than expected > in each category. You are already done, the chi square test will only give you this overall measure of whether your table deviates from independence, it will not give you a cell by cell test. If you want to model patterns in your table you will have to use what sociologists call loglinear models, see (Hout 1983) for an introduction. You don't want to go there unless you really need to. You don't need to, since you should not do this anyhow, as you should not waste the valuable information you have by only using the quartiles. Hope this helps, Maarten Mike Hout (1983) "Mobility Tables". Quantitative Applications in the Social Sciences, nr. 31. Thousand Oaks: Sage.  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
In reply to this post by Marc Michelsen
On Thu, Jul 15, 2010 at 10:33 AM, Marc Michelsen
<[hidden email]> wrote: > I am trying to copy the approach of Dittmar/Thakor (2007) "Why do firms > issue equity?" p. 27: The authors divide their sample of debt and equity > issuers into quartiles based on two explanatory variables, i.e. building a > matrix. Specifically, they examine the observed number of firms that fall > into one of the four categories and compare them to the expected > frequencies. After that, they apply a chisquared test for independence to > determine if there are more or fewer firms than expected in each category. > Untabulated results show that each of these frequencies is significant. I agree with Maarten: that's a strange approach. Not that it is totally inappropriate... but it smells like 1960s when computations were essentially restricted to how much handwriting you can fit onto two sheets of paper. Propagating strange approaches does not do a good service to whatever discipline you are in (finance?). If those are continuous variables, you can use twosample KolmogorovSmirnov tests to compare the distributions. I am pretty sure that bivariate versions of KS tests exist, but they are not implemented in Stata. If the explanatory variables are categorical, you can compare the samples using tabulate variable debt_vs_equity as they are. If you want a fancier analysis, you can run qreg (or rather sqreg) over a set of quantiles, with debt/equity as the explanatory variables, to gauge whether the distributions of the continuous variables are the same for two types of firms.  Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
Stas, Maarten,
many thanks for your comments. The complete reference is: Dittmar, A., and A. Thakor. "Why do firms issue equity?" Journal of Finance 62 (2007), 154. You are totally right, the authors use this analysis only as an addon / robustness test. The main body of the paper are multivariate analyses. Nevertheless, it would be quite helpful to determine the relative importance of the two explanatory variables (dimensions), i.e. prior stock return (divided into quartiles) and credit rating outlook (positive, negative, stable). Do you have any idea how the authors have tested the significance of each of the frequencies? I will have a look at your three proposed alternatives and see how fancy they are. Regards Marc Ursprüngliche Nachricht Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Stas Kolenikov Gesendet: Donnerstag, 15. Juli 2010 23:52 An: [hidden email] Betreff: Re: st: Chisquared test for independence of observed and expected frequencies On Thu, Jul 15, 2010 at 10:33 AM, Marc Michelsen <[hidden email]> wrote: > I am trying to copy the approach of Dittmar/Thakor (2007) "Why do firms > issue equity?" p. 27: The authors divide their sample of debt and equity > issuers into quartiles based on two explanatory variables, i.e. building a > matrix. Specifically, they examine the observed number of firms that fall > into one of the four categories and compare them to the expected > frequencies. After that, they apply a chisquared test for independence to > determine if there are more or fewer firms than expected in each category. > Untabulated results show that each of these frequencies is significant. I agree with Maarten: that's a strange approach. Not that it is totally inappropriate... but it smells like 1960s when computations were essentially restricted to how much handwriting you can fit onto two sheets of paper. Propagating strange approaches does not do a good service to whatever discipline you are in (finance?). If those are continuous variables, you can use twosample KolmogorovSmirnov tests to compare the distributions. I am pretty sure that bivariate versions of KS tests exist, but they are not implemented in Stata. If the explanatory variables are categorical, you can compare the samples using tabulate variable debt_vs_equity as they are. If you want a fancier analysis, you can run qreg (or rather sqreg) over a set of quantiles, with debt/equity as the explanatory variables, to gauge whether the distributions of the continuous variables are the same for two types of firms.  Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
 On Fri, 16/7/10, Marc Michelsen wrote:
> Do you have any idea how the authors have tested > the significance of each of the frequencies? I don't even know what the null hypothesis should be: independence refers to the whole set of frequencies that make up a cross tabulation. A test on individual frequencies then just does not make sense within this context.  Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
In reply to this post by Marc Michelsen
 On Fri, 16/7/10, Marc Michelsen wrote:
> The complete reference is: Dittmar, A., and A. Thakor. "Why > do firms issue equity?" Journal of Finance 62 (2007), 154. Ok, I had a chance to look at this article, but there is no table on page 27, there is a reference to a table IV on the next page. Are you refering to that? If that is the case then that has absolutely nothing to do with a chi square test of independence, it is just a collection of ttests comparing the averages of two groups on a set of variables. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
Maarten,
many thanks for your efforts. Indeed, the results for this analysis are untabulated. It just says in the text at the top of page 27 (re. Prediction 2): "Using a chisquared test for independence to determine if there are more or fewer firms than expected in each category, we show that each of these frequenciesis significant." Marc Ursprüngliche Nachricht Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Maarten buis Gesendet: Freitag, 16. Juli 2010 11:22 An: [hidden email] Betreff: Re: AW: st: Chisquared test for independence of observed and expected frequencies  On Fri, 16/7/10, Marc Michelsen wrote: > The complete reference is: Dittmar, A., and A. Thakor. "Why > do firms issue equity?" Journal of Finance 62 (2007), 154. Ok, I had a chance to look at this article, but there is no table on page 27, there is a reference to a table IV on the next page. Are you refering to that? If that is the case then that has absolutely nothing to do with a chi square test of independence, it is just a collection of ttests comparing the averages of two groups on a set of variables. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
 On Fri, 16/7/10, Marc Michelsen wrote:
> Indeed, the results for this analysis are untabulated. It > just says in the text at the top of page 27 (re. Prediction > 2): "Using a chisquared test for independence to determine > if there are more or fewer firms than expected in each > category, we show that each of these frequenciesis > significant." OK, I see. I would recommend to just forget about that test. As I mentioned before a test on the individual frequencies just does not make sense to me: independence is a characteristic of the entire table not a characteristic of individual frequencies. To do such a test right you'd have to specify a specific hypothesis on the structure of counts, and than do a loglinear model. This is just not worth the effort, given that breaking up your continuous variable into quartiles is a bad idea to begin with. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
Agree. Thanks.
Marc Ursprüngliche Nachricht Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Maarten buis Gesendet: Freitag, 16. Juli 2010 12:26 An: [hidden email] Betreff: Re: AW: AW: st: Chisquared test for independence of observed and expected frequencies  On Fri, 16/7/10, Marc Michelsen wrote: > Indeed, the results for this analysis are untabulated. It > just says in the text at the top of page 27 (re. Prediction > 2): "Using a chisquared test for independence to determine > if there are more or fewer firms than expected in each > category, we show that each of these frequenciesis > significant." OK, I see. I would recommend to just forget about that test. As I mentioned before a test on the individual frequencies just does not make sense to me: independence is a characteristic of the entire table not a characteristic of individual frequencies. To do such a test right you'd have to specify a specific hypothesis on the structure of counts, and than do a loglinear model. This is just not worth the effort, given that breaking up your continuous variable into quartiles is a bad idea to begin with. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
In reply to this post by Maarten buis
However, one may want to test subtables when the overall hypothesis is one of homogeneity of various populations. The second test is one of independence. For a full table, the two tests are identical. When one is looking at subtables one is in the multiple testing mode. The way to do this is to look at the likelihood ratio chisquare and compare to the critical value for the full table (i.e. (r1)(c1) for the full table) even if one is looking at a 2x2 subtable
I don't have the exact reference, but it is fairly old either something by Novick and Grizzle in JASA or Gabriel in Annals of STatistic. It is before 1980  if there's demand for this, i can look it up next week. ________________________________________ From: [hidden email] [[hidden email]] On Behalf Of Maarten buis [[hidden email]] Sent: Friday, July 16, 2010 1:36 AM To: [hidden email] Subject: Re: AW: st: Chisquared test for independence of observed and expected frequencies  On Fri, 16/7/10, Marc Michelsen wrote: > Do you have any idea how the authors have tested > the significance of each of the frequencies? I don't even know what the null hypothesis should be: independence refers to the whole set of frequencies that make up a cross tabulation. A test on individual frequencies then just does not make sense within this context.  Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
This post was updated on .
In reply to this post by Marc Michelsen
CONTENTS DELETED
The author has deleted this message.

In reply to this post by Marc Michelsen
Dear Statalistusers,
I am estimating a logit model for a panel style data set. In order to guarantee unbiased estimation, I have used company, industry and/or offer year clusters (per Petersen, 2009). For my linear regressions I have made positive experience with fixedeffects models. Their application for binary outcome models is not as straightforward because the models rely solely on withinvariance. Running a fixedeffect logit model (xtlogit, fe) shows highly significant coefficients of my key variables, which would be very beneficial for my study. However, more than 50% of my observations get lost in the regression because of zero within variance. Is it consistent to show also a fixed effects logit model beside standard logit models clustered by the above mentioned characteristics. What do I have to keep in mind when interpreting the results (especially relative to the other ML models)? Is it possible to calculate marginal effects for such a fixed effects model (similar to Cameron/Trivedi, 2009, p. 516? Thank you for considering this posting. Regards Marc Cameron, A. C., and P. K. Trivedi. Microeconometrics using stata: Stata Press (2009). Petersen, M. A. "Estimating standard errors in finance panel data sets: Comparing approaches." Review of Financial Studies 22 (2009), 435. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
In reply to this post by Steven Samuels
Thank you very much for all the valuable comments. Having read all that, I
will probably skip the analysis. Ursprüngliche Nachricht Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Steve Samuels Gesendet: Samstag, 17. Juli 2010 00:24 An: [hidden email] Betreff: Re: st: Chisquared test for independence of observed and expected frequencies Marc Michelsen wants to use the Chi Square test of independence in a contingency table of two of his predictor variables, because the test occurs in a reference. Stas and Maarten suggested alternatives. But he wants to test the "significance of each of the frequencies". This would have been illadvised but possible. See the section "Nevertheless, it would be quite helpful to determine the relative importance of the two explanatory variables (dimensions), i.e. prior stock return (divided into quartiles) and credit rating outlook (positive, negative, stable)." Agreed, but the chi square test for independence of the two explanatory variables says _nothing_ about their relative importance as predictors. The same logic applies to a test for the correlation of two continuous predictors in ordinary regression. Correlation (multicollinearity) will make it difficult to disentangle the effects of the involved predictors, but it says nothing about the relative importance of any of them The authors of Marc's reference might have had other reasons for studying the association of the two predictors. They might have also tested a single cell with the residual shown on page 81 of A. Agresti, 2002, "Categorical Data Analysis", Wiley Books. Steve On Fri, Jul 16, 2010 at 4:22 AM, Marc Michelsen <[hidden email]> wrote: > Stas, Maarten, > > many thanks for your comments. > > The complete reference is: Dittmar, A., and A. Thakor. "Why do firms issue > equity?" Journal of Finance 62 (2007), 154. > > You are totally right, the authors use this analysis only as an addon / > robustness test. The main body of the paper are multivariate analyses. > Nevertheless, it would be quite helpful to determine the relative > of the two explanatory variables (dimensions), i.e. prior stock return > (divided into quartiles) and credit rating outlook (positive, negative, > stable). Do you have any idea how the authors have tested the significance > of each of the frequencies? > > I will have a look at your three proposed alternatives and see how fancy > they are. > > Regards > Marc > > > > Ursprüngliche Nachricht > Von: [hidden email] > [mailto:[hidden email]] Im Auftrag von Stas > Gesendet: Donnerstag, 15. Juli 2010 23:52 > An: [hidden email] > Betreff: Re: st: Chisquared test for independence of observed and expected > frequencies > > On Thu, Jul 15, 2010 at 10:33 AM, Marc Michelsen > <[hidden email]> wrote: >> I am trying to copy the approach of Dittmar/Thakor (2007) "Why do firms >> issue equity?" p. 27: The authors divide their sample of debt and equity >> issuers into quartiles based on two explanatory variables, i.e. building a >> matrix. Specifically, they examine the observed number of firms that fall >> into one of the four categories and compare them to the expected >> frequencies. After that, they apply a chisquared test for independence to >> determine if there are more or fewer firms than expected in each category. >> Untabulated results show that each of these frequencies is significant. > > I agree with Maarten: that's a strange approach. Not that it is > totally inappropriate... but it smells like 1960s when computations > were essentially restricted to how much handwriting you can fit onto > two sheets of paper. Propagating strange approaches does not do a good > service to whatever discipline you are in (finance?). > > If those are continuous variables, you can use twosample > KolmogorovSmirnov tests to compare the distributions. I am pretty > sure that bivariate versions of KS tests exist, but they are not > implemented in Stata. If the explanatory variables are categorical, > you can compare the samples using tabulate variable debt_vs_equity > as they are. > > If you want a fancier analysis, you can run qreg (or rather sqreg) > over a set of quantiles, with debt/equity as the explanatory > variables, to gauge whether the distributions of the continuous > variables are the same for two types of firms. > >  > Stas Kolenikov, also found at http://stas.kolenikov.name > Small print: I use this email account for mailing lists only. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ >  Steven Samuels [hidden email] 18 Cantine's Island Saugerties NY 12477 USA Voice: 8452460774 Fax: 2062024783 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
In reply to this post by Marc Michelsen
 On Mon, 19/7/10, Marc Michelsen wrote:
> I am estimating a logit model for a panel style data set. > In order to guarantee unbiased estimation, I have used company, > industry and/or offer year clusters (per Petersen, 2009). For > my linear regressions I have made positive experience with > fixedeffects models. Their application for binary outcome > models is not as straightforward because the models rely solely > on withinvariance. > > more than 50% of my observations get lost in the regression > because of zero within variance. Is it consistent to show also > a fixed effects logit model beside standard logit models > clustered by the above mentioned characteristics. I would not do that, these two estimators just measure different things, the fixed effects estimator controls for every characteristic that remains constant, while your model with clustered standard errors does not. I don't see how you can compare the results of these two models. The point of presenting two models side by side is that (it implies that) you can compare models. If you can't compare those models, than presenting the models side by side will just result in confusion. The problem with a large proportion of dropped observations is that you may need to think again about to what population you are trying to generalize. For that reason I would look at wether those that drop out of your analysis analysis are in some sense different from those that are in the analysis in terms of your observed variables. If you are lucky there isn't much difference, and you can, with some arm waving, argue that it doesn't matter. If there are considerable differences, than I would just mention that, and at the very end of your paper discuss some hypotheses of how this may influence your estimates. Remember that you are trying to do something that is by definition impossible: get an empricial estimate of an effect while controlling for stuff that you haven't seen. So do not expect to get the right answer. What you should aim at is to look at your data as containing some information on the effect that you are interested in; it is not enough, but it is not zero either. There are now a variety of strategies you can follow to extract that information. Pick one, and do that one right. There are two reasons for that. First, using these strategies right is hard (not surprising as they try to solve an unsolvable problem...), so it really pays to focuss on one strategy. Second, it is much easier this way to write your paper in a way that it helps the reader to follow what data you have used and what information it contains that help you get an idea of what the effect of interest is (and what "information" comes from the (untestable) assumptions underlying your strategy). Others (or you in a different paper) can later use other strategies. After a sufficient body of literature has been assembled on this question, someone can try to summarize the different finding. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
Maarten,
thanks for these very thoughtful comments. I will think this through another time. Marc Ursprüngliche Nachricht Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Maarten buis Gesendet: Montag, 19. Juli 2010 15:50 An: [hidden email] Betreff: Re: st: Fixed effects logit model  On Mon, 19/7/10, Marc Michelsen wrote: > I am estimating a logit model for a panel style data set. > In order to guarantee unbiased estimation, I have used company, > industry and/or offer year clusters (per Petersen, 2009). For > my linear regressions I have made positive experience with > fixedeffects models. Their application for binary outcome > models is not as straightforward because the models rely solely > on withinvariance. > > more than 50% of my observations get lost in the regression > because of zero within variance. Is it consistent to show also > a fixed effects logit model beside standard logit models > clustered by the above mentioned characteristics. I would not do that, these two estimators just measure different things, the fixed effects estimator controls for every characteristic that remains constant, while your model with clustered standard errors does not. I don't see how you can compare the results of these two models. The point of presenting two models side by side is that (it implies that) you can compare models. If you can't compare those models, than presenting the models side by side will just result in confusion. The problem with a large proportion of dropped observations is that you may need to think again about to what population you are trying to generalize. For that reason I would look at wether those that drop out of your analysis analysis are in some sense different from those that are in the analysis in terms of your observed variables. If you are lucky there isn't much difference, and you can, with some arm waving, argue that it doesn't matter. If there are considerable differences, than I would just mention that, and at the very end of your paper discuss some hypotheses of how this may influence your estimates. Remember that you are trying to do something that is by definition impossible: get an empricial estimate of an effect while controlling for stuff that you haven't seen. So do not expect to get the right answer. What you should aim at is to look at your data as containing some information on the effect that you are interested in; it is not enough, but it is not zero either. There are now a variety of strategies you can follow to extract that information. Pick one, and do that one right. There are two reasons for that. First, using these strategies right is hard (not surprising as they try to solve an unsolvable problem...), so it really pays to focuss on one strategy. Second, it is much easier this way to write your paper in a way that it helps the reader to follow what data you have used and what information it contains that help you get an idea of what the effect of interest is (and what "information" comes from the (untestable) assumptions underlying your strategy). Others (or you in a different paper) can later use other strategies. After a sufficient body of literature has been assembled on this question, someone can try to summarize the different finding. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
In reply to this post by Maarten buis
Thanks Maarten, for a very useful and practical response.
On Mon, Jul 19, 2010 at 3:50 PM, Maarten buis <[hidden email]> wrote: >  On Mon, 19/7/10, Marc Michelsen wrote: >> I am estimating a logit model for a panel style data set. >> In order to guarantee unbiased estimation, I have used company, >> industry and/or offer year clusters (per Petersen, 2009). For >> my linear regressions I have made positive experience with >> fixedeffects models. Their application for binary outcome >> models is not as straightforward because the models rely solely >> on withinvariance. >> >> more than 50% of my observations get lost in the regression >> because of zero within variance. Is it consistent to show also >> a fixed effects logit model beside standard logit models >> clustered by the above mentioned characteristics. > > I would not do that, these two estimators just measure different > things, the fixed effects estimator controls for every > characteristic that remains constant, while your model with > clustered standard errors does not. I don't see how you can > compare the results of these two models. The point of presenting > two models side by side is that (it implies that) you can > compare models. If you can't compare those models, than > presenting the models side by side will just result in confusion. > > The problem with a large proportion of dropped observations is > that you may need to think again about to what population you > are trying to generalize. For that reason I would look at > wether those that drop out of your analysis analysis are in > some sense different from those that are in the analysis in > terms of your observed variables. If you are lucky there isn't > much difference, and you can, with some arm waving, argue that > it doesn't matter. If there are considerable differences, than > I would just mention that, and at the very end of your paper > discuss some hypotheses of how this may influence your estimates. > > Remember that you are trying to do something that is by > definition impossible: get an empricial estimate of an effect > while controlling for stuff that you haven't seen. So do not > expect to get the right answer. What you should aim at is to > look at your data as containing some information on the effect > that you are interested in; it is not enough, but it is not > zero either. There are now a variety of strategies you can > follow to extract that information. Pick one, and do that > one right. There are two reasons for that. First, using these > strategies right is hard (not surprising as they try to solve > an unsolvable problem...), so it really pays to focuss on one > strategy. Second, it is much easier this way to write your > paper in a way that it helps the reader to follow what data you > have used and what information it contains that help you get > an idea of what the effect of interest is (and what "information" > comes from the (untestable) assumptions underlying your strategy). > Others (or you in a different paper) can later use other > strategies. After a sufficient body of literature has been > assembled on this question, someone can try to summarize the > different finding. > > Hope this helps, > Maarten > >  > Maarten L. Buis > Institut fuer Soziologie > Universitaet Tuebingen > Wilhelmstrasse 36 > 72074 Tuebingen > Germany > > http://www.maartenbuis.nl >  > > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
In reply to this post by Marc Michelsen
Dear StatalistUsers,
I looking for the appropriate statistic to test for differences in a rated company sample (panel data). The dataset comprises beside company financials, the rating category (e.g. AA, A, BBB) and the rating outlook (positive, negative, stable) per year. In summary, I've got firmyears per rating category that are subdivided by the prevailing rating outlook. I want to test for differences in company characteristics between firms with a positive, negative, stable outlook. However, the comparison has to be done within a rating category. So BBB companies with a negative outlook have to be compared with companies also rated BBB but with a different outlook. But I want to test the whole sample at once and not just subsamples. How can I replicate this in Stata? Or more general, what is this type of problem called in econometric terms? "Clustered"? Thanks for considering my post. Marc * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
 On Thu, 28/10/10, Marc Michelsen wrote:
> I looking for the appropriate statistic to test for > differences in a rated company sample (panel data). The > dataset comprises beside company financials, the rating > category (e.g. AA, A, BBB) and the rating outlook > (positive, negative, stable) per year. > > I want to test for differences in company characteristics > between firms with a positive, negative, stable outlook. > However, the comparison has to be done within a rating > category. > > How can I replicate this in Stata? Sounds to me like a regression of a charateristic on rating and outlook. Rating would than be a control variable. You might consider adding interaction terms between rating and outlook. > Or more general, what is this type of problem called in > econometric terms? Just regression. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
Maarten,
thanks for this. Would that be a multinomial logistic regression as the rating outlook has the three values "positive, negative, stable"? Regards Marc Ursprüngliche Nachricht Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Maarten buis Gesendet: Donnerstag, 28. Oktober 2010 15:25 An: [hidden email] Betreff: Re: st: Testing for differences  On Thu, 28/10/10, Marc Michelsen wrote: > I looking for the appropriate statistic to test for > differences in a rated company sample (panel data). The > dataset comprises beside company financials, the rating > category (e.g. AA, A, BBB) and the rating outlook > (positive, negative, stable) per year. > > I want to test for differences in company characteristics > between firms with a positive, negative, stable outlook. > However, the comparison has to be done within a rating > category. > > How can I replicate this in Stata? Sounds to me like a regression of a charateristic on rating and outlook. Rating would than be a control variable. You might consider adding interaction terms between rating and outlook. > Or more general, what is this type of problem called in > econometric terms? Just regression. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
 On Thu, 28/10/10, Marc Michelsen wrote:
> thanks for this. Would that be a multinomial logistic > regression as the rating outlook has the three values > "positive, negative, stable"? No, the rating and the outlook are both explanatory/ independent/predictor/righthandside/xvariables. The characteristic (whatever that may be) is your explained/dependent/lefthandside/yvariable. So the type of regression depends on the type of firm characteristic you want to investigate. Hope this helps, Maarten  Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl  * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ 
Free forum by Nabble  Edit this page 