

Dear Statalist,
1) I have a equation as this: y=a+b1*X1+b2*X2+b3*X3+...+ c*T +
error, where a, b, c are coefficients;
2) Y is a couple of dependent variables, which could be binary or
continuous;
3) T is a time trend and I use it to capture year effect;
4) My observation is user groups which were visited in different
years and I pool them together, treating them as crosssectional data.
My question: how should I treat T? Should I value it as 1, 2, 3, ..., OR
just yearly (eg., 1990, 1991, 1992, ....). I run regressions (both
Probit and OLS) using both methods, and the regression results give me
different coefficients ad t statistics for "T".
Could anyone explain why and which method is appropriate for pooled
data?
Thank you very much for your help.
Yan Sun
IFPRI
*
* For searches and help try:
* http://www.stata.com/help.cgi?search* http://www.stata.com/support/statalist/faq* http://www.ats.ucla.edu/stat/stata/


<>
Yan said
> 1) I have a equation as this: y=a+b1*X1+b2*X2+b3*X3+...+ c*T +
> error, where a, b, c are coefficients;
> 2) Y is a couple of dependent variables, which could be binary or
> continuous;
> 3) T is a time trend and I use it to capture year effect;
> 4) My observation is user groups which were visited in different
> years and I pool them together, treating them as crosssectional data.
>
> My question: how should I treat T? Should I value it as 1, 2, 3, ..., OR
> just yearly (eg., 1990, 1991, 1992, ....). I run regressions (both
> Probit and OLS) using both methods, and the regression results give me
> different coefficients ad t statistics for "T".
>
> Could anyone explain why and which method is appropriate for pooled
> data?
In a pooled setting, I would include time fixed effects (i.e. i.year in factorvariable notation) which will estimate a coefficient for each year. This set of variables will absorb all timespecific (or "macro') variation.
If you use instead a time trend, it does not matter whether it starts from 1 or starts from 1990; any variable for which D.time is a constant will yield the same results, in terms of explanatory power. But using a linear time trend constrains the timeeffect coefficients to lie on a straight line, whereas estimating i.time allows the coefficient pattern over years to be whatever the data chooses. If you have ten years, it is a difference between estimating nine coefficients and one coefficient. Are those eight constraints accepted by the data? That is an easily testable hypothesis.
Kit Baum  Boston College Economics & DIW Berlin  http://ideas.repec.org/e/pba1.html An Introduction to Stata Programming  http://www.statapress.com/books/isp.html An Introduction to Modern Econometrics Using Stata  http://www.statapress.com/books/imeus.html*
* For searches and help try:
* http://www.stata.com/help.cgi?search* http://www.stata.com/support/statalist/faq* http://www.ats.ucla.edu/stat/stata/


<>
> To professor Kit Baum, thank you very much for your help. Would you
> please let me know what is the test you mentioned at the end of your
> last email (are those eight constraints accepted by the data? That is an
> easily testable hypothesis.) Thanks.
Say you estimate the model
y_it = b_0 + b1_t, t=1,2,3
then the effects of time are b1, 2 b1, 3 b1, respectively. You estimate two parameters.
Instead consider the model (sans constant)
y_it = d_0 + d_1 T2 + d_2 T3
where T2, T3 are dummies for time=2 and time=3 respectively. You estimate three parameters.
d_0 is the conditional mean of y  time=1. If the effect of time is linear, d_2 should be twice d_1. That is one constraint which can
be tested or imposed.
Run the enclosed, in which the effect of time is constructed to be nonlinear, and you can see the difference.

webuse grunfeld, clear
drop if year>1937
tsset
g y = year + 0.5*(year1935)^2 + rnormal(0,1)
// allow for time effects => three coeffs to be estimated
reg y i.year
test 2*1936.year = 1937.year
// force linear trend => two coefficients to be estimated
reg y year
// enforce the linearity constraint
const def 1 2*1936.year=1937.year
cnsreg y i.year, c(1)

Kit Baum  Boston College Economics & DIW Berlin  http://ideas.repec.org/e/pba1.html An Introduction to Stata Programming  http://www.statapress.com/books/isp.html An Introduction to Modern Econometrics Using Stata  http://www.statapress.com/books/imeus.html*
* For searches and help try:
* http://www.stata.com/help.cgi?search* http://www.stata.com/support/statalist/faq* http://www.ats.ucla.edu/stat/stata/


This post has NOT been accepted by the mailing list yet.
Dear all,
Prof. Baum said
"In a pooled setting, I would include time fixed effects (i.e. i.year in factorvariable notation) which will estimate a coefficient for each year. This set of variables will absorb all timespecific (or "macro') variation.
If you use instead a time trend, it does not matter whether it starts from 1 or starts from 1990; any variable for which D.time is a constant will yield the same results, in terms of explanatory power. But using a linear time trend constrains the timeeffect coefficients to lie on a straight line, whereas estimating i.time allows the coefficient pattern over years to be whatever the data chooses. If you have ten years, it is a difference between estimating nine coefficients and one coefficient. Are those eight constraints accepted by the data? That is an easily testable hypothesis".
As far as i got it, one would introduce yearly fixed effects for example in a Difference in Difference set up (fixed effects regression) to capture for common macro shocks. Yearly fixed effects are here dummies from t=2,..., T for t=1,..., T. Can i set a time trend equivalently for t=2,...,T or is the time trend defined as t=1,..,T.
Kind regards,
Sebastian Blesse
Christopher F Baum wrote
<>
Yan said
> 1) I have a equation as this: y=a+b1*X1+b2*X2+b3*X3+...+ c*T +
> error, where a, b, c are coefficients;
> 2) Y is a couple of dependent variables, which could be binary or
> continuous;
> 3) T is a time trend and I use it to capture year effect;
> 4) My observation is user groups which were visited in different
> years and I pool them together, treating them as crosssectional data.
>
> My question: how should I treat T? Should I value it as 1, 2, 3, ..., OR
> just yearly (eg., 1990, 1991, 1992, ....). I run regressions (both
> Probit and OLS) using both methods, and the regression results give me
> different coefficients ad t statistics for "T".
>
> Could anyone explain why and which method is appropriate for pooled
> data?
In a pooled setting, I would include time fixed effects (i.e. i.year in factorvariable notation) which will estimate a coefficient for each year. This set of variables will absorb all timespecific (or "macro') variation.
If you use instead a time trend, it does not matter whether it starts from 1 or starts from 1990; any variable for which D.time is a constant will yield the same results, in terms of explanatory power. But using a linear time trend constrains the timeeffect coefficients to lie on a straight line, whereas estimating i.time allows the coefficient pattern over years to be whatever the data chooses. If you have ten years, it is a difference between estimating nine coefficients and one coefficient. Are those eight constraints accepted by the data? That is an easily testable hypothesis.
Kit Baum  Boston College Economics & DIW Berlin  http://ideas.repec.org/e/pba1.html An Introduction to Stata Programming  http://www.statapress.com/books/isp.html An Introduction to Modern Econometrics Using Stata  http://www.statapress.com/books/imeus.html*
* For searches and help try:
* http://www.stata.com/help.cgi?search* http://www.stata.com/support/statalist/faq* http://www.ats.ucla.edu/stat/stata/
Christopher F Baum wrote
<>
Yan said
> 1) I have a equation as this: y=a+b1*X1+b2*X2+b3*X3+...+ c*T +
> error, where a, b, c are coefficients;
> 2) Y is a couple of dependent variables, which could be binary or
> continuous;
> 3) T is a time trend and I use it to capture year effect;
> 4) My observation is user groups which were visited in different
> years and I pool them together, treating them as crosssectional data.
>
> My question: how should I treat T? Should I value it as 1, 2, 3, ..., OR
> just yearly (eg., 1990, 1991, 1992, ....). I run regressions (both
> Probit and OLS) using both methods, and the regression results give me
> different coefficients ad t statistics for "T".
>
> Could anyone explain why and which method is appropriate for pooled
> data?
In a pooled setting, I would include time fixed effects (i.e. i.year in factorvariable notation) which will estimate a coefficient for each year. This set of variables will absorb all timespecific (or "macro') variation.
If you use instead a time trend, it does not matter whether it starts from 1 or starts from 1990; any variable for which D.time is a constant will yield the same results, in terms of explanatory power. But using a linear time trend constrains the timeeffect coefficients to lie on a straight line, whereas estimating i.time allows the coefficient pattern over years to be whatever the data chooses. If you have ten years, it is a difference between estimating nine coefficients and one coefficient. Are those eight constraints accepted by the data? That is an easily testable hypothesis.
Kit Baum  Boston College Economics & DIW Berlin  http://ideas.repec.org/e/pba1.html An Introduction to Stata Programming  http://www.statapress.com/books/isp.html An Introduction to Modern Econometrics Using Stata  http://www.statapress.com/books/imeus.html*
* For searches and help try:
* http://www.stata.com/help.cgi?search* http://www.stata.com/support/statalist/faq* http://www.ats.ucla.edu/stat/stata/

