time trend or year effect for pooled data

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

time trend or year effect for pooled data

Sun, Yan (IFPRI)
Dear Statalist,

1) I have a equation as this: y=a+b1*X1+b2*X2+b3*X3+...+ c*T +
error, where a, b, c are coefficients;
2) Y is a couple of dependent variables, which could be binary or
continuous;
3) T is a time trend and I use it to capture year effect;
4) My observation is user groups which were visited in different
years and I pool them together, treating them as cross-sectional data.

My question: how should I treat T? Should I value it as 1, 2, 3, ..., OR
just yearly (eg., 1990, 1991, 1992, ....). I run regressions (both
Probit and OLS) using both methods, and the regression results give me
different coefficients ad t statistics  for "T".

Could anyone explain why and which method is appropriate for pooled
data?

Thank you very much for your help.

Yan Sun
IFPRI

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

re: time trend or year effect for pooled data

Christopher F Baum
<>
Yan said

> 1) I have a equation as this: y=a+b1*X1+b2*X2+b3*X3+...+ c*T +
> error, where a, b, c are coefficients;
> 2) Y is a couple of dependent variables, which could be binary or
> continuous;
> 3) T is a time trend and I use it to capture year effect;
> 4) My observation is user groups which were visited in different
> years and I pool them together, treating them as cross-sectional data.
>
> My question: how should I treat T? Should I value it as 1, 2, 3, ..., OR
> just yearly (eg., 1990, 1991, 1992, ....). I run regressions (both
> Probit and OLS) using both methods, and the regression results give me
> different coefficients ad t statistics  for "T".
>
> Could anyone explain why and which method is appropriate for pooled
> data?

In a pooled setting, I would include time fixed effects (i.e. i.year in factor-variable notation) which will estimate a coefficient for each year. This set of variables will absorb all time-specific (or "macro') variation.

If you use instead a time trend, it does not matter whether it starts from 1 or starts from 1990; any variable for which D.time is a constant will yield the same results, in terms of explanatory power. But using a linear time trend constrains the time-effect coefficients to lie on a straight line, whereas estimating i.time allows the coefficient pattern over years to be whatever the data chooses. If you have ten years, it is a difference between estimating nine coefficients and one coefficient. Are those eight constraints accepted by the data? That is an easily testable hypothesis.

Kit Baum   |   Boston College Economics & DIW Berlin   |   http://ideas.repec.org/e/pba1.html
                              An Introduction to Stata Programming  |   http://www.stata-press.com/books/isp.html
   An Introduction to Modern Econometrics Using Stata  |   http://www.stata-press.com/books/imeus.html


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

re: time trend or year effect for pooled data

Sun, Yan (IFPRI)
In reply to this post by Sun, Yan (IFPRI)
To professor Kit Baum, thank you very much for your help. Would you
please let me know what is the test you mentioned at the end of your
last email (are those eight constraints accepted by the data? That is an
easily testable hypothesis.) Thanks.
Yan

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

re: time trend or year effect for pooled data

Christopher F Baum
In reply to this post by Sun, Yan (IFPRI)
<>
> To professor Kit Baum, thank you very much for your help. Would you
> please let me know what is the test you mentioned at the end of your
> last email (are those eight constraints accepted by the data? That is an
> easily testable hypothesis.) Thanks.

Say you estimate the model

y_it = b_0 + b1_t,  t=1,2,3

then the effects of time are b1, 2 b1, 3 b1, respectively. You estimate two parameters.

Instead consider the model (sans constant)

y_it = d_0 + d_1 T2 + d_2 T3

where T2, T3 are dummies for time=2 and time=3 respectively. You estimate three parameters.

d_0 is the conditional mean of y | time=1. If the effect of time is linear, d_2 should be twice d_1. That is one constraint which can
be tested or imposed.

Run the enclosed, in which the effect of time is constructed to be nonlinear, and you can see the difference.

------------------------------
webuse grunfeld, clear
drop if year>1937
tsset
g y = year + 0.5*(year-1935)^2 + rnormal(0,1)
// allow for time effects => three coeffs to be estimated
reg y i.year
test 2*1936.year = 1937.year
// force linear trend => two coefficients to be estimated
reg y year
// enforce the linearity constraint
const def 1 2*1936.year=1937.year
cnsreg y i.year, c(1)
--------------------------------

Kit Baum   |   Boston College Economics & DIW Berlin   |   http://ideas.repec.org/e/pba1.html
                              An Introduction to Stata Programming  |   http://www.stata-press.com/books/isp.html
   An Introduction to Modern Econometrics Using Stata  |   http://www.stata-press.com/books/imeus.html


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: re: time trend or year effect for pooled data

sebastian.blesse
This post has NOT been accepted by the mailing list yet.
In reply to this post by Christopher F Baum


Dear all,

Prof. Baum said
"In a pooled setting, I would include time fixed effects (i.e. i.year in factor-variable notation) which will estimate a coefficient for each year. This set of variables will absorb all time-specific (or "macro') variation.

If you use instead a time trend, it does not matter whether it starts from 1 or starts from 1990; any variable for which D.time is a constant will yield the same results, in terms of explanatory power. But using a linear time trend constrains the time-effect coefficients to lie on a straight line, whereas estimating i.time allows the coefficient pattern over years to be whatever the data chooses. If you have ten years, it is a difference between estimating nine coefficients and one coefficient. Are those eight constraints accepted by the data? That is an easily testable hypothesis".

As far as i got it, one would introduce yearly fixed effects for example in a Difference in Difference set up (fixed effects regression) to capture for common macro shocks. Yearly fixed effects are here dummies from t=2,..., T for t=1,..., T. Can i set a time trend equivalently for t=2,...,T or is the time trend defined as t=1,..,T.

Kind regards,

Sebastian Blesse
Christopher F Baum wrote
<>
Yan said

> 1) I have a equation as this: y=a+b1*X1+b2*X2+b3*X3+...+ c*T +
> error, where a, b, c are coefficients;
> 2) Y is a couple of dependent variables, which could be binary or
> continuous;
> 3) T is a time trend and I use it to capture year effect;
> 4) My observation is user groups which were visited in different
> years and I pool them together, treating them as cross-sectional data.
>
> My question: how should I treat T? Should I value it as 1, 2, 3, ..., OR
> just yearly (eg., 1990, 1991, 1992, ....). I run regressions (both
> Probit and OLS) using both methods, and the regression results give me
> different coefficients ad t statistics  for "T".
>
> Could anyone explain why and which method is appropriate for pooled
> data?

In a pooled setting, I would include time fixed effects (i.e. i.year in factor-variable notation) which will estimate a coefficient for each year. This set of variables will absorb all time-specific (or "macro') variation.

If you use instead a time trend, it does not matter whether it starts from 1 or starts from 1990; any variable for which D.time is a constant will yield the same results, in terms of explanatory power. But using a linear time trend constrains the time-effect coefficients to lie on a straight line, whereas estimating i.time allows the coefficient pattern over years to be whatever the data chooses. If you have ten years, it is a difference between estimating nine coefficients and one coefficient. Are those eight constraints accepted by the data? That is an easily testable hypothesis.

Kit Baum   |   Boston College Economics & DIW Berlin   |   http://ideas.repec.org/e/pba1.html
                              An Introduction to Stata Programming  |   http://www.stata-press.com/books/isp.html
   An Introduction to Modern Econometrics Using Stata  |   http://www.stata-press.com/books/imeus.html


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Christopher F Baum wrote
<>
Yan said

> 1) I have a equation as this: y=a+b1*X1+b2*X2+b3*X3+...+ c*T +
> error, where a, b, c are coefficients;
> 2) Y is a couple of dependent variables, which could be binary or
> continuous;
> 3) T is a time trend and I use it to capture year effect;
> 4) My observation is user groups which were visited in different
> years and I pool them together, treating them as cross-sectional data.
>
> My question: how should I treat T? Should I value it as 1, 2, 3, ..., OR
> just yearly (eg., 1990, 1991, 1992, ....). I run regressions (both
> Probit and OLS) using both methods, and the regression results give me
> different coefficients ad t statistics  for "T".
>
> Could anyone explain why and which method is appropriate for pooled
> data?

In a pooled setting, I would include time fixed effects (i.e. i.year in factor-variable notation) which will estimate a coefficient for each year. This set of variables will absorb all time-specific (or "macro') variation.

If you use instead a time trend, it does not matter whether it starts from 1 or starts from 1990; any variable for which D.time is a constant will yield the same results, in terms of explanatory power. But using a linear time trend constrains the time-effect coefficients to lie on a straight line, whereas estimating i.time allows the coefficient pattern over years to be whatever the data chooses. If you have ten years, it is a difference between estimating nine coefficients and one coefficient. Are those eight constraints accepted by the data? That is an easily testable hypothesis.

Kit Baum   |   Boston College Economics & DIW Berlin   |   http://ideas.repec.org/e/pba1.html
                              An Introduction to Stata Programming  |   http://www.stata-press.com/books/isp.html
   An Introduction to Modern Econometrics Using Stata  |   http://www.stata-press.com/books/imeus.html


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/