Quantcast

Insignificant coefficient in prediction

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Insignificant coefficient in prediction

Mike Kim-3
Hi all,

I have a question about prediction. Linear predictions in Stata (probably in
all software) after any regression seem to use all estimated coefficients
regardless their statistical significance. How can I understand that we use
insignificant coefficients in forecasting?

sysuse auto, clear
reg mpg weight length  // length is not significant
predict mpgh, xb

The predicted values use all coefficients including the coefficient of
'length' even if it is not significant.

Thanks for your time in advance.

Mike.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Insignificant coefficient in prediction

Nick Cox
You get what you ask for. If you fit a model and then follow with -predict-, then Stata fits that model.

You have scope to hunt for models in which every coefficient is significant at conventional levels and then -predict- with those.

Stata lets you do that, but it remains agnostic about your prejudices about how to work with data.

My own view is that insisting that every coefficient be significant is a dogma that admits many exceptions. For example,

1. Often the search for such a model involves a trawl through many possible models and makes the usual interpretation of significance level problematic. Is a project a fishing expedition or does it have a clear focus on scientific research questions?

2. Not every researcher feels compelled to use thresholds such as .05 or .01. My attitude to 0.049 is not distinguishable from my attitude to 0.051.

3. Often there are scientific and/or statistical grounds for including bundles of predictors, even if some or all don't qualify individually as significant. A perhaps esoteric example is that a sine and a cosine term usually belong together, even if one is not significant. A more common example, for many readers of this list, is that a bunch of indicators usually belong together, ditto.

4. Often much of the point of a project is to use the same model, meaning here specifically the same predictors, in different circumstances. Finding out how far results are similar or different is more straightforward than if researchers just fit a ragbag of different models and insist on everything being significant.

5. Significance is usually overemphasised in any case. Often the P-value is the dodgiest part of the results, especially if there is doubt about how far the underlying assumptions are satisfied, as there is usually is.

I imagine that any others whose prejudices overlap with mine could easily extend my list.

Nick
[hidden email]

Mike Kim

I have a question about prediction. Linear predictions in Stata (probably in
all software) after any regression seem to use all estimated coefficients
regardless their statistical significance. How can I understand that we use
insignificant coefficients in forecasting?

sysuse auto, clear
reg mpg weight length  // length is not significant
predict mpgh, xb

The predicted values use all coefficients including the coefficient of
'length' even if it is not significant.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: RE: Insignificant coefficient in prediction

Mike Kim-3
Thanks. I share with you the skepticism on significance. Let me ask this
way. Let's say a practitioner wants to find which factor has more impact on
sales to make an investment decision: store size or good location. Let's
say, a regression model tells that the effect of store size is smaller than
the effect of location but highly significant and that the effect of
location is larger but insignificant with p>0.5, for example. What should we
recommend? To invest in store size or to pick up a better location? I guess
he should invest in store size. Then, what implication does it have on
prediction using both coefficients? Are these two problems very different
ones and should not be mixed?

Thanks again.
Mike.

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Nick Cox
Sent: Wednesday, December 01, 2010 12:12 PM
To: '[hidden email]'
Subject: st: RE: Insignificant coefficient in prediction

You get what you ask for. If you fit a model and then follow with -predict-,
then Stata fits that model.

You have scope to hunt for models in which every coefficient is significant
at conventional levels and then -predict- with those.

Stata lets you do that, but it remains agnostic about your prejudices about
how to work with data.

My own view is that insisting that every coefficient be significant is a
dogma that admits many exceptions. For example,

1. Often the search for such a model involves a trawl through many possible
models and makes the usual interpretation of significance level problematic.
Is a project a fishing expedition or does it have a clear focus on
scientific research questions?

2. Not every researcher feels compelled to use thresholds such as .05 or
.01. My attitude to 0.049 is not distinguishable from my attitude to 0.051.

3. Often there are scientific and/or statistical grounds for including
bundles of predictors, even if some or all don't qualify individually as
significant. A perhaps esoteric example is that a sine and a cosine term
usually belong together, even if one is not significant. A more common
example, for many readers of this list, is that a bunch of indicators
usually belong together, ditto.

4. Often much of the point of a project is to use the same model, meaning
here specifically the same predictors, in different circumstances. Finding
out how far results are similar or different is more straightforward than if
researchers just fit a ragbag of different models and insist on everything
being significant.

5. Significance is usually overemphasised in any case. Often the P-value is
the dodgiest part of the results, especially if there is doubt about how far
the underlying assumptions are satisfied, as there is usually is.

I imagine that any others whose prejudices overlap with mine could easily
extend my list.

Nick
[hidden email]

Mike Kim

I have a question about prediction. Linear predictions in Stata (probably in
all software) after any regression seem to use all estimated coefficients
regardless their statistical significance. How can I understand that we use
insignificant coefficients in forecasting?

sysuse auto, clear
reg mpg weight length  // length is not significant
predict mpgh, xb

The predicted values use all coefficients including the coefficient of
'length' even if it is not significant.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Insignificant coefficient in prediction

Richard Williams-5
In reply to this post by Nick Cox
At 01:12 PM 12/1/2010, Nick Cox wrote:
>3. Often there are scientific and/or statistical grounds for
>including bundles of predictors, even if some or all don't qualify
>individually as significant. A perhaps esoteric example is that a
>sine and a cosine term usually belong together, even if one is not
>significant. A more common example, for many readers of this list,
>is that a bunch of indicators usually belong together, ditto.

Good list.  To this particular item I would add interaction terms and
main effects, e.g. if you've got x1*x2 in the model you generally
want x1 and x2, even if their effects aren't significant.  Likewise
with things like X and X^2.


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [hidden email]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Insignificant coefficient in prediction

Richard Williams-5
In reply to this post by Mike Kim-3
At 12:54 PM 12/1/2010, Mike Kim wrote:
>Hi all,
>
>I have a question about prediction. Linear predictions in Stata (probably in
>all software) after any regression seem to use all estimated coefficients
>regardless their statistical significance. How can I understand that we use
>insignificant coefficients in forecasting?

One more addition to what everyone else has said: Just because a
coefficient does not significantly differ from 0 does not mean its
estimated value is close to 0.  Both the coefficient and its standard
error could be huge, so you just resetting the coefficient to zero
yourself could have a huge impact on the prediction.  If you really
think the effect of a variable is zero, then re-estimate the model
without the variable.


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [hidden email]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Loading...