st: Proper usage of Macros stored in summarize

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

st: Proper usage of Macros stored in summarize

Thomas Speidel
I am trying to use macros stored in the summarize command to flag
outliers/influenetial observations if they fall outside of this range:
p25 - 2IQR <= var  <= p75 +2IQR

suppose I try to do this on the weight var from the auto.dta dataset
(with a little modification):

sysuse auto, clear
set obs 75
replace weight = 8000 in 75
qui: summ  weight, d
gen weight_outlier=1 if (weight>`p(75)'+2*(`r(p75)'-`r(p25)') & (weight<.))
replace weight_outlier=1 if (weight<`p(25)'-2*(`r(p75)'-`r(p25)'))

If I was to do it by hand:
. di 3*2240-2*3670
-620
. di 3*3670-2*2240
6530
gen weight_outlier2=1 if weight>6530 & weight <.

There is something I am doing wrong in the first approach - read: poor
macro programming :-) - but can't quite grasp what the problem is.
Thanks

--
Thomas Speidel--
Thomas Speidel

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

st: RE: Proper usage of Macros stored in summarize

Nick Cox
In your first complicated command, p(75) is evidently a typo for r(p75).


In your second complicated command, p(25) is evidently a typo for
r(p25).

I didn't look further.

There is no gain here, and much fiddly extra typing, in writing e.g.
`r(p75)' rather than r(p75).

Note that -extremes- from SSC and the -egen- functions -adjl()- and
-adju()- from -egenmore- from SSC already incorporate similar
functionality.

Nick
[hidden email]

Thomas Speidel

I am trying to use macros stored in the summarize command to flag
outliers/influenetial observations if they fall outside of this range:
p25 - 2IQR <= var  <= p75 +2IQR

suppose I try to do this on the weight var from the auto.dta dataset
(with a little modification):

sysuse auto, clear
set obs 75
replace weight = 8000 in 75
qui: summ  weight, d
gen weight_outlier=1 if (weight>`p(75)'+2*(`r(p75)'-`r(p25)') &
(weight<.))
replace weight_outlier=1 if (weight<`p(25)'-2*(`r(p75)'-`r(p25)'))

If I was to do it by hand:
. di 3*2240-2*3670
-620
. di 3*3670-2*2240
6530
gen weight_outlier2=1 if weight>6530 & weight <.

There is something I am doing wrong in the first approach - read: poor
macro programming :-) - but can't quite grasp what the problem is.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: st: RE: Proper usage of Macros stored in summarize

Thomas Speidel
Thank you Nick.  I kept looking in other directions when the answer was
staring in front of me....

Thanks again.

Quoting Nick Cox <[hidden email]> Tue Dec  2 11:27:00 2008:

> In your first complicated command, p(75) is evidently a typo for r(p75).
>
>
> In your second complicated command, p(25) is evidently a typo for
> r(p25).
>
> I didn't look further.
>
> There is no gain here, and much fiddly extra typing, in writing e.g.
> `r(p75)' rather than r(p75).
>
> Note that -extremes- from SSC and the -egen- functions -adjl()- and
> -adju()- from -egenmore- from SSC already incorporate similar
> functionality.
>
> Nick
> [hidden email]
>
> Thomas Speidel
>
> I am trying to use macros stored in the summarize command to flag
> outliers/influenetial observations if they fall outside of this range:
> p25 - 2IQR <= var  <= p75 +2IQR
>
> suppose I try to do this on the weight var from the auto.dta dataset
> (with a little modification):
>
> sysuse auto, clear
> set obs 75
> replace weight = 8000 in 75
> qui: summ  weight, d
> gen weight_outlier=1 if (weight>`p(75)'+2*(`r(p75)'-`r(p25)') &
> (weight<.))
> replace weight_outlier=1 if (weight<`p(25)'-2*(`r(p75)'-`r(p25)'))
>
> If I was to do it by hand:
> . di 3*2240-2*3670
> -620
> . di 3*3670-2*2240
> 6530
> gen weight_outlier2=1 if weight>6530 & weight <.
>
> There is something I am doing wrong in the first approach - read: poor
> macro programming :-) - but can't quite grasp what the problem is.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



--
Thomas Speidel


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/