sqreg VERY slow

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

sqreg VERY slow

Lim,  Raymond
Hello Statalisters,
I'm running sqreq on 1.8 million observations using Stata-MP. It takes about a week to run. Are there options that will speed it up? I tried wlister(50) hoping the first set of iterations would make the second set of iterations faster, but wlister just made it slower.

Thanks!
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

RE: sqreg VERY slow

Martin Weiss-5

<>

This example takes 50 seconds per replication.

***********
clear*
set mem 1G
set obs 1800000
d,s
gen x=4*rnormal()
gen x2=runiform()*10
gen y=2+x+2*x2+rnormal()
sqreg y x?, quantile(.25 .5 .75) reps(10)
***********


There is no guarantee that MP will make your command faster. -bsample-, for
instance, does not profit much, as you can see from page 40 of
http://www.stata.com/statamp/report.pdf


HTH
Martin

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Lim, Raymond
Sent: Sonntag, 22. August 2010 16:07
To: [hidden email]
Subject: st: sqreg VERY slow

Hello Statalisters,
I'm running sqreq on 1.8 million observations using Stata-MP. It takes about
a week to run. Are there options that will speed it up? I tried wlister(50)
hoping the first set of iterations would make the second set of iterations
faster, but wlister just made it slower.

Thanks!
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

equality across quantile regressions WITHOUT sqreg

Lim,  Raymond
In reply to this post by Lim, Raymond
Hello Statalisters,
Can I test equality of coefficients across different quantile regressions (.25 .50 .75) manually after running qreg? If so, would the calculation be the usual "coefficient +or- 1.96*SE" and see whether the 95% confidence intervals overlap?

I'd like to avoid sqreq because I have a large sample and a single qreg takes two days to complete. Running sqreg with its bootstrapping computations would take forever.

Thanks!
-Raymond

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: equality across quantile regressions WITHOUT sqreg

Maarten buis
--- On Mon, 27/9/10, Lim, Raymond wrote:
> Can I test equality of coefficients across different
> quantile regressions (.25 .50 .75) manually after running
> qreg? If so, would the calculation be the usual "coefficient
> +or- 1.96*SE" and see whether the 95% confidence intervals
> overlap?

Unfortunately no, that method never works. Basically you are
missing the covariance of the joint sampling distribution.
 
Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


     

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: equality across quantile regressions WITHOUT sqreg

Maarten buis

--- On Mon, 27/9/10, Lim, Raymond wrote:
> > If so, would the calculation be the usual "coefficient
> > +or- 1.96*SE" and see whether the 95% confidence
> > intervals overlap?

--- On Mon, 27/9/10, Maarten buis wrote:
> Unfortunately no, that method never works. Basically you
> are missing the covariance of the joint sampling distribution.

To illustrate this point, consider the simulation below. We
know that the coefficients of 2.u and 3.u are equal in the
popultion, I created the data that way. So, I should reject
that hypothesis in 5% of the samples. That is the logic behind
a statistical test. As you can see, if I used the appropriate
command -test-, I get the right coverage (mean of sig is approx.
0.05, i.e. 5%), but the coverage of the "overlapping confidence
intervals test" really misses the nominal 5% mark.

*------------------------ begin simulation --------------------
set seed 12345
set more off
program drop _all
program define sim, rclass
        drop _all
        set obs 500
        gen u = ceil(3*runiform())
        gen y = .5* (u >= 2) + .5*rnormal()
        reg y i.u
        local lb_2 = _b[2.u] - invttail(e(df_r),0.025)*_se[2.u]
        local ub_2 = _b[2.u] + invttail(e(df_r),0.025)*_se[2.u]
        local lb_3 = _b[3.u] - invttail(e(df_r),0.025)*_se[3.u]
        local ub_3 = _b[3.u] + invttail(e(df_r),0.025)*_se[3.u]
        return scalar sig_f = ( `lb_2'  > `ub_3' ) | ( `ub_2' < `lb_3' )
        test 3.u = 2.u
        return scalar sig = r(p) < .05
end
simulate sig_f=r(sig_f) sig=r(sig), reps(10000) : sim
sum
*-------------------------- end simulation ----------------------------

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


     

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: equality across quantile regressions WITHOUT sqreg

Scott Merryman
In reply to this post by Maarten buis
On Mon, Sep 27, 2010 at 8:49 AM, Maarten buis <[hidden email]> wrote:
> --- On Mon, 27/9/10, Lim, Raymond wrote:
>> Can I test equality of coefficients across different
>> quantile regressions (.25 .50 .75) manually after running
>> qreg? If so, would the calculation be the usual "coefficient
>> +or- 1.96*SE" and see whether the 95% confidence intervals
>> overlap?
>
> Unfortunately no, that method never works. Basically you are
> missing the covariance of the joint sampling distribution.

Inaddition, you might the following article of interest:

Wolfe, R and Hanley, J  (2002), "If we're so different, why do we keep
overlapping? When 1 plus 1 doesn't make 2", Canadian Medical
Association Journal, 166: 65-66.

Avaliable at
http://www.cmaj.ca/cgi/reprint/166/1/65

Scott
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

replacing asterisks in a string

Lim,  Raymond
In reply to this post by Lim, Raymond
Hello Statalisters,
How does one flag a string containing an asterisk, which in Stata is a wildcard? I want to flag the observations with an asterisk and then delete the asterisk. This is what I would do to flag/delete a parenthesis.

replace flag_parenthesis=1 if strmatch(varname, "*)")
replace varname=subinstr(varname,")","",.)

Thanks!
-Raymond

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: replacing asterisks in a string

Eric A. Booth
<>

***********!
clear

inp str10(v1)
"*"
"("
")"
"test(*)"
"tes*t2(*)*"
end

g v2 = strpos(v1, `"("')
g v3 = strpos(v1, `"*"')

g v2a = subinstr(v1, `"("', "", .)
g v3a = subinstr(v1, `"*"', "", .)


//all together//
foreach s in * ( ) {
 replace v1 = subinstr(v1, `"`s'"', "", .)
 }
***********!

- Eric

__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[hidden email]
Office: +979.845.6754



On Oct 18, 2010, at 1:30 PM, Lim, Raymond wrote:

> Hello Statalisters,
> How does one flag a string containing an asterisk, which in Stata is a wildcard? I want to flag the observations with an asterisk and then delete the asterisk. This is what I would do to flag/delete a parenthesis.
>
> replace flag_parenthesis=1 if strmatch(varname, "*)")
> replace varname=subinstr(varname,")","",.)
>
> Thanks!
> -Raymond
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

simulataneous foreach loops

Lim,  Raymond
In reply to this post by Lim, Raymond
Hello Statalisters,
Is it possible to have two foreach loops running simultaneously? For example, I'm creating simulated teams of particularly sizes. I have team 102 to have 3 members, team 108 to have 5 members, team 202 to have 3 members, etc. The team numbers do NOT have a particular pattern. If it did I would use the loop below. Hence, I need to somehow index by (102, 3), (108, 5), (202,3), etc.

local tt=100
qui foreach NN of numlist 3 5 3 4 {
        bsample `NN'
        local tt=`tt'+1
        replace team=`tt'
        save simteam_`tt'.dta, replace
        restore
        preserve
        }

Thanks!
-Raymond

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: simulataneous foreach loops

Austin Nichols
Raymond <[hidden email]>:
Yes, but you need to specify further how the two loops interact or
mesh... and in the code below you need to -preserve- before you
-bsample- the first time through.  Is there a one to one mapping from
tt to NN?  As written, it just cycles through  3 5 3 4 over and
over...

On Fri, Nov 12, 2010 at 2:08 PM, Lim,  Raymond <[hidden email]> wrote:

> Hello Statalisters,
> Is it possible to have two foreach loops running simultaneously? For example, I'm creating simulated teams of particularly sizes. I have team 102 to have 3 members, team 108 to have 5 members, team 202 to have 3 members, etc. The team numbers do NOT have a particular pattern. If it did I would use the loop below. Hence, I need to somehow index by (102, 3), (108, 5), (202,3), etc.
>
> local tt=100
> qui foreach NN of numlist 3 5 3 4 {
>        bsample `NN'
>        local tt=`tt'+1
>        replace team=`tt'
>        save simteam_`tt'.dta, replace
>        restore
>        preserve
>        }
>
> Thanks!
> -Raymond
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: simulataneous foreach loops

Sergiy Radyakin
In reply to this post by Lim, Raymond
On Fri, Nov 12, 2010 at 2:08 PM, Lim,  Raymond <[hidden email]> wrote:
> Hello Statalisters,
> Is it possible to have two foreach loops running simultaneously? For example, I'm creating simulated teams of particularly sizes. I have team 102 to have 3 members, team 108 to have 5 members, team 202 to have 3 members, etc. The team numbers do NOT have a particular pattern. If it did I would use the loop below. Hence, I need to somehow index by (102, 3), (108, 5), (202,3), etc.

yes, see arrays
loop over array indices.
Best, Sergiy

>
> local tt=100
> qui foreach NN of numlist 3 5 3 4 {
>        bsample `NN'
>        local tt=`tt'+1
>        replace team=`tt'
>        save simteam_`tt'.dta, replace
>        restore
>        preserve
>        }
>
> Thanks!
> -Raymond
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: simulataneous foreach loops

Lim,  Raymond
In reply to this post by Lim, Raymond
Austin,
I preserve before the loop in my actual code. The loop right now creates team 101 with 3 members, team 102 with 5 members, team 103 with 3 members, etc. Yes, there's a one-to-one mapping from tt to NN. I basically need two loop indices to change simultaneously. For example, first round (i=103, j=3), second round (i=139, j=5), third round (i=230, j=3). Of course, the real thing is a couple hundred pairs.

-Raymond

preserve
local tt=100
qui foreach NN of numlist 3 5 3 4 {
       bsample `NN'
       local tt=`tt'+1
       replace team=`tt'
       save simteam_`tt'.dta, replace
       restore
       preserve
       }

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: simulataneous foreach loops

Nick Winter
It sounds like you want something that the old,
out-of-date-and-no-longer-supported -for- command made easy:

for num 101 102 103 \ num 3 5 3 : di "team X will have Y members"

-for- is no longer documented, so you'll have to guess how to use it if
you want something more complicated than that.

In modern Stata, you could do something like:

local tnums 101 102 103
local tsizes 3 5 3

forval i=1/3 {
   local team : word `i' of `tnums'
   local size : word `i' of `tsizes'
   di "Team `team' will have `size' members"
}

- Nick Winter


On 11/12/2010 2:45 PM, Lim, Raymond wrote:

> Austin,
> I preserve before the loop in my actual code. The loop right now creates team 101 with 3 members, team 102 with 5 members, team 103 with 3 members, etc. Yes, there's a one-to-one mapping from tt to NN. I basically need two loop indices to change simultaneously. For example, first round (i=103, j=3), second round (i=139, j=5), third round (i=230, j=3). Of course, the real thing is a couple hundred pairs.
>
> -Raymond
>
> preserve
> local tt=100
> qui foreach NN of numlist 3 5 3 4 {
>         bsample `NN'
>         local tt=`tt'+1
>         replace team=`tt'
>         save simteam_`tt'.dta, replace
>         restore
>         preserve
>         }
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

--
--------------------------------------------------------------
Nicholas Winter                                 434.924.6994 t
Assistant Professor                             434.924.3359 f
Department of Politics                  [hidden email] e
University of Virginia          faculty.virginia.edu/nwinter w
S385 Gibson Hall, South Lawn
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

RE: simulataneous foreach loops

Nick Cox
Like Nick Winter, I see only one loop in this problem.

The situation is also discussed at length in

SJ-3-2  pr0009  . . . . . . . . . . . . . Speaking Stata:  Problems with lists
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q2/03   SJ 3(2):185--202                                 (no commands)
        discusses ways of working through lists held in macros

Nick
[hidden email]

Nick Winter

It sounds like you want something that the old,
out-of-date-and-no-longer-supported -for- command made easy:

for num 101 102 103 \ num 3 5 3 : di "team X will have Y members"

-for- is no longer documented, so you'll have to guess how to use it if
you want something more complicated than that.

In modern Stata, you could do something like:

local tnums 101 102 103
local tsizes 3 5 3

forval i=1/3 {
   local team : word `i' of `tnums'
   local size : word `i' of `tsizes'
   di "Team `team' will have `size' members"
}

On 11/12/2010 2:45 PM, Lim, Raymond wrote:

> I preserve before the loop in my actual code. The loop right now creates team 101 with 3 members, team 102 with 5 members, team 103 with 3 members, etc. Yes, there's a one-to-one mapping from tt to NN. I basically need two loop indices to change simultaneously. For example, first round (i=103, j=3), second round (i=139, j=5), third round (i=230, j=3). Of course, the real thing is a couple hundred pairs.
>
> -Raymond
>
> preserve
> local tt=100
> qui foreach NN of numlist 3 5 3 4 {
>         bsample `NN'
>         local tt=`tt'+1
>         replace team=`tt'
>         save simteam_`tt'.dta, replace
>         restore
>         preserve
>         }
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: simulataneous foreach loops

Sergiy Radyakin
In reply to this post by Lim, Raymond
. do C:\temp\listdbl.do

.
. // ------------ SOLUTION WITH CLASSES ---------------
.
. class bstrap_plan {
.   array teams = {102,108,202}
.   array membs = {  3,  5,  3}
. }

.
. .P=.bstrap_plan.new

.
. local npairs `.P.teams.arrnels'

. forvalues i=1/`npairs' {
  2.
.        local team `.P.teams[`i']'
  3.        local memb `.P.membs[`i']'
  4.
.        preserve
  5.        //bsample `memb'
.        //replace team=`team'
.        //save simteam_`team'.dta, replace
.                 display `team' "  " `memb'
  6.        restore
  7. }
102  3
108  5
202  3

.
.
. // ------------ SOLUTION WITH MATRICES --------------
. matrix P = 102,3\108,5\202,3

.
. local npairs=rowsof(P)

. forvalues i=1/`npairs' {
  2.
.        local team=P[`i',1]
  3.        local memb=P[`i',2]
  4.
.        preserve
  5.        //bsample `memb'
.        //replace team=`team'
.        //save simteam_`team'.dta, replace
.                 display `team' "  " `memb'
  6.
.        restore
  7. }
102  3
108  5
202  3

.
.
end of do-file


And as a bonus we get the plan matrix in a readable form (add row and
column headers to taste):

. matrix list P

P[3,2]
     c1   c2
r1  102    3
r2  108    5
r3  202    3

Best, Sergiy

PS: uncomment the bootsrap part in both procedures above





On Fri, Nov 12, 2010 at 2:45 PM, Lim,  Raymond <[hidden email]> wrote:

> Austin,
> I preserve before the loop in my actual code. The loop right now creates team 101 with 3 members, team 102 with 5 members, team 103 with 3 members, etc. Yes, there's a one-to-one mapping from tt to NN. I basically need two loop indices to change simultaneously. For example, first round (i=103, j=3), second round (i=139, j=5), third round (i=230, j=3). Of course, the real thing is a couple hundred pairs.
>
> -Raymond
>
> preserve
> local tt=100
> qui foreach NN of numlist 3 5 3 4 {
>       bsample `NN'
>       local tt=`tt'+1
>       replace team=`tt'
>       save simteam_`tt'.dta, replace
>       restore
>       preserve
>       }
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

RE: simulataneous foreach loops

Nick Cox
The first solution shows an impressive grasp of technique, but it's a bit like building a tank so that you can drive to the nearest convenience store. The second solution and Nick Winter's solution

local tnums 101 102 103
local tsizes 3 5 3

forval i=1/3 {
   local team : word `i' of `tnums'
   local size : word `i' of `tsizes'
   di "Team `team' will have `size' members"
}

show that the same ends can be achieved with much simpler machinery.

Nick
[hidden email]

Sergiy Radyakin

. do C:\temp\listdbl.do

.
. // ------------ SOLUTION WITH CLASSES ---------------
.
. class bstrap_plan {
.   array teams = {102,108,202}
.   array membs = {  3,  5,  3}
. }

.
. .P=.bstrap_plan.new

.
. local npairs `.P.teams.arrnels'

. forvalues i=1/`npairs' {
  2.
.        local team `.P.teams[`i']'
  3.        local memb `.P.membs[`i']'
  4.
.        preserve
  5.        //bsample `memb'
.        //replace team=`team'
.        //save simteam_`team'.dta, replace
.                 display `team' "  " `memb'
  6.        restore
  7. }
102  3
108  5
202  3

.
.
. // ------------ SOLUTION WITH MATRICES --------------
. matrix P = 102,3\108,5\202,3

.
. local npairs=rowsof(P)

. forvalues i=1/`npairs' {
  2.
.        local team=P[`i',1]
  3.        local memb=P[`i',2]
  4.
.        preserve
  5.        //bsample `memb'
.        //replace team=`team'
.        //save simteam_`team'.dta, replace
.                 display `team' "  " `memb'
  6.
.        restore
  7. }
102  3
108  5
202  3

.
.
end of do-file


And as a bonus we get the plan matrix in a readable form (add row and
column headers to taste):

. matrix list P

P[3,2]
     c1   c2
r1  102    3
r2  108    5
r3  202    3

Best, Sergiy

PS: uncomment the bootsrap part in both procedures above

On Fri, Nov 12, 2010 at 2:45 PM, Lim,  Raymond <[hidden email]> wrote:

> I preserve before the loop in my actual code. The loop right now creates team 101 with 3 members, team 102 with 5 members, team 103 with 3 members, etc. Yes, there's a one-to-one mapping from tt to NN. I basically need two loop indices to change simultaneously. For example, first round (i=103, j=3), second round (i=139, j=5), third round (i=230, j=3). Of course, the real thing is a couple hundred pairs.
>
> -Raymond
>
> preserve
> local tt=100
> qui foreach NN of numlist 3 5 3 4 {
>       bsample `NN'
>       local tt=`tt'+1
>       replace team=`tt'
>       save simteam_`tt'.dta, replace
>       restore
>       preserve
>       }
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/