Interruption of a loop because of missing variables

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Interruption of a loop because of missing variables

Florian Seliger
Dear Statalist,

I want to run a foreach loop for 250 files. There are some files were  variables are missing so that the loop is interrupted.

I have already tried to use the capture command, but I haven't found an adequate specification yet.

Please see the following example:

foreach file in acer.dta adobe.dta affiliated.dta ... {

use "`file'"

keep id cit_*
reshape long cit_, i(id)
drop if cit_ == ""
drop _j
ren cit_ pat_no

save "sic_`file'", replace
}

***end of the example***

Generally, I have multiple cit_* variables in each dataset. The number is varying among the files. For some datasets, I may have 100 cit_* variables (cit_1, cit_2, ..., cit_100), but for others zero.

How can I tell STATA that the commands should not be executed when there are zero cit_* variables and that I want STATA to proceed with the next file in this case?

Thank you very much,
Florian





--
GMX DSL Doppel-Flat ab 19,99 €/mtl.! Jetzt auch mit
gratis Notebook-Flat! http://portal.gmx.net/de/go/dsl
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

RE: Interruption of a loop because of missing variables

Cohen, Elan
Hi Florian.  Try this:

foreach file in acer.dta adobe.dta affiliated.dta ... {

  use "`file'"

  cap keep id cit_*
  if _rc==0 {
    reshape long cit_, i(id)
    drop if cit_ == ""
    drop _j
    ren cit_ pat_no

    save "sic_`file'", replace
  }
}

- Elan


> -----Original Message-----
> From: [hidden email] [mailto:owner-
> [hidden email]] On Behalf Of Florian Seliger
> Sent: Friday, November 19, 2010 14:26
> To: [hidden email]
> Cc: [hidden email]
> Subject: st: Interruption of a loop because of missing variables
>
> Dear Statalist,
>
> I want to run a foreach loop for 250 files. There are some files were
> variables are missing so that the loop is interrupted.
>
> I have already tried to use the capture command, but I haven't found an
> adequate specification yet.
>
> Please see the following example:
>
> foreach file in acer.dta adobe.dta affiliated.dta ... {
>
> use "`file'"
>
> keep id cit_*
> reshape long cit_, i(id)
> drop if cit_ == ""
> drop _j
> ren cit_ pat_no
>
> save "sic_`file'", replace
> }
>
> ***end of the example***
>
> Generally, I have multiple cit_* variables in each dataset. The number
> is varying among the files. For some datasets, I may have 100 cit_*
> variables (cit_1, cit_2, ..., cit_100), but for others zero.
>
> How can I tell STATA that the commands should not be executed when
> there are zero cit_* variables and that I want STATA to proceed with
> the next file in this case?
>
> Thank you very much,
> Florian
>
>
>
>
>
> --
> GMX DSL Doppel-Flat ab 19,99 €/mtl.! Jetzt auch mit
> gratis Notebook-Flat! http://portal.gmx.net/de/go/dsl
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: Interruption of a loop because of missing variables

Stas Kolenikov
In reply to this post by Florian Seliger
you can -capture confirm variable cit_1- and use the return code to
see if you want to -continue-.

On Fri, Nov 19, 2010 at 1:25 PM, Florian Seliger <[hidden email]> wrote:

> Dear Statalist,
>
> I want to run a foreach loop for 250 files. There are some files were  variables are missing so that the loop is interrupted.
>
> I have already tried to use the capture command, but I haven't found an adequate specification yet.
>
> Please see the following example:
>
> foreach file in acer.dta adobe.dta affiliated.dta ... {
>
> use "`file'"
>
> keep id cit_*
> reshape long cit_, i(id)
> drop if cit_ == ""
> drop _j
> ren cit_ pat_no
>
> save "sic_`file'", replace
> }
>
> ***end of the example***
>
> Generally, I have multiple cit_* variables in each dataset. The number is varying among the files. For some datasets, I may have 100 cit_* variables (cit_1, cit_2, ..., cit_100), but for others zero.
>
> How can I tell STATA that the commands should not be executed when there are zero cit_* variables and that I want STATA to proceed with the next file in this case?
>
> Thank you very much,
> Florian
>
>
>
>
>
> --
> GMX DSL Doppel-Flat ab 19,99 &euro;/mtl.! Jetzt auch mit
> gratis Notebook-Flat! http://portal.gmx.net/de/go/dsl
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

RE: Interruption of a loop because of missing variables

Sergiy Radyakin
In reply to this post by Cohen, Elan
On Fri, Nov 19, 2010 at 3:12 PM, Cohen, Elan <[hidden email]> wrote:

> Hi Florian.  Try this:
>
> foreach file in acer.dta adobe.dta affiliated.dta ... {
>
>  use "`file'"
>
>  cap keep id cit_*
>  if _rc==0 {
>    reshape long cit_, i(id)
>    drop if cit_ == ""
>    drop _j
>    ren cit_ pat_no
>
>    save "sic_`file'", replace
>  }
> }
>
> - Elan

Hi, All,

In the best case, not only we don't need to -reshape- the data, we
don't even need
to -use- the file that does not have the required variables.

The code below implements this, perhaps even better solutions can be suggested.
Ideally commands working with lists could support some sort of masks, but in the
absense of such, I check the varlist item by item.

The code is commented. Put it into your "foreach file in ..." loop. I
use auto.dta for
an illustration only (and "rep" instead of "cit_").


        clear
       
        // this is in the loop where fname gets defined
        // fname is a name of a file, which may or may not contain the var
       
        local fname "C:\Program Files (x86)\Stata11\ado\base/a/auto.dta"
        local varroot "rep"
       
        quietly describe using `"`fname'"', varlist
        local vars `r(varlist)'
       
        local present=0
        foreach onevar of local vars {
                if strpos("`onevar'","`varroot'")>0 {
                        local present=1
                        break
                }
    }

        if (`present'==1) {
                use `"`fname'"', clear
                // do analysis here
        }
        else {
                display as result "File `fname' skipped because it does not contain
variables starting with `varroot'"
        }
       
        // close the loop by file here
       
        // --- end of file ---


A smaller code is simply:

foreach file in acer.dta adobe.dta affiliated.dta {
      capture use cit_* using `"`file'"', clear, in 1
      if _rc==0 {
             use `"`file'"', clear
             // do analysis of file `file'
      }
}

but it may not work if you have more complex requirements for
variables (e.g. some
of x's and some of y's, but not z's, unless there are some t's....).
Complex rules are
easier to check with the first version above.

Note that here we write -use- two times. We exploit it once just to
probe if the file
contains the variables of interest (and one observation is enough for
that), if variables
are not there we go to the next file, if variables are there, we load
the whole file
(perhaps we need variables other than cit_* for the analysis).

Hope this helps.

Best regards,
    Sergiy Radyakin





>
>
>> -----Original Message-----
>> From: [hidden email] [mailto:owner-
>> [hidden email]] On Behalf Of Florian Seliger
>> Sent: Friday, November 19, 2010 14:26
>> To: [hidden email]
>> Cc: [hidden email]
>> Subject: st: Interruption of a loop because of missing variables
>>
>> Dear Statalist,
>>
>> I want to run a foreach loop for 250 files. There are some files were
>> variables are missing so that the loop is interrupted.
>>
>> I have already tried to use the capture command, but I haven't found an
>> adequate specification yet.
>>
>> Please see the following example:
>>
>> foreach file in acer.dta adobe.dta affiliated.dta ... {
>>
>> use "`file'"
>>
>> keep id cit_*
>> reshape long cit_, i(id)
>> drop if cit_ == ""
>> drop _j
>> ren cit_ pat_no
>>
>> save "sic_`file'", replace
>> }
>>
>> ***end of the example***
>>
>> Generally, I have multiple cit_* variables in each dataset. The number
>> is varying among the files. For some datasets, I may have 100 cit_*
>> variables (cit_1, cit_2, ..., cit_100), but for others zero.
>>
>> How can I tell STATA that the commands should not be executed when
>> there are zero cit_* variables and that I want STATA to proceed with
>> the next file in this case?
>>
>> Thank you very much,
>> Florian
>>
>>
>>
>>
>>
>> --
>> GMX DSL Doppel-Flat ab 19,99 &euro;/mtl.! Jetzt auch mit
>> gratis Notebook-Flat! http://portal.gmx.net/de/go/dsl
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/