|
This post has NOT been accepted by the mailing list yet.
Is it possible to force the "egen cut" command to keep a specified category even though there are zero observations within that category. For example, I want to write a script that will work on multiple different datasets, and I want to categorise age into <40, >=40 <50, >=50 <60, >=60 <70, >=70 <80, >=80 <90, >=90. The egen cut command works well unless I have zero observation in a category - rather than still creating that as a level of the new categorical variable, Stata just doesn't form the category. This would usually be fine, but I am using "file write" commands from which I want to produce identically shaped tab delimited files to easily overlay numbers from multiple studies.
I have searched extensively but not found a simple solution. Of course I could manually create the new variable, level by level. But I have many such variables, and this would be time consuming (and lack elegance!). Does anyone know of an elegant solution?
For example, in one dataset, there are no participants <40 years. So the following command yields one less level in the categorical variable produced than I wanted:
. egen agecut = cut(age), at (0 40 50 60 70 80 90 130) label
. tab agecut
agecut | Freq. Percent Cum.
------------+-----------------------------------
40- | 29 0.39 0.39
50- | 873 11.73 12.12
60- | 3,540 47.56 59.67
70- | 2,364 31.76 91.43
80- | 629 8.45 99.88
90- | 9 0.12 100.00
------------+-----------------------------------
Total | 7,444 100.00
Many thanks in advance for any advice.
Anthony
|