Dropping Observations

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Dropping Observations

momo
Dear Colleagues

I have a dataset with the following shape:

var1.......var5......var10(which sum obs in var11)... var11
co. N......1203.........................3........ ........................
.................................................. .....................1450
.................................................. ......................1203
.................................................. ......................1620
co.N......etc

so, I need to drop the observations in var11 which equal the unique value in var5. in the example above I want to drop the observations in var11=1203 that =1203 in var5 for each var1(company).

I tries
drop var11 if (var11=var5) but doesn't work and gives me, var11=var5 invalid name
or type mismatch if I put ==

in addition how can I change the value of var10 to be 2 in the example above instead of 3 after dropping the observation that =1203 in var11. I.e to change the summation of var11, which var10=number of observations in var11.

Thanks in advanced
Reply | Threaded
Open this post in threaded view
|

R: Dropping Observations

Cardani Roberta
drop var11 if (var11==var5)
you need = =

________________________________________
Da: [hidden email] [[hidden email]] per conto di momo [[hidden email]]
Inviato: martedì 9 novembre 2010 17.58
A: [hidden email]
Oggetto: st: Dropping Observations

Dear Colleagues

I have a dataset with the following shape:

var1.......var5......var10(which sum obs in var11)... var11
co. N......1203.........................3........ ........................
.................................................. .....................1450
..................................................
......................1203
..................................................
......................1620
co.N......etc

so, I need to drop the observations in var11 which equal the unique value in
var5. in the example above I want to drop the observations in var11=1203
that =1203 in var5 for each var1(company).

I tries
drop var11 if (var11=var5) but doesn't work and gives me, var11=var5 invalid
name
or type mismatch if I put ==

in addition how can I change the value of var10 to be 2 in the example above
instead of 3 after dropping the observation that =1203 in var11. I.e to
change the summation of var11, which var10=number of observations in var11.

Thanks in advanced

--
View this message in context: http://statalist.1588530.n2.nabble.com/Dropping-Observations-tp5721657p5721657.html
Sent from the Statalist mailing list archive at Nabble.com.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

RE: R: Dropping Observations

Nick Cox
Although Momo's precise problem is not at all clear to me it was stated here that == doesn't work in the sense that "type mismatch" is the error message.

The issue is presumably that one variable is string and the other numeric. Perhaps applying -destring- to the string variable with numeric content is the easiest thing to do first.

Nick
[hidden email]

Cardani Roberta

drop var11 if (var11==var5)
you need = =

momo

I have a dataset with the following shape:

var1.......var5......var10(which sum obs in var11)... var11
co. N......1203.........................3........ ........................
.................................................. .....................1450
..................................................
......................1203
..................................................
......................1620
co.N......etc

so, I need to drop the observations in var11 which equal the unique value in
var5. in the example above I want to drop the observations in var11=1203
that =1203 in var5 for each var1(company).

I tries
drop var11 if (var11=var5) but doesn't work and gives me, var11=var5 invalid
name
or type mismatch if I put ==

in addition how can I change the value of var10 to be 2 in the example above
instead of 3 after dropping the observation that =1203 in var11. I.e to
change the summation of var11, which var10=number of observations in var11.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: Dropping Observations

momo
In reply to this post by momo
Anyway thank you, but as I said before in the message if I put
drop var11 if (var11==var5)

to clarify the problem
I have say a 10000 companies data and each company has (code(var5) total number of affiliates(var10) and code for each affiliate (var11))

so I need to drop the affiliates that have the same code of the main company(headquarter))

drop var11 if (var11==var5) or drop var11 if (var11==var5) do not work

I think I have to include something to say "for each company" but how

any idea

thanks
Reply | Threaded
Open this post in threaded view
|

Re: Dropping Observations

nshephard
Administrator
On Tue, Nov 9, 2010 at 5:36 PM, momo <[hidden email]> wrote:

> Anyway thank you, but as I said before in the message if I put
> drop var11 if (var11==var5)
>
> to clarify the problem
> I have say a 10000 companies data and each company has (code(var5) total
> number of affiliates(var10) and code for each affiliate (var11))
>
> so I need to drop the affiliates that have the same code of the main
> company(headquarter))
>
> drop var11 if (var11==var5) or drop var11 if (var11==var5) do not work
>
> I think I have to include something to say "for each company" but how

Read the help for -drop- again and look at the syntax examples at the top.

"drop eliminates variables OR observations from the data in memory."


You're trying to do both at once, so if you want to remove instances
where var11 and var 5 match _and_ get rid of var11 you have to...

drop if(var11 == var5)
drop var11

Neil




--
"Our civilization would be pitifully immature without the intellectual
revolution led by Darwin" - Motoo Kimura, The Neutral Theory of
Molecular Evolution

Email - [hidden email]
Website - http://kimura-no-ip.org/
Photos - http://www.flickr.com/photos/slackline/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Reply | Threaded
Open this post in threaded view
|

Re: Dropping Observations

momo
drop if(var11 == var5)
this already doesn't work, maybe because each company has one value of var5 and has many counterpart observations in var11. and the var5 value in first company for example could repeated in var11 for many companies.
So, the stata command should deal with each company alone and drop the observation of var11 if it equal the value of var5 for the same company and so on for all companies.
Reply | Threaded
Open this post in threaded view
|

Re: Dropping Observations

momo
ah, actually I tried before because var11 string and var5 int

so

. destring var11, replace
var11 contains nonnumeric characters; no replace

maybe because some observations are missed, but the non missing values all are numbers
Thanks a lot for help