Hello
I have a huge dataset containing automotive brand-swiching data. I have a variable called traded_model and another variable called new_model. in order to output a symmetric swiching matrix i need to identify all the expressions that are eigther only in traded_brand or only in new_brand and replace the expression with "others". For example a model called "318d" exists in traded_model but not in new_model, so all the "318d" should be replaced by "others". Does anyone have an idea how i can programm that? I dont get it to work, and doing it manually is close to impossible as there are more than 1000 different model names. Any help is much appreciated, thank you. |
You can consider making use of -levelsof- for each of your variables,
and a -foreach- loop On Mon, Jul 12, 2010 at 6:18 PM, Hobst <[hidden email]> wrote: > Hello > > I have a huge dataset containing automotive brand-swiching data. I have a > variable called traded_model and another variable called new_model. in order > to output a symmetric swiching matrix i need to identify all the expressions > that are eigther only in traded_brand or only in new_brand and replace the > expression with "others". For example a model called "318d" exists in > traded_model but not in new_model, so all the "318d" should be replaced by > "others". > > Does anyone have an idea how i can programm that? I dont get it to work, and > doing it manually is close to impossible as there are more than 1000 > different model names. > > Any help is much appreciated, thank you.
In reply to this post by Hobst
how about
levelsof traded_model, local(traded)
levelsof new_model, local(new)
local onlytraded : list traded - new
local onlynew : list new - traded
local others : list onlynew | onlytraded

replace traded_model="others" if inlist(traded_model,"`others'")
replace new_model ="others" if inlist(new_model,"`others'")


hth,
Jeph

On 7/12/2010 12:18 PM, Hobst wrote:
> Hello
>
> I have a huge dataset containing automotive brand-swiching data. I have a
> variable called traded_model and another variable called new_model. in order
> to output a symmetric swiching matrix i need to identify all the expressions
> that are eigther only in traded_brand or only in new_brand and replace the
> expression with "others". For example a model called "318d" exists in
> traded_model but not in new_model, so all the "318d" should be replaced by
> "others".
>
> Does anyone have an idea how i can programm that? I dont get it to work, and
> doing it manually is close to impossible as there are more than 1000
> different model names.
>
> Any help is much appreciated, thank you.
Thank you very much, it seems to work like a charm. Only Problem is, that
the some Model-names contain spaces and then Stata says "invalid name". I know there is a command to relpace all the spaces by an underscore, but i can't find it anymore. Anybody knows this command? Thanks

On Mon, 12 Jul 2010 13:52:46 -0400
Jeph Herrin <[hidden email]> wrote:
> how about
>
> levelsof traded_model, local(traded)
> levelsof new_model, local(new)
> local onlytraded : list traded - new
> local onlynew : list new - traded
> local others : list onlynew | onlytraded
>
> replace traded_model="others" if inlist(traded_model,"`others'")
> replace new_model ="others" if inlist(new_model,"`others'")
>
>
> hth,
> Jeph
>
> On 7/12/2010 12:18 PM, Hobst wrote:
>> Hello
>>
>> I have a huge dataset containing automotive brand-swiching data. I have a
>> variable called traded_model and another variable called new_model. in order
>> to output a symmetric swiching matrix i need to identify all the expressions
>> that are eigther only in traded_brand or only in new_brand and replace the
>> expression with "others". For example a model called "318d" exists in
>> traded_model but not in new_model, so all the "318d" should be replaced by
>> "others".
>>
>> Does anyone have an idea how i can programm that? I dont get it to work, and
>> doing it manually is close to impossible as there are more than 1000
>> different model names.
>>
>> Any help is much appreciated, thank you.
*********** h strtoname ***********

HTH
Martin

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Tobias Friedli
Sent: Montag, 12. Juli 2010 20:35
To: [hidden email]
Subject: Re: st: Identify and Replace Values

Thank you very much, it seems to work like a charm. Only Problem is, that
the some Model-names contain spaces and then Stata says "invalid name". I
know there is a command to relpace all the spaces by an underscore, but i
can't find it anymore. Anybody knows this command? Thanks


On Mon, 12 Jul 2010 13:52:46 -0400
Jeph Herrin <[hidden email]> wrote:
> how about
>
> levelsof traded_model, local(traded)
> levelsof new_model, local(new)
> local onlytraded : list traded - new
> local onlynew : list new - traded
> local others : list onlynew | onlytraded
>
> replace traded_model="others" if inlist(traded_model,"`others'")
> replace new_model ="others" if inlist(new_model,"`others'")
>
>
> hth,
> Jeph
>
> On 7/12/2010 12:18 PM, Hobst wrote:
>> Hello
>>
>> I have a huge dataset containing automotive brand-swiching data. I have a
>> variable called traded_model and another variable called new_model. in order
>> to output a symmetric swiching matrix i need to identify all the expressions
>> that are eigther only in traded_brand or only in new_brand and replace
the
>> expression with "others". For example a model called "318d" exists in
>> traded_model but not in new_model, so all the "318d" should be replaced by
>> "others".
>>
>> Does anyone have an idea how i can programm that? I dont get it to work, and
>> doing it manually is close to impossible as there are more than 1000
>> different model names.
>>
>> Any help is much appreciated, thank you.
Thanks. This would be exactly what i am looking for. But it seems like this
command was introduced in Stata 11 and i am stil working with Stata 10. Can i download this function somehow, so i can use it in Stata 10? Or is there a similar function?

On Mon, 12 Jul 2010 20:56:03 +0200
"Martin Weiss" <[hidden email]> wrote:
>
> ***********
> h strtoname
> ***********
>
>
> HTH
> Martin
>
> -----Original Message-----
>From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Tobias Friedli
> Sent: Montag, 12. Juli 2010 20:35
> To: [hidden email]
> Subject: Re: st: Identify and Replace Values
>
> Thank you very much, it seems to work like a charm. Only Problem is, that
> the some Model-names contain spaces and then Stata says "invalid name". I
> know there is a command to relpace all the spaces by an underscore, but i
> can't find it anymore. Anybody knows this command? Thanks
>
>
>
> On Mon, 12 Jul 2010 13:52:46 -0400
> Jeph Herrin <[hidden email]> wrote:
>> how about
>>
>> levelsof traded_model, local(traded)
>> levelsof new_model, local(new)
>> local onlytraded : list traded - new
>> local onlynew : list new - traded
>> local others : list onlynew | onlytraded
>>
>> replace traded_model="others" if inlist(traded_model,"`others'")
>> replace new_model ="others" if inlist(new_model,"`others'")
>>
>>
>> hth,
>> Jeph
>>
>> On 7/12/2010 12:18 PM, Hobst wrote:
>>> Hello
>>>
>>> I have a huge dataset containing automotive brand-swiching data. I have a
>>> variable called traded_model and another variable called new_model. in order
>>> to output a symmetric swiching matrix i need to identify all the expressions
>>> that are eigther only in traded_brand or only in new_brand and replace
> the
>>> expression with "others". For example a model called "318d" exists in
>>> traded_model but not in new_model, so all the "318d" should be replaced
> by
>>> "others".
>>>
>>> Does anyone have an idea how i can programm that? I dont get it to work, and
>>> doing it manually is close to impossible as there are more than 1000
>>> different model names.
>>>
>>> Any help is much appreciated, thank you.
<>
Try using -subinstr()-

replace var = subinstr(var, " ", "_", .)

~ Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[hidden email]
Office: +979.845.6754


On Jul 12, 2010, at 2:05 PM, Tobias Friedli wrote:

> Thanks. This would be exactly what i am looking for. But it seems like this command was introduced in Stata 11 and i am stil working with Stata 10. Can i download this function somehow, so i can use it in Stata 10? Or is there a similar function?
>
> On Mon, 12 Jul 2010 20:56:03 +0200
> "Martin Weiss" <[hidden email]> wrote:
>> ***********
>> h strtoname
>> ***********
>> HTH
>> Martin
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Tobias Friedli
>> Sent: Montag, 12. Juli 2010 20:35
>> To: [hidden email]
>> Subject: Re: st: Identify and Replace Values
>> Thank you very much, it seems to work like a charm. Only Problem is, that the some Model-names contain spaces and then Stata says "invalid name". I know there is a command to relpace all the spaces by an underscore, but i can't find it anymore. Anybody knows this command? Thanks
>> On Mon, 12 Jul 2010 13:52:46 -0400
>> Jeph Herrin <[hidden email]> wrote:
>>> how about
>>> levelsof traded_model, local(traded)
>>> levelsof new_model, local(new)
>>> local onlytraded : list traded - new
>>> local onlynew : list new - traded
>>> local others : list onlynew | onlytraded
>>> replace traded_model="others" if inlist(traded_model,"`others'")
>>> replace new_model ="others" if inlist(new_model,"`others'")
>>> hth,
>>> Jeph
>>> On 7/12/2010 12:18 PM, Hobst wrote:
>>>> Hello
>>>>
>>>> I have a huge dataset containing automotive brand-swiching data. I have a
>>>> variable called traded_model and another variable called new_model. in order
>>>> to output a symmetric swiching matrix i need to identify all the expressions
>>>> that are eigther only in traded_brand or only in new_brand and replace
>> the
>>>> expression with "others". For example a model called "318d" exists in
>>>> traded_model but not in new_model, so all the "318d" should be replaced
>> by
>>>> "others".
>>>>
>>>> Does anyone have an idea how i can programm that? I dont get it to work, and
>>>> doing it manually is close to impossible as there are more than 1000
>>>> different model names.
>>>>
>>>> Any help is much appreciated, thank you.
In reply to this post by Hobst
Just use the normal string functions, then:

***********
di in r subinstr("not a good varname", " ", "_",.)
***********


HTH
Martin

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Tobias Friedli
Sent: Montag, 12. Juli 2010 21:05
To: [hidden email]
Subject: Re: st: Identify and Replace Values

Thanks. This would be exactly what i am looking for. But it seems like this
command was introduced in Stata 11 and i am stil working with Stata 10. Can
i download this function somehow, so i can use it in Stata 10? Or is there
a similar function?

On Mon, 12 Jul 2010 20:56:03 +0200
"Martin Weiss" <[hidden email]> wrote:
>
> ***********
> h strtoname
> ***********
>
>
> HTH
> Martin
>
> -----Original Message-----
>From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Tobias Friedli
> Sent: Montag, 12. Juli 2010 20:35
> To: [hidden email]
> Subject: Re: st: Identify and Replace Values
>
> Thank you very much, it seems to work like a charm. Only Problem is, that
> the some Model-names contain spaces and then Stata says "invalid name". I
> know there is a command to relpace all the spaces by an underscore, but i
> can't find it anymore. Anybody knows this command? Thanks
>
>
>
> On Mon, 12 Jul 2010 13:52:46 -0400
> Jeph Herrin <[hidden email]> wrote:
>> how about
>>
>> levelsof traded_model, local(traded)
>> levelsof new_model, local(new)
>> local onlytraded : list traded - new
>> local onlynew : list new - traded
>> local others : list onlynew | onlytraded
>>
>> replace traded_model="others" if inlist(traded_model,"`others'")
>> replace new_model ="others" if inlist(new_model,"`others'")
>>
>>
>> hth,
>> Jeph
>>
>> On 7/12/2010 12:18 PM, Hobst wrote:
>>> Hello
>>>
>>> I have a huge dataset containing automotive brand-swiching data. I have
a
>>> variable called traded_model and another variable called new_model. in order
>>> to output a symmetric swiching matrix i need to identify all the expressions
>>> that are eigther only in traded_brand or only in new_brand and replace
> the
>>> expression with "others". For example a model called "318d" exists in
>>> traded_model but not in new_model, so all the "318d" should be replaced
> by
>>> "others".
>>>
>>> Does anyone have an idea how i can programm that? I dont get it to work, and
>>> doing it manually is close to impossible as there are more than 1000
>>> different model names.
>>>
>>> Any help is much appreciated, thank you.
In reply to this post by Eric A. Booth
This command worked perfect, all the Names are usable now. But the function
still doesnt do what i want:

levelsof traded_model, local(traded)
>>>> levelsof new_model, local(new)
>>>> local onlytraded : list traded - new
>>>> local onlynew : list new - traded
>>>> local others : list onlynew | onlytraded
>>>> replace traded_model="others" if inlist(traded_model,"`others'")
>>>> replace new_model ="others" if inlist(new_model,"`others'")

For the last two commands i get "0 real changes made". It seems like "onlytraded" and "onlynew" are empty... (althought traded contains about 70 more Names than new.) I have to do some more testing with it. If
<>
You can use -tokenize- for each element in `others' : **************! clear inp str10 (traded_model new_model) "g1" "f1" "g2" "f2" "g3" "f3" "g4" "f4" "f1" "g4" "f2" "f5" "g4" "g5" "g5" "aa aa" end levelsof traded_model, local(traded) levelsof new_model, local(new) local onlytraded : list traded - new local onlynew : list new - traded local others : list onlynew | onlytraded **************** di `"`onlytraded'"' di `"`onlynew'"' di `"`others'"' **************** token `"`others'"' **shouldn't need to subinstr() the spaces** ** due to the compound quotes in the macros** while "`1'"!="" { di "`1'" replace traded_model="OTHERS" if inlist(traded_model,"`1'") replace new_model ="OTHERS" if inlist(new_model,"`1'") macro shift } **************! ~ Eric __ Eric A. Booth Public Policy Research Institute Texas A&M University [hidden email] Office: +979.845.6754 On Jul 12, 2010, at 2:28 PM, Tobias Friedli wrote: > This command worked perfect, all the Names are usable now. But the function still doesnt do what i want: > > levelsof traded_model, local(traded) >>>>> levelsof new_model, local(new) >>>>> local onlytraded : list traded - new >>>>> local onlynew : list new - traded >>>>> local others : list onlynew | onlytraded >>>>> replace traded_model="others" if inlist(traded_model,"`others'") >>>>> replace new_model ="others" if inlist(new_model,"`others'") > > For the last two commands i get "0 real changes made". It seems like "onlytraded" and "onlynew" are empty... (althought traded contains about 70 more Names than new.) I have to do some more testing with it. If anyone sees the mistake in the Code, please tell me.. Anyways thank you all very much for your fast responses, i really appreciate it!! > > > > On Mon, 12 Jul 2010 19:09:57 +0000 > Eric Booth <[hidden email]> wrote: >> <> >> Try using -subinstr()- >> replace var = subinstr(var, " ", "_", .) >> ~ Eric >> __ >> Eric A. Booth >> Public Policy Research Institute >> Texas A&M University >> [hidden email] >> Office: +979.845.6754 >> On Jul 12, 2010, at 2:05 PM, Tobias Friedli wrote: >>> Thanks. This would be exactly what i am looking for. But it seems like this command was introduced in Stata 11 and i am stil working with Stata 10. Can i download this function somehow, so i can use it in Stata 10? Or is there a similar function? >>> On Mon, 12 Jul 2010 20:56:03 +0200 >>> "Martin Weiss" <[hidden email]> wrote: >>>> <> >>>> *********** >>>> h strtoname >>>> *********** >>>> HTH >>>> Martin >>>> -----Original Message----- >>>> From: [hidden email] >>>> [mailto:[hidden email]] On Behalf Of Tobias Friedli >>>> Sent: Montag, 12. Juli 2010 20:35 >>>> To: [hidden email] >>>> Subject: Re: st: Identify and Replace Values >>>> Thank you very much, it seems to work like a charm. Only Problem is, that the some Model-names contain spaces and then Stata says "invalid name". I know there is a command to relpace all the spaces by an underscore, but i can't find it anymore. Anybody knows this command? Thanks >>>> On Mon, 12 Jul 2010 13:52:46 -0400 >>>> Jeph Herrin <[hidden email]> wrote: >>>>> how about >>>>> levelsof traded_model, local(traded) >>>>> levelsof new_model, local(new) >>>>> local onlytraded : list traded - new >>>>> local onlynew : list new - traded >>>>> local others : list onlynew | onlytraded >>>>> replace traded_model="others" if inlist(traded_model,"`others'") >>>>> replace new_model ="others" if inlist(new_model,"`others'") >>>>> hth, >>>>> Jeph >>>>> On 7/12/2010 12:18 PM, Hobst wrote: >>>>>> Hello >>>>>> I have a huge dataset containing automotive brand-swiching data. I have a >>>>>> variable called traded_model and another variable called new_model. in order >>>>>> to output a symmetric swiching matrix i need to identify all the expressions >>>>>> that are eigther only in traded_brand or only in new_brand and replace >>>> the >>>>>> expression with "others". For example a model called "318d" exists in >>>>>> traded_model but not in new_model, so all the "318d" should be replaced >>>> by >>>>>> "others". >>>>>> Does anyone have an idea how i can programm that? I dont get it to work, and >>>>>> doing it manually is close to impossible as there are more than 1000 >>>>>> different model names. >>>>>> Any help is much appreciated, thank you. >>>>> * >> *= * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
I dont get that one.. What are you doing with this inp str10? But anyways i have already renamed all the Model Names, so i am still working with the following code:
levelsof traded_model, local(traded) levelsof new_model, local(new) local onlytraded : list traded - new local onlynew : list new - traded local others : list onlynew | onlytraded replace traded_model="others" if inlist(traded_model,"`others'") replace new_model ="others" if inlist(new_model,"`others'") Does anybody see a reason why i always get the "0 real changes made" for the last two commands? Is there a mistake in the syntax somewhere, i dont see the problem. I think there must be something wrong with lines 3-5, i cant find the error..... Thank you.. |
<> " What are you doing with this inp str10" -input- is intended to create a fake dataset, so we can replicate your problem. "Does anybody see a reason why i always get the "0 real changes made" for the last two commands?" -inlist()- expects the arguments for comparison as comma-separated which "others" is not... See Bill`s little -program- in http://www.stata-journal.com/article.html?article=dm0049 on how to change that... HTH Martin -----Ursprüngliche Nachricht----- Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Hobst Gesendet: Dienstag, 13. Juli 2010 12:59 An: [hidden email] Betreff: st: Re: Identify and Replace Values I dont get that one.. What are you doing with this inp str10? But anyways i have already renamed all the Model Names, so i am still working with the following code: levelsof traded_model, local(traded) levelsof new_model, local(new) local onlytraded : list traded - new local onlynew : list new - traded local others : list onlynew | onlytraded replace traded_model="others" if inlist(traded_model,"`others'") replace new_model ="others" if inlist(new_model,"`others'") Does anybody see a reason why i always get the "0 real changes made" for the last two commands? Is there a mistake in the syntax somewhere, i dont see the problem. I think there must be something wrong with lines 3-5, i cant find the error..... Thank you.. -- View this message in context: http://statalist.1588530.n2.nabble.com/Identify-and-Replace-Values-tp5283591 p5286941.html Sent from the Statalist mailing list archive at Nabble.com. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
Thanks - it's not the first time I've tripped up on this.
On 7/13/2010 7:14 AM, Martin Weiss wrote: > > -inlist()- expects the arguments for comparison as comma-separated which > "others" is not... See Bill`s little -program- in > http://www.stata-journal.com/article.html?article=dm0049 on how to change > that... > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
In reply to this post by Hobst
<>
The upshot of the example I provided was to use -tokenize- and a loop to replace your `others' variables with something that -inlist()- could understand. The problem with the last 2 lines of code that you asked about again is that Stata sees the "if inlist(traded_models, "`others'")" part as "if inlist(traded_models, "``"aa aa"' `"f3"' `"f4"' `"f5"' `"g1"' `"g2"' `"g3"'")--if you're using the data I provided in my example (by the way, check out Maartin Buis's FAQ on how to run examples provided on Statalist ((http://www.maartenbuis.nl/example_faq/index.html))). You'll notice that this matches up with what is displayed if you run " di `"`others'"'" in my example. The problem is, as Martin Weiss points out, that -inlist- expects the comma separated list which is why your last two lines of code returns 0 changes. The combination of -tokenize- and a -while- loop is probably a bit more complicated than using a -foreach- loop, so you could exchange my previous example with the code below to get the same result using -foreach- (also, I've traced some of the -replace- syntax with -tr- (from SSC) so you can see how Stata is interpreting it): **************! clear inp str10 (traded_model new_model) "g1" "f1" "g2" "f2" "g3" "f3" "g4" "f4" "f1" "g4" "f2" "f5" "g4" "g5" "g5" "aa aa" end levelsof traded_model, local(traded) levelsof new_model, local(new) local onlytraded : list traded - new local onlynew : list new - traded local others : list onlynew | onlytraded **************** di `"`onlytraded'"' di `"`onlynew'"' di `"`others'"' **************** // I'VE ONLY CHANGED BELOW THIS LINE --> //FIRST, install -tr- to take a look at -replace- inside the loop which tr if _rc ssc instal tr, replace ** //SECOND, run a -foreach- loop foreach v of local others { tr: replace traded_model="OTHERS" if inlist(traded_model,"`v'") tr: replace new_model ="OTHERS" if inlist(new_model,"`v'") } **************! This will give you the same result. Finally, you could use macro extended functions to change your others macro so that -inlist- will accept it without a loop of any sort. For example, substitute the -foreach- loop above with this code: ***** local others: subinstr local others " " ",", all di `"`others'"' tr: replace traded_model="OTHERS" if inlist(traded_model, `others' ) //note that you don't need the double quotes around `others' here tr: replace new_model ="OTHERS" if inlist(new_model,`others') ***** ^ Again, with any of these methods, you don't need to use -subinstr- to change spaces to _ for your models. ~ Eric __ Eric A. Booth Public Policy Research Institute Texas A&M University [hidden email] Office: +979.845.6754 On Jul 13, 2010, at 5:59 AM, Hobst wrote: I dont get that one.. What are you doing with this inp str10? But anyways i have already renamed all the Model Names, so i am still working with the following code: levelsof traded_model, local(traded) levelsof new_model, local(new) local onlytraded : list traded - new local onlynew : list new - traded local others : list onlynew | onlytraded replace traded_model="others" if inlist(traded_model,"`others'") replace new_model ="others" if inlist(new_model,"`others'") Does anybody see a reason why i always get the "0 real changes made" for the last two commands? Is there a mistake in the syntax somewhere, i dont see the problem. I think there must be something wrong with lines 3-5, i cant find the error..... Thank you.. -- View this message in context: http://statalist.1588530.n2.nabble.com/Identify-and-Replace-Values-tp5283591p5286941.html Sent from the Statalist mailing list archive at Nabble.com. * * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
