Hi,
I'm running panel data regressions with ~1 million observations on voting data. There are ~3000 fixed effects covering -voting precincts (~2200, identified by tracking over time) -candidates (~500, identified by candidate order rotations) -offices (~200, identified by variation in contest ballot position) There is no explicit time effect; time is implicitly indexed by "offices" since in my dataset Governor in 1992 is different than Governor in 1994. With 1 million observations, the dataset becomes enormous as I add all of my fixed effects. I have some questions on how to implement these regressions. 1) What are the memory demands beyond the current dataset size when running -xtreg- and declaring a panel variable? If my dataset is 1GB, and I specify -i(precinct)- as an option when there are 2000+ precincts, how much more memory does stata need to execute the command? Does it create the dummies during the calculation, or does it perform the within-transformation? I'm assuming the latter but 0.5GB of extra RAM beyond the 1GB dataset is not enough. 2) Any feedback on the options for making the computation feasible would be GREATLY appreciated. I'm attempting to implement them now and some are either memory-restricted or give other errors: i) get rid of the precinct and candidate fixed effects by creating spells through egen spells = group(precinct candidate) and then specifying -i(spells)- in -xtreg-. ii) Apply within transformation. Usually this is done with time but since I don't have an explicity time variable, and I want to sweep out the largest number of fixed effects, I could do the transform on precincts. Or spells? Or redefining spells to be precinct-candidate, then transforming on spells? iii) felsdvreg: Ran into memory restrictions. iv) a2reg: I received a non-conformability error and have no idea how to get around that. v) Are there other methods that I'm missing??? On my 32-bit Stata/SE I can get about 1.4GB of RAM. Alternatively I can sometimes get 2.5GB of physical RAM and more of virtual memory on quad-core 64-bit Linux servers. These are shared servers and generally experience a heavy load of Matlab and simulation jobs. Ultimately it's hard for me to even know how much RAM I need to specify or need to hunt down. Thanks so much in advance for any help and suggestions you can provide. Cheers, Scott Nicholson Dept of Economics Stanford University ____________________________________________________________________ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
Free forum by Nabble | Edit this page |