# st: semiparametric mixture cure models

5 messages
Open this post in threaded view
|

## st: semiparametric mixture cure models

 Are there any stata routines for estimating a semiparametric mixture cure model with fixed covariates?  The data have 30,000  observations and the duration of observation is in days ( 1 day to 1096 days).   I have estimated several parametric mixture cure models and they did not yield good results. Cure percentage was very low, the fit had long tails and the median survival time is greater than 1096 days. Therefore, there exists an identification problem.   Further,   1.       since the data is in days, is stset enough or episode splitting is required?   2.       Convergence problems with the EM algorithm and the bootstrapping procedures if the data is in days.   3.       If I have to group the data to estimate a semiparametric mixture cure model what is the optimal way to group the data? 5 days, 10 days? 30 days?   Thank you for your help,   Sridhar Telidevara * *   For searches and help try: *   http://www.stata.com/help.cgi?search*   http://www.stata.com/support/statalist/faq*   http://www.ats.ucla.edu/stat/stata/
Open this post in threaded view
|

## Re: st: semiparametric mixture cure models

 If you are talking about a Cox-type regression on top of a cure mixture, then no.  These have real identifiability problems, and identification usually involves making an assumption about where the floor is. There is strsnmix, Paul Lambert's parametric cure-mixture model, and it works well.  I have some home-grown stuff, but it is still parametric.   If you have trouble with identification using the parametric model, it may be because the floor is vague in your data.  Parametric cure models fit well if you can actually see the floor in a Kaplan Meier plot.  If you do not see the survivorship leveling off in the KM plot, the floor of the cure model doesn't have any way to ID the floor in your data. Have you thought of modeling with frailty?  You may achieve the same kind of result with a slightly different set of assumptions. P Sridhar Telidevara wrote: > Are there any stata routines for estimating a semiparametric mixture cure > model with fixed covariates?  The data have 30,000  observations and the > duration of observation is in days ( 1 day to 1096 days). > >   > > I have estimated several parametric mixture cure models and they did not > yield good results. Cure percentage was very low, the fit had long tails and > the median survival time is greater than 1096 days. Therefore, there exists > an identification problem. > >   > > Further, > >   > > 1.       since the data is in days, is stset enough or episode splitting is > required? > >   > > 2.       Convergence problems with the EM algorithm and the bootstrapping > procedures if the data is in days. > >   > > 3.       If I have to group the data to estimate a semiparametric mixture > cure model what is the optimal way to group the data? 5 days, 10 days? 30 > days? > >   > > Thank you for your help, > >   > > Sridhar Telidevara > > * > *   For searches and help try: > *   http://www.stata.com/help.cgi?search> *   http://www.stata.com/support/statalist/faq> *   http://www.ats.ucla.edu/stat/stata/> >   -- E. Paul Wileyto, Ph.D. Assistant Professor of Biostatistics Tobacco Use Research Center School of Medicine, U. of Pennsylvania 3535 Market Street, Suite 4100 Philadelphia, PA  19104-3309 215-746-7147 Fax: 215-746-7140 [hidden email] * *   For searches and help try: *   http://www.stata.com/help.cgi?search*   http://www.stata.com/support/statalist/faq*   http://www.ats.ucla.edu/stat/stata/
Open this post in threaded view
|

## Re: st: semiparametric mixture cure models

 Paul, Thank you for your observations and suggestions. I will estimate the models using strsnmix, and frailty models. The floor on the Kaplan Meier estimate is 0.4. I have used accelerated failure time model, loglogistic, weibull and lognormal distributions to model the latency part and logit link function for the cure fraction. loglogistic distribution appears to be a best fit (of all the three) at the top of the KM curve but not towards the tail. The estimated marginal survivor function appears to taper off, but below the floor of the KM estimate, around 0.37. The estimated cure fraction fraction from the model is around 1.2%. Median survival time is around 1500 days. Marginal survivor function of a simple loglogistic model without cure did not approximate the KM curve well and it was above the KM curve all throughout the observation period. Do you think your home-grown models will be of any help? Thanks, Sridhar On Tue, Feb 23, 2010 at 10:34 AM, E. Paul Wileyto <[hidden email]> wrote: > If you are talking about a Cox-type regression on top of a cure mixture, > then no.  These have real identifiability problems, and identification > usually involves making an assumption about where the floor is. > > There is strsnmix, Paul Lambert's parametric cure-mixture model, and it > works well.  I have some home-grown stuff, but it is still parametric.  If > you have trouble with identification using the parametric model, it may be > because the floor is vague in your data.  Parametric cure models fit well if > you can actually see the floor in a Kaplan Meier plot.  If you do not see > the survivorship leveling off in the KM plot, the floor of the cure model > doesn't have any way to ID the floor in your data. > > Have you thought of modeling with frailty?  You may achieve the same kind of > result with a slightly different set of assumptions. > > P > > > > Sridhar Telidevara wrote: >> >> Are there any stata routines for estimating a semiparametric mixture cure >> model with fixed covariates?  The data have 30,000  observations and the >> duration of observation is in days ( 1 day to 1096 days). >> >> I have estimated several parametric mixture cure models and they did not >> yield good results. Cure percentage was very low, the fit had long tails >> and >> the median survival time is greater than 1096 days. Therefore, there >> exists >> an identification problem. >> >> >> Further, >> >> 1.       since the data is in days, is stset enough or episode splitting >> is >> required? >> >> >> 2.       Convergence problems with the EM algorithm and the bootstrapping >> procedures if the data is in days. >> >> >> 3.       If I have to group the data to estimate a semiparametric mixture >> cure model what is the optimal way to group the data? 5 days, 10 days? 30 >> days? >> >> >> Thank you for your help, >> >> >> Sridhar Telidevara >> >> * >> *   For searches and help try: >> *   http://www.stata.com/help.cgi?search>> *   http://www.stata.com/support/statalist/faq>> *   http://www.ats.ucla.edu/stat/stata/>> >> > > -- > E. Paul Wileyto, Ph.D. > Assistant Professor of Biostatistics > Tobacco Use Research Center > School of Medicine, U. of Pennsylvania > 3535 Market Street, Suite 4100 > Philadelphia, PA  19104-3309 > > 215-746-7147 > Fax: 215-746-7140 > [hidden email] > * > *   For searches and help try: > *   http://www.stata.com/help.cgi?search> *   http://www.stata.com/support/statalist/faq> *   http://www.ats.ucla.edu/stat/stata/> * *   For searches and help try: *   http://www.stata.com/help.cgi?search*   http://www.stata.com/support/statalist/faq*   http://www.ats.ucla.edu/stat/stata/