Are there any stata routines for estimating a semiparametric mixture cure
model with fixed covariates? The data have 30,000 observations and the duration of observation is in days ( 1 day to 1096 days). I have estimated several parametric mixture cure models and they did not yield good results. Cure percentage was very low, the fit had long tails and the median survival time is greater than 1096 days. Therefore, there exists an identification problem. Further, 1. since the data is in days, is stset enough or episode splitting is required? 2. Convergence problems with the EM algorithm and the bootstrapping procedures if the data is in days. 3. If I have to group the data to estimate a semiparametric mixture cure model what is the optimal way to group the data? 5 days, 10 days? 30 days? Thank you for your help, Sridhar Telidevara * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
If you are talking about a Cox-type regression on top of a cure mixture,
then no. These have real identifiability problems, and identification usually involves making an assumption about where the floor is. There is strsnmix, Paul Lambert's parametric cure-mixture model, and it works well. I have some home-grown stuff, but it is still parametric. If you have trouble with identification using the parametric model, it may be because the floor is vague in your data. Parametric cure models fit well if you can actually see the floor in a Kaplan Meier plot. If you do not see the survivorship leveling off in the KM plot, the floor of the cure model doesn't have any way to ID the floor in your data. Have you thought of modeling with frailty? You may achieve the same kind of result with a slightly different set of assumptions. P Sridhar Telidevara wrote: > Are there any stata routines for estimating a semiparametric mixture cure > model with fixed covariates? The data have 30,000 observations and the > duration of observation is in days ( 1 day to 1096 days). > > > > I have estimated several parametric mixture cure models and they did not > yield good results. Cure percentage was very low, the fit had long tails and > the median survival time is greater than 1096 days. Therefore, there exists > an identification problem. > > > > Further, > > > > 1. since the data is in days, is stset enough or episode splitting is > required? > > > > 2. Convergence problems with the EM algorithm and the bootstrapping > procedures if the data is in days. > > > > 3. If I have to group the data to estimate a semiparametric mixture > cure model what is the optimal way to group the data? 5 days, 10 days? 30 > days? > > > > Thank you for your help, > > > > Sridhar Telidevara > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > -- E. Paul Wileyto, Ph.D. Assistant Professor of Biostatistics Tobacco Use Research Center School of Medicine, U. of Pennsylvania 3535 Market Street, Suite 4100 Philadelphia, PA 19104-3309 215-746-7147 Fax: 215-746-7140 [hidden email] * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
Paul, Thank you for your observations and suggestions. I will estimate
the models using strsnmix, and frailty models. The floor on the Kaplan Meier estimate is 0.4. I have used accelerated failure time model, loglogistic, weibull and lognormal distributions to model the latency part and logit link function for the cure fraction. loglogistic distribution appears to be a best fit (of all the three) at the top of the KM curve but not towards the tail. The estimated marginal survivor function appears to taper off, but below the floor of the KM estimate, around 0.37. The estimated cure fraction fraction from the model is around 1.2%. Median survival time is around 1500 days. Marginal survivor function of a simple loglogistic model without cure did not approximate the KM curve well and it was above the KM curve all throughout the observation period. Do you think your home-grown models will be of any help? Thanks, Sridhar On Tue, Feb 23, 2010 at 10:34 AM, E. Paul Wileyto <[hidden email]> wrote: > If you are talking about a Cox-type regression on top of a cure mixture, > then no. These have real identifiability problems, and identification > usually involves making an assumption about where the floor is. > > There is strsnmix, Paul Lambert's parametric cure-mixture model, and it > works well. I have some home-grown stuff, but it is still parametric. If > you have trouble with identification using the parametric model, it may be > because the floor is vague in your data. Parametric cure models fit well if > you can actually see the floor in a Kaplan Meier plot. If you do not see > the survivorship leveling off in the KM plot, the floor of the cure model > doesn't have any way to ID the floor in your data. > > Have you thought of modeling with frailty? You may achieve the same kind of > result with a slightly different set of assumptions. > > P > > > > Sridhar Telidevara wrote: >> >> Are there any stata routines for estimating a semiparametric mixture cure >> model with fixed covariates? The data have 30,000 observations and the >> duration of observation is in days ( 1 day to 1096 days). >> >> I have estimated several parametric mixture cure models and they did not >> yield good results. Cure percentage was very low, the fit had long tails >> and >> the median survival time is greater than 1096 days. Therefore, there >> exists >> an identification problem. >> >> >> Further, >> >> 1. since the data is in days, is stset enough or episode splitting >> is >> required? >> >> >> 2. Convergence problems with the EM algorithm and the bootstrapping >> procedures if the data is in days. >> >> >> 3. If I have to group the data to estimate a semiparametric mixture >> cure model what is the optimal way to group the data? 5 days, 10 days? 30 >> days? >> >> >> Thank you for your help, >> >> >> Sridhar Telidevara >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> >> > > -- > E. Paul Wileyto, Ph.D. > Assistant Professor of Biostatistics > Tobacco Use Research Center > School of Medicine, U. of Pennsylvania > 3535 Market Street, Suite 4100 > Philadelphia, PA 19104-3309 > > 215-746-7147 > Fax: 215-746-7140 > [hidden email] > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
My homegrown models are pretty much equivalent to Paul Lambert's
models. His stuff has error handling, mine doesn't. By 0.4, I assume you mean 40%. That 1.2% estimate may have to do with your data still declining at censoring time. That parametric cure model is fitted to the shape of the uncensored data, and if at censoring there's a long way to go, and it is still declining (even slowly), it may well project that the floor is far below the current level. It doesn't surprise me. P Sridhar Telidevara wrote: > Paul, Thank you for your observations and suggestions. I will estimate > the models using strsnmix, and frailty models. > > > The floor on the Kaplan Meier estimate is 0.4. I have used accelerated > failure time model, loglogistic, weibull and lognormal distributions > to model the latency part and logit link function for the cure > fraction. loglogistic distribution appears to be a best fit (of all > the three) at the top of the KM curve but not towards the tail. The > estimated marginal survivor function appears to taper off, but below > the floor of the KM estimate, around 0.37. The estimated cure fraction > fraction from the model is around 1.2%. Median survival time is around > 1500 days. Marginal survivor function of a simple loglogistic model > without cure did not approximate the KM curve well and it was above > the KM curve all throughout the observation period. > > Do you think your home-grown models will be of any help? > > Thanks, > > Sridhar > > > > > On Tue, Feb 23, 2010 at 10:34 AM, E. Paul Wileyto > <[hidden email]> wrote: > >> If you are talking about a Cox-type regression on top of a cure mixture, >> then no. These have real identifiability problems, and identification >> usually involves making an assumption about where the floor is. >> >> There is strsnmix, Paul Lambert's parametric cure-mixture model, and it >> works well. I have some home-grown stuff, but it is still parametric. If >> you have trouble with identification using the parametric model, it may be >> because the floor is vague in your data. Parametric cure models fit well if >> you can actually see the floor in a Kaplan Meier plot. If you do not see >> the survivorship leveling off in the KM plot, the floor of the cure model >> doesn't have any way to ID the floor in your data. >> >> Have you thought of modeling with frailty? You may achieve the same kind of >> result with a slightly different set of assumptions. >> >> P >> >> >> >> Sridhar Telidevara wrote: >> >>> Are there any stata routines for estimating a semiparametric mixture cure >>> model with fixed covariates? The data have 30,000 observations and the >>> duration of observation is in days ( 1 day to 1096 days). >>> >>> I have estimated several parametric mixture cure models and they did not >>> yield good results. Cure percentage was very low, the fit had long tails >>> and >>> the median survival time is greater than 1096 days. Therefore, there >>> exists >>> an identification problem. >>> >>> >>> Further, >>> >>> 1. since the data is in days, is stset enough or episode splitting >>> is >>> required? >>> >>> >>> 2. Convergence problems with the EM algorithm and the bootstrapping >>> procedures if the data is in days. >>> >>> >>> 3. If I have to group the data to estimate a semiparametric mixture >>> cure model what is the optimal way to group the data? 5 days, 10 days? 30 >>> days? >>> >>> >>> Thank you for your help, >>> >>> >>> Sridhar Telidevara >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> >>> >> -- >> E. Paul Wileyto, Ph.D. >> Assistant Professor of Biostatistics >> Tobacco Use Research Center >> School of Medicine, U. of Pennsylvania >> 3535 Market Street, Suite 4100 >> Philadelphia, PA 19104-3309 >> >> 215-746-7147 >> Fax: 215-746-7140 >> [hidden email] >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > -- E. Paul Wileyto, Ph.D. Assistant Professor of Biostatistics Tobacco Use Research Center School of Medicine, U. of Pennsylvania 3535 Market Street, Suite 4100 Philadelphia, PA 19104-3309 215-746-7147 Fax: 215-746-7140 [hidden email] * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ |
This post has NOT been accepted by the mailing list yet.
In reply to this post by E. Paul Wileyto
paul I'm making models with cox regresion
but I can not use the syntax in Stata what syntax to cox regresion with cure models in stata? how the sintax details? what a Stata version that could be used for phi (coefficient of cure models)? thaks |
Free forum by Nabble | Edit this page |