survival analysis with unknown censoring

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view

survival analysis with unknown censoring

Wagner, Stefan
I am analyzing survival times with no time-varying co-variates. At the moment, I am using a Cox proportional hazards model based on STATA's stcox.

The data is characterized as follows:

For all observations in the sample it is known when an individual joined the risk pool, i.e., starting dates are known for all observations. Basically, spells can be terminated by two different outcomes A and B. Unfortunately, I only observe one of those two outcomes, A. For those cases, I also know when A happened and I can compute the duration of spells ending in A as (date of A minus entry date).

For the remaining observations it is impossible to determine whether the spell already was terminated by event B or whether the observation is still at risk.

Due to this data structure it seems unreasonable to treat observations that didn't end in A as censored observations as I cannot know whether they are still in the risk pool (here duration would be date today minus entry date) or whether they left the risk pool to destination B (then duration would be date of B minus entry date).

Currently, I am estimating the Cox model only for observations that ended in A excluding all other observations from the estimation. As a robustness check, I also estimate a Heckman selection model where the selection is defined over (spell ended in A yes/no) and duration is the dependent variable in stage 2. Results of both exercises are comparable.

Is anyone aware of how to deal with this problem in a better way? Or some literature looking at potential biases from excluding observations with unknown spell-endings? Thanks for your support!


Stefan Wagner
Institut für Innovationsforschung, Technologiemanagement und Entrepreneurship
Ludwig-Maximilians-Universität München
Kaulbachstr. 45/III
80539 München
Tel.: ++49/89/2180-2877
Fax: ++49/89/2180-6284
[hidden email]

*   For searches and help try:
Reply | Threaded
Open this post in threaded view

Re: survival analysis with unknown censoring

This post has NOT been accepted by the mailing list yet.
Dear Mr. Wagner,

All dates in my data is known (1988-2008).  However, there are no failures in the period of analysis for some firms, while other firms have just one observation and fail.
My doubt is the time duration of the observations that do not fail. In Stata, firms are without any value in the cases there are no failure.  However when I run stset time, failure(dep.var) id(idfirm), the message "PROBABLE ERROR" appears.
As my goal is to study the second entry in the market, for firms that have just one observation and did not fail NO value is related to them, while firms that have a failure in the first event has the value "zero". And if the second event is a failure it has value "1".

Please, should I run other commands to avoid probable errors? Am I using the right values for the circumstances that I mentioned?
Thank you very much.
Leo Quadros
Universidad Autonoma de Barcelona