Confidence regions in Cox proportional hazards model with measurement errors and unbounded parameter set

Cox proportional hazards model with measurement errors is considered. In Kukush and Chernova (2017), we elaborated a simultaneous estimator of the baseline hazard rate $\lambda(\cdot)$ and the regression parameter $\beta$, with the unbounded parameter set $\varTheta=\varTheta_{\lambda}\times\varTheta_{\beta}$, where $\varTheta_{\lambda}$ is a closed convex subset of $C[0,\tau]$ and $\varTheta_{\beta}$ is a compact set in $\mathbb{R}^m$. The estimator is consistent and asymptotically normal. In the present paper, we construct confidence intervals for integral functionals of $\lambda(\cdot)$ and a confidence region for $\beta$ under restrictions on the error distribution. In particular, we handle the following cases: (a) the measurement error is bounded, (b) it is a normally distributed random vector, and (c) it has independent components which are shifted Poisson random variables.


Introduction
Survival analysis models time to an event of interest (e.g., lifetime). It is a powerful tool in biometrics, epidemiology, engineering, and credit risk assessment in financial institutions. The proportional hazards model proposed in Cox (1972) [3] is a widely used technique to characterize a relation between survival time and covariates.
Our model is presented in Augustin (2004) [1] where the baseline hazard function λ(·) is assumed to belong to a parametric space, while we consider λ(·) belonging to a closed convex subset of C[0, τ ]. In practice covariates are often contaminated by errors, so we deal with errors-in-variables model. Kukush et al. (2011) [5] derive a simultaneous estimator of the baseline hazard rate λ(·) and the regression parameter β and prove the consistency of the estimator. At that, the parameter set Θ λ for the baseline hazard rate is assumed to be bounded and separated away from zero. The asymptotic normality of the estimator is shown in Chimisov and Kukush (2014) [2]. In [7,6] we construct an estimator (λ (1) n (·),β (1) n ) of λ(·) and β over the parameter set Θ = Θ λ × Θ β , where n is the sample size and Θ λ is a subset of C[0, τ ], which is unbounded from above and not separated away from zero. The estimator is consistent and can be modified to be asymptotically normal.
The goal of present paper is to construct confidence intervals for integral functionals of λ(·) and a confidence region for β based on the estimators from [7,6]. We impose certain restrictions on the error distribution. Actually we handle three cases: (a) the measurement error is bounded, (b) it is a normally distributed random vector, and (c) it has independent components which are shifted Poisson random variables.
The paper is organized as follows. Section 2 describes the observation model, gives main assumptions, defines an estimator under an unbounded parameter set, and states the asymptotic normality result from [7,6]. Sections 3 and 4 present the main results: a confidence region for the regression parameter and confidence intervals for integral functionals of the baseline hazard rate. Section 5 provides a method to compute auxiliary consistent estimates, and Section 6 concludes.
Throughout the paper, all vectors are column ones, E stands for the expectation, Var stands for the variance, and Cov for the covariance matrix. A relation holds eventually if it is valid for all sample sizes n starting from some random number, almost surely.

The model and estimator
Let T denote the lifetime and have the intensity function We observe censored data, i.e., instead of T only a censored lifetime Y := min{T, C} and the censorship indicator ∆ := I {T ≤C} are available, where the censor C is distributed on a given interval [0, τ ]. The survival function of censor The conditional survival function of T given X equals We deal with an additive error model, which means that instead of X, a surrogate variable W = X + U is observed. We suppose that a random error U has known moment generating function M U (z) := Ee z ⊤ U , where ||z|| is bounded according to assumptions stated below. A couple (T, X), censor C, and measurement error U are stochastically independent. Introduce assumptions from [7,6].
(i) Θ λ ⊂ C[0, τ ] is the following closed convex set of nonnegative functions where L > 0 is a fixed constant.
(ii) Θ β ⊂ R m is a compact set.
(vi) The covariance matrix of random vector X is positive definite.
Definition 1. Fix a sequence {ε n } of positive numbers, with ε n ↓ 0, as n → ∞. The corrected estimator (λ Theorem 3 from [7,6] proves that under conditions (i) to (vii) the corrected estimator (λ (1) n ,β (1) n ) is a strongly consistent estimator of the true parameters (λ 0 , β 0 ). In the proof of Theorem 3 from [7,6], it is shown that eventually and for R large enough, the upper bound on the right-hand side of (2) can be taken over the set with center in the origin and radius R. Thus, we assume that for all n ≥ 1, and (λ n ) which is consistent and asymptotically normal. Definition 2. The modified corrected estimator (λ Below we use notations from [2]. Let For i = 1, 2, . . ., introduce random variables where ϕ = (ϕ λ , ϕ β ) ∈ C[0, τ ] × R m and q ′ denotes the Fréchet derivative.

Confidence regions for the regression parameter
Denote as E X [·] the conditional expectation given a random variable X. Remember that M U (z) = Ee z ⊤ U . For simplicity of notation, we write M k,β instead of M U ((k + 1)β). Using differentiation in z one can easily prove the following.
Lemma 1. The equalities hold true: Now, we state conditions on measurement error U under which one can construct unbiased estimators for a(t), b(t) and p(t), t ∈ [0, τ ]. with Then there exist functions B(·, ·), A(·, ·) and P (·, ·) which satisfy deconvolution equations: Proof. We find solutions to the equations in a form of series expansions using the idea from Stefanski (1990) [8].
(a) Utilizing Taylor decomposition of the right-hand side, we obtain Using Lemma 1 take for k ≥ 0 Here no additional restriction on U is needed.
The latter sum is finite due to condition (6). Therefore, there exists a solution to the second equation.
(c) Finally, for the third equation we put Hereafter Q is the Euclidean norm of a matrix Q. We have The right-hand side of (8) is a sum of four series which can be bounded similarly based on condition (6). E.g., for the last of the four series we have: Therefore, condition (6) yields (7), and P (W, t) is a solution to the third equation.
(b) For a normally distributed vector U with components U (i) , we have Ee tU (i) = exp( Thus, Then (6) holds true.
Definition 3. The Kaplan-Meier estimator of the survival function of censor C is defined asĜ . . , n}, and Y (n) is the largest order statistic.
We state the convergence of the Kaplan-Meier estimator. Remember that Y = min{T, C}. Let G Y (t) be the survival function of Y .

Theorem 4 ([4]). Assume the following:
(a) survival functions G T and G C are continuous, and for some fixed 0 < S < ∞ and 0 < δ < 1 2 . Then a.s. for all n ≥ 2, In our model, the lifetime T has a continuous survival function, and if we assume that the same holds true for the censor C, then the first condition of Theorem 4 is satisfied. Next, it holds G Y (t) = G T (t)G C (t) and due to condition (v) for all small enough positive ε there exists 0 < δ < 1 2 such that Therefore, the second condition holds as well, with S = τ − ε.
Relation (9) is equivalent to the following: there exists a random variable C S (ω) such that a.s. for all n ≥ 2, We have Due to the above-stated consistency ofT (·)K(·) and sinceĜ C is bounded by 1, the first summand in (10) converges to zero a.s. as n → ∞.
Consider the second summand. Let S = τ − ε for some fixed ε > 0. There are two possibilities: Y (n) ≤ S and S < Y (n) ≤ τ . In the first case, In the second case, It holds that Y (n) → τ a.s. Utilizing Theorem 4, we first tend n → ∞ and then ε → 0 and obtain convergence of the second summand of (10) to 0 a.s. as n → ∞.

Confidence intervals for the baseline hazard rate
Theorem 1 implies the following statement.
Corollary 1. Let 0 < ε < τ . Assume that the censor C has a bounded pdf on Here we set f (τ ) . We show that asymptotic variance σ 2 ϕ is positive and construct its consistent estimator. (xii) For all nonzero z ∈ R m , at least one of random variables z ⊤ X and z ⊤ U is nonatatomic.
It holds P(∆ = 0) > 0 and according to (x), C > 0 a.s. Thus, in order to get a contradiction it is enough to prove that Since C and W are independent, it holds where for x ∈ (0, τ ], Here v x is a nonrandom real number. In the latter equality we use assumption (vii) to guarantee that x 0 λ 0 (u)du > 0. Further, ϕ β = −A −1 m(ϕ λ ) = 0 because according to (xi) m(ϕ λ ) = 0. Using independence of X and U together with assumption (xii), we conclude that for all nonzero z ∈ R m , z ⊤ W = z ⊤ X + z ⊤ U is nonatomic. Then ϕ ⊤ β W is nonatomic as well and π x = 0.

Results of Section 3 yield thatÂ is a consistent estimator of A.
Denotê and defineφ λ as a solution in L 2 [0, τ ] to the Fredholm integral equation with a degenerate kernel Eventually, a solution is unique because the limiting equation (12) has a unique solution. The functionφ λ can be assumed right-continuous and it converges a.s. to ϕ λ from (12) in the supremum norm. Therefore, is a consistent estimator of ϕ β .
Finally, we construct an estimator of σ 2 ϕ . Put Lemma 2 and the consistency of auxiliary estimators yield the following consistency result.
Theorem 5. Assume that condition (6) together with conditions (i) -(xii) are fulfilled and censor C has a continuous survival function. Then σ 2 ϕ > 0 and For fixed ε > 0, consider an integral functional of the baseline hazard rate, a.s. as n → ∞. Now, we show that we can truncate the series. Let {N n : n ≥ 1} be a strictly increasing sequence of nonrandom positive integers. Fix t for the moment and omit this argument t. Consider the head of series B(W i ), Fix j ≥ 1, then for n ≥ j it holds: The latter expression tends to zero as j → ∞. Therefore, almost surely We conclude that 1 n a.s. as n → ∞. Moreover, with probability one the convergence is uniform in (λ, β) belonging to a compact set. Therefore, it is enough to truncate the series B(W, t) by some large numbers, which makes feasible the computation of estimators from Section 3.

Conclusion
At the end of Section 3, we constructed asymptotic confidence intervals for integral functionals of the baseline hazard rate λ 0 (·), and at the end of Section 4, we constructed an asymptotic confidence region for the regression parameter β. We imposed some restrictions on the error distribution. In particular, we handled the following cases: (a) the measurement error is bounded, (b) it is normally distributed, and (c) it has independent components which are shifted Poisson random variables. Based on truncated series, we showed a way to compute auxiliary estimates which are used in construction of the confidence sets. In future we intend to elaborate a method to construct confidence regions in case of heavy-tailed measurement errors.