1 Introduction
In medical, biological and sociological studies the investigated population is frequently a mixture of subpopulations (components of the mixture) with different distributions of observed variables. If the subpopulation which a subject belongs to is not known exactly, the distribution of its variables is a mixture of subpopulations’ distributions. In the classical finite mixture models (FMM) the concentrations of the components in the mixture (mixing probabilities) are the same for all observations. See [11, 9] and [14] for results on parametric estimation under FMM. In a more flexible mixture with varying concentrations model (MVC) the concentrations are different for different observations. See [7, 8] for the theory of nonparametric estimation in these models and their application to a DNA-microchip data, and [10] for the application of MVC in the analysis of a neurological data.
Regression models are applied usually to describe dependency between different numerical variables of one subject. In the case of homogeneous sample there exist many nonparametric estimators of the regression function, such as the Nadaraya–Watson estimator (NWE) and local linear regression estimator (LLRE) [4]. A modification of NWE (mNWE) for the estimation of regression function of some MVC component is presented in [2] which also contains the derivation of asymptotic normality for mNWE.
It is well known that for homogeneous samples NWE demonstrates an inappropriate bias in points where the regressor probability density function (PDF) has discontinuity (jump points). The bias of LLRE in this case is significantly smaller [3]. A modification of LLRE for MVC (mLLRE) was considered in [5]. The consistency of mLLRE was shown in [6] and the performance of mLLRE was compared to mNWE by simulations.
In this paper we continue the study of asymptotic mLLRE behavior in jump points and points of continuity of the regressor’s PDF. It is shown that under suitable assumptions mLLRE is asymptotically normal at jump points as well as in the continuity points of the regressor distribution. This result allows to calculate the theoretically optimal bandwidth for mLLRE which minimizes the asymptotic mean squared error.
Semiparametric models similar to the one considered in this paper were discussed in [15, 12] and [13]. In these papers some versions of EM-algorithm are used to estimate the regression functions of the mixture components. Since the EM-algorithm for mixtures is based on the iteratively reweighted likelihood maximization, to construct the estimators the authors need a parametric model for the error term in the regression model. In contrast to the EM technique the approach of this paper is nonparametric both by the regression function and the distribution of the errors.
The rest of the paper is organized as follows. In Section 2 the mixture of regression models is described, in terms of which the definition of the mLLRE is recalled. Section 3 contains the main result on asymptotic normality of the estimator. In Section 4 an optimal bandwith parameter selection for the mLLRE is discussed. The proof of the main result is presented in Section 5. Simulations for the mLLRE are provided in Section 6. Conclusive remarks are placed in Section 7.
2 Mixture of regressions and the locally linear estimator
2.1 Mixture of regressions
Consider a sample with n subjects ${\{{O_{j}}\}_{j=1}^{n}}$. Each subject ${O_{j}}$ belongs to one of the M subpopulations (components of the mixture). For each $j=\overline{1,n}$ the component which contains ${O_{j}}$ is unknown. The numerical index of the containing component is denoted ${\kappa _{j}}=\kappa ({O_{j}})$, $1\le {\kappa _{j}}\le M$; it is a latent (unobesrved) random variable, yet the distributions of ${\kappa _{j}}$ are assumed to be known. The probabilities
are called concentrations of components of the mixture or mixing probabilities.
For each subject ${O_{j}}$ one observes a bivariate vector of numerical variables ${\xi _{j}}=({X_{j}},{Y_{j}})$ where ${X_{j}}=X({O_{j}})$ and ${Y_{j}}=Y({O_{j}})$ are the regressor and response respectively. The distribution of these variables is described by the regression model:
where ${g^{(k)}}$ is an unknown regression function for k-th component of mixture, ${\varepsilon _{j}}=\varepsilon ({O_{j}})$ is a random error term. It is assumed that the vectors ${\{({X_{j}},{Y_{j}})\}_{j=1}^{n}}$ are mutually independent for any fixed $n\ge 1$, and for all $j=\overline{1,n}$, ${X_{j}}$ and ${\varepsilon _{j}}$ are conditionally independent under the condition $\{{\kappa _{j}}=m\}$, $m=\overline{1,M}$.
For all $k=\overline{1,M}$ the conditional distribution of ${X_{j}}\mid \{{\kappa _{j}}=k\}$ has a Lebesgue density ${f^{(k)}}$, which does not depend of j. We assume that the distributions of errors ${\varepsilon _{j}}$ satisfy the following conditions:
2.2 Minimax weights
In this paper we consider a modified locally linear estimator for ${g^{(m)}}$ at a fixed point ${x_{0}}\in \mathbb{R}$ introduced in [5]. This estimator utilizes minimax weights for the estimation of component distributions (see [7]). Let us recall the construction of these weights.
In what follows the angle brackets mean averaging of a vector:
where ${\gamma _{km}}$ is the $(k,m)$-th minor of ${\Gamma _{n}}$, are called minimax weighting coefficients. These weights can also be obtained by the formula
\[ {\left\langle \mathbf{v}\right\rangle _{n}}=\frac{1}{n}{\sum \limits_{j=1}^{n}}{v_{j}},\hspace{1em}\text{for any}\hspace{2.5pt}\mathbf{v}={({v_{1}},\dots ,{v_{n}})^{T}}\in {\mathbb{R}^{n}}.\]
Arithmetic operations with vectors in the angle brackets are performed entry-wise:
\[ {\left\langle \mathbf{v}\mathbf{u}\right\rangle _{n}}=\frac{1}{n}{\sum \limits_{j=1}^{n}}{v_{j}}{u_{j}}.\]
Consider a set of concentration vectors ${\mathbf{p}^{(m)}}={({p_{1:n}^{(m)}},\dots ,{p_{n:n}^{(m)}})^{T}}$, $m=\overline{1,M}$. Observe that ${\left\langle {\mathbf{p}^{(m)}}{\mathbf{p}^{(k)}}\right\rangle _{n}}$ can be considered as an inner product on ${\mathbb{R}^{n}}$. Assuming that the concentration vectors ${\{{\mathbf{p}^{(m)}}\}_{m=1}^{M}}$ are linearly independent, the Gram matrix ${\Gamma _{n}}={\left({\left\langle {\mathbf{p}^{(k)}}{\mathbf{p}^{(l)}}\right\rangle _{n}}\right)_{k,l=1}^{M}}$ is invertible. The weighting coefficients ${a_{j:n}^{(m)}}$ defined by the formula
(2)
\[ {a_{j:n}^{(m)}}=\frac{1}{\det {\Gamma _{n}}}{\sum \limits_{m=1}^{M}}{(-1)^{m+k}}{\gamma _{km}}{p_{j:n}^{(m)}},\]
\[ ({a_{j:n}^{(1)}},\dots ,{a_{j:n}^{(M)}})=({p_{j:n}^{(1)}},\dots ,{p_{j:n}^{(M)}}){\Gamma _{n}^{-1}}.\]
The vector of minimax coefficients for the m-th component will be denoted by ${\mathbf{a}^{(m)}}={({a_{1:n}^{(m)}},\dots ,{a_{n:n}^{(m)}})^{T}}$. Observe that
2.3 Construction of an estimator
The modified local linear regression estimator (mLLRE) for ${g^{(m)}}({x_{0}})$ was introduced in [5] as a generalization of local linear regression to the data described by the model of regression mixture (1). To define it one needs to choose a kernel function $K:\mathbb{R}\to {\mathbb{R}_{+}}$ and a bandwidth $h\gt 0$. For any $p,q\in {\mathbb{Z}_{+}}$ let
Then the mLLRE can be defined as
3 Asymptotic normality of an estimator
To formulate the result on the asymptotic normality of mLLRE we need some notations and definitions.
The symbol $\stackrel{\text{W}}{\longrightarrow }$ means weak convergence.
In what follows, the one-sided limits of a function $f(x)$ at a point ${x_{0}}$ are denoted by
\[ f({x_{0}}-)=\underset{x\to {x_{0}}-0}{\lim }f(x),\hspace{1em}f({x_{0}}+)=\underset{x\to {x_{0}}+0}{\lim }f(x),\]
assuming that these limits exist. With this notation, we define
\[\begin{aligned}{}{I_{d}^{(k),-}}& ={f^{(k)}}({x_{0}}+){\int _{-\infty }^{0}}{z^{d}}{(K(z))^{2}}dz,\hspace{0.2222em}{I_{d}^{(k),+}}={f^{(k)}}({x_{0}}-){\int _{0}^{+\infty }}{z^{d}}{(K(z))^{2}}dz.\\ {} {I_{d}^{(k)}}& ={I_{d}^{(k),+}}+{I_{d}^{(k),-}},\\ {} {I_{{d_{x}},{d_{y}}}^{(k)}}& =\left\{\begin{array}{l@{\hskip10.0pt}l}{I_{{d_{x}}}^{(k)}},\hspace{1em}& {d_{y}}=0,\\ {} ({I_{{d_{x}}}^{(k),+}}{g^{(k)}}({x_{0}}-)+{I_{{d_{x}}}^{(k),-}}{g^{(k)}}({x_{0}}+)),\hspace{1em}& {d_{y}}=1,\\ {} ({I_{{d_{x}}}^{(k),+}}({({g^{(k)}}({x_{0}}-))^{2}}+{\sigma _{(k)}^{2}})+{I_{{d_{x}}}^{(k),-}}({({g^{(k)}}({x_{0}}+))^{2}}+{\sigma _{(k)}^{2}})),\hspace{1em}& {d_{y}}=2,\end{array}\right.\\ {} {\Sigma _{{d_{x}}:{d_{y}}}^{(m)}}& ={\sum \limits_{k=1}^{M}}\left\langle {({\mathbf{a}^{(m)}})^{2}}{\mathbf{p}^{(k)}}\right\rangle {I_{{d_{x}},{d_{y}}}^{(k)}},\end{aligned}\]
\[ {\Sigma ^{(m)}}=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}{\Sigma _{0:0}^{(m)}}& {\Sigma _{0:1}^{(m)}}& {\Sigma _{1:0}^{(m)}}& {\Sigma _{1:1}^{(m)}}& {\Sigma _{2:0}^{(m)}}\\ {} {\Sigma _{0:1}^{(m)}}& {\Sigma _{0:2}^{(m)}}& {\Sigma _{1:1}^{(m)}}& {\Sigma _{1:2}^{(m)}}& {\Sigma _{2:1}^{(m)}}\\ {} {\Sigma _{1:0}^{(m)}}& {\Sigma _{1:1}^{(m)}}& {\Sigma _{2:0}^{(m)}}& {\Sigma _{2:1}^{(m)}}& {\Sigma _{3:0}^{(m)}}\\ {} {\Sigma _{1:1}^{(m)}}& {\Sigma _{1:2}^{(m)}}& {\Sigma _{2:1}^{(m)}}& {\Sigma _{2:2}^{(m)}}& {\Sigma _{3:1}^{(m)}}\\ {} {\Sigma _{2:0}^{(m)}}& {\Sigma _{2:1}^{(m)}}& {\Sigma _{3:0}^{(m)}}& {\Sigma _{3:1}^{(m)}}& {\Sigma _{4:0}^{(m)}}\end{array}\right),\]
\[\begin{aligned}{}{u_{p}^{-}}& ={\underset{-\infty }{\overset{0}{\int }}}{z^{p}}K(z)dz,\hspace{1em}{u_{p}^{+}}={\underset{0}{\overset{+\infty }{\int }}}{z^{p}}K(z)dz,\\ {} {u_{p}}& ={u_{p}^{-}}+{u_{p}^{+}},\\ {} {e_{{p_{x}},{p_{y}}}^{(m)}}& ={({g^{(m)}}({x_{0}}))^{{p_{y}}}}\cdot ({f^{(m)}}({x_{0}}-){u_{{p_{x}}}^{+}}+{f^{(m)}}({x_{0}}+){u_{{p_{x}}}^{-}}),\hspace{1em}{p_{y}}\in \{0,1\}.\end{aligned}\]
Now we are ready to formulate our main result on the asymptotic behavior of mLLRE.
Theorem 1.
Assume that the following conditions hold.
Then
where ${\mu ^{(m)}}({x_{0}})$ and ${S_{(m)}^{2}}({x_{0}})$ are defined by
-
1. For all $k=\overline{1,M}$, there exist ${f^{(k)}}({x_{0}}\pm )$, ${g^{(k)}}({x_{0}}\pm )$.
-
2. ${g^{(m)}}$ is twice continuously differentiable in some neighbourhood B of ${x_{0}}$.
-
3. For all $k,{k_{1}},{k_{2}}=\overline{1,M}$ the limits\[\begin{aligned}{}\left\langle {({\mathbf{a}^{(m)}})^{2}}{\mathbf{p}^{(k)}}\right\rangle & :=\underset{n\to +\infty }{\lim }{\left\langle {({\mathbf{a}^{(m)}})^{2}}{\mathbf{p}^{(k)}}\right\rangle _{n}},\\ {} \left\langle {\mathbf{a}^{(m)}}{\mathbf{p}^{({k_{1}})}}{\mathbf{p}^{({k_{2}})}}\right\rangle & :=\underset{n\to \infty }{\lim }{\left\langle {\mathbf{a}^{(m)}}{\mathbf{p}^{({k_{1}})}}{\mathbf{p}^{({k_{2}})}}\right\rangle _{n}}\end{aligned}\]exist and are finite.
-
4. There exists $\underset{n\to \infty }{\lim }{\Gamma _{n}}=\Gamma $, where ${\Gamma _{n}}={({\left\langle {\mathbf{p}^{({k_{1}})}}{\mathbf{p}^{({k_{2}})}}\right\rangle _{n}})_{{k_{1}},{k_{2}}=1}^{M}}$.
-
5. $h={h_{n}}=H{n^{-1/5}}$.
-
6. For some $A\gt 0$ and all z such that $|z|\gt A$, $K(z)=0$.
-
7. Integrals ${\textstyle\int _{-\infty }^{\infty }}|z|K(z)dz$ and ${\textstyle\int _{-\infty }^{\infty }}{z^{4}}{(K(z))^{2}}dz$ are finite.
-
8. ${f^{(k)}}(x)$, ${g^{(k)}}(x)$ are bounded for $x\in B$ for all $k=\overline{1,M}$.
-
9. ${e_{2,0}^{(m)}}{e_{0,0}^{(m)}}-{({e_{1,0}^{(m)}})^{2}}\ne 0$.
-
10. $\mathbf{E}\left[{\varepsilon _{j}^{4}}\mid {\kappa _{j}}=k\right]\lt \infty $ for all $k=\overline{1,M}$.
(5)
\[ {n^{2/5}}({\hat{g}_{n}^{(m)}}({x_{0}})-{g^{(m)}}({x_{0}}))\stackrel{\text{W}}{\longrightarrow }N({\mu ^{(m)}}({x_{0}}),{S_{(m)}^{2}}({x_{0}})),\]
\[\begin{aligned}{}{\mu ^{(m)}}({x_{0}})& ={H^{2}}\cdot \frac{{\ddot{g}^{(m)}}({x_{0}})}{2}\cdot \frac{{({e_{2,0}^{(m)}})^{2}}-{e_{1,0}^{(m)}}{e_{3,0}^{(m)}}}{{e_{2,0}^{(m)}}{e_{0,0}^{(m)}}-{({e_{1,0}^{(m)}})^{2}}},\\ {} {S_{(m)}^{2}}({x_{0}})& =\frac{1}{H}({({g^{(m)}}({x_{0}}))^{2}}{\tilde{\Sigma }_{0}^{(m)}}-2({g^{(m)}}({x_{0}})){\tilde{\Sigma }_{1}^{(m)}}+{\tilde{\Sigma }_{2}^{(m)}}),\\ {} {\tilde{\Sigma }_{k}^{(m)}}& ={\tilde{e}_{2,2}^{(m)}}{\Sigma _{0:k}^{(m)}}-2{\tilde{e}_{1,2}^{(m)}}{\Sigma _{1:k}^{(m)}}+{\tilde{e}_{1,1}^{(m)}}{\Sigma _{2:k}^{(m)}},\\ {} {\tilde{e}_{p,q}^{(m)}}& =\frac{{e_{p,0}^{(m)}}{e_{q,0}^{(m)}}}{{({e_{2,0}^{(m)}}{e_{0,0}^{(m)}}-{({e_{1,0}^{(m)}})^{2}})^{2}}},\end{aligned}\]
where
4 Optimal bandwidth selection
The mLLRE ${\hat{g}_{n}^{(m)}}({x_{0}})$ defined by (4) depends on the bandwidth h, a tuning parameter that must be selected by the researcher to obtain an accurate estimator. The accuracy of ${\hat{g}_{n}^{(m)}}({x_{0}})$ usually is measured by the mean squared error
\[ \text{MSE}({\hat{g}_{n}^{(m)}}({x_{0}}))=\mathbf{E}[{({\hat{g}_{n}^{(m)}}({x_{0}})-{g^{(m)}}({x_{0}}))^{2}}]=\mathbf{Var}[{\hat{g}_{n}^{(m)}}({x_{0}})]+{(\mathbf{bias}({\hat{g}_{n}^{(m)}}({x_{0}})))^{2}},\]
where $\mathbf{bias}({\hat{g}_{n}^{(m)}}({x_{0}}))=\mathbf{E}[{\hat{g}_{n}^{(m)}}({x_{0}})]-{g^{(m)}}({x_{0}})$ is the estimator’s bias. In Theorem 1 we considered the choice $h={h_{n}}=H{n^{-1/5}}$, where H is some fixed constant. This rate of convergence for the bandwidth as $n\to \infty $ is optimal, since if ${h_{n}}$ vanishes slower, the estimator has inappropriately high bias, while for more rapid ${h_{n}}$ decay the variance of the estimator would be inappropriate. So, we need to choose the best constant H. By Theorem 1, ${n^{2/5}}({\hat{g}_{n}^{(m)}}({x_{0}})-{g^{(m)}}({x_{0}}))$ converges weakly to $\eta \sim N({\mu ^{(m)}}({x_{0}}),{S_{(m)}^{2}}({x_{0}}))$, so we will measure the asymptotic accuracy of ${\hat{g}_{n}^{(m)}}({x_{0}})$ by the asymptotic MSE (aMSE):
\[\begin{aligned}{}\text{aMSE}(H)& =\mathbf{E}[{\eta ^{2}}]={\left({\mu ^{(m)}}({x_{0}})\right)^{2}}+{S_{(m)}^{2}}({x_{0}})={H^{4}}\cdot {E_{(m)}^{2}}+\frac{1}{H}\cdot {V_{(m)}},\end{aligned}\]
where
\[\begin{aligned}{}{E_{(m)}}& =\frac{{\ddot{g}^{(m)}}({x_{0}})}{2}\cdot \frac{{({e_{2,0}^{(m)}})^{2}}-{e_{1,0}^{(m)}}{e_{3,0}^{(m)}}}{{e_{2,0}^{(m)}}{e_{0,0}^{(m)}}-{({e_{1,0}^{(m)}})^{2}}},\\ {} {V_{(m)}}& ={({g^{(m)}}({x_{0}}))^{2}}{\tilde{\Sigma }_{0}^{(m)}}-2({g^{(m)}}({x_{0}})){\tilde{\Sigma }_{1}^{(m)}}+{\tilde{\Sigma }_{2}^{(m)}}.\end{aligned}\]
An optimal bandwidth constant, which minimizes aMSE, is
Observe that ${H_{\ast }^{(m)}}$ cannot be calculated by the data, since it depends on unknown distributions of the mixture components. So it is an infeasible theoretically optimal bandwidth constant which can be used in comparisons to some empirical bandwidth selection rules.
5 Proofs
The proof of Theorem 1 is based on two lemmas.
Let
(6)
\[\begin{array}{l}\displaystyle {\mathbf{S}_{n}^{(m)}}={({S_{0,0}^{(m)}},{S_{0,1}^{(m)}},{S_{0,2}^{(m)}},{S_{1,0}^{(m)}},{S_{2,0}^{(m)}})^{T}},\hspace{3.33333pt}{\mathbf{e}_{n}^{m}}=\mathbf{E}[{\mathbf{S}_{n}^{(m)}}].\\ {} \displaystyle {\Delta _{n}^{(m)}}=\sqrt{nh}({\mathbf{S}_{n}^{(m)}}-{\mathbf{e}_{n}^{(m)}}).\end{array}\]Lemma 1.
Under Assumptions 1–4 and 6–10 of Theorem 1, if $h={h_{n}}\to 0$ and $n{h_{n}}\to \infty $ as $n\to \infty $, then
For any $\mathbf{a}={({a_{0,0}},{a_{0,1}},{a_{1,0}},{a_{1,1}},{a_{2,0}})^{T}}\in {\mathbb{R}^{5}}$ let
Proof of Theorem 1.
Consider
Lemma 1 and the continuous mapping theorem (see Theorem 3.1 in [1]) yield
By Lemma 2 for $h=H{n^{-1/5}}$,
Combining (9)–(11) one obtains the statement of Theorem 1. □
(9)
\[ {n^{2/5}}({\hat{g}^{(m)}}({x_{0}})-{g^{(m)}}({x_{0}}))={n^{2/5}}(\mathbf{U}({\mathbf{S}_{n}^{(m)}})-\mathbf{U}({\mathbf{e}_{n}^{(m)}}))+{n^{2/5}}(\mathbf{U}({\mathbf{e}_{n}^{(m)}})-{g^{(m)}}({x_{0}})).\]
\[ {n^{2/5}}(\mathbf{U}({\mathbf{S}_{n}^{(m)}})-\mathbf{U}({\mathbf{e}_{n}^{(m)}}))\stackrel{\text{W}}{\longrightarrow }\frac{1}{\sqrt{H}}{\dot{\mathbf{U}}^{T}}({\mathbf{e}_{n}^{(m)}}){\Delta _{\infty }^{(m)}},\]
where
\[ \dot{\mathbf{U}}(\mathbf{a})={\left(\frac{d}{d{a_{0,0}}}\mathbf{U}(\mathbf{a}),\frac{d}{d{a_{0,1}}}\mathbf{U}(\mathbf{a}),\frac{d}{d{a_{1,0}}}\mathbf{U}(\mathbf{a}),\frac{d}{d{a_{1,1}}}\mathbf{U}(\mathbf{a}),\frac{d}{d{a_{2,0}}}\mathbf{U}(\mathbf{a})\right)^{T}}\]
is the gradient of U. Tedious but straightforward algebra yields
(10)
\[ \mathbf{Var}\left[\frac{1}{\sqrt{H}}{\dot{\mathbf{U}}^{T}}({\mathbf{e}_{n}^{(m)}}){\Delta _{\infty }^{(m)}}\right]=\frac{1}{H}\dot{\mathbf{U}}{({\mathbf{e}_{n}^{(m)}})^{T}}{\Sigma ^{(m)}}\dot{\mathbf{U}}({\mathbf{e}_{n}^{(m)}})={S_{(m)}^{2}}({x_{0}}).\](11)
\[ {n^{2/5}}(\mathbf{U}({\mathbf{e}_{n}^{(m)}})-{g^{(m)}}({x_{0}}))\to \frac{{H^{2}}}{2}\cdot {\ddot{g}^{(m)}}({x_{0}})\cdot \frac{{({e_{2,0}^{(m)}})^{2}}-{e_{1,0}^{(m)}}{e_{3,0}^{(m)}}}{{e_{2,0}^{(m)}}{e_{0,0}^{(m)}}-{({e_{1,0}^{(m)}})^{2}}}.\]To demonstrate Lemma 1 we need the Lindeberg–Feller central limit theorem.
Lemma 3.
For the proof, see [1], Theorem 8.4.1.
Proof of Lemma 1.
To simplify notations, we introduce formally random vectors $({X_{(m)}},{Y_{(m)}},{\varepsilon _{(m)}})$ with the distribution of $({X_{j}},{Y_{j}},{\varepsilon _{j}})$ given ${\kappa _{j}}=m$.
Conditions of Lemma 3 will be verified for $\{{\eta _{j:n}^{(m)}}\}$, where
\[\begin{aligned}{}{\eta _{j:n}^{(m)}}& ={a_{j:n}^{(m)}}\cdot {\tilde{\eta }^{\prime }_{j:n}},\hspace{0.2222em}{\tilde{\eta }^{\prime }_{j:n}}=({\tilde{\eta }_{j:n}}-\mathbf{E}[{\tilde{\eta }_{j:n}}]),\\ {} {\tilde{\eta }_{j:n}}& =\frac{1}{\sqrt{nh}}\cdot K\left(\frac{{x_{0}}-{X_{j}}}{h}\right){\left(1,\hspace{0.2222em}{Y_{j}},\hspace{0.2222em}\left(\frac{{x_{0}}-{X_{j}}}{h}\right),\hspace{0.2222em}\left(\frac{{x_{0}}-{X_{j}}}{h}\right){Y_{j}},\hspace{0.2222em}{\left(\frac{{x_{0}}-{X_{j}}}{h}\right)^{2}}\right)^{T}}\hspace{-0.1667em}\hspace{-0.1667em}.\end{aligned}\]
Similarly we define random variables ${\tilde{\eta }_{(m):n}}$ (${\tilde{\eta }^{\prime }_{(m):n}}$) that have a distribution of ${\tilde{\eta }_{j:n}}$ (${\tilde{\eta }^{\prime }_{j:n}}$) given ${\kappa _{j}}=m$. Obviously, ${\Delta _{n}^{(m)}}={\textstyle\sum _{j=1}^{n}}{\eta _{j:n}^{(m)}}$.The first condition of Lemma 3 holds since $({X_{j}},{Y_{j}})$ are independent for different j. The second condition follows from the construction of ${\eta _{j:n}^{(m)}}$.
From now, we will proceed to the third condition of Lemma 3. For some ${p_{x}},{p_{y}},{q_{x}},{q_{y}}$, consider
\[ \operatorname{\mathbf{Cov}}({S_{{p_{x}},{p_{y}}}^{(m)}},{S_{{q_{x}},{q_{y}}}^{(m)}})={\Sigma _{{p_{x}},{p_{y}}:{q_{x}},{q_{y}}}^{(m)}}(n)={Q_{1}^{(m)}}(n,h)-{Q_{2}^{(m)}}(n,h),\]
\[\begin{aligned}{}{Q_{1}^{(m)}}(n,h)& =\frac{1}{nh}{\sum \limits_{j=1}^{n}}{({a_{j:n}^{(m)}})^{2}}\mathbf{E}\left[{\left(K\left(\frac{{x_{0}}-{X_{j}}}{h}\right)\right)^{2}}{\left(\frac{{x_{0}}-{X_{j}}}{h}\right)^{{p_{x}}+{q_{x}}}}{Y_{j}^{{p_{y}}+{q_{y}}}}\right],\\ {} {Q_{2}^{(m)}}(n,h)& =\frac{1}{nh}{\sum \limits_{j=1}^{n}}{({a_{j:n}^{(m)}})^{2}}\mathbf{E}\left[K\left(\frac{{x_{0}}-{X_{j}}}{h}\right){\left(\frac{{x_{0}}-{X_{j}}}{h}\right)^{{p_{x}}}}{Y_{j}^{{p_{y}}}}\right]\\ {} & \hspace{1em}\times \mathbf{E}\left[K\left(\frac{{x_{0}}-{X_{j}}}{h}\right){\left(\frac{{x_{0}}-{X_{j}}}{h}\right)^{{q_{x}}}}{Y_{j}^{{q_{y}}}}\right].\end{aligned}\]
We will investigate ${Q_{1}^{(m)}}(n,h)$ and ${Q_{2}^{(m)}}(n,h)$ separately. First of all, note that
\[\begin{aligned}{}& {Q_{1}^{(m)}}(n,h)\\ {} & \hspace{1em}=\frac{1}{h}{\sum \limits_{k=1}^{M}}{\left\langle {({\mathbf{a}^{(m)}})^{2}}{\mathbf{p}^{(k)}}\right\rangle _{n}}\mathbf{E}\left[{\left(K\left(\frac{{x_{0}}-{X_{(k)}}}{h}\right)\right)^{2}}{\left(\frac{{x_{0}}-{X_{(k)}}}{h}\right)^{{p_{x}}+{q_{x}}}}{Y_{(k)}^{{p_{y}}+{q_{y}}}}\right].\end{aligned}\]
Consider the expectations in the sum and denote ${d_{x}}={p_{x}}+{q_{x}}$ and ${d_{y}}={p_{y}}+{q_{y}}\le 4$. Then, for all $k=\overline{1,M}$,
\[\begin{aligned}{}& \frac{1}{h}\mathbf{E}\left[{\left(K\left(\frac{{x_{0}}-{X_{(k)}}}{h}\right)\right)^{2}}{\left(\frac{{x_{0}}-{X_{(k)}}}{h}\right)^{{p_{x}}+{q_{x}}}}{Y_{(k)}^{{p_{y}}+{q_{y}}}}\right]\\ {} & \hspace{1em}={\sum \limits_{l=0}^{{d_{y}}}}\left(\genfrac{}{}{0.0pt}{}{{d_{y}}}{l}\right)\cdot \mathbf{E}\left[{\varepsilon _{(k)}^{{d_{y}}-l}}\right]\cdot {\underset{-\infty }{\overset{+\infty }{\int }}}{(K(z))^{2}}{z^{{d_{x}}}}{({g^{(k)}}({x_{0}}-hz))^{l}}{f^{(k)}}({x_{0}}-hz)dz,\end{aligned}\]
where $\left(\genfrac{}{}{0.0pt}{}{n}{k}\right)=n!/(k!(n-k)!)$ is the binomial coefficient. By Assumptions 1, 6 and 7 we obtain
\[\begin{aligned}{}& {\underset{-\infty }{\overset{+\infty }{\int }}}{(K(z))^{2}}{z^{{d_{x}}}}{({g^{(k)}}({x_{0}}-hz))^{l}}{f^{(k)}}({x_{0}}-hz)dz\\ {} & \hspace{1em}\to {({g^{(k)}}({x_{0}}-))^{l}}{I_{{d_{x}}}^{(k),+}}+{({g^{(k)}}({x_{0}}+))^{l}}{I_{{d_{x}}}^{(k),-}}\end{aligned}\]
as $n\to \infty $. So, for ${d_{x}}\in \{0,1,2,3,4\}$ and ${d_{y}}\in \{0,1,2\}$,
\[\begin{aligned}{}& \frac{1}{h}\mathbf{E}\left[{\left(K\left(\frac{{x_{0}}-{X_{(k)}}}{h}\right)\right)^{2}}{\left(\frac{{x_{0}}-{X_{(k)}}}{h}\right)^{{d_{x}}}}{Y_{(k)}^{{d_{y}}}}\right]\to {I_{{d_{x}},{d_{y}}}^{(k)}},\hspace{1em}n\to \infty .\end{aligned}\]
From the assumption $\mathbf{E}[{\varepsilon _{(k)}}]=0$ and Assumption 4, we obtain
\[ {Q_{1}^{(m)}}(n,h)\to {\sum \limits_{k=1}^{M}}\left\langle {({\mathbf{a}^{(m)}})^{2}}{\mathbf{p}^{(k)}}\right\rangle {I_{{d_{x}},{d_{y}}}^{(k)}},\hspace{1em}n\to \infty .\]
Now we will show that ${Q_{2}^{(m)}}(n,h)\to 0$ as $n\to \infty $. Note that
\[\begin{aligned}{}{Q_{2}^{(m)}}(n,h)& =h{\sum \limits_{{k_{1}},{k_{2}}=1}^{M}}{\left\langle {({\mathbf{a}^{(m)}})^{2}}{p^{({k_{1}})}}{p^{({k_{2}})}}\right\rangle _{n}}{Q_{{p_{x}},{p_{y}}}^{({k_{1}})}}(n,h){Q_{{q_{x}},{q_{y}}}^{({k_{2}})}}(n,h),\\ {} {Q_{{p_{x}},{p_{y}}}^{(k)}}(n,h)& ={\underset{-\infty }{\overset{+\infty }{\int }}}{\underset{-\infty }{\overset{+\infty }{\int }}}K(z){z^{{p_{x}}}}{({g^{(k)}}({x_{0}}-hz)+u)^{{p_{y}}}}{f^{(k)}}({x_{0}}-hz)dzd{F_{\varepsilon }^{(k)}}(u),\end{aligned}\]
where ${F_{\varepsilon }^{(k)}}(u)=\mathbf{P}\left({\varepsilon _{(k)}}\lt u\right)$ is the cumulative distribution function of ${\varepsilon _{(k)}}$. The multiple integrals ${Q_{{p_{x}},{p_{y}}}^{(k)}}(n,h)$ are bounded for $n\ge 1$, thus from Assumption 3 we obtain ${Q_{2}^{(m)}}(n,h)\to 0$ as $n\to \infty $.Combining the asymptotics of ${Q_{1}^{(m)}}(n,h)$ and ${Q_{2}^{(m)}}(n,h)$ as $n\to \infty $, we obtain the asymptotics of covariances for ${\Delta _{n}^{(m)}}$:
\[ {\Sigma _{{p_{x}},{p_{y}}:{q_{x}},{q_{y}}}^{(m)}}(n)\to {\sum \limits_{k=1}^{M}}\left\langle {({\mathbf{a}^{(m)}})^{2}}{\mathbf{p}^{(k)}}\right\rangle {I_{{p_{x}}+{q_{x}},{p_{y}}+{q_{y}}}^{(k)}}={\Sigma _{{p_{x}}+{q_{x}}:{p_{y}}+{q_{y}}}^{(m)}},\hspace{1em}n\to \infty .\]
The third condition of Lemma 3 holds.Finally we will show that the fourth condition of Lemma 3 holds. For some $s\gt 2$, note that
we obtain
Let us show (14). Observe that for any $p\ge 2$
Really, the left-hand side of (15) can be expressed as follows:
\[\begin{aligned}{}{M_{2}}(s)& ={\sum \limits_{j=1}^{n}}\mathbf{E}\left[\min (|{\eta _{j:n}}{|^{2}},|{\eta _{j:n}}{|^{s}})\right]\\ {} & ={\sum \limits_{j=1}^{n}}{\sum \limits_{k=1}^{M}}{p_{j:n}^{(k)}}\mathbf{E}\left[\min (|{a_{j:n}^{(m)}}{|^{2}}\cdot |{\tilde{\eta }^{\prime }_{(k):n}}{|^{2}},|{a_{j:n}^{(m)}}{|^{s}}\cdot |{\tilde{\eta }^{\prime }_{(k):n}}{|^{s}})\right]\\ {} & \le {C_{\Gamma }}\cdot {\sum \limits_{j=1}^{n}}{\sum \limits_{k=1}^{M}}\mathbf{E}\left[\min (|{\tilde{\eta }^{\prime }_{(k):n}}{|^{2}},|{\tilde{\eta }^{\prime }_{(k):n}}{|^{s}})\right],\end{aligned}\]
since ${p_{j:n}^{(m)}}\le 1$, $|{a_{j:n}^{(m)}}{|^{2}}\le \max (1,{\sup _{j=\overline{1,n}}}|{a_{j:n}^{(m)}}{|^{s}})={C_{\Gamma }}\lt \infty $. By the inequality
(12)
\[ |\mathbf{a}+\mathbf{b}{|^{s}}\le {2^{s-1}}(|\mathbf{a}{|^{s}}+|\mathbf{b}{|^{s}}),\hspace{1em}\text{for any}\hspace{2.5pt}\mathbf{a},\mathbf{b}\in {\mathbb{R}^{d}},\]
\[\begin{aligned}{}& {\sum \limits_{j=1}^{n}}{\sum \limits_{k=1}^{M}}\mathbf{E}\left[\min (|{\tilde{\eta }^{\prime }_{(k):n}}{|^{2}},|{\tilde{\eta }^{\prime }_{(k):n}}{|^{s}})\right]\\ {} & \le {2^{s-1}}\cdot {\sum \limits_{k=1}^{M}}n\cdot \left(\mathbf{E}\left[\min \left(|{\tilde{\eta }_{(k):n}}{|^{2}},|{\tilde{\eta }_{(k):n}}{|^{s}}\right)\right]+\max \left(|\mathbf{E}\left[{\tilde{\eta }_{(k):n}}\right]{|^{2}},|\mathbf{E}\left[{\tilde{\eta }_{(k):n}}\right]{|^{s}}\right)\right).\end{aligned}\]
We will show that, as $n\to \infty $,
\[ n\cdot |\mathbf{E}[{\tilde{\eta }_{(k):n}}]{|^{p}}=n\cdot {\left({E_{0,0}^{(k)}}+{E_{0,1}^{(k)}}+{E_{1,0}^{(k)}}+{E_{1,1}^{(k)}}+{E_{2,0}^{(k)}}\right)^{p/2}},\]
where ${E_{{p_{x}},{p_{y}}}^{(k)}}={({(nh)^{-1/2}}\cdot \mathbf{E}[K(({x_{0}}-{X_{(k)}})/h){(({x_{0}}-{X_{(k)}})/h)^{{p_{x}}}}{Y_{(k)}^{{p_{y}}}}])^{2}}$.For instance,
for some ${C^{(k)}}\lt \infty $.
\[ {E_{2,0}^{(k)}}\le \frac{h}{n}\cdot {\left({\underset{-\infty }{\overset{+\infty }{\int }}}{z^{2}}K(z){f^{(k)}}({x_{0}}-hz)dz\right)^{2}}\sim \frac{h}{n}\cdot {C_{2,0}^{(m)}},\]
as $n\to \infty $, where
\[ {C_{2,0}^{(m)}}={\left({f^{(k)}}({x_{0}}+){u_{2}^{-}}+{f^{(k)}}({x_{0}}-){u_{2}^{+}}\right)^{2}}.\]
By similar reasoning for the other terms we obtain
(16)
\[ n\cdot |\mathbf{E}[{\tilde{\eta }_{(k):n}}]{|^{p}}\le n\cdot {\left(\frac{h}{n}\right)^{p/2}}\cdot {C^{(k)}},\]To show (13), observe that for any $\tau \gt 0$
where
We will show that ${Z_{n}}(0)$ is bounded and ${Z_{n}}(\tau )\to 0$ as $n\to \infty $ for any $\tau \gt 0$. So, taking τ small enough we can make the right-hand side of (18) as small as desired.
(18)
\[ \mathbf{E}\left[\min \left(|{\tilde{\eta }_{(k):n}}{|^{2}},|{\tilde{\eta }_{(k):n}}{|^{s}}\right)\right]\le {Z_{n}}(\tau )+{\tau ^{s-2}}{Z_{n}}(0),\](19)
\[ {Z_{n}}(\tau )=\mathbf{E}\left[|{\tilde{\eta }_{(k):n}}{|^{2}}\mathbf{1}\{|{\tilde{\eta }_{(k):n}}|\ge \tau \}\right].\]Let
Then
So, by (20), ${Z_{n}}(0)\lt {V^{\ast }}$.
\[\begin{aligned}{}& {Z_{n}}(\tau )=\frac{1}{h}\mathbf{E}\left[V\left(\frac{{x_{0}}-{X_{(k)}}}{h},{X_{(k)}},{\varepsilon _{(k)}}\right)\mathbf{1}\left\{V\left(\frac{{x_{0}}-{X_{(k)}}}{h},{X_{(k)}},{\varepsilon _{(k)}}\right)\gt {\tau ^{2}}nh\right\}\right]\\ {} & \hspace{1em}=\frac{1}{h}{\int _{-A}^{A}}\mathbf{E}\left[V\left(\frac{{x_{0}}-x}{h},x,{\varepsilon _{(k)}}\right)\mathbf{1}\left\{V\left(\frac{{x_{0}}-x}{h},x,{\varepsilon _{(k)}}\right)\gt {\tau ^{2}}nh\right\}\right]{f^{(k)}}(x)dx\\ {} & \hspace{1em}={\int _{-A}^{A}}\mathbf{E}\left[V\left(z,{x_{0}}-hz),{\varepsilon _{(k)}}\right)\mathbf{1}\left\{V\left(z,{x_{0}}-hz,{\varepsilon _{(k)}}\right)\gt {\tau ^{2}}nh\right\}\right]{f^{(k)}}({x_{0}}-hz)dz.\end{aligned}\]
By Assumption 8, ${g^{(k)}}$ and ${f^{(k)}}$ are bounded in a neighborhood B of ${x_{0}}$. For n large enough $[{x_{0}}-hA,{x_{0}}+hA]\in B$, so for $-A\le z\le A$,
where
\[\begin{array}{l}\displaystyle \bar{V}(z,u)=\bar{f}{K^{2}}(z)(1+{(\bar{g}+|u|)^{2}}+{z^{2}}+{z^{2}}{(\bar{g}+|u|)^{2}}+{z^{4}}),\\ {} \displaystyle \bar{f}=\underset{x\in B}{\sup }{f^{(k)}}(x),\hspace{1em}\bar{g}=\underset{x\in B}{\sup }|{g^{(k)}}(x)|.\end{array}\]
By Assumptions 7 and 10,
(21)
\[ {\int _{-A}^{A}}\mathbf{E}\left[\bar{V}(z,{\varepsilon _{(k)}})\right]dz={V^{\ast }}\lt \infty .\]Observe that
as $n\to \infty $, since $nh\to \infty $ by Assumption 5. So, with (20) and (21) in mind, by the Lebesgue dominated convergence theorem we obtain ${Z_{n}}(\tau )\to 0$ as $n\to \infty $ for any $\tau \gt 0$.
Thus, for any $\delta \gt 0$ we can take $\tau \gt 0$ so small that ${\tau ^{s-2}}{Z_{n}}(0)\lt {\tau ^{s-2}}{V^{\ast }}\lt \delta /2$ and then ${n_{0}}$ so large that ${Z_{n}}(\tau )\lt \delta /2$ for $n\gt {n_{0}}$. By (18) this yields ${M_{2}}(s)\to 0$ as $n\to \infty $. So Assumption 4 of Lemma 3 holds.
Proof of Lemma 2.
Consider ${c_{n}^{(m)}}\hspace{0.1667em}=\hspace{0.1667em}{e_{2,0:n}^{(m)}}{e_{0,1:n}^{(m)}}-{e_{1,1:n}^{(m)}}{e_{1,0:n}^{(m)}}$ and ${d_{n}^{(m)}}\hspace{0.1667em}=\hspace{0.1667em}{e_{2,0:n}^{(m)}}{e_{0,0:n}^{(m)}}-{({e_{1,0:n}^{(m)}})^{2}}$, where
By continuity of U and convergence ${\mathbf{e}_{n}^{(m)}}\to {\mathbf{e}^{(m)}}$, we get $\mathbf{U}({\mathbf{e}_{n}^{(m)}})\to \mathbf{U}({\mathbf{e}^{(m)}})={g^{(m)}}({x_{0}})$.
(22)
\[ {e_{{p_{x}},{p_{y}}:n}^{(m)}}=\mathbf{E}[{S_{{p_{x}},{p_{y}}}^{(m)}}]={\underset{-\infty }{\overset{+\infty }{\int }}}K(z){z^{{p_{x}}}}{g^{(m)}}({x_{0}}-hz){f^{(m)}}({x_{0}}-hz)dz.\]We will examine the rate of convergence to zero for the difference
From (22) one obtains
and
By Taylor’s expansion for ${g^{(m)}}$ in the neighborhood of ${x_{0}}$, we obtain, as $n\to \infty $,
The asymptotics of
remains to be examined as $n\to \infty $, where
where $p\in \{0,1\}$. Since ${e_{p,q:n}^{(m)}}\to {e_{p,q}^{(m)}}$, it suffices to investigate the asymptotics of ${J_{R,p:n}^{(m)}}$.
(23)
\[ \mathbf{U}({\mathbf{e}_{n}^{(m)}})-{g^{(m)}}({x_{0}})=\frac{{c_{n}^{(m)}}-{d_{n}^{(m)}}{g^{(m)}}({x_{0}})}{{d_{n}^{(m)}}}.\](24)
\[ {c_{n}^{(m)}}={\underset{-\infty }{\overset{+\infty }{\int }}}{g^{(m)}}({x_{0}}-hz)K(z)({e_{2,0:n}^{(m)}}-z{e_{1,0:n}^{(m)}}){f^{(m)}}({x_{0}}-hz)dz\](25)
\[ {d_{n}^{(m)}}={\underset{-\infty }{\overset{+\infty }{\int }}}K(z)({e_{2,0:n}^{(m)}}-z{e_{1,0:n}^{(m)}}){f^{(m)}}({x_{0}}-hz)dz.\]
\[\begin{aligned}{}& {c_{n}^{(m)}}-{d_{n}^{(m)}}{g^{(m)}}({x_{0}})\\ {} & \hspace{1em}={\underset{-\infty }{\overset{+\infty }{\int }}}({g^{(m)}}({x_{0}}-hz)-{g^{(m)}}({x_{0}}))K(z)({e_{2,0:n}^{(m)}}-z{e_{1,0:n}^{(m)}}){f^{(m)}}({x_{0}}-hz)dz\\ {} & \hspace{1em}={\dot{g}^{(m)}}({x_{0}})(-h){\underset{-A}{\overset{A}{\int }}}zK(z)({e_{2,0:n}^{(m)}}-z{e_{1,0:n}^{(m)}}){f^{(m)}}({x_{0}}-hz)dz\\ {} & \hspace{2em}+\frac{{h^{2}}}{2}\cdot {\ddot{g}^{(m)}}({x_{0}}){\underset{-A}{\overset{A}{\int }}}{z^{2}}K(z)({e_{2,0:n}^{(m)}}-z{e_{1,0:n}^{(m)}}){f^{(m)}}({x_{0}}-hz)dz\\ {} & \hspace{2em}+{\underset{-A}{\overset{A}{\int }}}R(hz)K(z)({e_{2,0:n}^{(m)}}-z{e_{1,0:n}^{(m)}}){f^{(m)}}({x_{0}}-hz)dz=:{J_{1:n}^{(m)}}+{J_{2:n}^{(m)}}+{J_{3:n}^{(m)}},\end{aligned}\]
where $R(t)$ is some function, such that $|R(t)|/{t^{2}}\to 0$ as $t\to 0$. By (22),
and
(27)
\[ {J_{2:n}^{(m)}}=\frac{{h^{2}}}{2}\cdot {\ddot{g}^{(m)}}({x_{0}})({({e_{2,0:n}^{(m)}})^{2}}-{e_{1,0:n}^{(m)}}{e_{3,0:n}^{(m)}}).\]Note that for ${J_{R,p:n}^{(m),+}}={\textstyle\int _{0}^{A}}R(hz){z^{p}}K(z){f^{(m)}}({x_{0}}-hz)dz$,
For any $\varepsilon \in (0,1)$, there exists such $N(\varepsilon )$ that $|R(t)|\le \varepsilon {t^{2}}$, $n\ge N(\varepsilon )$. For $n\ge N(\varepsilon )$,
Similarly
From (28) and (30), we get
From (26), (27) and (31), we obtain the statement of Lemma 2. □
(29)
\[\begin{aligned}{}& \bigg|{\int _{0}^{A}}R(hz){z^{p}}K(z)dz\bigg|\le \varepsilon {h^{2}}{\int _{0}^{A}}{z^{2+p}}K(z)dz=o({h^{2}}),\hspace{1em}n\to \infty .\end{aligned}\]
\[ {J_{R,p:n}^{(m),-}}={\int _{-A}^{0}}R(hz){z^{p}}K(z){f^{(m)}}({x_{0}}-hz)dz=o({h^{2}}),\hspace{1em}n\to \infty .\]
Thus,
(30)
\[ {J_{R,p:n}^{(m)}}={J_{R,p:n}^{(m),+}}+{J_{R,p:n}^{(m),-}}=o({h^{2}}),\hspace{1em}n\to \infty .\]6 Simulations
6.1 Description of simulations
For simulations we considered a mixture of regressions with $M=2$ components. The concentrations $\{{p_{j:n}^{(m)}}\}$ were defined by
\[ {p_{j:n}^{(1)}}=\frac{j}{n},\hspace{1em}{p_{j:n}^{(2)}}=1-\frac{j}{n},\hspace{1em}j=\overline{1,n}.\]
The distribution of regressor ${X_{j}}$ was the same for both components. Its PDF was
\[ f(t)=\frac{3}{2}\cdot {\mathbf{1}_{(0,1/2]}}(t)+\frac{1}{2}\cdot {\mathbf{1}_{(1/2,1)}}(t),\hspace{1em}t\in \mathbb{R}.\]
The distribution of ${\varepsilon _{j}}$ was different for different experiments. Regression functions were defined as
Estimation was performed at ${x_{0}}=1/2$, which is a discontinuity point of $f(t)$. The simulation procedure was as follows:
-
1. For each sample size $n\in \{100,500,1000,5000,10000\}$, we generate $B=1000$ copies of ${\{({X_{j}},{Y_{j}})\}_{j=\overline{1,n}}}$ from the described model.
-
2. In each copy, the modified local linear regression estimator ${\hat{g}_{n}^{(m)}}({x_{0}})$ is computed at ${x_{0}}$ for each $m=\overline{1,M}$.
-
3. Having an array with B values of ${\hat{g}_{n}^{(m)}}({x_{0}})$, we compute sample bias and standard deviation
For ${\hat{g}_{n}^{(m)}}({x_{0}})$ we select an optimal bandwidth $h={H_{\ast }^{(m)}}{n^{-1/5}}$ and the Epanechnikov kernel
In this scenario, ${H_{\ast }^{(1)}}$ does not exist since ${E_{(m)}}=0$. So, in the experiments, we let ${H^{(1)}}={H_{\ast }^{(2)}}$.
6.2 Performance of mLLE
Experiment 1. In this experiment„ we consider ${\varepsilon _{j}}\sim N(0,1.25)$. The results of Experiment 1 for mLLE are presented in Table 1.
Table 1.
Computed values of sample bias and standard deviation for each n and m
n | $m=1$ | $m=2$ | ||
${\text{Bias}_{n}^{(m)}}$ | ${\text{Std}_{n}^{(m)}}$ | ${\text{Bias}_{n}^{(m)}}$ | ${\text{Std}_{n}^{(m)}}$ | |
100 | 0.7239 | 13.021 | −3.5882 | 149.4207 |
500 | 0.2935 | 2.6544 | 1.0857 | 2.692 |
1000 | 0.0979 | 2.5753 | 1.3031 | 2.6607 |
5000 | 0.1256 | 2.6908 | 1.3599 | 2.6578 |
10000 | −0.0103 | 2.5745 | 1.2657 | 2.5809 |
∞ | 0 | 2.6547 | 1.3274 | 2.6547 |
Here ${H_{\ast }^{(2)}}\approx 0.6261$. The simulation results show the agreement with the asymptotic considerations for large n.
Experiment 2. In this experiment, we consider ${\varepsilon _{j}}\sim {T_{5}}$, where ${T_{5}}$ is the Student T distribution with 5 degrees of freedom. The results of Experiment 2 for mLLE are presented in Table 2.
Here ${H_{\ast }^{(2)}}\approx 0.6575$. These results are also in accordance with the asymptotic calculations.
Table 2.
Computed values of sample bias and standard deviation for each n and m
n | $m=1$ | $m=2$ | ||
${\text{Bias}_{n}^{(m)}}$ | ${\text{Std}_{n}^{(m)}}$ | ${\text{Bias}_{n}^{(m)}}$ | ${\text{Std}_{n}^{(m)}}$ | |
100 | 0.1082 | 4.4768 | 0.6359 | 6.724 |
500 | 0.1659 | 3.059 | 0.9511 | 3.0376 |
1000 | 0.1577 | 3.0195 | 1.1993 | 3.0792 |
5000 | 0.1386 | 2.9185 | 1.1616 | 3.012 |
10000 | 0.2131 | 2.8123 | 1.2613 | 2.9326 |
∞ | 0 | 2.9282 | 1.4641 | 2.9282 |
Table 3.
Computed values of sample bias and standard deviation for each n and m
n | $m=1$ | $m=2$ | ||
${\text{Bias}_{n}^{(m)}}$ | ${\text{Std}_{n}^{(m)}}$ | ${\text{Bias}_{n}^{(m)}}$ | ${\text{Std}_{n}^{(m)}}$ | |
100 | 0.0616 | 4.5498 | 0.9148 | 5.3876 |
500 | 0.2661 | 2.8404 | 1.1379 | 2.9482 |
1000 | 0.2239 | 2.7614 | 1.1546 | 2.9566 |
5000 | −0.0964 | 2.6225 | 1.4349 | 2.8296 |
10000 | 0.0435 | 2.626 | 1.3573 | 2.9192 |
∞ | 0 | 2.6938 | 1.4317 | 2.8635 |
Experiment 3. In this experiment, we consider
\[ {\varepsilon _{j}}\mid \{{\kappa _{j}}=m\}\sim \left\{\begin{array}{l@{\hskip10.0pt}l}N(0,1.25),\hspace{1em}& m=1,\\ {} {T_{5}},\hspace{1em}& m=2.\end{array}\right.\]
The results of Experiment 3 for mLLE are presented in Table 3.Here ${H_{\ast }^{(2)}}\approx 0.6502$. The simulations results inherit the pattern similar to the observed in the previous experiments.
7 Conclusions
We examined the asymptotic behavior of the modified local linear regression estimator for a mixture of regressions model. We proved that the modified estimator is asymptotically normal. The obtained rate of convergence of this estimator to the unknown value of the regression function at a given point is the same, regardless of whether the density function of a regressor has a jump at this point or not.
Based on the proven asymptotic theory, the optimal bandwidth parameter was derived that minimizes the asymptotic standard deviation of the estimator. The derived asymptotic theory was tested using a simulation experiment. The results obtained in this experiment are consistent with the theoretical results.
The subject of a further research is the development of the theory of optimal choice of parameters for the modified local linear regression estimator, that is, the bandwidth parameter and the kernel function.