We investigate the fractional Vasicek model described by the stochastic differential equation dXt=(α−βXt)dt+γdBtH, X0=x0, driven by the fractional Brownian motion BH with the known Hurst parameter H∈(1/2,1). We study the maximum likelihood estimators for unknown parameters α and β in the non-ergodic case (when β<0) for arbitrary x0∈R, generalizing the result of Tanaka, Xiao and Yu (2019) for particular x0=α/β, derive their asymptotic distributions and prove their asymptotic independence.

Fractional Brownian motionfractional Vasicek modelmaximum likelihood estimationmoment generating functionasymptotic distributionnon-ergodic process60G2262F1062F12Research Council of Norway274410The second author acknowledges that the present research is carried through within the frame and support of the ToppForsk project nr. 274410 of the Research Council of Norway with title STORM: Stochastics for Time-Space Risk Models. Introduction

The present paper deals with the fractional Vasicek model of the form
dXt=(α−βXt)dt+γdBtH,X0=x0∈R,
where BH is the fractional Brownian motion with the Hurst index H∈(1/2,1). It is a generalization of the classical interest rate model proposed by O. Vasicek [34] in 1977. This generalization enables to study processes with long-range dependence, which arise in financial mathematics and several other areas such as telecommunication networks, investigation of turbulence and image processing. In recent years, many articles on various financial applications of the fractional Vasicek model (1) have appeared, see e.g. [8, 9, 12, 13, 30, 40]). In order to use this model in practice, a theory of parameter estimation is necessary.

Notice that in the particular case α=0, (1) is a so-called fractional Ornstein–Uhlenbeck process, introduced in [7]. The drift parameter estimation for it has been studied since 2002, see the paper [17], where the maximum likelihood estimation was considered. The asymptotic and exact distributions of the maximum likelihood estimator (MLE) were investigated later in [4, 31, 32]. Alternative approaches to the drift parameter estimation were proposed and studied in [3, 6, 14–16, 21]. We refer to the article [28] for a survey on this topic, and to the book [20] for its more detailed presentation.

In the general case, the least squares and ergodic-type estimators of unknown parameters α and β were studied in [27, 38, 39]. The corresponding MLEs of α and β were presented in [25]. Their consistency and asymptotic normality were proved there for the case β>00$]]>. Slightly more general results were proved in [26], where joint asymptotic normality of MLE of the vector parameter (α,β) was established. Recently Tanaka et al. [33] investigated asymptotic behavior of MLEs in the cases β=0 and β<0. However, in the latter case the asymptotic distribution was obtained only under assumption that x0=αβ. The study of the case x0≠αβ requires a different technique and still remains an open problem. The goal of the present paper is to fill in the gap and to derive asymptotic distributions of the MLEs of α and β for arbitrary x0∈R, α∈R and β<0. Moreover, we prove that the MLEs for α and β are asymptotically independent.

The asymptotic behavior of the process X and of the estimators substantially depends on the sign of the parameter β. If β<0, then the process X behaves as OP(e−βT) as T→∞, hence it is non-ergodic. If β>00$]]>, then XT=OP(1), as T→∞, and the process has ergodic properties, see, e.g., [27]. The method for the hypothesis testing of the sign of β was developed in [22].

In this article we restrict ourselves to the case 12<H<1. Our proofs are based on the results of the papers [17] and [26], which are valid only for H∈(12,1) and cannot be immediately extended to the case H∈(0,12). In particular, the integral representation (7) below, which is the starting point for derivation of moment generating functions (MGFs) in Lemmas 1 and 2, holds for H∈(0,12) with different (and more complicated) kernel KH. Therefore, the asymptotic behavior of the MLEs in this case requires a separate study.

The paper is organized as follows. In Section 2 we describe the model and the estimators, and introduce the notation. Section 3 contains the results on distributions and asymptotic behavior of stochastic processes involved into MLEs. In Section 4 we formulate and prove the main results on asymptotic distributions of MLEs. Some auxiliary facts and results concerning modified Bessel functions of the first kind and MGFs related to the normal distribution are collected in the appendices.

Preliminaries

Let (Ω,F,{Ft},P) be a complete probability space with filtration. Let BH={BtH,t≥0} be a fractional Brownian motion on this probability space, that is, a centered Gaussian process with the covariance function
EBtHBsH=12(t2H+s2H−|t−s|2H).
It follows from (2) that E(BtH−BsH)2=|t−s|2H. Hence, there exists a modification of BH, which is δ-Hölder continuous for all δ∈(0,H).

We study the fractional Vasicek model, described by the stochastic differential equation
Xt=x0+∫0t(α−βXs)ds+γBtH,t≥0.
The main goal is to estimate parameters α∈R and β<0 by continuous observations of a trajectory of X on the interval [0,T]. We assume that the parameters γ>00$]]> and H∈(1/2,1) are known. This assumption is natural, because γ and H can be obtained explicitly from the observations by considering realized power variations, see Remark 1 below.

Equation (3) has a unique solution, which is given by
Xt=x0e−βt+αβ(1−e−βt)+γ∫0te−β(t−s)dBsH,t≥0,
where ∫0te−β(t−s)dBsH is a path-wise Riemann–Stieltjes integral. It exists due to [7, Proposition A.1].

Following [18], for 0<s<t≤T we define
κH=2HΓ(3/2−H)Γ(H+1/2),λH=2HΓ(3−2H)Γ(H+1/2)Γ(3/2−H),kH(t,s)=κH−1s1/2−H(t−s)1/2−H,wtH=λH−1t2−2H.
We introduce also three stochastic processes
PH(t)=1γddwtH∫0tkH(t,s)Xsds,QH(t)=1γddwtH∫0tkH(t,s)(α−βXs)ds,St=1γ∫0tkH(t,s)dXs.
Note that by [25, Lemma 4.1]
QH(t)=αγ−βPH(t).
According to [18, Theorem 1], the process S is an (Ft)-semimartingale with the decomposition
St=∫0tQH(s)dwsH+MtH,
where MtH=∫0tkH(t,s)dBsH is a Gaussian martingale with the variance function ⟨MH⟩=wH. The natural filtrations of processes S and X coincide. Moreover, the process X admits the following representation
Xt=∫0tKH(t,s)dSs,
where KH(t,s)=γH(2H−1)∫strH−1/2(r−s)H−3/2dr.

If we observe the whole path {Xt,t∈[0,T]}, then the parameters γ and H can be obtained from observations explicitly in the following way. Let {ti(n)} be a partition of [0,T], such that supi|ti+1(n)−ti(n)|→0, as n→∞. Denote ZT=∫0TkH(T,s)dXs=γST. From (6) it follows that ⟨Z⟩T=γ2wTH a.s. Hence, the parameter γ is calculated as the limit
γ2=(wTH)−1limn∑iZti+1(n)−Zti(n)2a.s.

The Hurst index H can be evaluated as follows:
H=12−12limnlog2∑i=12n−1Xti+1(2n)−2Xti(2n)+Xti−1(2n)2∑i=1n−1Xti+1(n)−2Xti(n)+Xti−1(n)2a.s.,
see, e.g., [20, Sec. 3.1]. There exist several other methods of the Hurst index evaluation based on power variations of X. We refer to the books [5, 20] for further information on this subject.

Applying the analog of the Girsanov formula for a fractional Brownian motion ([18, Theorem 3], see also [19]) and (6), one can obtain the likelihood ratio dPα,β(T)dP0,0(T) for the probability measure Pα,β(T) corresponding to our model and the probability measure P0,0(T) corresponding to the model with zero drift [25]:
dPα,β(T)dP0,0(T)=exp{∫0TQH(t)dSt−12∫0T(QH(t))2dwtH}=expαγST−β∫0TPH(t)dSt−α22γ2wTH+αβγ∫0TPH(t)dwtH−β22∫0T(PH(t))2dwtH.
MLEs of parameters α and β maximize (8) and have the following form [25]:
αˆT=STKT−ITJTwTHKT−JT2γ,βˆT=STJT−wTHITwTHKT−JT2,
where
IT=∫0TPH(t)dSt,JT=∫0TPH(t)dwtH,KT=∫0T(PH(t))2dwtH.
It is worth noting that using the definition of PH(t) one can easily represent JT in the following way
JT=1γ∫0TkH(T,s)Xsds.

Auxiliary results

In this section we find exact and asymptotic distributions of the statistics ST, IT, JT, KT and related random variables and vectors.

We start with the bivariate MGF of the vector (ST,IT). For the case β>00$]]>, it was derived in [26, Lemma 3.3]. However, the same proof is valid for the case β<0. The following result is a reformulation of [26, Lemma 3.3], obtained by applying the formula (44) from Appendix A.

The moment generating function of(ST,IT)equalsm1(α,β)(ξ1,ξ2)=E[exp{ξ1ST+ξ2IT}]=D(α,β)(ξ2)−12exp{18D(α,β)(ξ2)∑i=14Ai(α,β)(ξ1,ξ2)−ξ2T2},whereD(α,β)(ξ2)=(1−ξ22β)2+ξ224β2e−2βT+(ξ2β−ξ222β2)(−β)πT4sinπHe−βT×[I−H(−βT2)IH−1(−βT2)+I1−H(−βT2)IH(−βT2)],A1(α,β)(ξ1,ξ2)=ξ2(c1(αβ)ξ1−c2(αβ)ξ2)(−β)H−1T1−He−3βT2I1−H(−βT2),A2(α,β)(ξ1,ξ2)=(ξ12c3−ξ1ξ2c4(αβ)+ξ22c5(αβ))T2−2He−βT×I1−H(−βT2)IH−1(−βT2),A3(α,β)(ξ1,ξ2)=ξ2(ξ2−2β)c6(αβ)(−β)2H−1Te−βTI1−H(−βT2)I−H(−βT2),A4(α,β)(ξ1,ξ2)=(c1(αβ)ξ1−c2(αβ)ξ2)(ξ2−2β)(−β)H−1×T1−He−βT2I1−H(−βT2),c1(αβ)=(x0−αβ)4ρH,c4(αβ)=(x0−αβ)ρH22H+1Γ(H),c2(αβ)=(x0−αβ)2λH∗22H+1ρH2Γ(1−H),c5(αβ)=(x0−αβ)2λH∗24H−1ρH2Γ(H)Γ(1−H),c3=2Γ(H)Γ(1−H)λH∗,c6(αβ)=(x0−αβ)22λH∗ρH2,λH∗=λH2−2H,ρH=πΓ(3/2−H)γκH.The domain of the functionm1(α,β)equals{(ξ1,ξ2)∈R2:D(α,β)(ξ2)>0}0\}$]]>.

The following lemma gives a joint MGF of (ST,IT,JT,KT).

The moment generating function of(ST,IT,JT,KT)equalsm2(θ1,θ2,θ3,θ4)=E[exp{θ1ST+θ2IT+θ3JT+θ4KT}]=m1(α1,β1)(θ1+α−α1γ,θ2−β+β1)exp{α12−α22γ2wTH},whereα1=−γθ3+αββ2−2θ4,β1=−β2−2θ4.The domain of the functionm2equals(θ1,θ2,θ3,θ4)∈R4:θ4<β2/2,D(α,β)θ2−β−β2−2θ4>0,0\right\},\]]]>whereD(α,β)is defined in (10).

The proof is the same as for [26, Theorem 3.4]. □

Under stated conditions the processSThas the normal asymptotic distribution asT→∞, namelyTH−1/2eβTST→dN((x0−αβ)ρH(−β)H−1/2π,Γ(H)Γ(1−H)2π(−β)λH∗).

We obtain the distribution via MGF. Using Lemma 1 we have
E[exp{θTH−1/2eβTST}]=m1(θTH−1/2eβT,0).
Taking each term of the function m1 separately with ξ1=θTH−1/2eβT and ξ2=0 and applying (45) we obtain that D(ξ2)=1, A1(ξ1,ξ2)=A3(ξ1,ξ2)=0,
A2(ξ1,ξ2)=ξ12c3T2−2He−βTI1−H(−βT2)IH−1(−βT2)=θ2c3TeβTI1−H(−βT2)IH−1(−βT2)→c3π(−β)θ2,asT→∞,
and
A4(ξ1,ξ2)=ξ12c1(αβ)(−β)HT1−He−βT2I1−H(−βT2)=θ2c1(αβ)(−β)HT1/2eβT2I1−H(−βT2)→2c1(αβ)(−β)H−1/2πθ,asT→∞.
Hence
E[exp{θTH−1/2eβTST}]→exp{c38π(−β)θ2+c1(αβ)(−β)H−1/24πθ},
as T→∞. This means that
TH−1/2eβTST→dN(c1(αβ)(−β)H−1/24π,c34π(−β)),asT→∞,
which is equivalent to (15). □

The following result is crucial for the derivation of the joint asymptotic distribution of MLE.

The vector of main components of the MLE has the following behaviorTH−1(ST+βJT−αγwTH)eβT(IT+βKT)e2βTKT→dξηζζ2,asT→∞,where ξ, η, ζ are independent andξ=dN(0,λH−1),η=dN(0,1),ζ=dN((x0−αβ)ρHλH∗(−β)H−12π,14β2sinπH).

We again obtain the stated asymptotic distribution via MGF of the presented vector. It could be easily reduced to already studied MGF. That said, using Lemma 2, we have
E[exp{θ1TH−1(ST+βJT−αγwTH)+θ2eβT(IT+βKT)+θ3e2βTKT}]=m2(θ1TH−1,θ2eβT,θ1βTH−1,θ2βeβT+θ3e2βT)exp{−θ1αγTH−1wTH}=m1(α1(T),β1(T))(θ1TH−1+α−α1(T)γ,θ2eβT−β+β1(T))×exp{α1(T)2−α22γ2wTH−θ1αγTH−1wTH},
where
α1(T)=βγθ1TH−1+αβ−β2−2(θ2βeβT+θ3e2βT),β1(T)=−β2−2(θ2βeβT+θ3e2βT).
Applying the Taylor series expansion, we get as T→∞α1(T)=(γθ1TH−1+α)[1−2(θ2βeβT+θ3β2e2βT)]−1/2=(γθ1TH−1+α)[1+(θ2βeβT+θ3β2e2βT)+O(e2βT)]=α+γθ1TH−1+θ2αβeβT+θ2γθ1βTH−1eβT+O(e2βT)
and
β1(T)=β[1−2(θ21βeβT+θ31β2e2βT)]1/2=β[1−(θ21βeβT+θ31β2e2βT)−12(θ21βeβT+θ31β2e2βT)2+O(e3βT)]=β−θ2eβT+θ22+2θ32(−β)e2βT+O(e3βT).
Note that α1(T)→α and β1(T)→β, as T→∞. Moreover, the arguments of the function m1(α1(T),β1(T)) in (18) have the following asymptotic behavior: ξ1(T):=θ1TH−1+α−α1(T)γ=θ2α−βγeβT+θ1θ2−βTH−1eβT+O(e2βT),ξ2(T):=θ2eβT−β+β1(T)=θ22+2θ32(−β)e2βT+O(e3βT), as T→∞. Further, inserting (22) into (10), and applying the expansion (45) from Appendix A, we obtain
D(α1(T),β1(T))(ξ2(T))=(1−(θ22+2θ3)e2βT4β1(T)(−β))2+(θ22+2θ3)2e4βT16β1(T)2β2e−2β1(T)T+((θ22+2θ3)e2βT2β1(T)(−β)−(θ22+2θ3)2e4βT8β1(T)2β2)(−β1(T))πT4sinπHe−β1(T)T×[I−H(−β1(T)T2)IH−1(−β1(T)T2)+I1−H(−β1(T)T2)IH(−β1(T)T2)]+O(eβT)∼(1−(θ22+2θ3)e2βT4β1(T)(−β))2+(θ22+2θ3)2e4βT16β1(T)2β2e−2β1(T)T+((θ22+2θ3)e2βT2β1(T)(−β)−(θ22+2θ3)2e4βT8β1(T)2β2)12sinπHe−2β1(T)T+O(eβT)→1−θ22+2θ34β2sinπH,asT→∞.
It follows from (21), (22) that
c1(α1(T)β1(T))ξ1(T)−c2(α1(T)β1(T))ξ2(T)∼c1(αβ)θ2α−βγeβT,asT→∞.
Using this relation, (22) and (45), we get from (11) that
A1(α1(T),β1(T))(ξ1(T),ξ2(T))=ξ2(T)(c1(α1(T)β1(T))ξ1(T)−c2(α1(T)β1(T))ξ2(T))×(−β1(T))H−1T1−He−3β1(T)T2I1−H(−β1(T)T2)∼θ22+2θ32(−β)e2βTc1(αβ)α(−β)γθ2eβT(−β1(T))H−1T1−H×e−3β1(T)T2I1−H(−β1(T)T2)∼αθ2(θ22+2θ3)2β2γe3βTc1(αβ)(−β1(T))H−3/21πT1/2−He−2β1(T)T=O(T1/2−HeβT)→0,asT→∞.
It follows from (21), (22) that
ξ1(T)2=θ22α2β2γ2e2βT+O(TH−1e2βT),ξ1(T)ξ2(T)=O(e3βT),ξ2(T)2=O(e4βT),
as T→∞. Therefore, by (12) and (45) we obtain
A2(α1(T),β1(T))(ξ1(T),ξ2(T))=(ξ1(T)2c3−ξ1(T)ξ2(T)c4(α1(T)β1(T))+ξ2(T)2c5(α1(T)β1(T)))×T2−2He−β1(T)TI1−H(−β1(T)T2)IH−1(−β1(T)T2)∼c3α2β2γ2θ22e2βTT2−2He−β1(T)TI1−H(−β1(T)T2)IH−1(−β1(T)T2)∼e2βTc3α2β2γ2θ221(−β)πT1−2He−2β1(T)T=O(T1−2H)→0,asT→∞.

Note that ξ2(T)−2β1(T)→−2β, as T→∞, by (20) and (22). Hence, by (13) and (45),
A3(α1(T),β1(T))(ξ1(T),ξ2(T))=ξ2(T)(ξ2(T)−2β1(T))c6(α1(T)β1(T))×(−β1(T))2H−1Te−β1(T)TI1−H(−β1(T)T2)I−H(−β1(T)T2)∼θ22+2θ3−2βe2βT(−2β)c6(αβ)(−β1(T))2H−21πe−2β1(T)T→c6(αβ)(−β)2H−2π(θ22+2θ3),asT→∞.
Similarly, using (14), (24) and (45), we get
A4(α1(T),β1(T))(ξ1(T),ξ2(T))=(c1(α1(T)β1(T))ξ1(T)−c2(α1(T)β1(T))ξ2(T))×(ξ2(T)−2β1(T))(−β1(T))H−1T1−He−β1(T)T2I1−H(−β1(T)T2)∼c1(αβ)θ2α−βγeβT(−2β)(−β1(T))H−3/21πT1/2−He−β1(T)T=O(T1/2−H)→0,asT→∞.
Also, (19) implies
α1(T)2=α2+2αγθ1TH−1+γ2θ12T2H−2+O(eβT),asT→∞.
Then, for the expression under the exponential function in (18) we have
α1(T)2−α22γ2wTH−θ1αγTH−1wTH=12θ12T2H−2wTH+O(wTHeβT)=12θ12λH−1+O(wTHeβT)→12θ12λH−1,asT→∞,
since wTH=λH−1T2−2H.

Thus, inserting (23) and (25)–(29) into (18), we arrive at
E[exp{θ1TH−1(ST+βJT−αγwTH)+θ2eβT(IT+βKT)+θ3e2βTKT}]→exp{12θ12λH−1}(1−θ22+2θ34β2sinπH)−1/2×exp{c6(αβ)(−β)2H−2(θ22+2θ3)8π(1−θ22+2θ34β2sinπH)},asT→∞.
We see that the limit is a product of MGF of the normal random variable ξ=dN(0,λH−1) and MGF of the random vector ηζζ2, where the random variables η=dN(0,1) and ζ=dN(c6(αβ)(−β)H−12π,14β2sinπH) are independent, see Lemma 5 in Appendix B. This concludes the proof, since c6(αβ)=(x0−αβ)22λH∗ρH2. □

In fact, N(0,λH−1) is an exact distribution of the random variable TH−1(ST+βJT−αγwTH) for any T. It can be easily seen from the above proof by putting θ2=θ3=0 (then α1(T)=α+γθ1TH−1, β1(T)=β, ξ1(T)=ξ2(T)=0).

The following series of corollaries will describe asymptotic distributions of minor components of the MLE.

First, by considering the convergence of the first component of the random vector in (16), we immediately get the following result.

For the process(ST+βJT)we have1wTH(ST+βJT)→Pαγ,T→∞.

Next, we focus on the process IT. In order to obtain its asymptotic behavior it suffices to write
e2βTIT=e2βT(IT+βKT)−βe2βTKT
and then apply (16).

For the processITwe havee2βTIT→d−βζ2,T→∞,where ζ has the normal distribution defined in (17).

Finally, the asymptotic behavior of JT can be easily derived using Lemma 3, Corollary 1 and the identity
−βTH−12eβTJT=TH−12eβTST−TH−12eβT(ST+βJT).

For the processJTwe haveTH−12eβTJT→dN(8(x0−αβ)ρH(−β)H−3/2π,4Γ(H)Γ(1−H)λH∗(−β)3π),T→∞.

Main results

Now we are ready to prove the main result of the article.

Letβ<0,H∈(1/2,1). ThenT1−H(αˆT−α)e−βT(βˆT−β)→dνηζ,T→∞,whereν=dN(0,λHγ2),η=dN(0,1), andζ=dN((x0−αβ)ρHλH∗(−β)H−12π,14β2sinπH)are independent random variables. In particular, the estimatorsαˆTandβˆTare asymptotically independent.

Using (9) and the equality T1−H=λHwTHTH−1, we can write
T1−H(αˆT−α)=λHwTHTH−1(STKT−ITJTwTHKT−JT2γ−α)=λHwTHTH−1γSTKT−γITJT−αwTHKT+αJT2wTHKT−JT2=e2βTKTγλHTH−1(ST+βJT−αγwTH)e2βTKT−1wTHe2βTJT2+−γλHTH−1e2βTJT(IT+βKT)+αλHTH−1e2βTJT2e2βTKT−1wTHe2βTJT2.
Note that by Corollary 3 and Lemma 4, we see that JT=OP(T12−He−βT), IT+βKT=OP(e−βT), and e2βTKT−1wTHe2βTJT2→dζ2, as T→∞. Consequently, the second term in the right-hand side of (32) converges to zero in probability.

Further, by (9),
e−βT(βˆT−β)=e−βT(STJT−wTHITwTHKT−JT2−β)=e−βTSTJT−wTHIT−βwTHKT+βJT2wTHKT−JT2=−eβT(IT+βKT)+eβTJT1wTH(ST+βJT)e2βTKT−1wTHe2βTJT2.
Corollaries 1 and 3 imply that eβTJT1wTH(ST+βJT) converges to zero in probability. Then applying Lemma 4 and Slutsky’s theorem, from (32), (33) we get the convergence (30). □

Unlike the ergodic case (studied in [26]), in the non-ergodic case the initial value x0 affects the asymptotic bias of βˆT. If β<0, then the deterministic term (x0−αβ)e−βt in (4) does not converge to zero and, moreover, has the same asymptotic order O(e−βT) as the stochastic term γ∫0te−β(t−s)dBsH. This implies that the asymptotic behavior of the statistics ST, IT, JT, and KT depends on x0. A similar dependence on initial condition holds for the non-ergodic Ornstein–Uhlenbeck model driven by the Brownian motion (see [11] and [23, Prop. 3.46]) and some explosive autoregressive models [2, 35, 37].

The model (1) with x0=αβ was considered in [33, Th. 5.2]. In this particular case we have ζ=dN(0,14β2sinπH). Consequently,
e−βT2β(βˆT−β)→dXsinπHY,asT→∞,
where X and Y are two independent N(0,1) random variables. This completely agrees with [33, Th. 5.2].

(Alternative parameterization).

An alternative specification of the fractional Vasicek model is
dXt=κ(μ−Xt)dt+γdBtH,t∈[0,T],X0=x0.
For the model (34), the MLEs of the parameters μ and κ have the following form [26]:
μˆT=STKT−ITJTSTJT−wTHITγ,κˆT=STJT−wTHITwTHKT−JT2.
One can establish the following result: if κ<0 and H∈(1/2,1), then
T1−H(μˆT−μ)e−κT(κˆT−κ)→dν˜η/ζ˜,asT→∞,
where ν˜=dN(0,λHγ2κ2), η=dN(0,1), and
ζ˜=dN((x0−μ)ρHλH∗(−κ)H−12π,14κ2sinπH),
are independent random variables.

The proof of (35) is carried out by the delta-method. By Taylor’s theorem, for the function g(x,y)=xy, we have as (x,y)→(α,β)g(x,y)−g(α,β)=1β(x−α)−αβ2(y−β)+o((x−α)2+(y−β)2).
Multiplying both sides of (36) by T1−H, and putting x=αˆT, y=βˆT, we get
T1−H(αˆTβˆT−αβ)=1βT1−H(αˆT−α)+RT,
where
RT=−αβ2T1−H(βˆT−β)+oP(T1−H(αˆT−α)2+(βˆT−β)2)→P0,
as T→∞, since T1−H(βˆT−β)→P0 and T1−H(αˆT−α)→dν due to (30).

Finally, by Slutsky’s theorem, from (30), (37) and (38) we obtain the convergence
T1−H(αˆTβˆT−αβ)e−βT(βˆT−β)=1βT1−H(αˆT−α)e−βT(βˆT−β)+RT0→dνβηζ,asT→∞,
which is equivalent to (35), since μˆT=αˆTβˆT, κˆT=βˆT, μ=αβ, κ=β.

Now let us consider the situation when one of the parameters is known. In this case we can obtain strong consistency of the corresponding MLEs (instead of weak one) by applying the strong law of large numbers for martingales, see, e.g., [24, Theorem 2.6.10].

Letβ<0be known andH∈(1/2,1). The MLE for α isα˜T=γwTH(ST+βJT).It is unbiased, strongly consistent and normal:T1−H(α˜T−α)=dN(0,λHγ2).

The form of the MLE (39) was established in [25, Eq. (3.2)]. The normality follows from Remark 2:
T1−H(α˜T−α)=λHγTH−1(ST+βJT−αγwTH)=dN(0,λHγ2).
In order to obtain the strong consistency, we rewrite this equality using the relation
ST+βJT−αγwTH=MTH,
which follows from (6) and (5). Then
α˜T−α=γMTHwTH.
Recall that MTH is a martingale and ⟨MH⟩T=wTH→∞, as T→∞. Then MTHwTH→0 a.s., as T→∞, by the strong law of large numbers for martingales [24, Th. 2.6.10, Cor. 1]. □

Actually, the statement of Theorem 2 is true regardless of the sign of β. For β>00$]]>, (40) was proved in [25, Th. 3.1].

Let α be known,H∈(1/2,1)andβ<0. The MLE for β isβ˜T=αγJT−ITKT.It is strongly consistent ande−βT(β˜T−β)→dηζ,asT→∞,where η and ζ are the same as in Theorem1.

The form of the MLE (42) is found in [25, Eq. (3.3)]. The strong consistency is established in the same way as in [25, Th. 3.2]. It follows from (41) that
αγJT−IT−βKT=αγ∫0TPH(t)dwTH−∫0TPH(t)dSt−β∫0T(PH(t))2dwtH=−∫0TPH(t)dMTH.
Whence, (42) implies that
β˜T−β=αγJT−IT−βKTKT=−∫0TPH(t)dMTH∫0T(PH(t))2dwtH.
Since the process MH is a martingale with the quadratic variation wH, the process ∫0·PH(t)dMtH is a martingale with the quadratic variation ∫0·(PH(t))2dwtH. Note that ∫0T(PH(t))2dwtH=KT→∞ in probablility, by Lemma 4. This convergence holds almost surely, because ∫0T(PH(t))2dwtH is increasing in upper bound T. Therefore, by the strong law of large numbers for martingales [24, Th. 2.6.10, Cor. 1], we get that β˜T→β a.s., as T→∞.

Finally, the convergence (43) follows from the representation
e−βT(β˜T−β)=αγeβTJT−eβT(IT+βKT)e2βTKT,
Lemma 4, Corollary 3, and Slutsky’s theorem. □

The particular case when the parameter α=0 is known and x0=0 was studied in [32]. Similarly to Remark 4, we see that in this case the convergence (43) takes the form
e−βT2β(β˜T−β)→dXsinπHY,asT→∞,
where X and Y are two independent N(0,1) random variables. This coincides with the result of [32, Th. 2].

Modified Bessel function of the first kind

In this section we present some properties of the modified Bessel function of the first kind Iν(x), which are helpful for our proofs. For more details on this topic we recommend the book [36]. Let ν>−1-1$]]>, x∈R. Then the function Iν(x) could be defined by the following power series [29, Formula 50:6:1]:
Iν(x)=∑j=0∞(x/2)2j+νj!Γ(j+1+ν).
Note, that if x is negative and ν is non-integer, then the function Iν(x) is complex-valued. However, a function Iν(x)/xν is always real-valued. This function equals 2−ν/Γ(1+ν) when x=0 and it is even, i.e.
Iν(−x)(−x)ν=Iν(x)xν,
see [29, Formula 50:2:1]. For large values of x the function Iν(x) has the following asymptotic behavior [1, Formula 9.7.1]:
Iν(x)=ex2πx(1−4ν2−18x+O(x−2)),x→∞.

MGFs related to the bivariate normal distribution

MGF of the product of two normal random variables X=dN(m1,σ12), Y=dN(m2,σ22) with the correlation coefficient r=Cov(X,Y)σ1σ2 equals (see e.g. [10])
E[exp{tXY}]=D−1/2exp{(m12σ22+m22σ12−2rm1m2σ1σ2)t2+2m1m2t2D},
where
D=(1−(1+r)σ1σ2t)(1+(1−r)σ1σ2t).

For two independent normal random variablesX=dN(m,σ2)andY=dN(0,1),E[exp{θ1XY+θ2X2}]=[1−σ2(θ12+2θ2)]−12exp{m2(θ12+2θ2)2[1−σ2(θ12+2θ2)]}.

Evidently,
θ1XY+θ2X2=X(θ1Y+θ2X),
where X=dN(m,σ2), θ1Y+θ2X=dN(θ2m,θ12+θ22σ2), and
Cov(X,θ1Y+θ2X)=θ2Cov(X,X)=θ2σ2.
Applying (46) with t=1, m1=m, σ12=σ2, m2=θ2m, σ22=θ12+θ22σ2, and r=θ2σθ12+θ22σ2, we get
E[exp{X(θ1Y+θ2X)}]=D−1/2exp{m2(θ12+θ22σ2)+θ22m2σ2−2θ22m2σ2+2θ2m22D}=D−1/2exp{m2(θ12+2θ2)2D},
where
D=(1−(1+θ2σθ12+θ22σ2)σθ12+θ22σ2)×(1+(1−θ2σθ12+θ22σ2)σθ12+θ22σ2)=(1−σθ12+θ22σ2−θ2σ2)(1+σθ12+θ22σ2−θ2σ2)=(1−θ2σ2)2−σ2(θ12+θ22σ2)=1−σ2(θ12+2θ2),
whence (47) follows. □

Acknowledgement

The authors are grateful to the referees for their valuable comments and suggestions.

ReferencesAbramowitz, M., Stegun, I.A. (eds.): Anderson, T.W.: On asymptotic distributions of estimates of parameters of stochastic difference equations. Belfadli, R., Es-Sebaiy, K., Ouknine, Y.: Parameter estimation for fractional Ornstein–Uhlenbeck processes: non-ergodic case. Bercu, B., Coutin, L., Savy, N.: Sharp large deviations for the fractional Ornstein–Uhlenbeck process. Berzin, C., Latour, A., León, J.R.: Bishwal, J.P.N.: Minimum contrast estimation in fractional Ornstein-Uhlenbeck process: Continuous and discrete sampling. Cheridito, P., Kawaguchi, H., Makoto, M.: Fractional Ornstein–Uhlenbeck processes.Chronopoulou, A., Viens, F.G.: Estimation and pricing under long-memory stochastic volatility. Corlay, S., Lebovits, J., Véhel, J.L.: Multifractional stochastic volatility models. Craig, C.C.: The frequency of function of y/x. Feigin, P.D.: Maximum likelihood estimation for continuous-time stochastic processes. Fink, H., Klüppelberg, C., Zähle, M.: Conditional distributions of processes related to fractional Brownian motion. Hao, R., Liu, Y., Wang, S.: Pricing credit default swap under fractional Vasicek interest rate model. Hu, Y., Nualart, D.: Parameter estimation for fractional Ornstein–Uhlenbeck processes. Hu, Y., Song, J.: Parameter estimation for fractional Ornstein–Uhlenbeck processes with discrete observations. In: Hu, Y., Nualart, D., Zhou, H.: Parameter estimation for fractional Ornstein–Uhlenbeck processes of general Hurst parameter. Kleptsyna, M.L., Le Breton, A.: Statistical analysis of the fractional Ornstein–Uhlenbeck type process. Kleptsyna, M.L., Le Breton, A., Roubaud, M.-C.: Parameter estimation and optimal filtering for fractional type stochastic systems. Kozachenko, Y., Melnikov, A., Mishura, Y.: On drift parameter estimation in models with fractional Brownian motion. Kubilius, K., Mishura, Y., Ralchenko, K.: Kubilius, K., Mishura, Y., Ralchenko, K., Seleznjev, O.: Consistency of the drift parameter estimator for the discretized fractional Ornstein–Uhlenbeck process with Hurst index H∈(0,12). Kukush, A., Mishura, Y., Ralchenko, K.: Hypothesis testing of the drift parameter sign for fractional Ornstein–Uhlenbeck process. Kutoyants, Y.A.: Liptser, R.S., Shiryayev, A.N.: Lohvinenko, S., Ralchenko, K.: Maximum likelihood estimation in the fractional Vasicek model. Lohvinenko, S., Ralchenko, K.: Asymptotic distribution of maximum likelihood estimator in fractional Vasicek model. Lohvinenko, S., Ralchenko, K., Zhuchenko, O.: Asymptotic properties of parameter estimators in fractional Vasicek model. Mishura, Y., Ralchenko, K.: Drift parameter estimation in the models involving fractional Brownian motion. In: Oldham, K., Myland, J., Spanier, J.: Song, L., Li, K.: Pricing option with stochastic interest rates and transaction costs in fractional Brownian markets. Tanaka, K.: Distributions of the maximum likelihood and minimum contrast estimators associated with the fractional Ornstein–Uhlenbeck process. Tanaka, K.: Maximum likelihood estimation for the non-ergodic fractional Ornstein–Uhlenbeck process. Tanaka, K., Xiao, W., Yu, J.: Maximum likelihood estimation for the fractional Vasicek model. Research Collection School Of Economics, Paper No. 08-2019, 1–31 (2019)Vasicek, O.: An equilibrium characterization of the term structure. Wang, X., Yu, J.: Limit theory for an explosive autoregressive process. Watson, G.N.: White, J.S.: The limiting distribution of the serial correlation coefficient in the explosive case. Xiao, W., Yu, J.: Asymptotic theory for estimating drift parameters in the fractional Vasicek model. Xiao, W., Yu, J.: Asymptotic theory for rough fractional Vasicek models. Xiao, W., Zhang, W., Zhang, X., Chen, X.: The valuation of equity warrants under the fractional Vasicek process of the short-term interest rate.