Parameter estimation in mixed fractional stochastic heat equation

The paper is devoted to a stochastic heat equation with a mixed fractional Brownian noise. We investigate the covariance structure, stationarity, upper bounds and asymptotic behavior of the solution. Based on its discrete-time observations, we construct a strongly consistent estimator for the Hurst index H and prove the asymptotic normality for H < 3 / 4. Then assuming the parameter H to be known, we deal with joint estimation of the coefﬁcients at the Wiener process and at the fractional Brownian motion. The quality of estimators is illustrated by simulation experiments


Introduction
The paper is devoted to parameter estimation in a stochastic heat equation of the following form: The right-hand side of (1) is a mixed fractional noise. It consists of two independent stochastic processes, namely, a fractional Brownian motion B H = {B H x , x ∈ R} with Hurst parameter H ∈ (0, 1) and a Wiener process W = {W x , x ∈ R}; σ and κ are some positive coefficients. The solution to the problem (1)- (2) is understood in mild sense; the precise definition will be given in Section 2.
The mixed fractional Brownian motion was first introduced by P. Cheridito [8] in order to model financial markets that are simultaneously arbitrage-free and have a memory. The properties of this process were studied in [36]. We refer to the book [28] for a detailed presentation of the modern theory in this area. More involved mixed fractional models described by stochastic differential equations are the subject of numerous publications [13,35,33,17] appeared during last decades. Such equations can be used to model processes on financial markets, where two principal random noises influence the prices. The first source of randomness is the stock exchange itself with thousands of agents. The noise coming from this source can be assumed white and is best modeled by a Wiener process. The second one has the financial and economical background and can be modeled by a fractional Brownian motion B H . Stochastic partial differential equations with such noises can be used, in particular, for the modeling of instantaneous forward rates, where the space variable corresponds to time until maturity [9,22]. Such equations arise also in geophysics, especially in physical oceanography [30] and in geostatistics [34]. For example, in models for sea surface temperature, noise terms can represent various heat fluxes and ocean processes [29].
Existence, uniqueness and properties of solutions for stochastic differential equations with mixed noise were studied in various papers [12,16,17,[25][26][27]. A stochastic heat equation with white and fractional noises was investigated in [24]. Several approaches to parameter identification in simple linear mixed fractional models for various observation schemes were proposed in [6,14,23,20,21]. The problem of drift parameter estimation in a mixed stochastic differential equation of a general form was studied in [19]. The statistical problems for the mixed fractional Vasicek model were investigated in the recent papers [7,31].
Similarly to our previous papers the solution u(t, x) is observed at equidistant spatial points for a several fixed time instants. On one hand, there are many practical cases where the solution is observed at some discrete space points such as temperature of a heated body or velocity of a turbulent flow. In many cases the measurements with a high space resolution are available, but the time series are short. For example, this is the case for satellite observations of sea surface temperature, see [30]. For this reason, it is suitable to assume that u(t, x) is observed at a large number of space points x k and only few different time instants t i . On the other hand, observing the solution at three time points is enough to construct estimators for unknown parameters σ , κ and H . Nevertheless, the additional information of observing the solution discretely in time can be consolidated by taking the (weighted) average of the estimators similarly to [5] or [9].
The present paper is devoted to the problem of estimating unknown parameters H , σ , κ in the equation (1), by discrete observations of its solution u(t, x). The re-sults of this paper are an extension of our previous works [2,3], where the problems of estimating H and σ were studied for the equation (1) with the fractional Brownian motion only (that is, for κ = 0). The diffusion parameter estimator for SPDE with white noise and its properties was considered in [4]. Similarly to the mentioned articles, in the present paper we start with proving stationarity and ergodicity of the solution u(t, x) as a function of the spatial variable x by analyzing the behavior of the covariance function. Based on these results we construct a strongly consistent estimator of H (assuming that the parameters σ and κ are unknown). The asymptotic normality of this estimator is proved for any H ∈ (0, 1 2 ) ∪ ( 1 2 , 3 4 ). Then we consider the problem of estimating the couple of parameters (σ, κ) when the value of H is known. We prove strong consistency of the estimator and investigate its asymptotic normality.
The paper is organized as follows. In Section 2 we introduce the definition of a mild solution to SPDE (1) and present its properties. Furthermore, we prove the limit theorem for it, needed for establishing properties of statistical estimators. The statistical problems are investigated in Section 3. In Subsection 3.1 we construct an estimator of the Hurst index H and prove its strong consistency and asymptotic normality. Subsection 3.2 is devoted to the estimators of the parameters σ and κ and their asymptotic properties. Numerical results are presented and discussed in Section 4.

Preliminaries
Assume that B H = B H x , x ∈ R is a two-sided fractional Brownian motion with the Hurst index H ∈ (0, 1), while W = {W x , x ∈ R} is a Wiener process, independent of B H . Let G be Green's function of the heat equation, that is Similarly to [2][3][4] (see also [10] and the references cited therein), we define a solution to SPDE (1) in a mild sense as follows.

Remark 1.
As shown in [2], both stochastic integrals in (3) exist as pathwise Riemann-Stieltjes integrals. This fact follows from the Hölder regularity of the integrands and integrators. Namely, Green's function is obviously Lipshitz continuous, while sample paths of the fractional Brownian motion are Hölder continuous up to order H . Such regularity guarantees the existence of the first integral in (3). The second integral is also well defined, since the integrand is square integrable, see [4, The next proposition summarizes basic properties of the solution u(t, x). These properties, especially stationarity and ergodicity, enable us to construct and investigate statistical estimators for H , κ and σ . Proposition 1. Let u = {u(t, x), t ∈ [0, T ], x ∈ R} be a solution to (1) defined by (3). Then the following properties hold.

The variance of u(t, x) is given by
where denotes the gamma function.
3. For all t, s ∈ [0, T ] and x > 0, the covariance function admits the following upper bound: where C H is a positive constant depending only on H .
5. For a fixed t > 0, the random process {u(t, x), x ∈ R} is ergodic.
Proof. The proposition follows from the corresponding results for the equation with pure fractional noise studied in [2,3]. Indeed, all the statements are based on the properties of the covariance function of the solution. However, since B H and W are independent, we see that this covariance function can be represented as Then combining the equality (9) with statements of [3, Prop. 2.2], we immediately obtain formulas (4), (7) and (8). The equality (5) follows from (9) and [2,Prop. 3]. Finally, the last statement of the proposition holds, because the solution {u(t, x), x ∈ R} is a stationary Gaussian process, whose covariance function vanishes as x → ∞, according to (8). Hence, the process u(t, ·) is ergodic.
Let us fix some δ > 0 and consider the following sequence: The sequence (10) will serve as a statistic for parameter estimation problems in Section 3. We introduce the following notation in addition to (6): The next theorem describes an asymptotic behavior of the stochastic process V N . It gives a law of large numbers and a central limit theorem for its finite-dimensional distributions (V N (t 1 ), . . . , V N (t n )) as N → ∞. This result is crucial for construction of the estimators and for establishing their asymptotic properties.

For any
2. If additionally H ∈ (0, 3 4 ), then for any n ≥ 1 and any distinct positive t 1 , . . . , t n , where N (0, R) is the multivariate normal distribution with zero mean and the covariance matrix Proof. 1. The ergodic theorem implies that for any t > 0 which is equivalent to (12). 2. In order to prove the convergence (13), we shall apply the Cramér-Wold theorem. In other words, we need to show that for all α 1 , . . . , α n ∈ R, the convergence holds with the asymptotic variance Representing V N as the sum (10) and using (11), we rewrite (14) in the form Further, since , x ∈ R, is a multivariate stationary Gaussian sequence, the convergence (16) can be established by application of the multivariate Breuer-Major theorem [1,Theorem 4]. In order to verify the conditions of this theorem, note that the function F (x 1 , . . . , x n ) = n i=1 α i x 2 i has Hermite rank 2, therefore we need to check the convergence of the series: It follows immediately from the upper bound (7) that these power series converge if and only if H ∈ (0, 3 4 ). Thus, the assumptions of [1,Theorem 4] are satisfied, whence the convergence (16) holds with the following asymptotic variance: 0), . . . , u(t n , 0) , F u(t 1 , kδ), . . . , u(t n , kδ) . Now we must only check that the asymptotic variance s 2 satisfies (15). By the definition of the function F , we have Now we can use the following well-known fact: if ξ and η are zero-mean Gaussian random variables, then cov(ξ 2 , η 2 ) = 2 cov(ξ, η) 2 , in particular, Var(ξ 2 ) = 2 Var(ξ ) 2 (this is a corollary of the Isserlis theorem [18]). Then we get Taking into account the equality ρ H ts (k) = ρ H st (−k), we may rewrite this expression in the more compact form: Thus the equality (15) is verified. This completes the proof of Theorem 1.

Parameter estimation
Let us consider the following statistical problem. It is supposed that for fixed t 1 , . . . , t n and fixed step δ > 0, the random field u given by (3)  Our aim is to estimate the unknown parameters H , σ and κ. We shall do this it two steps. We start with construction of a strongly consistent estimator of H that does not depend on κ and σ . Also, we shall establish its asymptotic normality. Then assuming that H is known, we shall estimate the parameters σ and κ.
In what follows we assume that H = 1 2 , because otherwise the model is nonidentifiable. The parameters σ and κ are assumed to be positive.

Estimation of H
In order to estimate H without knowledge of σ and κ, it suffices to observe u(t i , x k ) only at three time instants t 1 < t 2 < t 3 . If we write (12) at these points and replace convergences with equalities, we shall get the following system of equations Excluding the unknown parameter κ from the system, we obtain Then excluding σ we arrive at the following estimating equation for H : .
The solution of (19) (if exists), can be viewed as an estimator of H . Note that the left-hand side of (19) is indeterminate for H = 1/2. However, it is easy to see by l'Hôpital's rule that there exists the limit Therefore, one may define by continuity Then the estimator of H is defined as where f (−1) denotes the inverse function of f . In order to prove its existence, we need to establish that f : R → (0, ∞) is a one-to-one function. This is true, since f is always a strictly increasing function (see Fig. 1) as shown in the following lemma.

Lemma 1.
For any 0 < t 1 < t 2 < t 3 , the function f : R → (0, ∞) defined by (20) is strictly increasing with respect to H . Proof. We prove the statement for the case H ∈ ( 1 2 , ∞). The interval (−∞, 1 2 ) is considered similarly. The derivative of f with respect to H is equal to Therefore, it suffices to prove the inequality In order to establish (23), observe that the function h(x) = x log x, x > 0, is strictly convex (indeed, its second derivative h (x) = 1/x > 0). This means that for any α ∈ (0, 1), x > 0 and y > 0, Let us take which is equivalent to (23).
The above lemma yields that the estimator H N is well defined at least for sufficiently large N (when the right-hand side of estimating equation (19) becomes positive). The asymptotic properties of H N are summarized in the following theorem, which is the first main result of the paper.
Remark 2. The inequality (23) from the proof of Lemma 1 implies that D > 0 for all H = 1/2.
Proof. 1. The strong consistency follows from the construction of the estimator. Indeed, (12) implies that Then the convergence H N → H a. s. as N → ∞ follows from (21) and (25) due to the continuity of f (−1) . 2. By taking expectations in the equalities (18), we get the following relations or In order to derive the desired asymptotic normality from the convergence (13), we shall apply the delta method. Denoting , (28) we see that the partial derivatives of g equal By the delta method, we derive from (13) the convergence (2) with It remains to prove that g i (μ(t 1 ), μ(t 2 ), μ(t 3 )) = L i /(Dσ 2 c H ), i = 1, 2, 3. Let us consider in detail the proof of this equality for i = 1, the cases i = 2 and i = 3 are considered similarly. Using successively (27), (28) and (22), we get After inserting this expression into (29) and taking into account the relations (26), we obtain Note also that the above representation yields g 1 (x 1 , x 2 , x 3 ) = 0 in the neighborhood of the point (μ(t 1 ), μ(t 2 ), μ(t 3 )), which is necessary for applying the delta method. The derivatives g 2 and g 3 are considered similarly.
The estimator of H was obtained as a solution to some exponential equation. However, it would be more convenient for applications and modeling to have the explicit form of the estimator. It turns out that in some particular cases it is possible to solve the estimating equation explicitly. Let us consider such example in more detail.
Assume that t 1 = h > 0, t 2 = 2h, t 3 = 4h. Substituting these values in the definition of f , we get Therefore, f (−1) (x) = 1 2 + log 2 x, x > 0, consequently (21) becomes 1 . (30) In this case 1 We use log + 2 rather than log 2 so that the right-hand side of (30) is always well defined. The function log + 2 x is defined as log 2 x if x > 0 and 0 if x ≤ 0. Note that the expression under log + 2 in (30) eventually becomes positive, since in converges to 2 H −1/2 .
Denoting i = D −1 L i h 1+H log 2, we arrive at the following result.
Then the estimator H N can be written in the explicit form (30). In this case Theorem 2 holds with

Remark 3.
Evidently, the explicit form of the estimator can be obtained also in a slightly more general case, when t 1 = h, t 2 = ah, t 3 = a 2 h with some a > 0. This leads to the estimator of the form (30) with log + a instead of log + 2 .

Estimation of σ and κ
Now we assume that the Hurst index H is known and investigate the estimation of the coefficients σ and κ. From the first two equations of (17), one can derive the following estimators: Now we are ready to formulate and prove the second main result of the paper.
2. For H ∈ (0, 1 2 ) ∪ ( 1 2 , 3 4 ), the estimator ( σ 2 N , κ 2 N ) is asymptotically normal: where the asymptotic covariance matrix consists of the following elements: Due to (9), this matrix can be decomposed as N = σ 2 b N + κ 2 w N , where b N and w N are the covariance matrices of and respectively. The log-likelihood function is Then, MLE of θ = (σ 2 , κ 2 ) is obtained as the solution to the following system of equations: (here and after we use the differentiation formulas of a matrix with respect to given parameter, 2 see, e. g., [32] for more details). The maximum likelihood estimatorθ mle N of θ can hardly be written in the explicit form, since the estimating equations involve the inverse matrix −1 N , which depends nonlinearly on σ 2 and κ 2 . However, using the general theory of maximum likelihood estimation for dependent observations [11], it is possible to establish the asymptotic normality in the form where T N (θ ) is the Fisher information matrix and I 2 is the 2 × 2 identity matrix. The rigorous proof of (36) as well as a careful analysis of the asymptotic behavior of T N (θ ) requires an additional investigation. To the best of our knowledge, even for much simpler model of the mixed fractional Brownian motion, this problem has not been completely solved yet, see the recent paper [15] for details. Therefore, here we restrict ourselves to the identification of the matrix T N (θ ).

Lemma 2. The Fisher information matrix T N (θ ) has the form
Proof. In order to identify T N (θ ), let us calculate the second derivatives. Note that Hence, Taking expectations, we obtain that the corresponding element of the Fisher information matrix equals since for any matrix A = (a ij ) i,j =1,...2N , we have the equality E X N AX N = i,j a ij E X i X j = tr (A N ). Arguing as above, one can write the derivatives and calculate their expectations, identifying other elements of T N (θ ).
Remark 4. 1. Similarly to the previous subsection, in the case H = 1 2 it is impossible to estimate both parameters, σ 2 and κ 2 , simultaneously. Only estimation of the sum σ 2 + κ 2 is possible. In this case b N = w N , therefore the estimation equations (34) and (35) coincide.
2. The results of this subsection are valid for any other observations vector of the form X = (u(t i , x k ), i = 1 . . . , M, k = 1, . . . N) and its covariance matrix (with decomposition = σ 2 b + κ 2 w ).
3. Similar approach can be applied to the case, when H is unknown, that is, to the problem of estimation of all three parameters σ 2 , κ 2 and H .

Simulations
Let us illustrate the theoretical properties of the estimator by some numerical results. We consider the model with the coefficients σ = κ = 1 for various values of H . For each value of the Hurst index H , we simulate 50 sample paths of the solution u(t, x) to the equation (1). The trajectories of a solution are generated by the discretization of the formula (3).
We choose t 1 = 0.25, t 2 = 0.5, t 3 = 1 as the moments of observations, so that the conditions of Corollary 1 are satisfied and the estimator of H can be computed by the explicit formula (30). For each t i we observe u(t i , kδ), k = 1, . . . , N, with the step δ = 1. Table 1 reports the means and standard deviations of H N for various H and N . We see that the estimates converge to the true value of the parameter H . However, the convergence is much slower compared to the estimation of H in the pure fractional case (i. e. κ = 0), which was considered in [3]. The results become poorer when H approaches 1/2, or when H is close to zero. It's worth mentioning that the best performance of H N is observed for large values of H (0.8 and 0.9), for which the asymptotic normality does not hold.
The means and standard deviations of the estimates σ 2 N and κ 2 N are reported in Tables 2-3. Here we also clearly see that the estimators converge to the true values of the parameters, however the results for both estimators become worse when H  Table 3. Means and standard deviations of the estimator κ 2 N for σ = 1, κ = 1