Large deviations for conditionally Gaussian processes: estimates of level crossing probability

The problem of (pathwise) large deviations for conditionally continuous Gaussian processes is investigated. The theory of large deviations for Gaussian processes is extended to the wider class of random processes -- the conditionally Gaussian processes. The estimates of level crossing probability for such processes are given as an application.


Introduction
In this paper we study some large deviations principles for conditionally continuous Gaussian processes. Then we find estimates of level crossing probability for such processes. Large deviations theory is concerned with the study of probabilities of very "rare" events. There are events whose probability is very small, however these events are of great importance; they may represent an atypical situation (i.e. a deviation from the average behavior) that may cause disastrous consequences: an insurance company or a bank may bankrupt; a statistical estimator may give a wrong information; a physical or chemical system may show an atypical configuration. The aim of this paper is to extend the theory of large deviations for Gaussian processes to a wider class of random processes -the conditionally Gaussian processes. Such processes were introduced in applications in finance, optimization and control problems. See, for instance, [12,16,14] and [1]. More precisely, Doucet et al. in [12] considered modelling the behavior of latent variables in neural networks by Gaussian processes with random parameters; Lototsky in [16] studied stochastic parabolic equations with random coefficients; Gulisashvili in [14] studied large deviations principle for some particular stochastic volatility models where the log-price is, conditionally, a Gaussian process; in [1] probabilities of large extremes of conditionally Gaussian processes were considered, in particular sub-Gaussian processes i.e. Gaussian processes with a random variance. Let (Y, Z) be a random element on the probability space (Ω, F , P), where Z = (Z t ) t∈[0,1] is a process taking values in R and Y is an arbitrary random element (a process or a random variable). We say that Z is a conditionally Gaussian process if the conditional distribution of the process Z|Y is (almost surely) Gaussian. The theory of large deviations for Gaussian processes and for conditioned Gaussian processes is already well developed. See, for instance, Section 3.4 in [11] (and the references therein) for Gaussian processes, [7] and [13] for particular conditioned Gaussian processes. The extension of this theory is possible thanks to the results obtained by Chaganty in [8].
We consider a family of processes (Y n , Z n ) n∈N on a probability space (Ω, F , P). (Y n ) n∈N is a family of processes taking values in a measurable space (E 1 , E 1 ) that satisfies a large deviation principle (LDP for short) and (Z n ) n∈N is a family of processes taking values in (E 2 , E 2 ) such that for every n ∈ N, Z n |Y n is a Gaussian process (P-a.s.). We want to find a LDP for the family (Z n ) n∈N .
A possible application of LDPs is computing the estimates of level crossing probability (ruin problem). We will give the asymptotic behavior (in terms of large deviations) of the probability where ϕ is a suitable function. We will consider the following families of conditionally Gaussian processes.
1) The class of Gaussian processes with random variance and random mean, i.e. the processes of the type 1] , where X is a centered continuous Gaussian process with covariance function k and Y = (Y 1 , Y 2 ) is a random element independent of X. Notice that Z|Y is Gaussian with 2) The class of Ornstein-Uhlenbeck type processes with random diffusion coefficients. More precisely (Z t ) t∈[0,1] is the solution of the following stochastic differential equation: where x, a 0 , a 1 ∈ R and Y is a random element independent of the Brownian motion The paper is organized as follows. In Section 2 we recall some basic facts on large deviations theory for continuous Gaussian processes. In Section 3 we introduce the conditionally Gaussian processes and the Chaganty theory. In Section 4 and 5 we study the theoretical problem and we give the main results. Finally in Section 6 we investigate the ruin problem for such processes.

Large deviations for continuous Gaussian processes
We briefly recall some main facts on large deviations principles and reproducing kernel Hilbert spaces for Gaussian processes we are going to use. For a detailed development of this very wide theory we can refer, for example, to the following classical references: Chapitre II in Azencott [2], Section 3.4 in Deuschel and Strook [11], Chapter 4 (in particular Sections 4.1 and 4.5) in Dembo and Zeitouni [10], for large deviations principles; Chapter 4 (in particular Section 4.3) in [15], Chapter 2 (in particular Sections 2.2 and 2.3) in [5], for reproducing kernel Hilbert space. Without loss of generality, we can consider centered Gaussian processes.

Reproducing kernel Hilbert space
An important tool to handle continuous Gaussian processes is the associated reproducing kernel Hilbert space (RKHS).
Let U = (U t ) t∈[0,1] be a continuous, centered , Gaussian process on a probability space (Ω, F , P), with covariance function k. From now on, we will denote by

Consider the set
The RKHS relative to the kernel k can be constructed as the completion of the set L with respect to a suitable norm. Consider the set of (real) Gaussian random variables Define now H = Γ . L 2 (Ω,F ,P) .
Since L 2 -limits of Gaussian random variables are still Gaussian, we have that H is a closed subspace of L 2 (Ω, F , P) consisting of real Gaussian random variables. Moreover, it becomes a Hilbert space when endowed with the inner product Remark 1. We remark that, since any signed Borel measure λ can be weakly approximated by a linear combination of Dirac deltas, the Hilbert space H above is nothing but the Hilbert space generated by the Gaussian process U , namely . L 2 (Ω,F ,P) .
Consider now the following mapping Definition 1. Let U = (U t ) t∈[0,1] be a continuous Gaussian process. We define the reproducing kernel Hilbert space relative to the Gaussian process U as with an inner product defined as Then, we have Lemma 1. (Theorem 35 in [5]). Let H be the Hilbert space of the continuous Gaussian process U defined above. Then H is isometrically isomorphic to the Reproducing Kernel Hilbert Space H of U , and the corresponding isometry is given by (2). The map S defined in (2) is referred to as Loève isometry. Since the covariance function fully identifies, up to the mean, a Gaussian process, we can talk equivalently of RKHS associated with the process or with its covariance function.

Large deviations
Definition 2. (LDP) Let E be a topological space, B(E) the Borel σ-algebra and (µ n ) n∈N a family of probability measures on B(E); let γ : N → R + be a function, such that γ(n) → +∞ as n → +∞. We say that the family of probability measures (µ n ) n∈N satisfies a large deviation principle (LDP) on E with the rate function I and the speed γ(n) if, for any open set Θ, and for any closed set Γ lim sup A rate function is a lower semicontinuous mapping I : E → [0, +∞]. A rate function I is said good if the sets {I ≤ a} are compact for every a ≥ 0.
Definition 3. (WLDP) Let E be a topological space, B(E) the Borel σ-algebra and (µ n ) n∈N a family of probability measures on B(E); let γ : N → R + be a function, such that γ(n) → +∞ as n → +∞. We say that the family of probability measures (µ n ) n∈N satisfies a weak large deviation principle (WLDP) on E with the rate function I and the speed γ(n) if the upper bound (3) holds for compact sets.
Remark 2. We say that a family of continuous processes ((X n t ) t∈[0,1] ) n∈N satisfies a LDP if the associated family of laws satisfy a LDP on C ([0, 1]).
The following remarkable theorem (Proposition 1.5 in [2]) gives an explicit expression of the Cramér transform Λ * of a continuous centered Gaussian process be a continuous, centered Gaussian process with covariance function k. Let Λ * denote the Cramér transform of Λ, that is, Then, where H and . H denote, respectively, the reproducing kernel Hilbert space and the related norm associated to the covariance function k.
In order to state a large deviation principle for a family of Gaussian processes, we need the following definition.

Definition 4. A family of continuous processes
If the means and the covariance functions of an exponentially tight family of Gaussian processes have a good limit behavior, then the family satisfies a large deviation principle, as stated in the following theorem which is a consequence of the classic abstract Gärtner-Ellis Theorem (Baldi Theorem 4.5.20 and Corollary 4.6.14 in [10]) and Theorem 1. 1] ) n∈N be an exponentially tight family of continuous Gaussian processes with respect to the speed function γ(n). Suppose that, for any λ ∈ M [0, 1], lim n→+∞ E λ, X n = 0 (6) and the limit exists for some continuous, symmetric, positive definite functionk, which is the covariance function of a continuous Gaussian process. Then ((X n t ) t∈[0,1] ) n∈N satisfies a large deviation principle on C ([0, 1])), with the speed γ(n) and the good rate function whereH and . H respectively denote the reproducing kernel Hilbert space and the related norm associated to the covariance functionk.
A useful result which can help in investigating the exponential tightness of a family of continuous centered Gaussian processes is the following proposition (Proposition 2.1 in [17]); the required property follows from Hölder continuity of the mean and the covariance function.
Then the family ((X n t ) t∈[0,1] ) n∈N is exponentially tight with respect to the speed function γ(n).

Conditionally Gaussian processes
In this section we introduce conditionally Gaussian processes and the Chaganty theorem which allows us to find a LDP for families of such processes. We also recall, for sake of completeness, some results about conditional distributions in Polish spaces. We referred to Section 3.1 in [6] and Section 4.3 in [3].
Let Y and Z two random variables, defined on the same probability space (Ω, F , P), with values in the measurable spaces (E 1 , E 1 ) and (E 2 , E 2 ) respectively, and let us denote by µ 1 , µ 2 the (marginal) laws of Y and Z respectively and by µ the In this case we have µ(dy, dz) = µ 2 (dz|y)µ 1 (dy). In this section we will use the notation (E, B) to indicate a Polish space (i.e. a complete separable metric space) with the Borel σ-field, and we say that a sequence where d E denotes the metric on E. Regular conditional probabilities do not always exist, but they exist in many cases. The following result, that immediately follows from Corollary 3.2.1 in [6], shows that in Polish spaces the regular version of the conditional probability is well defined. In what follows we always suppose random variables taking values in a Polish space.
Definition 5. Let (Y, Z) be a random element on the probability space (Ω, F , P), where Z = (Z t ) t∈[0,1] is a real process and Y is an arbitrary random element (a process or a random variable). We say that Z is a conditionally Gaussian process if the conditional distribution of the process Z|Y is (almost surely) Gaussian. We denote by (Z y t ) t∈[0,1] the Gaussian process Z|Y = y. The main tool that we will use to study LDP for a family of conditionally Gaussian processes is provided by Chaganty Theorem (Theorem 2.3 in [8]). Let (E 1 , B 1 ) and (E 2 , B 2 ) be two Polish spaces. We denote by (µ n ) n∈N a sequence of probabilities measures on (E, B) = (E 1 × E 2 , B 1 × B 2 ) (the sequence of joint distributions), by (µ 1n ) n∈N the sequence of the marginal distributions on (E 1 , B 1 ) and by (µ 2n (·|x 1 )) n∈N the sequence of conditional distributions on (E 2 , B 2 ) (x 1 ∈ E 1 ,), given by Proposition 2, i.e.
Definition 6. Let (E 1 , B 1 ), (E 2 , B 2 ) be two Polish spaces and x 1 ∈ E 1 . We say that the sequence of conditional laws (µ 2n (·|x 1 )) n∈N on (E 2 , B 2 ) satisfies the LDP continuously in x 1 with the rate function J(·|x 1 ) and the speed γ(n), or simply, the LDP continuity condition holds, if a) For each x 1 ∈ E 1 , J(·|x 1 ) is a good rate function on E 2 . b) For any sequence (x 1n ) n∈N ⊂ E 1 such that x 1n → x 1 , the sequence of measures (µ 2n (·|x 1n )) n∈N satisfies a LDP on E 2 with the (same) rate function J(·|x 1 ) and the speed γ(n).
c) J(·|·) is lower semicontinuous as a function of (x 1 , Theorem 3 (Theorem 2.3 in [8]). Let (E 1 , B 1 ), (E 2 , B 2 ) be two Polish spaces. For i = 1, 2 let (µ in ) n∈N be a sequence of measures on (E i , B i ). For x 1 ∈ E 1 , let (µ 2n (·|x 1 )) n∈N be the sequence of the conditional laws (of µ 2n given µ 1n ) on (E 2 , B 2 ). Suppose that the following two conditions are satisfied: i) (µ 1n ) n∈N satisfies a LDP on E 1 with the good rate function I 1 (·) and the speed γ(n).
ii) For every x 1 ∈ E 1 , the sequence (µ 2n (·|x 1 )) n∈N satisfies the LDP continuity condition on E 2 with the rate function J(·|x 1 ) and the speed γ(n).
Then the sequence of joint distributions (µ n ) n∈N satisfies a WLDP on E = E 1 × E 2 with the speed γ(n) and the rate function The sequence of marginal distributions (µ 2n ) n∈N defined on (E 2 , B 2 ), satisfies a LDP with the speed γ(n) and the rate function Moreover, if I(·, ·) is a good rate function then (µ n ) n∈N satisfies a LDP and I 2 (·) is a good rate function. ) and for n ∈ N, Z n = X n Y n 1 + Y n 2 with (Y n ) n∈N independent of (X n ) n∈N . Suppose ((X n t ) t∈[0,1] ) n∈N is family of continuous centered Gaussian processes which satisfy the hypotheses of Theorem 2 and suppose that (Y n ) n∈N satisfies a LDP with the good rate function I Y and the speed γ(n). We want to prove a LDP principle for (Z n ) n∈N . Proposition 3. Let ((X n t ) t∈[0,1] ) n∈N be a family of continuous Gaussian processes which satisfies the hypotheses of Theorem 2 and let y = (y 1 , y 2 ) ∈ C α ([0, 1]) × C ([0, 1]). Then the family ((X n t y 1 (t) + y 2 (t)) t∈[0,1] ) n∈N is still a family of continuous Gaussian processes which satisfies the hypotheses of Theorem 2 with the same speed function and limit covariance function (depending only on y 1 ) k y1 given by k y1 (s, t) = y 1 (s)y 1 (t)k(s, t).

Gaussian process with random mean and random variance
Therefore, also ((X n t y 1 (t) + y 2 (t)) t∈[0,1] ) n∈N satisfies a LDP with the good rate function whereH y1 is the RKHS associated to the covariance function defined in (10).
Proof. This is a simple application of the contraction principle. Definition 7. Let (E, d E ) be a metric space, and let (µ n ) n∈N , (μ n ) n∈N be two families of probability measures on E. Then (µ n ) n∈N and (μ n ) n∈N are exponentially equivalent (at the speed γ(n)) if there exist a family of probability spaces ((Ω, F n , P n )) n∈N and two families of E-valued random variables (Z n ) n∈N and (Z n ) n∈N such that, for any δ > 0, the set {ω : d E (Z n (ω), Z n (ω)) > δ} is F nmeasurable and As far as the LDP is concerned exponentially equivalent measures are indistinguishable. See Theorem 4.2.13 in [10].
Proof. Thanks to the lower semicontinuity of · 2H lim inf (y n ,z n )→(y,z) J z n |y n = lim inf Theorem 4. Consider the family of processes (Y n , Z n ) n∈N , where (Y n ) n∈N = (Y n 1 , Y n 2 ) n∈N is a family of processes with paths in C α ([0, 1]) × C ([0, 1]) and for n ∈ N, Z n = X n Y n 1 + Y n 2 with (Y n ) n∈N independent of (X n ) n∈N . Suppose ((X n t ) t∈[0,1] ) n∈N is family of continuous centered Gaussian processes which satisfy the hypotheses of Theorem 2 and suppose that (Y n ) n∈N satisfies a LDP with the good rate function I Y and the speed γ(n). Then (Y n , Z n ) n∈N satisfies the WLDP with the speed γ(n) and the rate function I(y, z) = I Y (y) + J(z|y), and (Z n ) n∈N satisfies the LDP with the speed γ(n) and the rate function Proof. Thanks to Propositions 3, 4 and 5 the family of processes (Y n , Z n ) n∈N satisfies the hypotheses of Theorem 3, therefore the theorem holds.
where, x, a 0 , a 1 ∈ R and (Y n ) n∈N is a family of random processes independent of the Brownian motion (W t ) t∈[0,1] . Suppose that (Y n ) n∈N satisfies a LDP with the good rate function I Y , and the speed γ(n) = n. We want to prove a LDP principle for (Z n ) n∈N .
Let Z n,y , y ∈ C α ([0, 1]), be the solution of the following stochastic differential equation, that is where And it is well known from the theory of Gaussian processes that the family ((Z n,y t ) t∈[0,1] ) n∈N satisfies the LDP in C ([0, 1]) with the speed γ(n) = n and the good rate function where H y is the reproducing kernel Hilbert space associate to the covariance function The two rate functions (for the unicity of the rate function) are the same rate function. So we can deduce a LDP for the family (Z n ) n∈N in two different ways. First let (Z n,y t ) t∈[0,1] be a family of diffusions. Remark 4. For y n ∈ C α ([0, 1]) let Z n,y n denote the solution of equation (13)   Proof. If y n Cα([0,1]) −→ y, then for any ε > 0, eventually inf t∈[0,1] | y(t) y n (t) | 2 ≥ (1 − ε), and by the lower semicontinuity of J(·|y), and the proposition holds.
Theorem 5. Consider the family of processes (Y n , Z n ) n∈N , where (Y n ) n∈N is a family of processes with paths in C α ([0, 1]) and for n ∈ N, Z n is the solution of (12). Suppose that (Y n ) n∈N is independent from the Brownian motion and satisfies a LDP with the good rate function I Y and the speed γ(n) = n. Then (Y n , Z n ) n∈N satisfies the WLDP with the speed γ(n) = n and rate function I(y, z) = I Y (y) + J(z|y), and (Z n ) n∈N satisfies the LDP with the speed γ(n) and the rate function Proof. The family of processes (Y n , Z n ) n∈N , thanks to Remark 4 and Proposition 6, satisfies the hypotheses of Theorem 3, therefore the theorem holds. Therefore the family (Z n,y n ) n∈N is exponentially tight at the speed n. Furthermore, conditions (6) and (7)  where k y (s, t) = e a1(s+t) s∧t 0 e −2a1u y 2 (u) du. Therefore (Z n,y n ) n∈N satisfies a LDP on C ([0, 1]). Finally, thanks to the contraction principle, the family (Z n,y n ) n∈N satisfies a LDP on C ([0, 1]) with the rate function J(·|y) defined in (16). We have proved that the hypotheses of Theorem 3 are verified, so the LDP for (Z n ) n∈N follows.

Estimates of level crossing probability
In this section we will study the probability of level crossing for a family of conditionally Gaussian processes. In particular, we will study the probability as n → ∞, where (Z n ) n∈N is a family of conditionally Gaussian process. In this situation the probability p n has a large deviation limit The main reference in this section is [4]. We now compute lim n→∞ 1 γ(n) log(p n ), for a fixed continuous path ϕ ∈ C ([0, 1]). The computation is simple, in fact, since (Z n ) n∈N satisfies a LDP with the rate function where I Y (·) is the rate function associated to the family of conditioning processes (Y n ) n∈N , C is the Polish set where (Y n ) n∈N takes values, and J(·|y) is the good rate function of the family of Gaussian processes (Z n,y ) n∈N . If we denote It is a simple calculation to show that inf w∈Å I Z (w) = inf w∈Ā I Z (w). Therefore, For every t ∈

Gaussian process with random mean and variance
For every n ∈ N, let Z n = X n Y n 1 + Y n 2 as in Section 4. In this case we know that otherwise.
Therefore, we have The set of paths of the form is dense inH and, therefore, the infimum inf w∈At {I Y (y) + 1 2 w y 2H } is the same as that over the functions w such that for some λ ∈ M [0, 1]. For such kind of paths, recalling the expression of their norms in the RKHS, the functional we aim to minimize is given by therefore, it is enough to minimize the right-hand side of the above equation with respect to the measure λ, with the additional constraint that which we can write in the equivalent form This is a constrained extremum problem, and thus we are led to use the method of Lagrange multipliers. The measure λ must be such that for some β ∈ R. We find β = 1 0k (t, u) dλ(u) k(t, t) = 1 + ϕ(t) − y 2 (t) y 1 (t)k(t, t) andλ = 1 + ϕ(t) − y 2 (t) y 1 (t)k(t, t) δ {t} .
Such measure satisfies the Lagrange multipliers problem, and it is therefore a critical point for the functional we want to minimize. Since this is a strictly convex functional restricted on a linear subspace of M [0, 1], it is still strictly convex, and thus the critical pointλ is actually its unique point of minimum. Hence, we have inf 0≤t≤1 I Y (y) + (1 + ϕ(t) − y 2 (t)) 2 2y 2 1 (t)k(t, t) .

Ornstein-Uhlenbeck processes with random diffusion coefficient
In this case J(f |y) = where m(t) = e a1t (x + a0 a1 [1 − e −a1t ]), t ∈ [0, 1], H y is the RKHS associated to therefore, it is enough to minimize the right-hand side of the above equation with respect to the measure λ, with the additional constraint w(t) = 1 + ϕ(t), which can be written in the equivalent form This is a constrained extremum problem, and thus we are led to use the method of Lagrange multipliers. We find β = inf 0≤t≤1 I Y (y) + (1 + ϕ(t) − m(t)) 2 2k y (t, t) .