On backward Kolmogorov equation related to CIR process

We consider the existence of a classical smooth solution to the backward Kolmogorov equation \begin{align*} \begin{cases} \partial_t u(t,x)=Au(t,x),&x\ge0,\ t\in[0,T],\\ u(0,x)=f(x),&x\ge0, \end{cases} \end{align*} where $A$ is the generator of the CIR process, the solution to the stochastic differential equation \begin{equation*} X^x_t=x+\int_0^t\theta \bigl(\kappa-X^x_s\bigr)\,ds+\sigma\int _0^t\sqrt {X^x_s} \,dB_s, \quad x\ge0,\ t\in[0,T], \end{equation*} that is, $Af(x)=\theta(\kappa-x)f'(x)+\frac{1}{2}\sigma^2xf''(x)$, $ x\ge0$ ($\theta,\kappa,\sigma>0$). Alfonsi \cite{Alfonsi} showed that the equation has a smooth solution with partial derivatives of polynomial growth, provided that the initial function $f$ is smooth with derivatives of polynomial growth. His proof was mainly based on the analytical formula for the transition density of the CIR process in the form of a~rather complicated function series. In this paper, for a CIR process satisfying the condition $\sigma^2\le4\theta\kappa$, we present a direct proof based on the representation of a CIR process in terms of a~squared Bessel process and its additivity property.

that is, Af (x) = θ(κ − x)f ′ (x) + 1 2 σ 2 xf ′′ (x), x ≥ 0 (θ, κ, σ > 0). Alfonsi [1] showed that the equation has a smooth solution with partial derivatives of polynomial growth, provided that the initial function f is smooth with derivatives of polynomial growth. His proof was mainly based on the analytical formula for the transition density of the CIR process in the form of a rather complicated function series. In this paper, for a CIR process satisfying the condition σ 2 ≤ 4θκ, we present a direct proof based on the representation of a CIR process in terms of a squared Bessel process and its additivity property.

Introduction
Let us recall the well-known relationship between the one-dimensional stochastic differential equation (SDE) 1) and the following parabolic partial differential equation (PDE), called the backward Kolmogorov equation, with initial condition where Af = bf ′ + 1 2 σ 2 f ′′ is the generator of the diffusion defined by SDE (1.1). If the coefficients b, σ : R → R and the initial function f are sufficiently "good," then the function u = u(t, x) := Ef (X x t ) is a (classical) solution to PDE (1.2). From this by Itô's formula it follows that the random process . This fact is essential in rigorous proofs of the convergence rates of weak approximations of SDEs. The higher the convergence rate, the greater smoothness of the coefficients, and the final condition is to be assumed to get a sufficient smoothness of the solution u to (1.2). The question of the existence of smooth classical solutions to the backward Kolmogorov equation is more complicated than it might seem from the first sight. General results typically require smoothness and polynomial growth of several higher-order derivatives of the coefficients; we refer to the book by Kloeden and Platen [9], Theorem 4.8.6 on p. 153.
However, the coefficients of many SDEs used in financial mathematics are not sufficiently good, and therefore the general theory is not applicable. A classic example is the well-known Cox-Ingersoll-Ross (CIR) process [5], the solution to the SDE with parameters θ, κ, σ > 0, x ≥ 0, where the diffusion coefficientσ(x) = σ √ x has unbounded derivatives. Alfonsi [1,Prop. 4.1], using the known expression of the transition density of CIR process by a rather complicated function series, gave an ad hoc proof that, indeed, is the generator of the CIR process (1.3). Moreover, he proved that if f : R + → R is sufficiently smooth with partial derivatives of polynomial growth, then so is the solution u.
In this paper, in case the coefficients of Eq. (1.3) satisfy the condition σ 2 ≤ 4θκ, we give another proof of this result, where we do not use the transition function. We believe that our approach will be applicable to a wider class of "square-root-type" processes for which an explicit form of the transition function is not known (e.g., the well-known square-root stochastic-volatility Heston process [7]). The main tools are the additivity property of CIR processes and their representation in terms of squared Bessel processes. More precisely, we use, after a smooth time-space transformation, the expression of the solution to Eq.
is a squared Bessel process independent from B. The main challenge is the negative powers of x appearing in the expression of u(t, x) = Ef (X x t ) after differentiation with respect to x > 0. To overcome it, we use a "symmetrization" trick (see Step 1 in the proof of Theorem 4) based on the simple fact that replacing B t by the "opposite" Brownian motionB t := −B t does not change the distribution of X x t . Both proofs, Alfonsi's and ours, are "probabilistic." It is interesting whether there are similar results with "nonprobabilistic" proofs in the literature. Equation (1.2) seems to be a very simple equation, with coefficients analytic everywhere and the diffusion nondegenerate everywhere except a single point. However, although there is a vast literature on degenerate parabolic and elliptic equations, we could find only a few related results, which, however, do not include the case of initial functions f from C n pol (R + ) or C ∞ pol (R + ) (see the notation in the Introduction); instead, the boundedness of f and its derivatives is assumed as a rule. For example, general Theorem 1.1 of Feehan and Pop [6] (see also Cerrai [4]) in our particular (one-dimensional) case gives an a priori estimate of the form in terms of the corresponding Hölder and weighted Hölder space supremum norms.
is called a squared Bessel process with dimension δ, starting at x (BESQ δ x for short). We further denote it by Y δ t (x) or Y δ (t, x), and also, Y δ t := Y δ t (0). Lemma 1 (See [8], Section 6.1). Let B = (B 1 , B 2 , . . . , B n ) be a standard ndimensional Brownian motion, n ∈ N. Then the process where z = (z 1 , . . . , z n ) ∈ R n , coincides in distribution with Y n t ( z ), that is, with a BESQ n x random process starting at x = z = n i=1 z 2 i . In particular, We will frequently use differentiation under the integral sign (in particular, under the expectation sign). Without special mentioning, this will be clearly justified by Lemma 3, which seems to be a folklore theorem; we refer to technical report [3].
Lemma 3 (Differentiation under the integral sign; see [3], Thm. 4.1). Let (E, A, µ), X, and let f be as in Definition 2. Suppose that f has partial derivatives ∂f ∂xi (x, ω) for all (x, ω) ∈ X × E and that both f and ∂f ∂xi are locally integrable in X. Then for almost all x ∈ X. In particular, if both sides are continuous in X, then we have equality for all x ∈ X.
Notation. As usual, N and R are the sets of natural and real numbers, R + := [0, ∞), and N := N ∪ {0}. We denote by C n pol (R + ) the set of n times continuously differentiable functions f : R + → R such that there exist constants C i ≥ 0 and k i ∈ N, i = 0, 1, . . . , n, such that for all i = 0, 1, . . . , n. Then, following Alfonsi [2], we say that the set of constants then the sequence of constants {(C i , k i ), i ∈ N} is said to be good for f . Finally, by C ≥ 0 and k ∈ N we will denote constants that depend only on the good set of a function f and may very from line to line.

Existence and properties of a solution to backward Kolmogorov equation related to CIR process
Our main result is a direct proof of the following: Theorem 4 (cf. Alfonsi [1], Prop. 4.1). Let X t (x) = X x t be a CIR process with coefficients satisfying the condition σ 2 ≤ 4θκ and starting at x ≥ 0. Let f ∈ C q pol (R + ) for some q ≥ 4. Then the function Proof. We first focus ourselves on the differentiability in x ≥ 0. By Lemma 2 the process X t (x) can be reduced, by a space-time transformation, to the BESQ δ process . With an abuse of notation, we further write T instead ofT . We proceed by induction on l.
Step 1. Let l = 1. First, suppose that δ = n ∈ N. By Lemma 1 we have We now estimate P (t, x) and R(t, x) separately. By the well-known inequality we have the following estimates: and, as a consequence, Now, for P (t, x), we have where the constant C depends only on C 1 , k 1 , T , and n. At this point, we need the following technical lemma, which we will prove in the Appendix.

Lemma 5.
For a function f : R + → R, define the function If f ∈ C q pol (R + ) for some q = 2l + 1 ∈ N (l ∈ N), then the function g is extendable to a continuous function on R + × R × R + such that g(·; a, b) ∈ C l pol (R + ) for all a ∈ R and b ∈ R + . Moreover, there exist constants C ≥ 0 and k ∈ N, depending only on a good set {(C i , k i ), i = 0, 1, . . . , q} for f , such that
Now consider R(t, x). Applying Lemma 5 with f ′ instead of f (and thus with g 1 instead of g), we have where the constant C clearly depends only on C 2 , k 2 , T , and n.
Combining the obtained estimates, we finally get where k = max{k 1 , k 2 }, and the constant C depends only on C 1 , C 2 , k 1 , k 2 , T , and n. Now consider the general case where δ ≥ 1, δ / ∈ N. Note that we consider the general case only for l = 1 because the reasoning for higher-order derivatives is the same.

Combination of estimates (3.4) and (3.5) leads to the estimate
where the constant C depends only on C 1 , k 1 , T , and n. By Lemma 5, similarly to estimate (3.8), we have where the constant C depends only on C 2 , k 2 , T , and n. Combining the last two estimates, we get where k = max{k 1 , k 2 }, and the constant C depends only on C 1 , C 2 , k 1 , k 2 , and T .
Step 2. Let l = 2. From Step 1 we have Therefore, From estimate (3.9) with f replaced by f ′ we obtain where the constant C depends only on C 1 , C 3 , k 1 , k 3 , T , and n. For R 2 (t, x), applying Lemma 5 once more to g 1 instead of g, we get where the constants C and k ∈ N depend only on {(C i , k i ), i = 1, 2, 3, 4}, T , and n.
Combining the obtained estimates, we finally get where the constants C and k ∈ N depend only on {(C i , k i ), i = 1, 2, 3, 4}, T , and n.
Step 3. Now we may continue by induction on l. Suppose that estimate (3.1) is valid for l = m − 1. Let us show that it is still valid for l = m. The arguments are similar to those in the case m = 2 (Step 2). We have Then, similarly to estimates (3.6) and (3.10), we have where the constant C depends only on {(C i , k i ), i = 1, 3, . . . , 2m − 1}, T , and n. For R m (t, x), applying Lemma 5 to g 1 instead of g, we get where the constants C and k ∈ N depend only on {(C i , k i ), i = 1, . . . , 2m}, T , and n. Combining the obtained estimates, we get where the constants C and k ∈ N depend only on {(C i , k i ), i = 1, . . . , 2m}, T , and n. Thus, Theorem 4 is proved for all l ∈ N.
Then we have the estimate where C depends on C i+1 and k i+1 only. Now let us concentrate ourselves on the derivatives of g = g 0 with respect to x. We have (Note that the term at the negative power of x, that is, at 1/ √ x, vanishes since In particular, the function g = g 0 is continuously differentiable at x = 0 and thus belongs to C 1 (R + ) since g ′ 0 (0; a, b) = lim x↓0 g ′ 0 (x; a, b) by the Lagrange theorem. If, moreover, f ∈ C 5 pol (R + ) satisfies estimates (2.4) for i ≤ 5, then we have the corresponding estimate for g ′ 0 : where C and k depend on C 2,3 , k 2,3 , and A = a 2 + b only.
Thus, we have proved that g = g 0 ∈ C 1 pol (R + ), provided that f ∈ C 5 pol (R + ). (In fact, for estimate (A. 3), it suffices that f ∈ C 3 pol (R + ).) More precisely, if where the constants C > 0 and k ∈ N depend only on C i and k i , i = 1, 2, 3, and, in particular, on a good set of the function f ∈ C 5 pol (R + ). Now, let us proceed to the second derivative of g 0 . From Eq. (A.2) we have (Note that, again, the term at the negative power of x, that is, at 1/ √ x, vanishes since 1 −1 s 0 u du s ds = 0.) In particular, again by the Lagrange theorem, g 0 is twice continuously differentiable on the whole half-line R + since there exists the finite limit If, moreover, f ∈ C 5 pol (R + ) satisfies estimates (A.1), then we have the corresponding estimate for g ′′ 0 : g ′′ 0 (x, a, b) ≤ 2|a| where the constants C > 0 and k ∈ N depend only on C i and k i , i = 3,4,5, and, in particular, on a good set of the function f ∈ C 5 pol (R + ). Now, for l > 2, we can proceed similarly. For f ∈ C 2l+1 pol (R + ), denote Then, in addition to the first two derivatives g ′ 0 (x, a, b) = a 2F 0,2 + 4a 2 F 1,3 and g ′′ 0 (x, a, b) = a 2F 0,3 + 8a 2 F 1,4 + 8a 4 F 2,5 , we get: where c j,l , 0 ≤ j ≤ l, are some constants. Note that, as before, in the right-hand side of Eq. (A.5), there are no negative powers of x, so that g 0 is l times continuously differentiable on the whole half-line R + , provided that f ∈ C 2l+1 pol (R + ). Moreover, as before, from (A.5) we get the following estimates for g 0 (x, a, b) ≤ C|a| 1 + A k + x k , x ≥ 0, r = 0, 1, . . . , l, where the constants C > 0 and k ∈ N depend only on C i and k i , i = 0, . . . , 2l + 1, that is, only on a good set of the function f ∈ C 2l+1 pol (R + ).