Stable L\'{e}vy diffusion and related model fitting

A fractional advection-dispersion equation (fADE) has been advocated for heavy-tailed flows where the usual Brownian diffusion models fail. A stochastic differential equation (SDE) driven by a stable L\'{e}vy process gives a forward equation that matches the space-fractional advection-dispersion equation and thus gives the stochastic framework of particle tracking for heavy-tailed flows. For constant advection and dispersion coefficient functions, the solution to such SDE itself is a stable process and can be derived easily by least square parameter fitting from the observed flow concentration data. However, in a more generalized scenario, a closed form for the solution to a stable SDE may not exist. We propose a numerical method for solving/generating a stable SDE in a general set-up. The method incorporates a discretized finite volume scheme with the characteristic line to solve the fADE or the forward equation for the Markov process that solves the stable SDE. Then we use a numerical scheme to generate the solution to the governing SDE using the fADE solution. Also, often the functional form of the advection or dispersion coefficients are not known for a given plume concentration data to start with. We use a Levenberg--Marquardt (L-M) regularization method to estimate advection and dispersion coefficient function from the observed data (we present the case for a linear advection) and proceed with the SDE solution construction described above.


Introduction
The usual hydrological model for contamination/tracer transport through a porous media is given by a second order advection-dispersion equation (ADE) of the form where c(x, t) is the tracer concentration at time t, location x, v is the drift velocity and D is related to the diffusivity of the media [4,15]. The probabilistic approach to describe this flow from a mesoscopic view is given by the hypothesis that the path of a randomly chosen tracer particle is a Markov process that solves a stochastic differential equation (SDE) driven by Brownian motion. The basis of this hypothesis is that the conditional probability density function of this Markov process solves a forward equation of the same form as the ADE [7,6].
However, for some heavy-tailed flows, the second order diffusion model can be inadequate. For such cases a model called the fractional advection-dispersion equation or fADE of the following form has been proposed [27]: For the current discussion, we will consider a one-dimensional concentration c(x, t) where x denotes the distance from the origin of the plume, v is the drift velocity and D is a function that changes with the diffusion of the tracer. The fractional differentiation order α ∈ (1, 2) controls the tail of the flow. Here the fractional derivative of order α for any function f is defined as in [3] by: The negative fractional derivative is given by If we assume that the diffusion coefficient D is location invariant, then the fADE can be associated with an SDE driven by an α-stable Lévy process X t 1 of the form: Following the Brownian diffusion argument, in the heavy-tailed diffusion model, we assume a random particle's position at time t is given by the process Y t that solves SDE (1.2). It can be shown that Y t is a Markov process [1]. Let us denote the transition probability density function of Y t by p y0 (y, t), i.e. P (Y t ∈ A|Y 0 = y 0 ) = A p y0 (y, t)dy. For the initial distribution µ(u) the pdf of Y t is given by p(y, t) = p u (y, t)µ(u)du, (1.3) e.g., for a ground water tracer concentration modeling it is reasonable to assume that all particles start at location y 0 at time t = 0 hence µ(u) = I(u = y 0 ). If 1 < α < 2, it can be shown [10] that p(y, t) solves the forward equation: ∂p(y, t) ∂t = − ∂ ∂y a(y)p(y, t) + (1 + β) 2 − cos πα 2 −1 · ∂ α ∂y α b α (y)p(y, t) (1.4) Typically the coefficient functions a and b depend on t via Y t as in (1.2) hence the terms in the forward equation are expressed as a(y) and b(y), where y is supposed to be in the range space of Y t . However, without loss of generality, the functions can also be considered as a mapping from T × R → R as a(y, t) and b(y, t) and the associated forward equation will remain the same. Thus for a location invariant D, the fADE in (1.1) can be written as: which essentially has the same form as the forward equation given in (1.4) This shows that in a heavy-tailed plume that follows fADE (1.5), the position Y t of a randomly chosen tracer particle at time t solves SDE (1.2). Choosing β = 1 will allow the particle transition in forward or backward direction.
In most practical tracking scenarios, the plume data consists of observed tracer concentration values c(x, t) over a range of locations at certain time points. We write p(y, t) = K t c(y, t), where K t is a suitable scale parameter that adjusts the total mass for c(x, t) so that R p(y, t)dy = 1 for each t.
In case v(·) and D(·) are constants, (1.2) reduces to Y t = at + bX t with a = v, b = [− cos( πα 2 )D] 1/α and consequently, Y t itself is a stable process. The parameters of this stable process then can be estimated using the observed data (see [11] for details). However, when the coefficient functions are not constants, the solution to the SDE in (1.2) may not have a closed form.
In this paper, we present a numerical method to solve Y t through the fitted fADE equation using the observed tracer concentration data. Also, often for a groundwater contamination modeling problem the functional form of the advection or dispersion coefficients are not known to start with. In that case, the proper advection and dispersion coefficient functions are needed to be estimated. This presents an inverse problem. We formulate this inverse problem as an optimization problem and develop a Levenberg-Marquardt (L-M) regularization method to obtain the proper advection and dispersion coefficient function from the observed data (the case for a linear drift function is presented here). Then we proceed to generate Y t using these fitted coefficient functions. The detailed methodology is described in Section 2. Section 3 provides an illustrative example. Section 4 includes concluding remarks.

Methodology
We propose a method for generating a weak solution for the SDE given in (1.2). The core technique is the standard probability integral transformation. Suppose the probability density function (pdf) of the solution process Y t at a given time point t is given by p(y, t) and the cumulative distribution function (CDF) is defined by , a random observation of Y t , say y, can be generated by F −1 Yt (u) = y where u is a random observation from Uniform(0, 1). For the present problem, closed form of p(t, y) is not available, and we use point-wise numerical approximations to estimate p(y, t) for any fixed t and fixed y. Therefore we can only use a numerical scheme to approximate F Yt and F −1 Yt .
2.1 Simulating a solution process for an SDE driven by a heavy-tailed stable process For a heavy-tailed plume where the particle path can be modeled by the SDE in (1.2), the forward equation is of the form (1.5). To generate the solution to this SDE we can consider two scenarios: (i) we may want to solve an SDE with a given form of coefficient functions v(·) and D(·) and driven by a given α-stable process; (ii) a more practical problem when only the tracer concentrations at different time points over a range of locations are observed from a plume, while the coefficient functions need to be estimated along with the parameters of the driving process X t in (1.2) from the observed concentration values.
In either case, first, we find p(t, y) that solves the forward equation (1.4) with given or estimated coefficient functions and then generate Y t using these p(t, y)'s. Y t Simulation Steps: Step 1: Numerical Solution for p(t, y): Assuming the probability density p(y, t) = K t c(y, t), we estimate K t , α and β from the observed data that assumed to satisfy fADE (1.5). p(y, t) can be solved numerically using a discretization scheme from the forward equation (1.4). In case the coefficient functions in (1.4) are not known, an inverse problem optimization technique can be used for estimating the parameters. See Section 2.2 for details of this step.
In either case the discretization method provides only point-wise approximation of p(y, t) values for given t and y and not a closed functional form.
Step 2: Generating Y t : To generate Y t for a fixed t using an inverse CDF transformation, we present a straightforward scheme with a trapezoidal rule to approximate F Yt and F −1 Yt . Approximate Y t 's can be simulated using p(y, t)'s from step 1 as follows: (a) Choose suitable grid y 0 < y 1 < · · · < y n and solve for p(y i , t) as mentioned in step 1. The endpoints y 0 and y n can be chosen so that p(y, t) is small or negligible for y < y 0 or y > y n . The fitted coefficient and parameters from the observed data in step 1 can be used for any y i 's.
(c) For a randomly generated Uniform(0, 1) observation u, find its placement in the grid, i.e. y r , y r+1 such thatF Yt (y r ) ≤ u <F Yt (y r+1 ), for some r ∈ {0, 1, . . . , n}. A way of generating associated random observation of Y t or F −1 Yt (u) (approximated) asŷ is described below: In case u <F Yt (y 0 ) setŷ = y 0 and if u ≥F Yt (y n ) set y = y n .
(d) To generate N i.i.d approximated random observationsŷ 1 ,ŷ 2 , . . . ,ŷ N of Y t , start with randomly generated independent Uniform(0, 1) observations u 1 , u 2 , . . . u N and then repeat step (c). Note that the efficiency of this approximation will depend on the spacing of the chosen grid since we are actually generating sample from distributionF Yt here andF Yt → F Yt as max 0≤i≤n |y i+1 − y i | → 0 following the convergence property of the standard trapezoidal rule.

A formulation for a numerical solution of the forward equation
In this subsection, we will present a finite volume scheme with the characteristic line for solving a forward equation of the form (1.4). To incorporate a discretized finite volume scheme with proper boundary condition, we re-parameterize the forward equation with fractional order 2 − λ with 0 < λ < 1 (consistent with the construction described in [25]) as follows: (2.1) Essentially, compared to (1.4) the fractional derivative order α = 2 − λ, while γ = 1+β 2 indicates the relative weight of forward versus backward transition. x l D −λ x and x D −λ xr represent the left and right fractional integral operators defined as To comply with the boundary conditions, x = x l and x = x r are set to be the inflow and outflow boundaries respectively [25], while a(x l ) and a(x r ) are assumed to be non-negative.

The discretized finite volume scheme and the accumulation terms
Let us define a partition P t on the time interval [0, T ], and a partition P x on the space Let y = r(θ;x,t) be a continuous and piecewise-smooth curve that passes through the pointx at timet, such that r(θ; x i−1/2 , t n ) and r(θ; x i+1/2 , t n ) do not meet each other during the time period [t n−1 , t n ]. We define a space-time control volume Ω n i by extending the cell Assuming the prism Ω n i does not intersect the boundary x = x l and x = x r of the domain during the time period [t n−1 , t n ], let x * = r(t n−1 ; x, t n ) be the foot at time t n−1 of the curve with head x at time t n .
Integrating equation (2.1) over the control volume Ω n i we get Without loss of generality, the accumulation term in (2.4) can be evaluated by assuming and accumulation term can be re-written as follows: (2.7) Here the notation t(x; x i−1/2 , t n ) represents the time instant that the curve r(t; , x i−1/2 , t n ) = x. The notation t(x; x i+1/2 , t n ) is defined similarly. A simple calculation of the second term on the right-hand side of (2.7) yields The first and third terms on the right-hand side of (2.7) are integrated as follows (2.10) (2.11) The derivation shows that (2.11) does not depend on the assumption (2.4). Now we just set the space-time boundaries r(t; x i±1/2 , t n ) of the control volume Ω n i to be the characteristic curves, which are defined by the initial value problem of the ordinary differential equation That is, because the characteristics r(t; x i±1/2 , t n ) are assumed to be tracked exactly and hence the residual advection term vanishes naturally. Substituting (2.11) for the accumulation term in (2.5), we obtain a locally conservative reference equation on an interior space-time control volume Ω n i as follows: which can be approximated as (2.14) with the Lagrangian interface fluxes Note that the interface fluxes F (p, b)(r(t; x i±1/2 , t n ), t) are defined across the space-time boundary r(t; x i−1/2 , t n ) and r(t; x i+1/2 , t n ) of the space-time control volume Ω n i . To discretize equation (2.1), we further let {φ i } I i=1 be a set of hat functions such that φ i (x i ) = 1 and φ i (x j ) = 0 for j = i. Hence the finite volume approximation p h to the true solution p can be expressed as and the finite volume scheme on the interval [x i−1/2 , x i+1/2 ] for i = 1, . . . , I − 1 has the form where, [z i,j ] I−1 i,j=1 is the coefficient matrix of the fractional term, and is given by Assuming the trial functions p(x, t n ) are chosen to be piecewise linear functions on [x l , x r ] with respect to the fixed spatial partition P x , we evaluate the accumulation term at time step t n in the finite volume scheme (2.16) analytically as In addition, we can compute x * i−1/2 and x * i+1/2 at time step t n−1 via a backward approximate characteristic tracking. Since the trial function p(x, t n−1 ) is also piecewise linear with respect to the fixed spatial partition P x at time t n−1 , we can evaluate the accumulation term at time step t n−1 analytically. Because the accumulation term at time step t n−1 affects only the right-hand side of the finite volume scheme (2.16), the scheme retains a symmetric and positive-definite coefficient matrix. Furthermore, the finite volume scheme (2.16) is locally conservative, even if the characteristics are computed approximately.

The fractional diffusion term and the stiffness matrix
The finite volume scheme (2.16) appears similar to the one for the canonical secondorder diffusion equation, but has a fundamental difference. Although the hat functions φ j have local support, (2.2) reveals that x l D −λ x φ j and x D −λ xr φ j have global support. Therefore, the stiffness matrix Z := [z i,j ] I−1 i,j=1 is a full matrix, which requires O(N 2 ) of memory to store. Numerical schemes for space-fractional differential equations were traditionally solved by the Gaussian type direct solvers that require O(N 3 ) of computations [13,14,18,19]. In recent years, there have been some other notable developments in methods for solving the algebraic linear systems arising from discretization of fractional-order problems, especially for space dimension higher than one (see [9]); we plan to explore them in our future work.
To simplify the computation of the diffusion term, we have to explore the structure of the stiffness matrix Z [17,26]. The entries of the stiffness matrix Z are presented below.
Its diagonal entries are given by: Whereas, the sub-triangular entries of the matrix Z are given by: The super-triangular entries of matrix Z can be also derived as for 1 ≤ i ≤ I − 2, and for 1 ≤ j ≤ I − 3 and j + 2 ≤ i ≤ I − 1.

Estimation of coefficient function
We present the case where the drift coefficient function in fADE (1.5) (and therefore in equation (2.1)) is assumed to be a piecewise linear function of x, where parameters a 0 , a 1 , a 2 and a 3 are to be estimated from the observed concentration data. The main idea on this part is to obtain certain measurements through physical or mechanical experiments, and then use the data to calibrate these parameters in the fADE (see [16]). This is an inverse problem: based on the initial guess p 0 of the equation (2.1), and certain observation (or desired) data such as values of the state variable g at the final time, we attempt to seek for the constant parameters a 0 , a 1 , a 2 and a 3 from the governing differential equation (2.1).
We formulate the inverse problem as an optimization and develop a Levenberg-Marquardt (L-M) regularization method (see, [12,20,24]) to iteratively identify the parameter. It is known that the inverse problem usually requires multiple runs of the forward problem. Considering the computational cost of the forward problem is already high, the inverse problem could become infeasible. Hence we propose an optimization algorithm for the parameter estimation.
Here we only present the details of the fitting of a 0 and a 1 . The other two parameters a 2 and a 3 can be estimated similarly. The parameter identification of a 0 and a 1 can be formulated as follows: let α := {a 0 , a 1 }, then to find α inv that satisfies In case the data is available for time points t 1 , t 2 , . . . , t R we rewrite equation (2.25) as here x t k ,i is the ith observed location at time t k and w k is the weight assigned for the sample set available for the same time point. (For example the data set used in Section 3 has R = 3).
An iteration algorithm such as the Newton method with line searching could be employed to find the solution to (2.26). Basically, the Newton algorithm for minimizing (2.26) uses the first and second derivatives of the objective function G(α): where k represents the kth iteration. It is easy to check that (2.27) is equivalent to solve and r k = r 1 ; r 2 ; . . . ; r R T , with for j = 1, . . . , M i , and i = 1, 2, . . . , R. Note that in practice, we always use the finite difference p(x ti,j , t i ; α + δ) − p(x ti,j , t i ; α) δ with a small enough δ to approximate the derivatives in (2.29).
However, the Newton method may fail to work because of J T k J k may be nearly zero. Therefore, the search direction d k := −J T k r k /J T k J k may not point in a decent direction.
A common technique to overcome this kind of problem is the L-M algorithm (or Levenberg algorithm since a single parameter case is considered in this paper), which modifies (2.27) by the following formulation where ̺ k is a positive penalty parameter, and I 2 is a 2×2 identity matrix. The method coincides with the Newton algorithm when ̺ k = 0; and it gives a step closer to the gradient descent direction when ̺ k is large.
(2) Compute J k and r k , and update the search direction (3) Determine the search step ρ m by Armijo rule: where m is the smallest non-negative integer.
Algorithm A summarizes the proposed parameter estimation steps through the inverse problem approach which includes the details of the L-M method. In particular, the Armijo rule [2] in the third step, known as one of the inexact line search techniques, is imposed to ensure the objective function G has sufficient descent. Other rules and the related convergence theory can be found in [24].

An illustrative example
In this section, we use a heavy-tailed groundwater tracer concentration data to illustrate the methodology described in Section 2. The data comes from natural-gradient tracer tests conducted at the MacroDispersion Experimental (MADE) site at Columbus Air Force Base in northeastern Mississippi, precisely the MADE-2 tritium plume data [8]. The data consists of the maximum concentration measured in vertical slices perpendicular to the direction of plume travel, at day 27, day 132, day 224, and day 328 days after injection (see Figure 4 in [5]).
In [11], an fADE with constant drift and diffusion coefficient was fitted to the same data. With constant coefficient functions, the resulting solution process Y t in the SDE (1.2) itself is a stable process. The parameters of the fitted stable process were estimated using the least-square method.
Here we are considering an fADE that may have variable coefficients. For the fits presented in this section, the diffusion coefficient is still assumed to be location invariant, but we use a piecewise linear drift function of the form (2.24). The choice of x l , x m and x r are subjective but should comply with the boundary conditions. x l = 0 (starting location), x r = 300 (beyond the observed location range) and x m = 9.375 (a rough mid-location of the observed range) were set for the current simulation. We used the fitted parameter values from the constant coefficient model [11] as initial values for the simulation; these parameter values are included in Appendix A. Then we used the data to estimate the linear drift and to obtain p(x, t) that solves (2.1) by the methodology presented in Section 2. c(x, t) = Kp(x, t) gives the fitted concentration that solves fADE associated with (2.1).
The concentrations fitted by the proposed method for the MADE site data at day 224 and day 328 are included here.  Fitted parameters for day 328 MADE site data: a 0 = 0.105, a 1 = 0.00030, a 2 = 0.0005, a 3 = 0.00018, b = 0.2233695, λ = 0.79, γ = 0.9999, K = 37195.05. Comments: 1. The fit above uses piecewise linear drift velocity and location invariant diffusion coefficient as opposed to constant drift but same diffusion coefficient used in [11]. The fitted curves in Figures 2 and 4 show a better tail fit here compared to similar plots in [11]. The fractional order α for the fADE with constant coefficient was fitted as 1.09 and 1.05 for day 224 and day 328 respectively in [11].
Here the fitted fractional order α(= 2 − λ) are 1.2 and 1.19 for day 224 and day 328 respectively. Since the new fit improves on the fit in [11], this indicates that a better drift velocity fit may lead to an adjustment of the fractional order to improve the tail fit in the fADE. Ideally, the drift velocity function might be mechanically estimated from a properly designed field experiment while the    diffusion coefficient and the fractional order can be estimated as described in this paper for an even better fADE fit.
2. Using simulation step 2, iid sample of size 1000 for Y t was generated at t = day 224 and t = day 328. Histograms for the generated samples in Figures 5  and 6 are fairly consistent with the observed concentration (or observed density) data. These figures indicate that although we had adopted a numerical approach to generate Y t 's, the simulation method is efficient enough to follow the underlying probability distribution.

Discussion
The trapezoidal rule provides a very simple and quick numerical approximation of the CDF of Y t in (1.2) and for its inverse in the proposed simulation step 2. A more sophisticated numerical method like Simpson's rule or other higher order Newton-Cotes formula or quadrature rules [23] can be used to obtain a more accurateF Yt , but the procedure will be much more computationally taxing. The method described here can be used to simulate a large number of observations (approximated) from Y t . With the usual particle tracking insights, these simulated values can also be used to build empirical confidence intervals for the fitted concentration values. Let us consider P [Y t ∈ (a, b)] = F Yt (b) − F Yt (a). This is the probability that a randomly chosen tracer particle will be in a given interval (a, b) at a time point t. For example, in groundwater pollution modeling this expression can be an important one that estimates the chance that the pollution will reach a certain area after a certain time. Using i.i.d. random simulations Y of Y t , an empirical estimate for this probability can be given byP = 1 ∈ (a, b)]. Then by the central limit theoremP asymptotically follows the normal distribution with mean Hence for large N 's, can be given bŷ Now noting that density p(y, t) can be approximated by 1 b−a P [Y t ∈ (a, b)] with a small interval (a, b), the simulated Y t values can be used to calculate the confidence intervals for the fitted densities. See [11] for detailed asymptotic confidence interval construction steps and related results associated with the empirical density function that is calculated using the generated Y t 's.
Assuming that the observed concentrations are scaled versions of density p(y, t) of Y t , one can repeat simulation step 2 described in Section 2.1 without solving the forward equation to generate Y t 's. However, the grid used for the simulation will be limited only to the fixed observed data location points. Since Y t 's are simulated through numerical approximation the performance of this simulation depends on the grid spacing used. Thus the fitted fADE in simulation step 1 not only gives us the better understanding of the underlying process but also is essential for generating a good approximate process that resembles Y t by enabling the use of a finer grid for the numerical approximation. Further, modeling the underlying fADE process facilitates better prediction than mere extrapolation from an observed sample set.
On the other hand, an fADE can be fitted to the observed concentration by simulation step 1 only, without describing the underlying stochastic process. But the stochastic diffusion description is useful for understanding the mesoscopic flow and particle tracking methods. Further, generating the solution to the SDE associated with the fADE can be used to construct confidence intervals for the fADE fits that account for the error of estimation by a sample set.
In conclusion, the fADE and the associated α-stable SDE are essential tools to model heavy-tailed diffusion. However, the model fitting part may get complicated if we consider any scenario other than constant drift and diffusion coefficients. A numerical scheme for solving the fADE and the related SDE with linear drift is presented here. The general ideas used in the simulation steps can be replicated for a more complicated form of the drift. We plan to explore such models in our future work along with the convergence and the efficiency of the numerical approximations applied here.

A Appendix
In [11] the tracer trajectory Y t of the MADE site data was assumed to follow an SDE of the form (1.2) with constant parameters and hence itself was a stable process with index of stability α, skewness parameter β, location parameter v (same as the assumed constant drift velocity in (1.5)) and scale parameter σ, where σ α = Dt| cos(πα/2)| with constant diffusion coefficient function D in (1.5).
The fitted values in [11] are: These were used as initial values for the iterations in Section 3.