Note on AR(1)-characterisation of stationary processes and model fitting

It was recently proved that any strictly stationary stochastic process can be viewed as an autoregressive process of order one with coloured noise. Furthermore, it was proved that, using this characterisation, one can define closed form estimators for the model parameter based on autocovariance estimators for several different lags. However, this estimation procedure may fail in some special cases. In this article we provide a detailed analysis of these special cases. In particular, we prove that these cases correspond to degenerate processes.


Introduction
Stationary processes are important tool in many practical applications of time series analysis, and the topic is extensively studied in the literature. Traditionally, stationary processes are modelled by using autoregressive moving average processes or linear processes (see monographs [2,4] for details).
One of the most simple example of an autoregressive moving average process is an autoregressive process of order one. That is, a process (X t ) t∈Z defined by where φ ∈ (−1, 1) and (ε t ) t∈Z is a sequence of independent and identically distributed square integrable random variables. The continuous time analogue of (1) is called the Ornstein-Uhlenbeck process, which can be defined as the stationary solution of the Langevin-type stochastic differential equation where φ > 0 and (W t ) t∈R is a two-sided Brownian motion. Such equations have also applications in mathematical physics. Statistical inference for AR(1)-process or Ornstein-Uhlenbeck process is wellestablished in the literature. Furthermore, recently a generalised continuous time Langevin equation, where the Brownian motion W in (2) is replaced with a more general driving force G, have been a subject of active study. Especially, the so-called fractional Ornstein-Uhlenbeck processes introduced by [3] have been studied extensively. For parameter estimation in such models, we mention a recent monograph [5] dedicated to the subject, and the references there in.
When the model becomes more complicated, the number of parameters increases and the estimation may become a challenging task. For example, it may happen that standard maximum likelihood estimators cannot be expressed in closed form [2]. Even worse, it may happen that classical estimators such as maximum likelihood or least squares estimators are biased and not consistent (cf. [1] for discussions on the generalised ARCH-model with fractional Brownian motion driven liquidity). One way to tackle such problems is to consider one parameter model, and to replace white noise in (3) with some other stationary noise. It was proved in [7] that each discrete time strictly stationary process can be characterised by where φ ∈ (0, 1). This representation can be viewed as a discrete time analogue of the fact that Langevin-type equation characterises strictly stationary processes in continuous time [6]. The authors in [7] applied characterisation (3) to model fitting and parameter estimation. The presented estimation procedure is straightforward to apply with the exception of certain special cases. The purpose of this paper is to provide a comprehensive analysis of these special cases. In particular, we show that such cases do not provide very useful models. This highlights the wide applicability of characterization (3) and the corresponding estimation procedure.
The rest of the paper is organised as follows. In Section 2 we briefly discuss the motivating estimation procedure of [7]. We also present and discuss our main results together with some illustrative figures. All the proofs and technical lemmas are postponed to Section 3.

Motivation and formulation of the main results
Let X = (X t ) t∈Z be a stationary process. It was shown in [7] that equation where φ ∈ (0, 1) and Z t is another stationary process, characterises all discrete time (strictly) stationary processes. Throughout this paper we suppose that X and Z are square integrable processes with autocovariance functions γ(·) and r(·), respectively. Using Equation (4), one can derive Yule-Walker type equations for the parameter φ, which can be solved in an explicit form. Namely, for any m ∈ Z such that γ(m) = 0 we have (5) The estimation of the parameter φ is obvious from (5) provided that one can determine which sign, plus or minus, one should choose. In practice, this can be done by choosing different lags m for which to estimate the covariance function γ(m). Then one can determine the correct value φ by comparing different signs in (5) for different lags m (We refer to [7, p. 387] for detailed discussion). However, this approach fails, i.e. one cannot find suitably chosen lags leading to the correct choice of the sign and only one value φ, if, for m ∈ Z such that γ(m) = 0 we also have r(m) = 0, and for any m ∈ Z such that γ(m) = 0, the ratio for some constant a ∈ (0, 1). The latter is equivalent [7, p. 387] to the fact that for some constant b with φ < b < φ + φ −1 . This leads to Moreover, if γ(m) = r(m) = 0 for some m, it is straightforward to verify that (8) holds in this case as well. Thus (8) holds for all m ∈ Z. Since covariance functions are necessarily symmetric, we obtain an "initial" condition γ(1) = b 2 γ(0). Thus (8) admits a unique symmetric solution.
From γ(1) = b 2 γ(0) it is clear that (8) does not define covariance function for b > 2. Furthermore, since φ > 0, it suffices to study the regime b ∈ [0, 2] (we include the trivial case b = 0). For b = 2 this corresponds to the case X t = X 0 for all t ∈ Z which is hardly interesting. Similarly, the case b = 0 leads to a process (. . . , X 0 , X 1 , −X 0 , −X 1 , X 0 , X 1 , . . .) which again does not provide a practical model. On the other hand, it is not clear whether for some other values b ∈ (0, 2) Equation (8) can lead to some non-trivial model in which estimation procedure explained above cannot be applied. It turns out that, for any b ∈ [0, 2], Equation (8) defines a covariance function. On the other hand, the resulting covariance function, denoted by γ b , leads to a model that is either not very interesting.

For any
In many applications of stationary processes, it is assumed that the covariance function γ(·) vanishes at infinity, or that γ(·) is periodic. Note that the latter case corresponds simply to the analysis of finite-dimensional random vectors with identically distributed components. Indeed, γ(m) = γ(0) implies X n = X 0 almost surely, so periodicity of γ(·) with period N implies that there exists at most N random variables as the source of randomness. By items (2) and (3) of Theorem 2.1, we observe that, for suitable values of b, (8) can be used to construct covariance functions that are neither periodic nor vanishing at infinity. On the other hand, in this case there are arbitrary large lags m such that γ b (m) is arbitrary close to γ b (0). Consequently, it is expected that different estimation procedures fail. Indeed, even the standard covariance estimators are not consistent. A consequence of Theorem 2.1 is that only a little structure in the noise Z is needed in order to apply the estimation procedure of the parameter φ introduced in [7], provided that one has consistent estimators for the covariances of X. The following is a precise mathematical formulation of this observation.
Remark 3.2. Using (8) directly, we observe, for even m ≥ 1, that Similarly, for odd m ≥ 1, we obtain γ(m) =  These formulas are finite polynomial expansions, in variable b, of the functions presented in (9) which could have been deduced also by using some well-known trigonometric identities.
Before proving our main theorems we need several technical lemmas.
Definition 3.3. We denote with Q a subset of rationals defined by Remark 3.4. The modulo condition above means only that either k is even and l is odd, or vice versa. Change of variable t = j − l gives Consequently, for even k and odd l we have Similarly, for odd k and even l, Lemma 3.6. Let γ(·) be given by (9) with b = 2 sin k l π 2 for some k l ∈ Q. Then the non-zero eigenvalues of the matrix are either 2l with multiplicity of two or 4l with multiplicity of one.
Proof. Let c i denote the ith column of C. Then, by the defining equation (8), Consequently, there exists at most two linearly independent columns. Thus rank(C) ≤ 2, which in turn implies that there exists at most two nonzero eigenvalues λ 1 and λ 2 . In order to compute λ 1 and λ 2 , we recall the following identities: where || · || F is the Frobenius norm. If rank(C) = 1, then λ 2 = 0 implying the second part of the claim. Suppose then rank(C) = 2. Observing that the squared sum of the diagonals is 4l and, for j = 1, 2, . . . , 4l − 1, a term γ(j) appears in C exactly 2(4l − j) times, we obtain Dividing the sum into two parts and using sin 2 (x) = 1 − cos 2 (x) we have where in the last equality we have used where substitution j = 4l − t yields Now (15), (16), and Lemma 3.5 imply Finally, using (13) and (14) together with ||C|| 2 F = 8l 2 , we obtain Hence λ 1 = λ 2 = 2l.
We are now ready to prove Theorem 2.1 and Theorem 2.2.
Proof the Theorem 2.1. Throughout the proof we denote a 2 ≡ a 1 (mod 2π) if a 2 = a 1 + 2kπ for some k ∈ Z. That is, a 1 and a 2 are identifiable when regarding them as points on the unit circle. By a 3 ∈ (a 1 , a 2 ) (mod 2π) we mean that a 3 ≡ a (mod 2π) for some a ∈ (a 1 , a 2 ).

Since arcsin
, the first claim follows from Proposition 3.1 together with the fact that functions sin(·) and cos(·) are periodic. In particular, we have γ(4l + m) = γ(m) for every m ∈ Z. This implies r = 4k m ± m , which contradicts r / ∈ Q. Since cos(mA) is injective, it is intuitively clear that cos(mA), m ≡ 0 (mod 4) is dense in [− 1,1]. For a precise argument, we argue by contradiction and assume there exists an interval (c 1 , d 1 ) ⊂ [−1, 1] such that cos(mA) / ∈ (c 1 , d 1 ) for any m ≡ 0 (mod 4). This implies that there exists an interval (c 2 , d 2 ) ⊂ [0, 2π] such that for every m ≡ 0 (mod 4) it holds that mA / ∈ (c 2 , d 2 ) (mod 2π). Without loss of generality, we can assume c 2 = 0 and that for some m 0 ≡ 0 (mod 4) we have m 0 A ≡ 0 (mod 2π). Let m n = m 0 + 4n with n ∈ N and denote by · the standard floor function. Suppose that for some n ∈ N and p n ∈ (−d 2 , 0) we have m n A ≡ p n (mod 2π). Since by injectivity 2π |pn| / ∈ N, we get m n 2π |pn| A ∈ (0, d 2 ) (mod 2π) leading to a contradiction. This implies that for every n ∈ N we have m n A / ∈ (−d 2 , d 2 ) (mod 2π) (for a visual illustration, see Figure 4). Similarly, assume next that m n 1 A ≡ p n 1 (mod 2π) and m n 1 +n 2 A − m n 1 A ∈ (−d 2 , d 2 ) (mod 2π). Then m n 2 A ∈ (−d 2 , d 2 ) (mod 2π) which again leads to a contradiction (see Figure 5). This means that for an arbitrary point p n on the unit circle such that m n A ≡ p n (mod 2π), we get an interval (p n − d 2 , p n + d 2 ) (understood as an angle on the unit circle) such that this interval cannot be visited later. As the whole unit circle is covered eventually, we obtain the expected contradiction. Here n * = 2π |pn| , and we have visualized the points on the unit circle corresponding to the steps 0, n, 2n, (n * − 1)n and n * n.

Denote
3. Consider first the case b = 2 sin k l π 2 , where k l ∈ Q. By Lemma 3.6, the symmetric matrix C defined by (12) has non-negative eigenvalues, and thus C is a covariance matrix of some random vector (X 0 , X 1 , . . . , X 4l−1 ). Now it suffices to extend this vector to a process X = (X t ) t∈Z by the relation X 4l+t = X t . Indeed, it is straightforward to verify that X has the covariance function γ.
Assume next b = 2 sin r π 2 , where r ∈ (0, 1) \ Q. We argue by contradiction and assume that there exists k ∈ N, and vectors t = (t 1 , t 2 , ..., t k ) T ∈ Z k and a = (a 1 , a 2 , ..., a k ) T ∈ R k such that where γ(·) is the covariance function corresponding to the value b. Since Q is dense in [0, 1], it follows that there exists (q n ) n∈N ⊂ Q such that q n → r.
Denote the corresponding sequence of covariance functions with (γ n (·)) n∈N . By definition, k i,j=1 a i γ n (t i − t j )a j ≥ 0 for every n.
On the other hand, continuity implies γ n (m) → γ(m) for every m. This leads to lim n→∞ k i,j=1 a i γ n (t i − t j )a j = k i,j=1 a i γ(t i − t j )a j = − giving the expected contradiction.
Remark 3.7. Note that in the periodic case the covariance matrix C defined by (12) satisfies rank(C) ≤ 2. Thus, in this case, the process (X t ) t∈Z is driven linearly by only two random variables Y 1 and Y 2 . In other words, we have for some deterministic coefficients a 1 (t) and a 2 (t). Furthermore, (8) implies (6) for every m such that γ(m) = 0. Now leading to a contradiction. Treating the case r(m) ≥ −r(0)(1 − ) for all m ≥ M similarly concludes the proof.