Extreme residuals in regression model. Minimax approach

We obtain limit theorems for extreme residuals in linear regression model in the case of minimax estimation of parameters.

Let θ = ( θ 1 , . . . , θ q ) be the least squares estimator (LSE) of θ. Introduce the notation θ i x ji , ǫ j = y j − y j , j = 1, N ; Asymptotic behavior of the r.v.-s Z N , Z * N is studied in the theory of extreme values (see classical works by Frechet [10], Fisher and Tippet [3], and Gnedenko [5] and monographs [4,8]). In the papers [6,7], it was shown that under mild assumptions asymptotic properties of the r.v.-s Z N , Z N , Z * N , and Z N * are similar in the cases of both finite variance and heavy tails of observation errors ǫ j .
In the present paper, we study asymptotic properties of minimax estimator (MME) of θ and maximal absolute residual. For MME, we keep the same notation θ.

Definition 1.
A random variable θ = ( θ 1 , . . . , θ q ) is called MME for θ by the observations (1) where Denote W N = min 1≤j≤N ǫ j and let R N = Z N − W N and Q N = ZN +WN 2 be the range and midrange of the sequence ǫ j , j = 1, N .
The following statement shows essential difference in the behavior of MME and LSE.

Remark 1.
From the point (ii) of Statement 1 it follows that MME θ is not consistent in the model (4) with some ǫ j having all the moments (see Example 2).

Remark 2.
The value ∆ can be represented as a solution of the following linear programming problem (LPP): So, the problem (2) of determination of the values ∆ and θ is reduced to solving LPP (5). The LPP can be efficiently solved numerically by the simplex method; see [2,12]). Investigation of asymptotic properties of maximal absolute residual ∆ and MME θ is quite difficult in the case of general model (1). However, under additional assumptions on regression experiment design and observation errors ǫ j , it is possible to find the limiting distribution of ∆, to prove the consistency of MME θ, and even estimate the rate of convergence θ → θ, N → ∞.

The main theorems
First, we recall briefly some results of extreme value theory. Let r.v.-s (ǫ j ) have the d.f. F (x). Assume that for some constants b n > 0 and a n , as n → ∞, b n (Z n − a n ) and ζ has a nondegenerate d.f. G(x) = P(ζ < x). If assumption (6) holds, then we say that d.f. F belongs to the domain of maximum attraction of the probability distribution G and write F ∈ D(G).
If F ∈ D(G), then G must have just one of the following three types of distributions [5,8]: Type II: Type III: Necessary and sufficient conditions for convergence to each of d.f.-s Φ α , Ψ α , Λ are also well known. Suppose in the model (1) that: (A2) (ǫ j ) satisfy relation (6), that is, F ∈ D(G) with normalizing constants a n and b n , where G is one of the d.f.-s. Φ α , Ψ α , Λ defined in (7).
Assume further that regression experiment design is organized as follows: that is, x j take some fixed values only. Besides, suppose that card(I l ) = n, I m ∩ I l = ⊘, m = l, N = kn is the sample size,
Then MME θ is consistent, and Example 1. Let in the model of simple linear regression x j = v, j = 1, N , that is, k = 1 and q = 2. Then such a model can be rewritten in the form (4) with θ = θ 0 + θ 1 v. Clearly, the parameters θ 0 , θ 1 cannot be defined unambiguously here. So, it does not make sense to speak about the consistency of MME θ when k < q. (4)

Example 2. Consider regression model
The limiting distribution is a logistic one (see [9], p. 62). Using further well-known formulas for the type Λ ([9], p. 49) a n = F −1 (1 − 1 n ) and b n = nf (a n ), we find a n = ln n 2 and b n = 1. From Statement 1 it follows now that MME θ is not consistent. Thus, condition (13) of Theorem 2 cannot be weakened.
The following lemma allows us to check condition (13). Lemma 1. Let F ∈ D(G). Then we have: Thus, (13) does not hold.
2. If G = Ψ α , then where L(x) is a slowly varying (s.v.) function at zero, and there exists s.v. at infinity function L 1 (x) such that So (13) is true.

Theorem 3. Let for the model
where the matrix V Q (i) is obtained from V by replacement of the ith column by the column (Q n1 , . . . , Q nq ) T .

Proofs of the main results
Let us start with the following elementary lemma, where Z n (t), W n (t), R n (t), and Q n (t) are determined by a sequence t = {t 1 , . . . , t n } and are respectively the maximum, minimum, range, and midrange of the sequence t. Lemma 2. Let t 1 , . . . , t n be any real numbers, and Then α n = R n (t)/2; moreover, the minimum in (20) is attained at the point s = Q n (t).
Proof. Choose s = Q n (t). Then If s = Q n (t) + δ, then, for δ > 0, and, for δ < 0, that is, s = Q n (t) is the point of minimum.
Proof of Statement 1. We will use Lemma 2: . The point (ii) of Statement 2 follows directly from Lemma 2.
Proof of Theorem 1. Using the notation and taking into account Eq. (1), conditions (8) and (9), we rewrite LPP (5) in the following form: LPP dual to (21) has the form max u∈D * L * n (u), where L * n (u) = k l=1 (u l Z nl − u ′ l W nl ), and the domain D * is given by (11). According to the basic duality theorem ( [11], Chap. 4), Hence, we obtain b n (∆ − a n ) = max u∈D * b n L * n (u) − a n = max u∈D * g n (u), Denote by Γ * the set of vertices of the domain D * and Since the maximum in LPP (22) is attained at one of the vertices Γ * , max u∈D * g n (u) = max u∈Γ * g n (u), n ≥ 1.
Obviously, card(Γ * ) < ∞. Thus, to prove (10), it suffices to prove that, as n → ∞ max u∈Γ * g n (u) The Cramer-Wold argument (see, e.g., §7 of the book [1]) reduces (23) to the following relation: for any t m ∈ R , as n → ∞, The last convergence holds if for any c l , c ′ l , as n → ∞, Under the conditions of Theorem 1, The vectors (Z nl , W nl ), l = 1, k, are independent, and, on the other hand, Z nl and W nl are asymptotically independent as n → ∞ ( [8], p. 28). To obtain (24), it remains to apply once more the Cramer-Wold argument.
Therefore, the minimum in d is attained in (29) at the pointd being the solution of the system of linear equations Since the matrix V is nonsingular, by Cramer's rulê Obviously, for such a choice ofd, ∆ = 1 2 max 1≤l≤q R nl , thats is, we have obtained formulae (15) and (16).
(ii) Using the asymptotic independence of r.v.-s Z n and W n , we derive the following statement.
where ζ and ζ ′ are independent r.v.-s and have d.f. G.
In fact, this lemma is contained in Theorem 2.9.2 of the book [4] (see also Theorem 2.10 in [9]).
Remark3 follows directly from Theorem 3. Indeed, let k < q, and let there exist a nonsingular submatrix V ⊂ V , Choosing in LPP (21) from Theorem 1, d i = 0 for all i = i 1 , i 2 , . . . i k (i.e., taking τ i = θ i for such indices i), we pass to the problem (29). It remains to apply Eq. (15) of Theorem 3.